on 06-06-2013 9:44 AM
Hi all,
Are there built in HANA DB operations to find outliers in an OLAP View (data cube, or also star schema analytic view) or I have to make the query to find these outliers manually?.
And if I have to do them manually, can you give me a SQL/SQLScript example of such operation.
You can consider that the there is one measure and one (or many) dimensions.
Thanks & Regards
Mohamed Ali
Hi Mohamed,
You need the SAP HANA PAL to be able to build a model that can determine outliers. The PAL function that can be used to determine outliers is called Anamoly Detection.
You may refer to the following guide on how to implement Anamoly Detection to identify outliers.
Thanks,
Sharan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
If you want to avoid using PAL, you can certainly define a procedure using CE functions and define the logic as per your requirements. As for performance with PAL shouldn't really be an issue because PAL functions usually run on the calculation engine which is highly optimized for performing such calculations. You can also increase the performance of PAL by not passing unnecessary data to the PAL function, by which I mean only pass required fields for calculation to the PAL function as opposed to a whole table.
The other option you have is if you want a really complex definition that PAL doesn't support or too complicated for a procedure, you can always reach out to R.
Here is a blog by Blag(Alvaro) that shows you how you can achieve complex statistical analysis using R.
Thanks,
Sharan
Hi,
I could not understand your question on outlier. Could you please eloborate or frame it in other way?
Regards
Raj
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
The outliers is something that is very different than the rest of the data a dimension of an analytic view.
so say for example I have the dimensions CITY and MONTH, and i have a measure like AMOUNT_SOLD (you can refer to this thread http://scn.sap.com/thread/3371063 to see the analytic view i am currently using) and I want to find the most different data cell in this set of data : AMOUNT_SOLD per CITY per MONTH (say for example the one that have the higher STD or an STD higher than some threshold)
I would like to know if there is like a CE function or something built in hana to that or i have to code it my self.
And I am having difficulty getting this kind of data AMOUNT_SOLD per CITY per MONTH using MDX (and I am not using Excel by the way)
Thanks & regards
Mohamed Ali
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.