cancel
Showing results for 
Search instead for 
Did you mean: 

Calc View and Out of Memory Errors

Former Member
0 Kudos

I have successfully tested two analytic views, each selecting a fact table (joined to attribute views).  Function and performance were good, as expected.  These fact tables are identical except certain measures exit in one fact table but not the other.  To provide a single view of the data, I created a calculation view that joins the two analytic views.   With queries that are relatively non-selective, e.g.,,

WHERE calendar_year_name = 'CY2012'

I hit out of memory errors.  The peak memory usage in the server is consistent with these errors.

Queries that are more selective (e.g. below) execute without error and but take considerable longer than the queries against analytic views.

WHERE calendar_year_name = 'CY2012'

AND department_name = 'Computers'

AND category_name = 'Total Personal Computers'

AND region_name = 'South America'

I wish to avoid joining the two fact tables in the ETL process.  Is joining analytic views in a calculation view the best solution?  The impression I get is that data is being copied from one layer/engine to another during this process (this the out of memory errors).  Is that impression reasonable.

One other detail that might be useful  In the calculation view joins I've created joins between all attribute columns.  This appears to be necessary in order to get those columns in the output of the join.  The tables really only need to join on a few columns (the primary keys of each fact table).  Are my joins (on all attributes) the correct practice?  Would the number of joins have any bearing on the out of memory errors?

Thanks.

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Hi William,

You shouldn't be joining multiple analytic views, but rather combine them using 'Union With Constant Value' approach, starting on slide 34 here: http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/6056911a-07cc-2e10-7a8a-ffa9b8cf5...

That should help.

Cheers,

Jody

vikasbansal2009
Discoverer
0 Kudos

HI William,

The Approach which you are using firstly using the Analytic view and then again joining the analytic view in the calculation view is not recommended.

Since both HANA  OLAP engine and calculation engine will be used for processing the result which in turn reduce performance.

Better use the Graphical Calculation view or scripted view with C function on tables which only use the Calc. engine.

Regards,

Vikas

Former Member
0 Kudos

Thank you for the replies.

The Building Advanced Data Models with SAP HANA presentation (in Jody's message) suggests a UNION between two analytic views, so I will give that a try first (because it's easy, understanding that script with CE functions would be better).

What would be the best practice for adding the constant to the analytic view. I could imagine:

- Adding new columns to the fact tables, but I would rather not do that because it would add a bit of extra space (but probably not much, because there would be only a single value, right?) and I would prefer not to change the table definition.

- Creating a regular/SQL DDL view over the table and adding the constant in the view.  Would the view disrupt performance?

- Add a calculated measure to the analytic view with the calc'd measure set to a constant (seems easy, so that would be a nice approach).

When I try the script/CE function method, I'm assuming that I can add the constant in the script, right?  (I haven't yet made a script/CE function calc view).

With either approach, how will the constants impact aggregate with functions such as min, median and average?

Because union is recommended, it would seem that HANA is happier scanning the union rather than the join between to large tables.  Correct?  Is the join one of the cases where too much data gets moved between engines?  What is the use case for join in the calculation view?

Thanks!

Former Member
0 Kudos

William Endress wrote:

Thank you for the replies.

The Building Advanced Data Models with SAP HANA presentation (in Jody's message) suggests a UNION between two analytic views, so I will give that a try first (because it's easy, understanding that script with CE functions would be better).

If you're talking about the UNION node in graphical CalcView, it's the same as CE_UNION_ALL. 'Union with constant value' can be accomplished graphically (recommended), with CE functions, and/or with SQL.

What would be the best practice for adding the constant to the analytic view. I could imagine:

- Adding new columns to the fact tables, but I would rather not do that because it would add a bit of extra space (but probably not much, because there would be only a single value, right?) and I would prefer not to change the table definition.

- Creating a regular/SQL DDL view over the table and adding the constant in the view.  Would the view disrupt performance?

- Add a calculated measure to the analytic view with the calc'd measure set to a constant (seems easy, so that would be a nice approach).

Add a calculated measure

When I try the script/CE function method, I'm assuming that I can add the constant in the script, right?  (I haven't yet made a script/CE function calc view).

Yes

With either approach, how will the constants impact aggregate with functions such as min, median and average?

I don't get your question

Because union is recommended, it would seem that HANA is happier scanning the union rather than the join between to large tables.  Correct?  Is the join one of the cases where too much data gets moved between engines?  What is the use case for join in the calculation view?

It's not a matter of data movement between engines. A couple comments:

1) Generally speaking, fact data from 2 sources should always be combined via 'Union with constant value' (UCV) rather than joins. In the UCV approach - no operation is required for UNION ALL - HANA essentially just stacks two chunks of memory. After UNION ALL, aggregation is performed - which is HANA's performance hallmark. So, UNION ALL + AGGREGATION is much faster than large joins.

2) That being said, let's say you have 500M sales records from 10 stores for thousands of products, a few thousand days, a few thousand sales per day. If you query two analytic views per store and per year - then the number of result records is in the tens (i.e. 10 stores in 2012, 10 stores in 2011 let's say). In a case like this - a join should actually behave just fine regarding performance. So, it depends on granularity.

Thanks!

rindia
Active Contributor
0 Kudos

Dear William,

How about creating calculation view based on column tables, instead of using analytic views, and can use measures from multiple fact tables which is a restriction in analytic view.

Also the performance would be better. Please try and let me know your results.

Regards

Raj