cancel
Showing results for 
Search instead for 
Did you mean: 

Troubleshooting sidecar causing memory unloading

patrickbachmann
Active Contributor
0 Kudos

Hi folks,

I occasionally am getting complaints of a report becoming very slow.  The report uses MSEG which is queried frequently every single day and so I would not expect this to be automatically unloaded.  If it were unloaded being that it's so popular I would assume we have a serious capacity issue.  Anyhow when this happens often I don't hear about it until AFTER the unload happens, I go back into Performance/Load KPI's and I can see massive unloading at a certain time of day.

I'm trying to find out what sidecar query that is running that triggers the need for massive unloading.  I ask SAP Netweaver team to give me a list of queries running during this time but of course it's a huge list.  I look at SQL plan cache in HANA for what's executed that day and filter out anything using MSEG and I'm still left with a needle in a haystack.  To make matters worse the Netweaver team would like an application user which doesn't seem to show in the plan cache.  The only way I can capture the application user is if I catch it running LIVE at the time of the unloading event.

Does anybody have any tips on how to pinpoint root of a memory hog report like this?  All our sidecar reports are tested before going to production but I think what could be happening is users are running for much larger selections than originally tested or something like this and I'm trying to figure out which report it could be.

Thanks,

-Patrick

Accepted Solutions (1)

Accepted Solutions (1)

former_member182114
Active Contributor
0 Kudos

Hi Patrick,

The list of queries executed will not explain which one required memory from system to unload others, also I don't expect one with MSEG table.

With the time which it should happen. I'd turn the memory tracking and expensive statements to catch where is the one to focus on.

alter system alter configuration ('global.ini','SYSTEM') set ('resource_tracking','enable_tracking') ='on' with reconfigure;

alter system alter configuration ('global.ini','SYSTEM') set ('resource_tracking','memory_tracking') ='on' with reconfigure;

... and turn expensive statements on... and active monitor... Things will slow down but you have a better chance to catch the hungry statement.

Regards, Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Hi Fernando,

OK I will look into your suggestions which are greatly appreciated.  I was thinking about expensive statements trace too but my thought was that it is only catching queries running for a long period and I was thinking maybe the query is a resource hog but is quick to execute (making it even more of a needle in haystack) but was not aware of the memory_tracking which sounds like it could help.  I'm going to do a quick test in our staging first (with both traces as you suggest) and see how it goes.

Thanks!

-Patrick

patrickbachmann
Active Contributor
0 Kudos

Fernando,

I ran the two alter statements on my test box now and am toying with memory, where can I see results?  I'm guessing one of the TRC files?

Thanks,

-Patrick

former_member182114
Active Contributor
0 Kudos

Hi Patrick,

After activate the tracking you need activate the expensive statements trace.

You can follow up on the performance -> expensive statements tab.

There's a column MEMORY_SIZE usually always with value -1 but now with the memory tracking enabled you obtain the amount used.

Best regards, Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Ahhh ok, thanks for clarifying!  I thought they were somehow independent.  Ok will try now.

patrickbachmann
Active Contributor
0 Kudos

Ok just one more question!  The Threshold duration.  I'm not clear what that weird character is that looks like upside down n. 

former_member182114
Active Contributor
0 Kudos

rsss

I don't know the name also, but 1000 is 1ms, so this is probably 1 second.

Best regards, Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Great, thanks.  Would you happen to know if MEMORY_SIZE column in the trace is KB or something else?

patrickbachmann
Active Contributor
0 Kudos

I found the troubleshooting and performance guide which says same as your suggestions now but still it doesn't say what the size unit is for MEMORY_SIZE so that's the last missing piece of my puzzle!

former_member182114
Active Contributor
0 Kudos

Hi Patrick,

The MEMORY_SIZE is in bytes.

Look for the huge ones 🙂

Best regards,

Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Awesome, thanks!!  This is great thus far.  Now I can tie $ amount to various queries if I wanted to.  Although obviously memory is reused/shared resource but still will give us better idea where the most costly ones are.  So often when we test and approve it's for selection criteria X but then users run criteria Y and we are unaware of where the memory is going.

-Patrick

patrickbachmann
Active Contributor
0 Kudos

Is there a way to run an SQL against the Expensive Statements cache?  ie; because one sidecar query had 19 statements and I want to sum the MEMORY_SIZE column.  I'm assuming yes it's just finding the path.

patrickbachmann
Active Contributor
0 Kudos

Never mind, found it!

select * from M_EXPENSIVE_STATEMENTS

patrickbachmann
Active Contributor
0 Kudos

Actually I get 19 rows via EXPENSIVE STATEMENTS TRACE tab in performance monitor vs 29 entries in M_EXPENSIVE_STATEMENTS so I'm scratching my head....

patrickbachmann
Active Contributor
0 Kudos

I "THINK" I have to exclude OPERATION 'AGGREGATED_EXECUTION' when I query from the table.  That's my theory at least as there seem to be 10 of these in my case accounting for the difference.

patrickbachmann
Active Contributor
0 Kudos

select * from M_EXPENSIVE_STATEMENTS WHERE APP_USER = 'TEST_USER' AND NOT OPERATION = 'AGGREGATED_EXECUTION'

former_member182114
Active Contributor
0 Kudos

Hi Patrick,

The view already aggregate.

About your monitoring, I've just remind a tip from Lucas related to multi node execution if it's your case keep in mind that an execution started on HOST 1 may differ from HOST2, it depends on where the tables involved on the execution is located.

Good luck

Regards, Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Hey Fernando,

I've now tried a bigger test in our staging environment which closely resembles production since we recently did a refresh from production.  Now when I run a sidecar report essentially I'm seeing thousands of the same select each returning a record or two.  The problem is I can't really just SUM all of the MEMORY_SIZE for each of the selects as it's essentially the same exact memory being used over and over in the query I believe.  So I'm back to not really knowing how much total memory the sidecar query is truly consuming.  See example below (I have 20,000 of these lines thus far and still running.  I can't simply sum the column like I originally was thinking).

-Patrick

patrickbachmann
Active Contributor
0 Kudos

I'm wondering if maybe I need to create an AVG MEMORY_SIZE by SUMMING MEMORY_SIZE then counting distinct SELECTS and divide by this number or something like that....

lucas_oliveira
Advisor
Advisor
0 Kudos

Hi guys,

Just a few of additions here:

1) as mentioned, depending on the host the query starts (considering a scale-out system) the memory consumption will vary. you can  try to rule that out by using ROUTE_TO hint (for testing purposes).

2) Data will be written into internal tables used by M_EXPENSIVE_STATEMENTS *only* when the threshold limit is hit. Thus, you are probably skipping a few queries (which is totally fine). As this view has the time and app_user information you could use that as a filter and avg/max to get better insights of the costs and queries involved in your scenario.

3) Normaly, you'd want to go and get the greatest memory consumer and start from there. From that point on it's Planviz playing time

4) I assume this is a new scenario happening for your customer. If so, it's also always good to understand their growth and see if they're far up from the expected growing figures... in other words: they could be growing more than planned, which can definitely lead to such scenarios.


BRs,
Lucas de Oliveira

patrickbachmann
Active Contributor
0 Kudos

Hi Lucas,

Thanks for feedback.

1) I should have answered Fernando but in this case it's not scale out system so we have single host.

2) I set the threshold very low and for 1 user and had that user run the query.  I set threshold to 1000 so its picking up everything I believe this user does.

3) This is ABAP sidecar query being called from SAP and so all of the SELECTIONS are masked in the SQL Plan cache and M_EXPENSIVE_STATEMENTS so I don't think I can run visual plan on these.

4) No just internal users running inefficient sidecar reports that we will eventually push down the logic to HANA.  We are experimenting with calling views from ABAP to start exploring this option.

Thanks,

-Patrick

lucas_oliveira
Advisor
Advisor
0 Kudos

Hello,

Hm.... Threshold set for 1000us ? You're getting queries that run in less than a second. Actually, anything slower than 0,1s 0,0001s will be caught. I think it's easier to start by checking the costier queries (1s+ ) but that's up to you.

Is that a particular SAP accelerator or was it some other development?

Finally: sure you can use Planviz on those. I don't see why not.

BRs,

Lucas de Oliveira

patrickbachmann
Active Contributor
0 Kudos

Well here's the thing.  This particular query, as with many of the ABAP sidecar queries I'm seeing running in SAP that are simply redirecting to HANA as the secondary db, the query could run for an hour or more in SAP.  But in HANA you barely notice that it's running because it's sending thousands of small separate sub-second queries to perform lookups to BKPF for example.  So I'm trying to evaluate the entire cost of that entire SAP query by totaling all those thousands of little queries that the ABAP query is generating.  So your 1 hour query may only be running 2 minutes of processing in HANA but how much memory is it consuming as a whole.  Ideally I want to run our top 10 ABAP queries and get the following;

TOTAL MEMORY USED = X GB

TOTAL $ PER GB = X $ (Monetary cost)

TOTAL DURATION = X Minutes of CPU in HANA

Then I can go to the business and have a case for improving the worst offenders.

-Patrick

lucas_oliveira
Advisor
Advisor
0 Kudos

Hello Patrick,

The reason why I asked "Is that a particular SAP accelerator or was it some other development?" is because depending on how you develop your abap application you can make very bad decisions ( like select single on BKPF in a huge loop ) that will definitely impact the overall performance. By the looks of your exp statements picture that could also contributing to the bad performance.

Makes sense?

BRs,

Lucas de Oliveira

former_member182114
Active Contributor
0 Kudos

Hi Patrick,

The questions related to 1 single execution many times is for sure source of many problems, anyhow I got curious about the numbers on your evidence:

- result 0 recods

- one select directly to column table (BKPF in case)

- with depending on value can be a good key (AWKEY...)

- taking 2 giga of memory

If I were you I'd go deeper within PlanViz this.

- look for the values that application are passing and the distinctivity on HANA

- is there partitioning involved?

- is there a group by. Can you post complete SQL, with the parameters if possible

- how are the order of filters being applied on datasource?

- is there access only on MAIN or DELTA also? Delta is in a good shape?

Regards, Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Guys I really appreciate all your help.  But with that said I'm actually not looking to focus on performance tuning with this thread.  I'm simply trying to find which of the dozens of reports running from our sidecar are consuming the most memory.  So I'm looking more at a measuring mechanism to find these queries.  I completely agree about my example but it's just one example of many and I'm eventually going to review this query and many more with ABAP team and I'm going to train them to push down logic to HANA as well.  But really this post is just trying to identify which queries may have caused massive unloading in our production environment in recent weeks.  ie: which query may have run that triggered unloading of MSEG and many more tables.  So this is just preliminary research to try to find the biggest consumers of memory that may have caused the unload.  Thus far it's only happened about once a week and it's not consistent which day it may occur but the time of day does seem to be consistent about 919AM.  This example of repeated BKPF lookups actually doesn't seem to be consuming a lot of memory and while really poor code it's not the smoking gun it seems.

-Patrick

lucas_oliveira
Advisor
Advisor
0 Kudos

Well, then it's better to increase your Expensive Statements trace threshold. Having it at 1000us won't help to find the bigger queries imo. Start it with 1000000us and if the results are still too overwhelming, increase it further (4-5s) and retest.

BRs,

Lucas de Oliveira

former_member182114
Active Contributor
0 Kudos

Hi Patrick,

That said, I agree with Lucas. You must look into the ones with more seconds as if they are hungry in memory waiting for unloads it can't be so fast.

Best regards, Fernando Da Rós

patrickbachmann
Active Contributor
0 Kudos

Ok thanks again guys as always it's much appreciated. 

-Patrick

lbreddemann
Active Contributor
0 Kudos

Patrick Bachmann wrote:

Ok just one more question!  The Threshold duration.  I'm not clear what that weird character is that looks like upside down n. 

That's the greek character lower case mu (https://en.wikipedia.org/wiki/Mu_(letter)) and one of the standard abbreviations for https://en.wikipedia.org/wiki/Microseconds

The threshold here can be provided in microseconds, that is one millionth of a second (1/1.000.000).

There you go - now you know

Answers (0)