Troubleshooting latency

patrickbachmann · ‎02-18-2015

Hi folks,

I've been troubleshooting some major latency issues and have read a bunch of suggestions in this forum which were quite old and wondering if there's any new approach/ideas on troubleshooting latency problems. For my particular case I've been looking at performance/load charts during the early morning slow down times and am not seeing any high amount of delta merges or anything that indicates a problem on the HANA side. On SLT I've increased amount of jobs processing and found some improvements however still seeing delays where replication just stops for an hour or two at a time for certain tables. Talking to some folks on my team they said DBA's suggested re-index tables on the SAP side could improve things. I'm interested in understanding why that could help plus if there's any other new ideas/suggestions to look for since some of the old posts were placed on SCN.

Thanks,

-Patrick

patrickbachmann · ‎03-06-2015

Ok I exhausted everything I could read about and created SAP message and they recommend upgrading SLT client drivers from 82 to 85. Apparently we are experiencing frequent connection issues which should automatically resolve themselves but somehow are not. For anybody interested see note 2089430 - SQLDBC Connectivity Issues after failover of master node if you are curious or experience similar problems.

Thanks everyone.

-Patrick

patrickbachmann · ‎03-03-2015

Guys,

I haven't forgotten about this thread, I'm still tinkering. Last night I stayed up late and monitored and was able to catch a latency window where I could clearly see records stacking up in logging tables and could see it had been written to logging table an hour earlier. I then looked at SM37 at my 22 transfer jobs and they were all active. I have however noticed many of them with CANCELED status almost hourly and when I look at those, the errors looks something like this;

Log not found (in main memory)

Job cancelled after system exception ERROR_MESSAGE

I then tried to look at SM50 to see work processes available but I don't have access. I tried to engage my Netweaver team but by the time they looked everything had caught up. So my next step is trying to catch this happening live again and then have Netweaver team look at work processes and load during this time and finally I will create an SAP message as I think at that point I've done all the stuff they recommend doing for troubleshooting.

-Patrick

patrickbachmann · ‎02-19-2015

Lars,

Here's an excerpt from my dba to answer your question on the full table scans (below). I have been watching delta merges, locks, etc on the target system. Esssentially everything available in performance/load charts and looking for outliers during the suspect time frame.

Martin,

Your statement doesn't make sense to me. I started digging into this problem because indeed data was not getting into the target system. So I looked at latency and could see the many hours gap exactly at the time the users complained. If I can not use replication statistics to measure latency then what tools can I use to monitor latency on a daily basis? Can I not trust latency alert emails either? Aren't those also based on these same statistics? Thanks for any more insight you can provide.

-Patrick

1. One of the top statements ordered by CPU time:

- Note that it does FULL TABLE scan (no index access):

CPU CPU per Elapsed

Time (s) Executions Exec (s) %Total Time (s) %CPU %IO SQL Id

---------- ------------ ---------- ------ ---------- ------ ------ -------------

10,119.8 10,324 0.98 4.6 10,682.6 94.7 .0 amcd4pbg4068a

Module: /1CADMC/SAPLDMC010000000002665

DELETE FROM ""/1CADMC/00010824"" WHERE ""IUUC_PROCESSED"">=:A0

- It uses index (INDEX RANGE scan), but index itself did grew big.

5,689.7 10,386 0.55 2.6 5,999.8 94.8 .0 1pbm6w9w0snym

2. The same statement as previously, but this time tops buffer gets list. Pretty unusual for a table that has 16 rows! Note that those 3 statements below makes 10% of all “Buffer gets” (which translates to IO operations) from overall system load

Buffer Gets Elapsed

Gets Executions per Exec %Total Time (s) %CPU %IO SQL Id

----------- ----------- ------------ ------ ---------- ----- ----- -------------

8.66250E+08 10,386 83,405.6 4.2 5,999.8 94.8 0 1pbm6w9w0snym

7.61395E+08 10,324 73,750.0 3.7 10,682.6 94.7 0 amcd4pbg4068a

Module: /1CADMC/SAPLDMC010000000002665

DELETE FROM ""/1CADMC/00010824"" WHERE ""IUUC_PROCESSED"">=:A0

3.02521E+08 2,666 113,473.8 1.5 1,506.8 92.7 0 bu7azdh0r8c1f

I agree with ([LARS THE GREAT]) observation that table fragmentation, if accessed by index does not impact performance too much, but as I showed above, tables are accessed by full table scan or by indexes that are also became inefficient.

Former Member · ‎02-18-2015

Hi Patrick,

Can you please if back up running at a particular time and also check the availability of the background jobs or delays in the background jobs.

Check both source system and replication server.

Thanks,

Shakthi Raj Natarajan.

Troubleshooting latency

Accepted Solutions (0)

Answers (4)

Answers (4)

Re: Connecting SAP CAP with SAP Cloud ALM

How to use CopyProvider on Table created by TypeSc...

How to execute/fire onPost when user press enter o...

Re: Connecting SAP CAP with SAP Cloud ALM

Re: How can assign in Identity Authentication Serv...