Skip to Content

HWM & Active counters in monProcedureCacheMemory|ModuleUsage

Since 15.0.x ASE comes with handy PC MDAs - monProcedureCacheMemoryUsage and monProcedureCacheModuleUsage. Both are "essential" in sizing the PC correctly - and analyzing PC behavior for your application & ASE. There counters are very nice but... over time they become user-un-friendly.

Both counters are not absolute point-in-time sample values but rather grow over time. But if they grow - what do they show?

  • Active (for ModuleUsage) must show "Number of memory pages (2K) currently allocated to this module".
  • HWM show "The maximum number of memory pages allocated since the server was started."

There may be further broken down into sub-allocators.

You'd expect Active to indicate the sample value from the point-of-time of inspection and HWM to indicate the number of memory pages since the startup (i.e. relative to the DaysRunning from the monState). Yet, both behave in the same cumulative manner and both yield useless values if divided into daily portions and compared to say monitorconfig. The same applies to the reuse values packed into these tables.

Anyone tried to make sense of these values? They are essential in configuring PC on the one hand but accumulate over time on the other (and exceed PC size in hundreds of % + are marked as indicator&1). What do these in fact display? How do they relate to monitorconfig? When you restart ASE (or play with these on your private ASE) the numbers seem to make sense. When you analyze them on the real production environment - they only mislead (as does the sysmon bit in the Indicator column related to them).

Any insights?

Thanks,

Andrew

ps. PERFORMANCE SCALABILITY ENHANCEMENTS IN SAP ADAPTIVE SERVER® ENTERPRISE has a neat section on PC sizing which, too, rely on monProcedureCacheModuleUsage. However, the values are taken as simple aggregates - without any respect to their cumulative nature. Was ASE bounced to take correct measurement? How reliable are the HWM/Reuse numbers in that paper?

Add comment
10|10000 characters needed characters exceeded

  • Follow
  • Get RSS Feed

1 Answer

  • May 29, 2015 at 11:03 AM

    Hi Andrew,

    Those two tables can provide you with some useful data but as you've already noticed they can mislead (sometimes gratuitously!) These have been observerations/findings over time:

    • If using a version of ASE prior to 15.7 SP130, forget trying to correlate the total used sizes with anything reported by sp_monitorconfig. You will regularly see massively higher sums for certain modules. This was due to a bug covered under CR #747948, I believe it related to duplicate allocator entries when memory was moved from one module to another, In short, this bug over time will lead to monProcedureCacheMemoryUsage reporting enormous 'Active' figures for certain modules (usually 5 and 10), way, way in excess of the actual configured 'procedure cache size'.
    • This bug is fixed in SP130, however, there is still a distinct difference between the data in monProcedureCacheModuleUsage and monProcedureCacheMemoryUsage. So from SP130 the total 'Active' figure for each table should be roughly the same but when summing it per module you'll again see discrepancies and this is due to the proc memory manager and memory transfers.
      • Memory is moved from one module to another (this happens with reasonable regularity particularly from 5 to 10, likely lava execution contexts).
      • When the memory is moved the ModuleID will change but the AllocatorID does not, this leads to discrepancy over time. This isn't really a monitoring issue but more a proc mgr issue as all monitoring is doing is reading the detail from the control structure.
      • monProcedureCacheModuleUsage is 'current' usage by module.
      • monProcedureCacheMemoryUsage does report thecurrent usage (as a total) but broken down by the *original* allocator and as each allocator only has one parent module, the data is not representative of the current allocations by module.
    • The HWM figures are of course useful but you can't start summing these else you'll end up with useless data, you need to treate each HWM figure as standalone and use it as a proportion of the total procedure cache used. I.e. You could say that ModuleID 5 has used a maximum of 20% of the configured proc cache for example.
    • For comparison to sp_monitorconfig, this is reasonably reliable but there are a few points you need to be wary of.
      • Don't trust the total percentage used as reported by sp_monitorconfig. The monitor counter it is reporting on brings back the total number of pages used in the full procedure cache memory pool, this includes the cfg parameter 'statement cache size'.
        • When it calculates the percentage however it will only use the 'procedure cache size' configuration parameter.
        • As such, (not a realistic example) say you had a 50MB proc cache and 60MB statement cache and you fill 50MB of the statement cache. This will report 100% used by sp_monitorconfig and any further additions to the statement cache will end up with an active figure going above the size of cfg parameter 'procedure cache size'. You see no reuse for statement cache removals (despite each removal removing the entry from the 'true' statement cache and the associated lwp). You WILL see reuse if it is the statement cache lwp that has been directly flushed from the proc cache (rather than the the lwp being removed because the statement has been removed)
        • The reuse_cnt in sp_monitorconfig should match the sum of the NumReuseCaused in monProcedureCacheMemoryUsage. This figure represents the number of regular procbufs forcibly pushed out of the proc cache. Each procbuf might use of X number of proc_hdrs, this reuse_cnt is not a measure of pages it is a measure of how many trees/plans have been forced out.
      • monProcedureCacheModuleUsage.NumPagesReused is the actual number of 2KB pages that have been reused for a given module.
      • If you are using dbcc proc_cache ('free_unused') this will increase the reuse count by the total number of lwp and procs in the proc cache memory pool but it won't increment the counter in monProcedureCacheModuleUsage, so immediately you'll see discrepancies.

      So my overall take on this is:

      • If running earlier than 15.7 SP130 don't use those two tables, they'll just confuse.
      • Only use monProcedureCacheModuleUsage to break down by module, the other table is just a smidge too granular and the historical nature of it will confuse.
      • Even if breaking it down by module it rarely gives you an insight into the behaviour of the proc cache, it's almost always the usual suspects using the bulk of it up.

      • You can get some reuse even with a low percentage used, but your bad reuse should be avoided if you keep the proc cache at < 70-80%.
        • Let's not get into the engine local caches and TF 753, 758 and all that jazz, that's another discussion :-)
      • If running with a large statement cache be wary of the sp_monitorconfig problem and work out the actual percentage used by summing the Active from monProcedureCacheModuleUsage and dividing it by the 'procedure cache size' + 'statement cache size'.
        • You can also run a modified sp_monitorconfig which gives better data for procedure cache.

      Hope this helps!

      Cheers,

      Simon

      Add comment
      10|10000 characters needed characters exceeded

      • Does it mean that backup (either full or tran) may potentially consume fairly large amount of proc cache memory (with the bunch of servers I inspect it may range anywhere between 10 to 90 % of it)?

        The one server that has had this huge percentage of PC consumed by backups also reports 20+% proc_cache spinlock contention - from time to time. Is this related to the pressure on the PC memory caused by concurrent backup (actually, the only concurrent dump at that time was the tran log dump which takes place every 5 minutes)?

        Which brings me back to inability to use this MDA conclusively - at least in 15.0.3. The counters in it increase all the time - so if at some point of time some activity pushed one of the modules into using 90% out of all available PC memory it will stay there until the server is bounced (for both Active and HWM counters - I would expect the HWM value being pushed up while the Active value being brought back to real non-cumulative value). All the subsequent attempts to see which module is most active and potentially has greater influence on the overall behavior of ASE at the time of inspection will be unsuccessful. Measuring delta values will also make no sense. Is there no way to reset the counters without bouncing the server? How can I make sense of the values the table lists?