Skip to Content

How to address High Page & Row Lock HashTable spinlock contention

We are running AS% 15.7 SP135 RUnning on Linix host -- threaded model -- 20 threads on 32 Core Host. We experience High Periods of CPU utilization. During this time sysmon shows high contention in Page & Row Lock HashTable -- over 20 % . Based on review of documentation/ working with SAP support -we have been adjusting lock spinlock ratio, lock hashtable size and lock address spinlock ratio. Here are current related connfiguration settings:

number of locks = 1000000 lock spinlock ratio = 20 lock address spinlock ratio = 5 lock hashtable size = 65536

I have see some documentation saying lock hashtable size should be 8192:

lock hashtable size >= @number of locks@ / 1000000 * 8192 [REC]

Any suggestions would be appreciated

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

7 Answers

  • Aug 24, 2017 at 03:29 PM

    Can't tell from the limited info if the contention on Page & Row Lock HashTable is an actual issue or a symptom of another issue, eg, are processes experiencing some other sort of contention which in turn is causing them to take longer than normal to process locks?

    And while 20% sounds high, it's a relative number so some context would be helpful, eg, 20% of 100 spinlock requests over a 30-minute time period may not be as big of an issue when compared to 20% of 10 million spinlock requests over a 1-minute time period.


    Assuming you've ruled out 'normal' explanations for high cpu usage (eg, high logical IO counts; high volume of queries needing to be compiled; a known/buggy configuration setting), I'd want to get a better picture of all spinlock activity during the period in question ... and while you can dig through sp_sysmon results (and the underlying spinlock data), it may be a bit easier to query the master..monSpinlockActivity (MDA) table.

    NOTE: You could also ask tech support for a copy of, and instructions on how to use, a custom proc named sp_spinmon, which can also provide some drilldown on spinlock issues.

    Since monSpinlockActivity counters represent total counts since the dataserver was booted, you'll need to take periodic samplings of the table and then take deltas to figure out the counter values for a given period of time (eg, take snapshot of monSpinlockActivity, wait 1 minute, take another snapshot of monSpinlockActivity, take difference in counters and you get the spinlock activity for the past minute).

    NOTE: You'll need to make sure enable spinlock monitoring = 1.

    A common reason (for high cpu usage spikes) that pops up way too often is an undersized procedure cache. This will show up in monSpinlockActivity with a high (delta) spins/grabs and waits/grabs values.


    A couple good resources:

    ASE WIKI page re: spinlocks and (high) cpu usage (contains example code for sampling monSpinlockActivity)

    Jeff Tallman's blog re: rightsizing procedure cache (could be pertinent if you find your real issue is spinlock contention related to procedure cache)

    Add comment
    10|10000 characters needed characters exceeded

  • Aug 24, 2017 at 04:33 PM

    Following is select from monSpinlockActivity at 1 minute interval:

    SpinlockName Grabs Spins Waits Contention fglockspins 40818673 2310551156 13616664 33.35 default data cache -4081864140 588372648 7133812 -0.17 SSQLCACHE_SPIN 274884 7587679 1788 0.65 Resource->rdbt_spin 10094447 4827526 78312 0.77 tablockspins 12216912 3100127 186391 1.52 Resource->rproccache_spin 1211028 267436 11305 0.93 Pdes Chain Spinlocks 34425474 249964 28132 0.08 Ides Chain Spinlocks 31359003 191883 31202 0.09 Sched Q 317878 75159 4365 1.37 Resource->rdesmgr_spin 696313 36021 2581 0.37
    Add comment
    10|10000 characters needed characters exceeded

  • Aug 24, 2017 at 04:49 PM


    Attached sysmon output

    Add comment
    10|10000 characters needed characters exceeded

  • Aug 24, 2017 at 05:51 PM

    Thanks for the additional data (monSpinlockActivity and sp_sysmon) ...


    The monSpinlockActivity data appears to rule out my concern about spinlock contention related to the procedure cache, with fglockspins being the main issue:

    SpinlockName     Grabs      Spins        Waits       Contention 
    fglockspins      40818673   2310551156   13616664         33.35 

    And per that wiki page I referenced, fglockspins is related to the config parameters you mention in your initial post/question.

    NOTE: While the lock hashtable size = 65536 seems a bit excessive (eg, sp_sysmon is showing a tiny avg chain length = 0.00015), I don't see this as being an issue.

    Since you're still having what appears to be spinlock contention issues (and resulting high cpu) after making your config changes, I'm wondering if there could be an issue somewhere else in your system ...


    Looking through the rest of the sp_sysmon output I did find one item of concern under the Disk I/O Management section, namely, all of your IOs (with a vast majority of these being writes) are being performed synchronously ... *yikes*:

      Completed Disk I/O's
        Asynchronous I/O's
          Total Completed I/Os            0.0           0.0           0       n/a   
        Synchronous I/O's
          Total Completed I/Os          307.9           1.6       55429     100.0 %
      -------------------------  ------------  ------------  ----------             
      Total Completed I/Os              307.9           1.6       55429             

    Ideally ASE should be running with 100% asynchronous IOs as this allows threads to continue working while waiting for disk IOs to complete.

    At this point I'm wondering if you could be seeing a lot of page/row spinlock contention because said spinlocks are being held while waiting for synchronous IOs to complete? [SAP tech support should be able to confirm/refute this idea]

    I'd want to find out why ASE is running with synchronous IOs (eg, the linux host is not configured to support async IO; ASE's allow sql server async i/o = 0 [off] ), and see what you can do to get asynchronous IOs enabled.

    - at ASE startup the errorlog should contain individual messages for each device, to include whether the device was started synchronously or asynchronously

    - see the Config guide (Unix/Linux), section 'Enabling Asynchronous Disk I/O on Linux' for items to have your system admin check.

    - if you can get asynchronous IO enabled, make sure you review the max async i/os per engine/server config parameters to make sure they're not set too low [default settings are typically too low for today's disk subsystems with the result being that too-low settings cause an artificial bottleneck for ASE disk IO activity]

    Add comment
    10|10000 characters needed characters exceeded

  • Aug 24, 2017 at 08:56 PM

    Thanks Mark -- we'll looking into setting this up

    Add comment
    10|10000 characters needed characters exceeded

  • Aug 25, 2017 at 03:55 PM


    Thanks Mark - Made the config change to allow for asynchronous IO's. Latest runs showing spinlock contention in default data cache -- We are going to set up named caches next- -- Currently the default data cache has 32 partitions -- are there drawbacks to setting number of partitions to 64 ?

    Add comment
    10|10000 characters needed characters exceeded

  • Aug 25, 2017 at 05:32 PM

    OK, so with async io enabled you were able to eliminate the spinlocks on the page/row lock hashtable, and you were able to double the volume of IO requests (from 55,429 to 110,847), but in doing so the bottleneck has now shifted to the default data cache.


    Might be interesting to see which spinlock(s) are having issues (ie, run that delta query against monSpinlockActivity).

    With 22 engines I wouldn't expect much improvement by increasing the cache partitions from 32 to 64.


    With approximately a 3rd of your disk IO requests being writes against (what appears to be) tempdb devices, I'd probably want to start with setting up a named cache for you tempdbs.

    You may have to play with the pool sizes (4k, 8k - assuming tempdb log io size = 8k, 32k) to find the right size/mix; this should go a long way towards helping reduce the spinlock contention in your default data cache.

    Assuming a majority of the tempdb cache activity is in support of short-lived operations (eg, short-lived #temp tables), I'd probably also want to disable the HK in your new tempdb cache; this should eliminate a lot of unnecessary writes (by the HK) to your tempdb devices; while this may help reduce the potential for some spinlock contention in the tempdb cache, the greater effect will come in the form of freeing up bandwidth for more important disk IO activity (eg, in case you're at/near the max throughput of your disk subsystem); one downside is that for long-running tempdb activity you may see an increase in pages being cleaned as they cross into the wash area ... so you may find yourself having to bump up the size of the wash area if you see the tempdb cache stalling while waiting for dirty pages to be flushed to disk.

    To disable the HK in your new tempdb cache you'll need to shutdown ASE and update the ASE's config file entry for the tempdb cache as follows:

    [Named Cache:<tempdb_cache_name>]
            cache size = xxx
            cache status = xxx
            cache status = HK ignore cache     <==== add this line
            cache replacement policy = xxx
            local cache partition number = xxx

    NOTE: Yes, you'll have 2x 'cache status' entries for the tempdb cache.

    Once you restart ASE you should find that sp_cacheconfig shows the HK is ignoring (ie, disabled in) the tempdb cache.

    Some references:

    ASE SAG (manual) entry re: disabling HK in cache (repeat of what I've posted above re: updating the ASE config file)

    Presentation by Jeff Tallman re: tuning data caches (includes small blip on disabling HK in tempdb caches; good refresher for tuning data caches in general)

    Add comment
    10|10000 characters needed characters exceeded