Skip to Content

SQL Server AlwaysOn Failover threshold and Lease Timeout

Hi experts,

I found each time when I were building/restoring another log shipping(standby) server. It would cause ERR [RES] SQL Server Availability Group: [hadrag] Failure detected, diagnostics heartbeat is lost(in cluster log) and A connection timeout has occurred on a previously established connection to availability replica 'DL980-4' with id(in errorlog). I google and find a document (http://download.microsoft.com/download/0/F/B/0FBFAA46-2BFD-478F-8E56-7BF3C672DF9D/Troubleshooting%20SQL%20Server%20AlwaysOn.pdf ) indicated that “This may be a performance issue”. I run restore database and AlwaysOn synchronizing on the same 10GbE link at the same time.

Should I increase

leaseTimeout from 20000 to 100000 and

HealthCheckTimeout from 30000 to 300000?

Does it work to prevent unnecessary failover.

---

Please refer to

(http://blogs.msdn.com/b/psssql/archive/2012/09/07/how-it-works-sql-server-alwayson-lease-timeout.aspx )

parag

10-18-2013 3:19 AM

#

Hi Denzil

we seem to see lease expires very frequently when the server is under very high cpu pressure .. our failure condition level is 1

is it possible to prevent this situation . the problem is when lease expires,all the current connections seem to be dropped . wondering if there is a way to prevent this ..

Also is it possible to affitinize the always on health check process to a particular core

Thanks for your help!

Add comment
10|10000 characters needed characters exceeded

  • Follow
  • Get RSS Feed

1 Answer

  • Mar 05, 2014 at 05:59 AM

    Hi Dennis

    Should I increase

    leaseTimeout from 20000 to 100000 and

    HealthCheckTimeout from 30000 to 300000?

    Does it work to prevent unnecessary failover.

    1. you can increase the timeout parameter, but when the fail-over time slight delay will be there (its a work around solution)

    2. You may require to check the Network connections. (Cluster Heartbeat & public network)

    3. Have you update the latest patches of OS & DB?

    4. If possible raise the ticket to Microsoft. they may give some update for Cluster resource update based on your issue

    Regards

    Sriram

    Add comment
    10|10000 characters needed characters exceeded