We recently did some testing to understand how our application behaves in the event of database outages. To do so, we just blocked all network traffic to the HANA instance. Since we had set connection and communication timeouts, the expectation was that HTTP requests would fail after the timeout was exceeded.
Instead, we saw requests hanging indefinitely. Here is an excerpt from a stack trace:
"http-nio-8080-exec-10" - Thread t@818 java.lang.Thread.State: TIMED_WAITING at firstname.lastname@example.org/java.lang.Thread.sleep(Native Method) at com.sap.db.jdbc.ConnectionSapDB._doSleep(ConnectionSapDB.java:5594) at com.sap.db.jdbc.ConnectionSapDB._tryReconnect(ConnectionSapDB.java:5570) at com.sap.db.jdbc.ConnectionSapDB._handleSendReceiveException(ConnectionSapDB.java:5385) at com.sap.db.jdbc.ConnectionSapDB._send(ConnectionSapDB.java:4649) at com.sap.db.jdbc.ConnectionSapDB.exchange(ConnectionSapDB.java:2008) - locked <4d65f6ae> (a com.sap.db.jdbc.HanaConnectionClean) at com.sap.db.jdbc.StatementSapDB._executeDirect(StatementSapDB.java:1845) at com.sap.db.jdbc.StatementSapDB._execute(StatementSapDB.java:1819) at com.sap.db.jdbc.StatementSapDB._execute(StatementSapDB.java:1780) at com.sap.db.jdbc.StatementSapDB.execute(StatementSapDB.java:585) - locked <4837ddd5> (a com.sap.db.jdbc.HanaStatement) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)<br>
You can see that the connection pool (HikariCP) is trying to check the aliveness of the connection before borrowing it to the application. For this purpose, a test statement is executed. The driver however never returns control to the pool. Instead, is does some retrying on its own (_tryReconnect).
This behavior is controlled by the JDBC Connection Property "reconnect". By default, it is set to "true". I think this is problematic, because it leads to a "stop the world" behavior which is not desirable for most applications. For application using connection pools and dealing with human user interaction, I would recommend to set this to "false".