Skip to Content
avatar image
Former Member

SAPDB stuck with no recognizable pattern

We are using sapdb 7.6.0.34 as database in a webapp (spring, hibernate as orm mapper, commons dbcp) with the jdbc-driver version 7.6.0.30.5567. Database and application are on the same machine and the cache hit rate is at 100% as our cache is the same size than the database.

After some days the application stucks and is inresponsible after that. After killing and restarting the app it is responsible again

Doing a thread-dump at that point reveals that all threads are waiting for the database.

At that point i find tons of that in the xserver-log:

2008-07-28 10:51:12 13306 12915 XSERVER Connection lost

2008-07-28 10:51:12 13306 ERR -11987 COMMUNIC semop (receive 44793978) interrupted and check reports error

2008-07-28 10:51:12 13288 12915 XSERVER Connection lost

2008-07-28 10:51:12 13288 ERR -11987 COMMUNIC semop (receive 44269674) interrupted and check reports error

About half an hour before we can see a dramatic growth in connections. It seems the application starts opening new connections as the available ones are bound and waiting for a call to the db to return:

2008-07-28 06:38:53 4089 11560 COMMUNIC Releasing T244

2008-07-28 06:38:53 4089 12827 COMMUNIC wait for connection T244

2008-07-28 06:39:00 4073 11561 COMMUNIC Connecting T266 localhost 0

2008-07-28 06:39:00 4089 12929 TASKING Task T266 started

2008-07-28 06:39:00 4089 11007 COMMUNIC wait for connection T266

2008-07-28 06:39:00 4089 11561 COMMUNIC Connected T266 localhost 0

2008-07-28 08:06:04 4073 11561 COMMUNIC Connecting T242 localhost 0

2008-07-28 08:06:04 4089 12929 TASKING Task T242 started

2008-07-28 08:06:04 4089 11007 COMMUNIC wait for connection T242

2008-07-28 08:06:04 4089 11561 COMMUNIC Connected T242 localhost 0

2008-07-28 08:28:26 4073 11561 COMMUNIC Connecting T230 localhost 0

2008-07-28 08:28:26 4089 12929 TASKING Task T230 started

2008-07-28 08:28:26 4089 11007 COMMUNIC wait for connection T230

2008-07-28 08:28:26 4089 11561 COMMUNIC Connected T230 localhost 0

2008-07-28 09:24:44 4073 11561 COMMUNIC Connecting T218 localhost 0

2008-07-28 09:24:44 4089 12929 TASKING Task T218 started

2008-07-28 09:24:44 4089 11007 COMMUNIC wait for connection T218

2008-07-28 09:24:44 4089 11561 COMMUNIC Connected T218 localhost 0

2008-07-28 10:19:50 4073 11561 COMMUNIC Connecting T300 localhost 0

2008-07-28 10:19:50 4089 12929 TASKING Task T300 started

2008-07-28 10:19:50 4089 11007 COMMUNIC wait for connection T300

2008-07-28 10:19:50 4089 11561 COMMUNIC Connected T300 localhost 0

2008-07-28 10:19:52 4073 11561 COMMUNIC Connecting T243 localhost 0

2008-07-28 10:19:52 4089 12929 TASKING Task T243 started

2008-07-28 10:19:52 4089 11007 COMMUNIC wait for connection T243

2008-07-28 10:19:52 4089 11561 COMMUNIC Connected T243 localhost 0

2008-07-28 10:19:55 4073 11561 COMMUNIC Connecting T234 localhost 0

2008-07-28 10:19:55 4089 12929 TASKING Task T234 started

2008-07-28 10:19:55 4089 11007 COMMUNIC wait for connection T234

2008-07-28 10:19:55 4089 11561 COMMUNIC Connected T234 localhost 0

2008-07-28 10:20:15 4089 12929 TASKING Task T267 started

2008-07-28 10:20:15 4073 11561 COMMUNIC Connecting T267 localhost 0

2008-07-28 10:20:15 4089 11007 COMMUNIC wait for connection T267

2008-07-28 10:20:15 4089 11561 COMMUNIC Connected T267 localhost 0

2008-07-28 10:20:16 4073 11561 COMMUNIC Connecting T233 localhost 0

2008-07-28 10:20:16 4089 12929 TASKING Task T233 started

2008-07-28 10:20:16 4089 11007 COMMUNIC wait for connection T233

2008-07-28 10:20:16 4089 11561 COMMUNIC Connected T233 localhost 0

2008-07-28 10:21:08 4089 12929 TASKING Task T290 started

2008-07-28 10:21:08 4073 11561 COMMUNIC Connecting T290 localhost 0

2008-07-28 10:21:08 4089 11007 COMMUNIC wait for connection T290

2008-07-28 10:21:08 4089 11561 COMMUNIC Connected T290 localhost 0

2008-07-28 10:21:17 4089 12929 TASKING Task T235 started

2008-07-28 10:21:17 4073 11561 COMMUNIC Connecting T235 localhost 0

2008-07-28 10:21:17 4089 11007 COMMUNIC wait for connection T235

2008-07-28 10:21:17 4089 11561 COMMUNIC Connected T235 localhost 0

2008-07-28 10:21:21 4073 11561 COMMUNIC Connecting T215 localhost 0

2008-07-28 10:21:21 4089 12929 TASKING Task T215 started

2008-07-28 10:21:21 4089 11007 COMMUNIC wait for connection T215

2008-07-28 10:21:21 4089 11561 COMMUNIC Connected T215 localhost 0

2008-07-28 10:21:33 4073 11561 COMMUNIC Connecting T263 localhost 0

2008-07-28 10:21:33 4089 12929 TASKING Task T263 started

2008-07-28 10:21:33 4089 11007 COMMUNIC wait for connection T263

2008-07-28 10:21:33 4089 11561 COMMUNIC Connected T263 localhost 0

2008-07-28 10:21:41 4073 11561 COMMUNIC Connecting T225 localhost 0

I've googled for that sort of problem but the best answer i found is "this can happen if there is a problem with the socket communication" so that the socket is brocken but the jdbc-driver never realizes this and sits there, waiting for an answer.

Now i'm totally hosed, i've no clue where to look next.

Any hint would be appreciated.

Regards

Dirk Lachowski

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Best Answer
    Jul 29, 2008 at 09:32 PM

    Hi Dirk,

    > About half an hour before we can see a dramatic growth in connections. It seems the application starts opening new connections as the available ones are bound and waiting for a call to the db to return:

    > Now i'm totally hosed, i've no clue where to look next.

    >

    > Any hint would be appreciated.

    I propose to check why the client starts to create so many connections.

    If the db calls hang on the client with semaphore operators in the JDBC driver, than perhaps checking the JAVA runtime environment might be paying off.

    It don't sound like a db (kernel) problem at all.

    regards,

    Lars

    BTW: the MaxDB version you use is pretty outdated - get the current patch here in SDN. As this also comes with a new version of the JDBC driver this perhaps improves things for you.

    Add comment
    10|10000 characters needed characters exceeded

    • Former Member Lars Breddemann

      Hi Lars,

      setting up a test-system should be no big deal but creating a high enough load to reproduce the error?

      I'll give it a try in the next weeks.

      Regards,

      Dirk