We have a portal running with the latest JDK (Linux X86_64, IBM JDK SR9), recommended parameters set and still facing lots of stability problems. The portal has three server processes, what we see actually is, that two of them consume each 100 % CPU. Actually no one is logged into the portal. I can see "strange" messages in the default trace such as
-
predecessor system -
com.sap.engine.services.applocking.exception.AppLockingTechnicalLockException: The lifetime can not be th
e user-session, because currently no user is logged in. Creating locks for the system-user is not allowed
.
at com.sap.engine.services.applocking.AbstractBaseLocking.getObjectForLifetime(AbstractBaseLockin
g.java:483)
at com.sap.engine.services.applocking.LogicalLockingImpl.getCurrentLifetimeDescription(LogicalLoc
kingImpl.java:193)
at com.sap.engine.services.applocking.AbstractBaseLocking.lockInternal(AbstractBaseLocking.java:1
28)
at com.sap.engine.services.applocking.LogicalLockingImpl.lock(LogicalLockingImpl.java:43)
at com.sap.engine.services.applocking.NamespaceLogicalLockingImpl.lock(NamespaceLogicalLockingImp
l.java:47)
at com.sap.engine.services.applocking.LogicalLocking_Stub.lock(LogicalLocking_Stub.java:65)
at com.sapportals.wcm.service.taskqueue.TaskQueue$Lock.lock(TaskQueue.java:316)
at com.sapportals.wcm.service.taskqueue.TaskQueue$Lock.writeLock(TaskQueue.java:293)
at com.sapportals.wcm.service.taskqueue.TaskQueue.writeLock(TaskQueue.java:467)
at com.sapportals.wcm.service.taskqueue.TaskQueue.get(TaskQueue.java:122)
at com.sapportals.wcm.service.taskqueue.TaskQueueReader.get(TaskQueueReader.java:68)
at com.sapportals.wcm.service.taskqueue.TaskQueueReader.get(TaskQueueReader.java:58)
at com.sapportals.wcm.service.jobprocessor.LocalDispatcher.stopJobs(LocalDispatcher.java:161)
at com.sapportals.wcm.service.jobprocessor.LocalDispatcher.run(LocalDispatcher.java:101)
at java.lang.Thread.run(Thread.java:838)
the two server processes are "looping" somewhere, the filesystem gets trashed with those logfiles, 4 or 5 of default.trc.X each minute.
Restarting the portal is NOT an option since that problem will arise sooner or later again.
What can we as a CUSTOMER do aside from creating thread dumps and attach them to an OSS call and "wait"? How to find out, what the system is doing?
--
Markus