hi everyone,
i have one small question while i'm testing a HA failover test scenario.
I have installed a NW7.5 java stack on Windows, and configured two nodes for HA, each node has a java instance. when i kill the msg_server.exe process, i think the message server should be automatically restart and it did restart, but after restart two java instance came into yellow state, and after 3 minutes it came backup green.
but when we switch over the SCS from one node to another node manually, those java instance are green all the time.
so may i ask is it normal after we kill msg_server process, java instance will restart ?
there are some logs found in work directory:
########################################
Dev_datcol
F ********************************************************************************
F Process datcol started with pid 22040
F ********************************************************************************
F [Thr 22388] *** LOG => Process datcol started (pid 22040).
F
F [Thr 22388] Fri Aug 12 15:27:27 2016
F [Thr 22388] *** LOG => Process datcol stopping (pid 22040).
F
F [Thr 20752] Fri Aug 12 15:27:27 2016
F [Thr 20752] *** LOG => Signal 13 SIGCHLD.
F [Thr 22388] *** LOG => Process datcol stopped (pid 22040).
F [Thr 22388] *** LOG => exiting (exitcode 0, retcode 0).
Jvm_datcol
Aug 12, 2016 3:27:27 PM com.sap.engine.datcol.Task error
SEVERE: while trying to get the length of a null array loaded from a local variable at slot 6
java.lang.NullPointerException: while trying to get the length of a null array loaded from a local variable at slot 6
at com.sap.engine.datcol.internal.Scanner.scanPattern(Scanner.java:67)
at com.sap.engine.datcol.internal.Scanner.scan(Scanner.java:31)
at com.sap.engine.datcol.tasks.Copy.execute(Copy.java:50)
at com.sap.engine.datcol.Task.perform(Task.java:96)
at com.sap.engine.datcol.internal.DataSet.execute(DataSet.java:28)
at com.sap.engine.datcol.internal.DataCollectorApp.run(DataCollectorApp.java:191)
at com.sap.engine.datcol.internal.DataCollectorApp.main(DataCollectorApp.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.sap.engine.offline.OfflineToolStart.main(OfflineToolStart.java:162)
Dev_icm
[Thr 11044] Fri Aug 12 15:26:54 2016
[Thr 11044] JNCMReconnectAsync: successfully reconnected to message server. Waiting 30 sec for consistent cluster.
[Thr 11044] Fri Aug 12 15:27:15 2016
[Thr 11044] *** WARNING => P4RecvHandShake: read failed: NIECONN_BROKEN(-6) [p4_plg.c 3158]
[Thr 11044] *** WARNING => P4PlugInReadHandler(id=6/16781): P4RecvHandShake failed: Network error (NI)(-8) [p4_plg.c 1245]
[Thr 11044] *** WARNING => P4RecvHandShake: read failed: NIECONN_BROKEN(-6) [p4_plg.c 3158]
[Thr 11044] *** WARNING => P4PlugInReadHandler(id=1/16780): P4RecvHandShake failed: Network error (NI)(-8) [p4_plg.c 1245]
[Thr 11044] Fri Aug 12 15:27:29 2016
[Thr 11044] JNCMReconnectAsync: inconsistent cluster reconnect [2 nodes are still not connected]
[Thr 11044] JNCMIReconnectMerge: can't find node [17181751] in reconnect list -> element loss
[Thr 11044] JNCMIReconnectMerge: can't find node [17181750] in reconnect list -> element loss
[Thr 11044] JNCMIHttpMsPutLogon: set http logon port (port:50100) (lbcount: 2)
[Thr 11044] JNCMIHttpMsPutLogon: set https logon port (port:0) (lbcount: 2)
[Thr 11044] JNCMIP4MsPutLogon: set p4 logon port (port:50104) (lbcount: 2)
[Thr 11044] JNCMIIIOPMsPutLogon: set iiop logon port (port:50107) (lbcount: 2)
[Thr 11044] JNCMITelnetMsPutLogon: set telnet logon port (port:50108) (lbcount: 2)
[Thr 11044] JNCMIHttpMsPutLogon: set http logon port (port:50100) (lbcount: 1)
[Thr 11044] JNCMIHttpMsPutLogon: set https logon port (port:0) (lbcount: 1)
[Thr 11044] Fri Aug 12 15:36:12 2016
[Thr 11044] *** ERROR => can't delete node [cluster id:24168720] [jncmxx.c 2268]
Dev_server0
J Fri Aug 12 15:26:51 2016
J Heap
J par new generation reserved 1397760K, committed 1397760K, used 405765K [0x00000006f0000000, 0x0000000745500000, 0x0000000745500000)
J eden space 1048320K, 22% used [0x00000006f0000000, 0x00000006fe1a1568, 0x000000072ffc0000)
J from space 174720K, 100% used [0x000000073aa60000, 0x0000000745500000, 0x0000000745500000)
J to space 174720K, 0% used [0x000000072ffc0000, 0x000000072ffc0000, 0x000000073aa60000)
J concurrent mark-sweep generation reserved 2796544K, committed 2796544K, used 372127K [0x0000000745500000, 0x00000007f0000000, 0x00000007f0000000)
J Metaspace used 323690K, capacity 359164K, committed 359808K, reserved 575488K
J class space used 38446K, capacity 47840K, committed 48124K, reserved 262144K
F
F [Thr 20432] Fri Aug 12 15:26:52 2016
F [Thr 20432] *** LOG => SfCJavaVm: exit hook is called. (rc = 11114)
F
F ********************************************************************************
F *** ERROR => Java node 'server0' terminated with exit code 11114.
F ***
F *** Please see section 'Java program exit codes'
F *** in SAP Note 1316652 for additional information and trouble shooting advice.
F ********************************************************************************
F
F [Thr 20432] *** LOG => exiting (exitcode 11114, retcode 1).
M [Thr 20432] CCMS: CCMS Monitoring Cleanup finished successfully.
Hi Minas.
so may i ask is it normal after we kill msg_server process, java instance will restart ?
It will not restart start in windows failover cluster environment. You can do manually failover.
You can use the both nodes one of the node hold the SAP Java group and another one hold the database.
Refer the HA FAQ High Availability - Frequently Asked Questions
BR
SS
Add a comment