Skip to Content
author's profile photo Former Member
Former Member

As Java HA windows failover test

hi everyone,

i have one small question while i'm testing a HA failover test scenario.

I have installed a NW7.5 java stack on Windows, and configured two nodes for HA, each node has a java instance. when i kill the msg_server.exe process, i think the message server should be automatically restart and it did restart, but after restart two java instance came into yellow state, and after 3 minutes it came backup green.

but when we switch over the SCS from one node to another node manually, those java instance are green all the time.

so may i ask is it normal after we kill msg_server process, java instance will restart ?

there are some logs found in work directory:

########################################

Dev_datcol

F ********************************************************************************

F Process datcol started with pid 22040

F ********************************************************************************

F [Thr 22388] *** LOG => Process datcol started (pid 22040).

F

F [Thr 22388] Fri Aug 12 15:27:27 2016

F [Thr 22388] *** LOG => Process datcol stopping (pid 22040).

F

F [Thr 20752] Fri Aug 12 15:27:27 2016

F [Thr 20752] *** LOG => Signal 13 SIGCHLD.

F [Thr 22388] *** LOG => Process datcol stopped (pid 22040).

F [Thr 22388] *** LOG => exiting (exitcode 0, retcode 0).

Jvm_datcol

Aug 12, 2016 3:27:27 PM com.sap.engine.datcol.Task error

SEVERE: while trying to get the length of a null array loaded from a local variable at slot 6

java.lang.NullPointerException: while trying to get the length of a null array loaded from a local variable at slot 6

at com.sap.engine.datcol.internal.Scanner.scanPattern(Scanner.java:67)

at com.sap.engine.datcol.internal.Scanner.scan(Scanner.java:31)

at com.sap.engine.datcol.tasks.Copy.execute(Copy.java:50)

at com.sap.engine.datcol.Task.perform(Task.java:96)

at com.sap.engine.datcol.internal.DataSet.execute(DataSet.java:28)

at com.sap.engine.datcol.internal.DataCollectorApp.run(DataCollectorApp.java:191)

at com.sap.engine.datcol.internal.DataCollectorApp.main(DataCollectorApp.java:31)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at com.sap.engine.offline.OfflineToolStart.main(OfflineToolStart.java:162)

Dev_icm

[Thr 11044] Fri Aug 12 15:26:54 2016

[Thr 11044] JNCMReconnectAsync: successfully reconnected to message server. Waiting 30 sec for consistent cluster.

[Thr 11044] Fri Aug 12 15:27:15 2016

[Thr 11044] *** WARNING => P4RecvHandShake: read failed: NIECONN_BROKEN(-6) [p4_plg.c 3158]

[Thr 11044] *** WARNING => P4PlugInReadHandler(id=6/16781): P4RecvHandShake failed: Network error (NI)(-8) [p4_plg.c 1245]

[Thr 11044] *** WARNING => P4RecvHandShake: read failed: NIECONN_BROKEN(-6) [p4_plg.c 3158]

[Thr 11044] *** WARNING => P4PlugInReadHandler(id=1/16780): P4RecvHandShake failed: Network error (NI)(-8) [p4_plg.c 1245]

[Thr 11044] Fri Aug 12 15:27:29 2016

[Thr 11044] JNCMReconnectAsync: inconsistent cluster reconnect [2 nodes are still not connected]

[Thr 11044] JNCMIReconnectMerge: can't find node [17181751] in reconnect list -> element loss

[Thr 11044] JNCMIReconnectMerge: can't find node [17181750] in reconnect list -> element loss

[Thr 11044] JNCMIHttpMsPutLogon: set http logon port (port:50100) (lbcount: 2)

[Thr 11044] JNCMIHttpMsPutLogon: set https logon port (port:0) (lbcount: 2)

[Thr 11044] JNCMIP4MsPutLogon: set p4 logon port (port:50104) (lbcount: 2)

[Thr 11044] JNCMIIIOPMsPutLogon: set iiop logon port (port:50107) (lbcount: 2)

[Thr 11044] JNCMITelnetMsPutLogon: set telnet logon port (port:50108) (lbcount: 2)

[Thr 11044] JNCMIHttpMsPutLogon: set http logon port (port:50100) (lbcount: 1)

[Thr 11044] JNCMIHttpMsPutLogon: set https logon port (port:0) (lbcount: 1)

[Thr 11044] Fri Aug 12 15:36:12 2016

[Thr 11044] *** ERROR => can't delete node [cluster id:24168720] [jncmxx.c 2268]

Dev_server0

J Fri Aug 12 15:26:51 2016

J Heap

J par new generation reserved 1397760K, committed 1397760K, used 405765K [0x00000006f0000000, 0x0000000745500000, 0x0000000745500000)

J eden space 1048320K, 22% used [0x00000006f0000000, 0x00000006fe1a1568, 0x000000072ffc0000)

J from space 174720K, 100% used [0x000000073aa60000, 0x0000000745500000, 0x0000000745500000)

J to space 174720K, 0% used [0x000000072ffc0000, 0x000000072ffc0000, 0x000000073aa60000)

J concurrent mark-sweep generation reserved 2796544K, committed 2796544K, used 372127K [0x0000000745500000, 0x00000007f0000000, 0x00000007f0000000)

J Metaspace used 323690K, capacity 359164K, committed 359808K, reserved 575488K

J class space used 38446K, capacity 47840K, committed 48124K, reserved 262144K

F

F [Thr 20432] Fri Aug 12 15:26:52 2016

F [Thr 20432] *** LOG => SfCJavaVm: exit hook is called. (rc = 11114)

F

F ********************************************************************************

F *** ERROR => Java node 'server0' terminated with exit code 11114.

F ***

F *** Please see section 'Java program exit codes'

F *** in SAP Note 1316652 for additional information and trouble shooting advice.

F ********************************************************************************

F

F [Thr 20432] *** LOG => exiting (exitcode 11114, retcode 1).

M [Thr 20432] CCMS: CCMS Monitoring Cleanup finished successfully.

Add a comment
10|10000 characters needed characters exceeded

Related questions

1 Answer

  • Posted on Aug 23, 2016 at 12:44 PM

    Hi Minas.

    so may i ask is it normal after we kill msg_server process, java instance will restart ?

    It will not restart start in windows failover cluster environment. You can do manually failover.

    You can use the both nodes one of the node hold the SAP Java group and another one hold the database.

    Refer the HA FAQ High Availability - Frequently Asked Questions

    BR

    SS

    Add a comment
    10|10000 characters needed characters exceeded

    • Hi Minas.

      Yes, ERS holding all the SAP table locks which normally active in both nodes in Windows MSCS. If your are restarting the ERS its will restart the MSG & SCS instances.

      If you want to do the Windows failover cluster testing. you can remove the MSCS cluster network cable(Dont remove the Cluster hardbit card) from one of the active node which active in SAP group. it will failover the all Cluster resource to another node.

      BR

      SS

Before answering

You should only submit an answer when you are proposing a solution to the poster's problem. If you want the poster to clarify the question or provide more information, please leave a comment instead, requesting additional details. When answering, please include specifics, such as step-by-step instructions, context for the solution, and links to useful resources. Also, please make sure that you answer complies with our Rules of Engagement.
You must be Logged in to submit an answer.

Up to 10 attachments (including images) can be used with a maximum of 1.0 MB each and 10.5 MB total.