on 11-17-2008 2:30 AM
Hi,
Recently our Production Server had restarted itself.
Could not find the Reason. Following are the logs of dev_disp log.
It stopped around 20:23 and started again at 21:49.
could find logs like ::
>>Operating system call WSASend failed
>>1 possible network problems detected - check tracefile and adjus |
Searched SMP and found some notes relating to NIPING.
55147 WinNT: Connection reset by peer
500235 Network Diagnosis with NIPING
But would like to find out if any one has faced the same problem and if so, how did you solve it.
Thanks & Regards
L Raghunahth
DEV_DISP
-
16:44:04|<<SERVER NAME>>|DP | | | | | |Q0 |4|Connection to user 769 (USR NAME), terminal 49 (GCC81559 ) lost |
16:50:39 | <<SERVER NAME>> | DP | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
16:50:44 | <<SERVER NAME>> | DP | Q0 | 4 | Connection to user 783 (USR NAME), terminal 46 (GCC14636 ) lost | |||||
16:54:36 | <<SERVER NAME>> | DIA | 000 | 100 | K-KURASAWA | SESS | US | 1 | User <usrname> locked due to incorrect logon | |
20:23:15 | <<SERVER NAME>> | DP | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:23:19 | <<SERVER NAME>> | DP | Q0 | 4 | Connection to user 1748 (USR NAME), terminal 51 (GCC11196 ) lost | |||||
20:23:38 | <<SERVER NAME>> | DP | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DIA | 001 | R0 | Z | The update dispatch info was reset | ||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:39 | <<SERVER NAME>> | DP | Q0 | G | Request (type DIA) cannot be processed | |||||
20:23:40 | <<SERVER NAME>> | DP | Q0 | I | Operating system call connect failed (error no. 10061) | |||||
20:23:40 | <<SERVER NAME>> | DP | Q0 | N | Failed to send a request to the message server | |||||
20:23:40 | <<SERVER NAME>> | DIA | 001 | R0 | R | The update has been deactivated following a system error | ||||
20:23:40 | <<SERVER NAME>> | DIA | 001 | R0 | Z | The update dispatch info was reset | ||||
20:23:40 | <<SERVER NAME>> | DP | Q0 | N | Failed to send a request to the message server | |||||
20:23:40 | <<SERVER NAME>> | DIA | 001 | R0 | R | The update has been deactivated following a system error | ||||
20:23:41 | <<SERVER NAME>> | DP | Q0 | N | Failed to send a request to the message server | |||||
20:23:41 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:23:45 | <<SERVER NAME>> | DP | Q0 | I | Operating system call connect failed (error no. 10061) | |||||
20:23:56 | <<SERVER NAME>> | DIA | 001 | Q0 | I | Operating system call WSASend failed (error no. 10054) | ||||
20:23:57 | <<SERVER NAME>> | DIA | 001 | Q0 | I | Operating system call connect failed (error no. 10061) | ||||
20:23:58 | <<SERVER NAME>> | UP2 | 030 | Q0 | 2 | Stop Workproc30, PID 9540 | ||||
20:23:58 | <<SERVER NAME>> | DIA | 009 | Q0 | 2 | Stop Workproc 9, PID 10992 | ||||
20:23:58 | <<SERVER NAME>> | DIA | 016 | Q0 | 2 | Stop Workproc16, PID 7360 | ||||
20:23:58 | <<SERVER NAME>> | BTC | 027 | Q0 | 2 | Stop Workproc27, PID 7892 | ||||
20:23:58 | <<SERVER NAME>> | BTC | 028 | Q0 | 2 | Stop Workproc28, PID 9988 | ||||
20:23:58 | <<SERVER NAME>> | DIA | 003 | Q0 | 2 | Stop Workproc 3, PID 10036 | ||||
20:23:58 | <<SERVER NAME>> | DIA | 006 | Q0 | 2 | Stop Workproc 6, PID 4360 | ||||
20:23:58 | <<SERVER NAME>> | BTC | 025 | Q0 | 2 | Stop Workproc25, PID 8528 | ||||
20:23:58 | <<SERVER NAME>> | DIA | 002 | Q0 | 2 | Stop Workproc 2, PID 4336 | ||||
20:23:59 | <<SERVER NAME>> | UP1 | 021 | Q0 | 2 | Stop Workproc21, PID 9652 | ||||
20:23:59 | <<SERVER NAME>> | DIA | 013 | Q0 | 2 | Stop Workproc13, PID 7260 | ||||
20:23:59 | <<SERVER NAME>> | BTC | 026 | Q0 | 2 | Stop Workproc26, PID 1060 | ||||
20:23:59 | <<SERVER NAME>> | DIA | 015 | Q0 | 2 | Stop Workproc15, PID 8388 | ||||
20:23:59 | <<SERVER NAME>> | UP1 | 020 | Q0 | 2 | Stop Workproc20, PID 10048 | ||||
20:23:59 | <<SERVER NAME>> | DIA | 005 | Q0 | 2 | Stop Workproc 5, PID 572 | ||||
20:23:59 | <<SERVER NAME>> | DIA | 011 | Q0 | 2 | Stop Workproc11, PID 7224 | ||||
20:24:00 | <<SERVER NAME>> | DIA | 014 | Q0 | 2 | Stop Workproc14, PID 7956 | ||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | DIA | 012 | Q0 | 2 | Stop Workproc12, PID 10592 | ||||
20:24:00 | <<SERVER NAME>> | SPO | 029 | Q0 | 2 | Stop Workproc29, PID 10760 | ||||
20:24:00 | <<SERVER NAME>> | DIA | 017 | Q0 | 2 | Stop Workproc17, PID 6264 | ||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | DIA | 018 | Q0 | 2 | Stop Workproc18, PID 9332 | ||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:00 | <<SERVER NAME>> | DIA | 008 | Q0 | 2 | Stop Workproc 8, PID 9592 | ||||
20:24:00 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:01 | <<SERVER NAME>> | BTC | 024 | Q0 | 2 | Stop Workproc24, PID 6832 | ||||
20:24:01 | <<SERVER NAME>> | DIA | 019 | Q0 | 2 | Stop Workproc19, PID 1576 | ||||
20:24:01 | <<SERVER NAME>> | DIA | 004 | Q0 | 2 | Stop Workproc 4, PID 9968 | ||||
20:24:01 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:01 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:01 | <<SERVER NAME>> | DIA | 007 | Q0 | 2 | Stop Workproc 7, PID 10464 | ||||
20:24:01 | <<SERVER NAME>> | UP1 | 023 | Q0 | 2 | Stop Workproc23, PID 9512 | ||||
20:24:01 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:02 | <<SERVER NAME>> | UP1 | 022 | Q0 | 2 | Stop Workproc22, PID 8328 | ||||
20:24:02 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:02 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:02 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:02 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:03 | <<SERVER NAME>> | DIA | 001 | Q0 | I | Operating system call connect failed (error no. 10061) | ||||
20:24:04 | <<SERVER NAME>> | DIA | 010 | Q0 | 2 | Stop Workproc10, PID 9764 | ||||
20:24:09 | <<SERVER NAME>> | DIA | 001 | Q0 | I | Operating system call connect failed (error no. 10061) | ||||
20:24:15 | <<SERVER NAME>> | DIA | 001 | Q0 | I | Operating system call connect failed (error no. 10061) | ||||
20:24:20 | <<SERVER NAME>> | DIA | 001 | GI | 0 | Error calling the central lock handler | ||||
20:24:20 | <<SERVER NAME>> | DIA | 001 | GI | 3 | > Failed to clean up lock entries | ||||
20:24:20 | <<SERVER NAME>> | DIA | 001 | Q0 | 2 | Stop Workproc 1, PID 9852 | ||||
20:24:20 | <<SERVER NAME>> | RD | Q0 | I | Operating system call recv failed (error no. 10054) | |||||
20:24:57 | <<SERVER NAME>> | DIA | 000 | 100 | <USRNME> | Y_GE | Q0 | 2 | Stop Workproc 0, PID 8212 | |
20:24:57 | <<SERVER NAME>> | RD | S3 | 0 | SAP gateway was closed | |||||
20:25:24 | <<SERVER NAME>> | DP | Q0 | 5 | Stop SAP System, Dispatcher Pid 68 | |||||
21:49:37 | <<SERVER NAME>> | DP | E1 | 0 | Buffer SCSA Generated with Length 4096 | |||||
21:49:37 | <<SERVER NAME>> | DP | Q0 | 0 | Start SAP System, SAPSYSTEM 02, Dispatcher Pid 8116 | |||||
21:49:42 | <<SERVER NAME>> | DP | GZ | Z | > 1 possible network problems detected - check tracefile and adjus | |||||
21:49:43 | <<SERVER NAME>> | RD | S0 | 0 | SAP Gateway Started (PID: 6368) | |||||
21:49:43 | <<SERVER NAME>> | WRK | 000 | Q0 | Q | Start Workproc 1, Pid 11060 | ||||
21:49:43 | <<SERVER NAME>> | WRK | 000 | Q0 | Q | Start Workproc 4, Pid 9344 |
|21:49:43|<<SERVER NAME>>|WRK |000| | | | |Q0 |Q|Start Workproc 0, Pid 10656
Usually in Windows we have to reboot servers after a while, looks like yours is automatic!.... just kidding.
As Markus already said, if Windows reboot by itself is really dificult that a network card caused that, just look at Event Viewer because the cause should be there. In SAP logs you will just find consecuences not causes when is an OS/hardware related problem.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
Thanks everyone for your valuable replies.
Actually the OS log is in Japanese and I have translated the log during the time of the incident.
Actually there are some more log after 21:45, if you need I will translate it.
-
2008/10/30,20:23:34,Service Control Manager,Error,none,7034,N/A,SRV15752,SAPOsCol service terminated unexpectedly. It has done this 1 time(s).
2008/10/30,20:23:36,ClusSvc,Error,Failover Manager,1069,N/A,SRV15752,Resource Group 'SAP SRV' Cluster Resource 'SAP SRV SAPOsCol' has failed.
2008/10/30,20:23:38,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,The SAPSRV_01 service was successfully sent a Stop control.
2008/10/30,20:23:38,Service Control Manager,Info,none,7036,N/A,SRV15752,The SAPSRV_01 service entered the stopped state.
2008/10/30,20:23:38,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,The SAPSRV_00 service was successfully sent a Stop control.
2008/10/30,20:23:38,Service Control Manager,Info,none,7036,N/A,SRV15752,The SAPSRV_00 service entered the stopped state.
2008/10/30,20:24:09,Service Control Manager,Error,none,7011,N/A,SRV15752,Timeout (3000 milliseconds) waiting for transaction response from the SAPSRV_02 service.
2008/10/30,20:24:39,Service Control Manager,Error,none,7011,N/A,SRV15752, Timeout (3000 milliseconds) waiting for transaction response from the service.
2008/10/30,20:24:40,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_10 successfully sent a Stop control
The Cluster Service failed to bring the Resource Group "Cluster Group" completely online or offline.
2008/10/30,20:24:40,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_10 Service entered the stopped state.
2008/10/30,20:24:45,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPOsCol successfully sent a Start control
2008/10/30,20:24:45,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPOsCol service entered the running state.
2008/10/30,20:24:46,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_01 successfully sent a Start control
2008/10/30,20:24:47,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_00 successfully sent a Start control
2008/10/30,20:24:47,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_02 successfully sent a Start control
2008/10/30,20:24:47,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_10 successfully sent a Start control
2008/10/30,20:24:49,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_01 service entered the running state.
2008/10/30,20:24:50,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_00 service entered the running state.
2008/10/30,20:24:50,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_02 service entered the running state.
2008/10/30,20:24:51,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_10 service entered the running state.
2008/10/30,20:25:04,ClusSvc,Error,Failover Manager ,1069,N/A,SRV15752,Resource Group 'SAP SRV'cluster resource'SAP SRV 02 Instance' failed。
2008/10/30,20:25:25,ClusSvc,Error,Failover Manager ,1069,N/A,SRV15752,Resource Group 'SAP SRV' cluster resource'SAP SRV 02 Service' failed。
2008/10/30,20:25:25,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_02 successfully sent a Stop control
2008/10/30,20:25:25,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_02 Service entered the stopped state.
2008/10/30,20:25:26,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_02 successfully sent a Start control
2008/10/30,20:25:28,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_02 service entered the running state.
2008/10/30,21:45:41,ClusSvc,Info,Failover Manager ,1205,N/A,SRV15752, The Cluster Service failed to bring the Resource Group "SAP SRV" completely online or offline.
2008/10/30,21:45:41,ClusSvc,Warning,Failover Manager ,1146,N/A,SRV15752,The cluster resource monitor died unexpectedly, an attempt will be made to restart it.
-
Regards,
Raghunahth L
Hi,
According to your OS log it seems you running MSCS cluster....
We had similar problems (SAP bouncing / being restarted), in our clusters, that was solved by updating the SAP cluster resources saprc.dll and saprcex.dll according to sap note [1043592|http://service.sap.com/sap/support/notes/1043592].
Maybe this could help you...
By the way it does not harm checking the version of the dll's and update them to the newest available...
Regards
Rolf
Hi,
No we did not uninstall SP2.
We installed updated version of the SAP dll's.
Start download the newest NTCLUST.SAR file from the service marketplace matching your environment.
(It can be found in the database independent Kernel folder).
Then follow only step 1 of solution in SAP note [867521|http://service.sap.com/sap/support/notes/867521]
(This note describes other things to do, but here the imported is how to replace the files of the NTCLUST.SAR file)...
Hope this helps....
Regards
Rolf
can we see the work process event log
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
> can we see the work process event loل
If an operating system crashes or reboots the reason should be in the event log (in case of Windows). Windows 2003 writes something like "the operating system has been rebooted after an unexpected shutdown" when this happens. Messages before that message should give an idea about what happened.
The workprocess is an application running on TOP of the OS, it won't have any information what happened.
Markus
What is in your windows event log? I doubt, that the problem is caused by the network card.
Markus
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
HI Raghunath
We have faced the similar problem , in our case it was the problem with windows patch , we were running 4.7 on windows and one applicatoin server used to restart by itself with the similar message and problem got resolved by applying patches at OS level
you can also check the message and gateway server logs but u might not find anything substancial related to SAP other than Network problem ,,
Ardhian is quite right ,, it can be because of the network problem ,,,as well ...
Hope this information might help you ,,
cheers
dEE
Edited by: Deep Kwatra on Nov 17, 2008 1:44 PM
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
hi,
please check your network connection. Also check your NIC card. If possible try to replace it.
ardhian
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.