on 07-01-2013 1:34 PM
Hi,
We were in the process of upgrading the SAP kernel version from 236 to 254. The upgrade was succcessful in our dev and test systems, but got into issue in Production.
Our ECC version is 6.0 and kernel release 700. In Production, we have a solaris cluster in place and SAP and DB resources are clustered among 2 servers to have a high availability. The process followed for kernel upgrade in production is as follows:
1) Extracted the 2 SAR files downloaded from service market to a new directory exe_new under /sapmnt/<SID>
2) Stopped the 4 application servers
3) Stopped SAP resources and Oracle resources in cluster
4) Mouned the file system /sapmnt/<SID> on server hosting central instance
5) Stopped all services running under <SIDADM> in central instance and application server
6) Renamed exe folder in /sapmnt/<PRD> to exe_backup
7) Renamed exe_new folder in /sapmnt/<SID> to exe
😎 Started oracle resources in cluster, which was successful
9) Started SAP resources in cluster, which was not getting started
Can some one please check the issue and guide us if we missed any steps?
Regards,
BIJOY
Hi,
I forgot icmbnd. pleace check. icmbnd uses root context with stikybit like this:
-rwsr-x--- 1 root sapsys 4107696 May 28 07:23 icmbnd
Best regards
Willi Eimler
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
first of all: Sorry for my bad englich, i'm short in time!
But I think your shareredmemory segments were not clean and the instance-exe directories are not supplied with the kernel. Try this procedure:
1.) Make Copy of old Kernel
2.) stop all sap instances with stopsap
3.) Upgrade saphostagent (on every instance):
root> cd /tmp
root> mkdir saphostagent; cd saphostagent
root> /sapmnt/PIE/SAPCAR -xvf /<Path of saphostagent SAR file>/SAPHOSTAGENT<Version>.SAR
root> ./saphostexec -upgrade
root> /usr/sap/hostctrl/exe/saphostexec -stop
root> /usr/sap/hostctrl/exe/saposcol -k
4.) Stop diagnostic agent on all instances
5.) Stop sapstartsrv on all instances
sidadm> sapcontrol -nr <Systemnumber of instances> -prot NI_HTTP -function StopService;
6.) check Sharedmemorysegments
sidadm> showipc all
if there are still segments then:
sidadm> cleanipc <Systemnumber shown by showipc> remove
7.) Delete old Kernel:
root> cd /usr/sap/<SID>/SYS/exe/run/
root> rm -rf *
8.) Extract Kernel
sidadm> /sapmnt/PIE/SAPCAR -xfv <Ptha and Filename of SAR files>
Don't forget the sapcryptolib if used;)
9.) saproot:
root> cd /usr/sap/PIE/<SID>/exe/run/
root> ./saproot.sh <SID>
10.) Delete and rebuild exe-directorys of instancedirectorys
on every instance do:
root> cd /usr/sap/<SID>/<Instance e.G. DVEBMGS00 or D10>/exe
root> rm -rf *
Rebuild:
sidadm> cd /usr/sap/<SID>/<Instance e.G. DVEBMGS00 or D10>/work
sidadm> sapcpe pf=/usr/sap/<SID>/SYS/profile/<SID>_<Instance e.G. DVEBMGS00 or D10>_<HOST>
11.) startsap
Best regards
Willi Eimler
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
> We were in the process of upgrading the SAP kernel version from 236 to 254.
If you want to patch SAP Kernel to higher PL within the same release level you need to extract new .SAR files to existing kernel directory (not to remove old files).
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
If you want to patch SAP Kernel to higher PL within the same release level you need to extract new .SAR files to existing kernel directory (not to remove old files).
Actually that is not true. The reason people do it this way is because it saves some time for having to add other dependencies which may be in use eg sapcryptolib, IGS etc. Sometimes it's a good idea to start fresh and update all these components in a new directory where no old files are left behind.
Having said that, the issue the system might not be starting up is because some of these "other dependencies" may be missing and he will need to check that.
Regards,
Nelis
Hi Bijoy,
Please share more information about point
9) Started SAP resources in cluster, which was not getting started
- Are there any errors in OS syslog about the SAP resources?
- Review / attach sapstart.log, sapstartsrv.log, stderr and dev_disp in /usr/sap/SID/Instance/work folder; probably there are relevant information in them.
You should rather not rename /sapmnt/SID/exe but move its content to another folder (e.g. exe_backup) and extract the new kernel in /sapmnt/SID/exe. Try it this way.
Best regrads,
Adam
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Adam,
I have listed the logs below. And also one doubt related to your suggestion related to your last point on directory renaming - I followed the same process in our dev and test system and it worked fine some how and the main difference in production in the solaris cluster. Any specific steps / configurations in cluster end during any kernel upgrade activities?
sapstart.log -
SAP-R/3-Startup Program Rel 700 V1.8 (2003/04/24)
-------------------------------------------------
Starting at 2013/06/15 13:12:13
Startup Profile: "/usr/sap/PRD/SYS/profile/START_DVEBMGS00_sscprdsap"
Execute Pre-Startup Commands
----------------------------
(8989) Local: /usr/sap/PRD/SYS/exe/run/sapmscsa -n pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(8993) Local: ln -s -f /usr/sap/PRD/SYS/exe/run/rslgcoll co.sapPRD_DVEBMGS00
(8995) Local: ln -s -f /usr/sap/PRD/SYS/exe/run/rslgsend se.sapPRD_DVEBMGS00
(8997) Local: ln -s -f /usr/sap/PRD/SYS/exe/run/msg_server ms.sapPRD_DVEBMGS00
(8999) Local: ln -s -f /usr/sap/PRD/SYS/exe/run/disp+work dw.sapPRD_DVEBMGS00
Starting Programs
-----------------
(9015) Starting: local co.sapPRD_DVEBMGS00 -F pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9015) New Child Process created.
(9015) Starting local Command:
Command: co.sapPRD_DVEBMGS00
-F
pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9016) Starting: local se.sapPRD_DVEBMGS00 -F pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9016) New Child Process created.
(9017) Starting: local ms.sapPRD_DVEBMGS00 pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9016) Starting local Command:
Command: se.sapPRD_DVEBMGS00
-F
pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9018) Starting: local dw.sapPRD_DVEBMGS00 pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9017) New Child Process created.
(9017) Starting local Command:
Command: ms.sapPRD_DVEBMGS00
pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9018) New Child Process created.
(9018) Starting local Command:
Command: dw.sapPRD_DVEBMGS00
pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9019) Starting: local /usr/sap/PRD/SYS/exe/run/igswd_mt -mode=profile pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(9019) New Child Process created.
(8988) Waiting for Child Processes to terminate.
(9019) Starting local Command:
Command: /usr/sap/PRD/SYS/exe/run/igswd_mt
-mode=profile
pf=/usr/sap/PRD/SYS/profile/PRD_DVEBMGS00_sscprdsap
(8988) **** 2013/06/15 13:12:16 Child 9016 terminated with Status 2 . ****
(9016) **** 2013/06/15 13:12:16 No RestartProgram command for program 2 ****
sapstartsrv.log - (old)
---------------------------------------------------
trc file: "sapstartsrv.log", trc level: 0, release: "700"
---------------------------------------------------
pid 8605
Sat Jun 15 09:40:19 2013
No halib defined => HA support disabled
Initializing SAPControl Webservice
SapSSLInit failed => https support disabled
Starting WebService thread
Webservice thread started, listening on port 50013
Trusted http connect via Unix domain socket '/tmp/.sapstream50013' enabled.
sapstartsrv.log -
---------------------------------------------------
trc file: "sapstartsrv.log", trc level: 0, release: "700"
---------------------------------------------------
pid 8984
Sat Jun 15 13:12:13 2013
No halib defined => HA support disabled
Initializing SAPControl Webservice
SapSSLInit failed => https support disabled
Starting WebService thread
Webservice thread started, listening on port 50013
Trusted http connect via Unix domain socket '/tmp/.sapstream50013' enabled.
dev_disp (old) -
---------------------------------------------------
trc file: "dev_disp.new", trc level: 1, release: "700"
---------------------------------------------------
sysno 00
sid PRD
systemid 370 (Solaris on SPARCV9 CPU)
relno 7000
patchlevel 0
patchno 236
intno 20050900
make: single threaded, ASCII, 64 bit, optimized
pid 8631
Sat Jun 15 09:40:28 2013
kernel runs with dp version 243(ext=110) (@(#) DPLIB-INT-VERSION-243)
length of sys_adm_ext is 364 bytes
*** SWITCH TRC-HIDE on ***
***LOG Q00=> DpSapEnvInit, DPStart (00 8631) [dpxxdisp.c 1287]
shared lib "dw_xml.so" version 236 successfully loaded
shared lib "dw_xtc.so" version 236 successfully loaded
shared lib "dw_stl.so" version 236 successfully loaded
shared lib "dw_gui.so" version 236 successfully loaded
shared lib "dw_mdm.so" version 236 successfully loaded
rdisp/softcancel_sequence : -> 0,5,-1
use internal message server connection to port 13900
MtxInit: 30000 0 0
DpSysAdmExtInit: ABAP is active
DpSysAdmExtInit: VMC (JAVA VM in WP) is not active
DpIPCInit2: start server >sscprdsap_PRD_00 <
DpShMCreate: sizeof(wp_adm) 40656 (1232)
DpShMCreate: sizeof(tm_adm) 53610880 (26792)
DpShMCreate: sizeof(wp_ca_adm) 88064 (88)
DpShMCreate: sizeof(appc_ca_adm) 176000 (88)
DpCommTableSize: max/headSize/ftSize/tableSize=2000/8/2192040/2192048
DpShMCreate: sizeof(comm_adm) 2192048 (1088)
DpSlockTableSize: max/headSize/ftSize/fiSize/tableSize=0/0/0/0/0
DpShMCreate: sizeof(slock_adm) 0 (104)
DpFileTableSize: max/headSize/ftSize/tableSize=0/0/0/0
DpShMCreate: sizeof(file_adm) 0 (72)
DpShMCreate: sizeof(vmc_adm) 0 (1840)
DpShMCreate: sizeof(wall_adm) (224040/346312/80/104)
DpShMCreate: sizeof(gw_adm) 48
DpShMCreate: SHM_DP_ADM_KEY (addr: ffffffff70800000, size: 56686064)
DpShMCreate: allocated sys_adm at ffffffff70800000
DpShMCreate: allocated wp_adm at ffffffff70801e18
DpShMCreate: allocated tm_adm_list at ffffffff7080bce8
DpShMCreate: allocated tm_adm at ffffffff7080bd48
DpShMCreate: allocated wp_ca_adm at ffffffff73b2c6c8
DpShMCreate: allocated appc_ca_adm at ffffffff73b41ec8
DpShMCreate: allocated comm_adm at ffffffff73b6ce48
DpShMCreate: system runs without slock table
DpShMCreate: system runs without file table
DpShMCreate: allocated vmc_adm_list at ffffffff73d840f8
DpShMCreate: allocated gw_adm at ffffffff73d84178
DpShMCreate: system runs without vmc_adm
DpShMCreate: allocated ca_info at ffffffff73d841a8
DpShMCreate: allocated wall_adm at ffffffff73d841b0
MBUF state OFF
DpCommInitTable: init table for 2000 entries
rdisp/queue_size_check_value : -> off
Sat Jun 15 09:40:29 2013
ThTaskStatus: rdisp/reset_online_during_debug 0
EmInit: MmSetImplementation( 2 ).
MM global diagnostic options set: 0
<ES> client 0 initializing ....
<ES> InitFreeList
<ES> block size is 4096 kByte.
Using implementation std
<ES> Info: use normal pages (no huge table support available)
EsStdUnamFileMapInit: ES base = 0xfffffffba8000000
EsStdInit: Extended Memory 15360 MB allocated
<ES> 3839 blocks reserved for free list.
ES initialized.
mm.dump: set maximum dump mem to 96 MB
Sat Jun 15 09:41:30 2013
rdisp/http_min_wait_dia_wp : 1 -> 1
***LOG Q0K=> DpMsAttach, mscon ( sscprdsap) [dpxxdisp.c 12650]
use SAPLOCALHOST=<sscprdsap> as internal hostname
DpStartStopMsg: send start message (myname is >sscprdsap_PRD_00 <)
DpStartStopMsg: start msg sent
CCMS: AlInitGlobals : alert/use_sema_lock = TRUE.
DpMsgAdmin: Set release to 7000, patchlevel 0
MBUF state PREPARED
MBUF component UP
DpMBufHwIdSet: set Hardware-ID
***LOG Q1C=> DpMBufHwIdSet [dpxxmbuf.c 1050]
DpMsgAdmin: Set patchno for this platform to 236
Release check o.K.
Sat Jun 15 09:41:41 2013
MBUF state ACTIVE
DpModState: change server state from STARTING to ACTIVE
Sat Jun 15 09:46:11 2013
DpSigInt: caught signal 2
DpHalt: shutdown server >sscprdsap_PRD_00 < (normal)
DpModState: change server state from ACTIVE to SHUTDOWN
Stop work processes
Sat Jun 15 09:46:13 2013
Stop gateway
Stop icman
Terminate gui connections
wait for end of work processes
wait for end of gateway
waiting for termination of gateway ...
Sat Jun 15 09:46:15 2013
wait for end of icman
waiting for termination of icman ...
Sat Jun 15 09:46:16 2013
waiting for termination of icman ...
Sat Jun 15 09:46:17 2013
waiting for termination of icman ...
Sat Jun 15 09:46:18 2013
waiting for termination of icman ...
Sat Jun 15 09:46:19 2013
waiting for termination of icman ...
Sat Jun 15 09:46:21 2013
DpStartStopMsg: send stop message (myname is >sscprdsap_PRD_00 <)
DpStartStopMsg: stop msg sent
Sat Jun 15 09:46:22 2013
DpHalt: sync with message server o.k.
detach from message server
***LOG Q0M=> DpMsDetach, ms_detach () [dpxxdisp.c 12996]
MBUF state OFF
MBUF component DOWN
cleanup EM
cleanup event management
cleanup shared memory/semaphores
Profile configuration error detected, use temporary corrected setup
Shared Pool 40: ipc/shm_psize_40 = 128000000 (too small)
Shared Pool 40: (smaller than min requirement 153676088)
Shared Pool 40: (estimated size assumed 156000000)
*** INFO Shm 42 in Pool 40 17547 KB estimated 12037 KB real ( -5510 KB -32 %)
removing request queue
***LOG Q05=> DpHalt, DPStop ( 8631) [dpxxdisp.c 11467]
*** shutdown completed - server stopped ***
Thanks in advance for any help on the topic.
Regards,
BIJOY
Hello Bijoy,
we can see that the rslgsend process (se.sapPRD_DVEBMGS00) stopped
**** 2013/06/15 13:12:16 Child 9016 terminated with Status 2 . ****
but this is not the reason of startup issue.
Refer to my previous comment as well:
One more remark reg. point
5) Stopped all services running under <SIDADM> in central instance and application server
sapstartsrv process needs to be stopped as well. This process may be started by root (by sapinit script while booting, see e.g. SAP note 936273 / 823941).
Adam
sapstartsrv processes starts (must to start) as <sid>adm user not root (-u option in /usr/sap/sapservices file).
What relation between sapstartsrv service (administration and monitoring service for SAP instance) and SAP instance which can lead to inability to start SAP instance itself?
Former Member:
Can you attach dev_w* log also?
Hello,
Just an example:
probud2:bcsadm 51> ps -ef | grep sapstartsrv
bcsadm 4118 1 0 Jun11 ? 00:00:00 /usr/sap/BCS/SCS01/exe/sapstartsrv pf=/usr/sap/BCS/SYS/profile/START_SCS01_probud2 -D -u bcsadm
Process is started by root and running with uid sidadm.
How it must be runnig and how it is running are sometimes two different things...
When changing the kernel sapstartsrv must be stopped as well. This is important.
How sapstartsrv is connected to start SAP instance? It's very simple. Just have a look at startsap script. sapcontrol is called to start SAP system (sapcontrol -nr NR -host HOST -pro PROT -function StartWait XXX YY). sapcontrol is the control program for sapstartsrv. Therefore if sapcontrol (...) StartWait (or just the -function Start) is called it goes to sapstartsrv and SAP system is started by sapstartsrv. This way if sapstartsrv is e.g. not running (or hanging) when sapcontrol calls the StarWait function SAP won't start either. Take a look at note 936273 for example.
Adam
Adam Csaba Goetz wrote:
probud2:bcsadm 51> ps -ef | grep sapstartsrv
bcsadm 4118 1 0 Jun11 ? 00:00:00 /usr/sap/BCS/SCS01/exe/sapstartsrv pf=/usr/sap/BCS/SYS/profile/START_SCS01_probud2 -D -u bcsadm
Process is started by root and running with uid sidadm.
How it must be runnig and how it is running are sometimes two different things...
You sapstartsrv is started as bcsadm user. You can't to start sapstartsrv service as root user until you adjust user profile (e.g. LD_LIBRARY_PATH to resolve dependencies - manual actions).
How sapstartsrv is connected to start SAP instance? It's very simple. Just have a look at startsap script. sapcontrol is called to start SAP system (sapcontrol -nr NR -host HOST -pro PROT -function StartWait XXX YY). sapcontrol is the control program for sapstartsrv. Therefore if sapcontrol (...) StartWait (or just the -function Start) is called it goes to sapstartsrv and SAP system is started by sapstartsrv. This way if sapstartsrv is e.g. not running (or hanging) when sapcontrol calls the StarWait function SAP won't start either. Take a look at note 936273 for example.
But in this case you simply get error of webservice method call. sapstartsrv starts SAP instance as usually you do with startsap command (more likely). Moreover in UNIX you can start SAP instance without running sapstartsrv. startsap will start it during startup. I can't see any relation how sapstartsrv can influence on result of SAP instance startup (success or fail). Also in cluster environments startup of SAP instances is handed to cluster software.
Hi,
You sapstartsrv is started as bcsadm user. You can't to start sapstartsrv service as root user until you adjust user profile (e.g. LD_LIBRARY_PATH to resolve dependencies - manual actions).
Yes I can without any adjustment:
probud2:~ # ps -ef | grep sapstartsrv
bcsadm 4118 1 0 Jun11 ? 00:00:00 /usr/sap/BCS/SCS01/exe/sapstartsrv pf=/usr/sap/BCS/SYS/profile/START_SCS01_probud2 -D -u bcsadm
probud2:~ # kill 4118
probud2:~ # /usr/sap/BCS/SCS01/exe/sapstartsrv pf=/usr/sap/BCS/SYS/profile/START_SCS01_probud2 -D
probud2:~ # ps -ef | grep sapstartsrv
root 26365 1 0 11:11 ? 00:00:00 /usr/sap/BCS/SCS01/exe/sapstartsrv pf=/usr/sap/BCS/SYS/profile/START_SCS01_probud2 -D
But it is NOT about my SAP test system... Its about the logic and the possibilities.
Lets focus on Bijoy's question.
But in this case you simply get error of webservice method call. sapstartsrv starts SAP instance as usually you do with startsap command (more likely).
When you call startsap it starts sapstartsrv as well. But what did I wrote before?
This way if sapstartsrv is e.g. not running (or hanging) WHEN sapcontrol calls the StartWait function SAP won't start either.
sapstartsrv does not run -> startsap is called -> sapstartsrv gets started -> sapcontol ... StartWait is called -> SAP will start (but this is NOT what I was talking about)
sapstartsrv does not run -> startsap is called -> sapstartsrv cannot be started (for any reasons) or hanging -> sapcontrol ... StartWait is called -> SAP won't start (this is what I was talking about)
Moreover in UNIX you can start SAP instance without running sapstartsrv.
When you call sapstart pf=path/startup_profile in background it will. But this is not the case how it should be started, at least not with SAP NW release 700 or later. And it is not how startsap works for these releases.
Adam
SC[SUNW.sap_ci_v2,prd-sap-rg,prd-sap-ci-res,sap_ci_svc_start]: [ID 930059 daemon.error] /sapmnt/PRD/exe/startsap_sscprdsap_00: No such file or directory
6) Renamed exe folder in /sapmnt/<PRD> to exe_backup7) Renamed exe_new folder in /sapmnt/<SID> to exe
> sapstartsrv does not run -> startsap is called -> sapstartsrv cannot be started (for any reasons) or hanging -> sapcontrol ... StartWait is called -> SAP won't start (this is what I was talking about)
In that case you receive errors about webservice method call errors (like NIECONN_REFUSED). Moreover, according to log provided by Former Member : sapstrartsrv was started successfully. But for some reason dev_disp was stopped:
> DpSigInt: caught signal 2
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.