Hi Basis Gurus,
Today my SAP PRD system all of a sudden was running slow and users had difficulties executing TCodes as it took minutes before the screen appeared.My system is SAP SRM 4.00, Oracle Release 220.127.116.11.0 and AIX 5.3.
From Basis point of view , we noticed that all the work process were hanged due to RFCs' occupying it.Couldnt investigate further on the system as we needed to restart asap because users needed to log on. A quick check on SICK shows no errors.What we managed to find out is that we needed to check this parameter below and set it accordingly:-
Number of work processes kept free for other users.
This parameter is used to reserve a number of dialog work processes for dialog mode. It prevents parallel RFCs from occupying all the processes.
The parameter rdisp/wp_no_dia specifies the absolute number of dialog work processes.
Unit: number of dialog work processes
Default value: 1
If 10 dialog work processes are configured for the instance (rdisp/wp_no_dia = 10) and the parameter rdisp/rfc_min_wait_dia_wp = 3 is set, parallel RFCs can occupy a maximun of 7 dialog work processes. Three dialog work processes always remain free for dialog mode.
There are 20 DIA processes in our system, perhaps reserving a minimum of about 5 work processes for DIA users will be useful via the parameter.
So my first question would be is there any other suggestions besides adjusting the mentioned parameter above in order to ensure that no work processors going into hang state due to RFCs' occupying it as this issue always happens at the end of the month only when there are massive users accessing it.
When we went for a restart of the system we encountered another issue. Steps to the issue are as below:-
1) Did a proper shutdown of Oracle and SAP.
2) When we start using startsap script, it doesn't start DB so we started DB manually but listener was having problem
so we stopped all.
3) We also did a cleanipc, it too throws below error:-
sidadm> cleanipc <systemno> remove
exec(): 0509-036 Cannot load program cleanipc because of the following errors:
0509-130 Symbol resolution failed for cleanipc because:
0509-136 Symbol memmove (number 106) is not exported from
dependent module /usr/sap/sid/SYS/exe/run/libsapu16.so.
0509-192 Examine .loader section symbols with the
'dump -Tv' command.
4) Also the Listener file has no contents,it's null file.
5) This is another error :-
exec(): 0509-036 Cannot load program /oracle/sid/112_64/bin/tnslsnr because of the following errors:
0509-150 Dependent module /oracle/sid/112_64/lib/libttsh11.so could not be loaded.
0509-101 The module has too many section headers
or the file is damaged
The libttsh11.so file was empty dated 25th timestamp as below in PRD:-
1 sid dba 0 Nov 25 07:12 libttsh11.so
Further checking has let me to understand this is an Oracle Bug from this link:-
This Oracle bug only happens in an Upgrade time and how could this happen to my Production system during a restart process.
Also this system was restarted during the last weekend's maintainance window and it came up with no issues.
6) We copied over the libttsh11.so from our QA environment to PRD and did a Startup:-
-rwxr-x--- 1 sid dba 65967496 Nov 25 07:40 libttsh11.so (This is the copied over file from QA)
7) After Startup everything has been in order till now.
My second question is what went wrong with the libttsh11.so file. How could it be 0 size in PRD when no signs of changes had happen to the PRD system. Is this a proven Oracle Bug or something else since I have never encountered anything like this before. Hope all the Gurus here could shed some light into my 2 questions as I am looking for positive answers.