Skip to Content
avatar image
Former Member

sap app server down.

hi everyone,
     My manager told to me of my PRD server was down yesterday morning. My manager said some users could not login the system , and he found one app server was down. He restarted the app server and seems everything is ok now.  But he asked me to find out what caused the server down.

I have checked the  dev trace files of the app server today.

I did not find any useful log in the dev_disp.old. 

but I found the some dev_wp*.old  files record like below (the server has 76 wp, 0~19 record like below)

M Tue May 28 06:55:22 2013
M  ThAlarmHandler (1)
M  ThAlarmHandler: set CONTROL_TIMEOUT/DP_CONTROL_JAVA_EXIT and break sql
B  db_sqlbreak() = 1
M  ThAlarmHandler: return from signal handler
M
M Tue May 28 06:56:22 2013
M  ThAlarmHandler (2)
M  ThAlarmHandler: 2. ALARM: terminate process (pid=9408, user is T138/M0)
M  ThAlarmHandler: prv_action of W0: 0x2
M  ThAlarmHandler: set clean state of T138/M0 to DP_TIMEOUT
M  ThAlarmHandler: prv_action of W0: 0xa
M  ThAlarmHandler: save snc contexts
M  ThISncSaveAllContexts: save 0 snc contexts
M  ThAlarmHandler: C-Stack during alarm handler
M                     C-STACK
(0)  0x4000000001b363b0  CTrcStack + 0x1d0 at dptstack.c:227 [dw.sapP20_D20]
(1)  0x4000000001733100  ThAlarmHandler + 0x11e0 at thxxhead.c:21417 [dw.sapP20_D20]
(2)  0x4000000001664520  DpSigAlrm + 0x220 at dpxxtool.c:2295 [dw.sapP20_D20]
(3)  0xe00000013305f440       Signal 14 (SIGALRM) delivered
(4)  0xc00000000054ee70  _semop_sys + 0x30 [/usr/lib/hpux64/libc.so.1]
(5)  0xc0000000005607e0  _semop + 0xe0 at ../../../../../core/libs/libc/shared_em_64_perf/../core/syscalls/t_semop.c:19 [/usr/lib/hpux64/libc.so.1]
(6)  0x4000000001707680  RqOsSem + 0xb0 at semux.c:1186 [dw.sapP20_D20]
(7)  0x40000000017097a0  SemRq + 0x810 at semux.c:1814 [dw.sapP20_D20]
(8)  0x4000000004cc2990  EsILock + 0x2410 at esxx.c:3449 [dw.sapP20_D20]
(9)  0x4000000004cca410  STD_EsAttach + 0x1d0 at esxx.c:2348 [dw.sapP20_D20]
(10) 0x4000000004cd5110  EsAttach + 0x90 at esxxfunc.c:874 [dw.sapP20_D20]
(11) 0x4000000004c988c0  EmContextAttach + 0x1e0 at emxx.c:932 [dw.sapP20_D20]
(12) 0x40000000018e20a0  ThCheckEmState + 0x300 at thxxmem.c:438 [dw.sapP20_D20]
(13) 0x40000000018dd780  ThRollIn + 0x380 at thxxmem.c:870 [dw.sapP20_D20]
(14) 0x400000000175bc20  ThSessionRestore + 0x180 at thxxhead.c:22129 [dw.sapP20_D20]
(15) 0x40000000017250b0  TskhLoop + 0x1210 at thxxhead.c:3542 [dw.sapP20_D20]
(16) 0x400000000171f000  ThStart + 0x5d0 at thxxhead.c:10759 [dw.sapP20_D20]
(17) 0x40000000015ab260  DpMain + 0x870 at dpxxdisp.c:1152 [dw.sapP20_D20]
(18) 0x40000000015a4b60  main + 0x80 at thxxanf.c:64 [dw.sapP20_D20]
(19) 0xc00000000006e9b0  main_opd_entry + 0x50 [/usr/lib/hpux64/dld.so]
M
M  ***LOG Q02=> wp_halt, WPStop (Workproc 0 9408) [dpuxtool.c   268]

other wp*.old just record wp heap memory is not enough and asked us to change the parameter.


Could anyone give me any suggestions how to find out what caused the issues?

regards .

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

4 Answers

  • avatar image
    Former Member
    Jun 05, 2013 at 09:34 AM

    Hello,

    Please check dev_disp file to see if that contains useful information.

    Regards,

    Denish

    Add comment
    10|10000 characters needed characters exceeded

  • Jun 05, 2013 at 10:41 AM

    Hello

    Please provide the complete log file dev_w0.old

    Regards

    RB

    Add comment
    10|10000 characters needed characters exceeded

  • Jun 05, 2013 at 01:04 PM

    Hi Ashuai

    The kernel the same between all servers ?

    Best Regards

    Marius

    Add comment
    10|10000 characters needed characters exceeded

  • Jun 05, 2013 at 09:50 PM

    When the dev traces show heap memory is not enough, then this is where you start looking.

    Compare the memory parameters between the different application servers.

    Also check if this matches the resources available to the application server.

    So the memory parameters should be aligned with the memory and swap available.

    Check and compare the heap parameters:
    abap/heap_area_dia
    abap/heap_area_nondia
    abap/heap_area_total

    But also the roll, page and extended memory
    rdisp/ROLL_SHM
    rdisp/ROLL_MAXFS
    rdisp/PG_SHM
    rdisp/PG_MAXFS
    em/initial_size_MB

    Compare them and when they get exhausted, enlarge them if the extra memory resources are availabe at your application server.

    Further just check ST02 for the buffers and ST22 for the shortdumps.

    Regards, Marco

    Add comment
    10|10000 characters needed characters exceeded