Skip to Content

911: My HANA One system on AWS has fallen and it can't get back up - Indexserver crash

Hi Juegen or anyone else from SAP,

I was trying to load a 5GB CSV file into a ROW table and HANA's IndexServer crashed with a crashdump file. I have the stuff zipped up and copied to my local drive. A quick examination of the trace files show an issue in the indexserver. When I attempt to stop and start the HANA database, I always get the indexserver crash. I can't use SAP HANA Studio for anything. This is on the "production" HANA One instance I have on AWS with 63GB of ram, so I figured I'd be OK. Right now, my server is useless.The good news is I have a backup.

Any suggestions before I kill the instance and recreated?

Thanks,

Bill

Add a comment
10|10000 characters needed characters exceeded

Assigned Tags

Related questions

5 Answers

  • author's profile photo Former Member
    Former Member
    Posted on Jan 16, 2013 at 07:59 AM

    Hi Bill,

    Any details on the index server trace?

    If you have a recent backup, the fastest way to get you db back up and running is indeed a fresh HANA One instance into which you restore the backup. 20 minutes and you're back online... But maybe you want to investigate the root cause for the crash...

    Btw, the support forum for the production instances of HANA One is on http://www.saphana.com/community/solutions/cloud-info/cloud/hana-platform-aws, so it might be a good idea to cross-post you 911 there.

    Cheers

    --Juergen

    Add a comment
    10|10000 characters needed characters exceeded

    • Hi Juergen & Rahul,

      I'm been busy getting my new instance up and running. Here are the trace specifics from indexserver_alert_hanaserver.trc:

      [3084]{0}[0] 2013-01-17 19:45:09.598506 e Metadata ptl_shm.cc(00520) : ShmSystem::attach (shmid=12550492, align=67108864) - Cannot allocate memory

      [3084]{0}[0] 2013-01-17 19:45:09.599115 e Row_Engine msglog.cc(00088) : Error during RowStore recovery: transaction rolled back due to unavailable resource (at ptime/storage/recovery/CheckpointMgr.cc:535 )

      [3084]{0}[0] 2013-01-17 19:45:09.599273 e Basis Crash.cpp(00558) : Crash at /HDB/IMP/NewDB100_REL/src///sys/src/Basis/Diagnose/impl/FaultProtectionImpl.cpp:531

      Reason:

      exception 1: no.2100002 (Basis/Diagnose/impl/FaultProtectionImpl.cpp:531)

      Illegal call to exit(), _exit() or _Exit() detected

      exception throw location:

      1: 0x00007f573f5b1f6a in exit_handler+0x46 at FaultProtectionImpl.cpp:531 (libhdbbasis.so)

      2: 0x0000000000542bba in _exit+0x16 at LinuxMallocInitializer.cpp:143 (hdbindexserver)

      3: 0x00007f5735f55897 in ptime::CheckpointMgr::restarter(void*)+0xd3 at CheckpointMgr.cc:536 (libhdbrskernel.so)

      4: 0x00007f57354e9a00 in ptime::PtimeThread::run(void*)+0x10 at ptime_thread.h:131 (libhdbrskernel.so)

      5: 0x00007f574c16c000 in TrexThreads::PoolThread::run()+0xc30 at PoolThread.cpp:255 (libhdbbasement.so)

      6: 0x00007f574c16d7a8 in TrexThreads::PoolThread::run(void*&)+0x14 at PoolThread.cpp:104 (libhdbbasement.so)

      7: 0x00007f573f641545 in Execution::Thread::staticMainImp(void**)+0x671 at Thread.cpp:448 (libhdbbasis.so)

      8: 0x00007f573f64170d in Execution::Thread::staticMain(void*)+0x39 at Thread.cpp:512 (libhdbbasis.so)

      [3084]{0}[0] 2013-01-17 19:45:09.645464 e Basis FaultProtectionImpl.cpp(00961) : SIGNAL 6 (SIGABRT) caught, sender PID: 3042, PID: 3042, thread: 2710[thr=3084]: Checkpointer, value int: 464540312, ptr: 0xffff880f1bb05298, time: 2013-01-17 19:45:09 000 Local

      Instance HDB/00, OS Linux hanaserver 2.6.32.27-0.2-default #1 SMP 2010-12-29 15:03:02 +0100 x86_64

      The scenario occurred when I was attempting to load up my test database with approx 30GB of data into ROW tables on the 63 GB instance of HANA One. The import of two of the tables took forever, so I stopped the instance and restarted it. I then learned about ALTER SYSTEM LOGGING OFF; However, the damage was done. I believe that I managed to get too much data into the ROW tables and HANA ran out of memory. Here are the size of the data files for the database:

      hanaserver:/hanadata/HDB/data/mnt00001> ls -l ./hdb00001

      total 24668

      -rw------- 1 hdbadm sapsys 335577088 2013-01-17 22:43 datavolume_0000.dat

      -rw-rw-r-- 1 hdbadm sapsys 36 2013-01-17 19:43 landscape.id

      hanaserver:/hanadata/HDB/data/mnt00001> ls -l ./hdb00002

      total 34172160

      -rw------- 1 hdbadm sapsys 35130195968 2013-01-16 02:48 datavolume_0000.dat

      hanaserver:/hanadata/HDB/data/mnt00001> ls -l ./hdb00003

      total 96152

      -rw------- 1 hdbadm sapsys 351059968 2013-01-16 02:03 datavolume_0000.dat

      hanaserver:/hanadata/HDB/data/mnt00001> ls -l ./hdb00004

      total 26444

      -rw------- 1 hdbadm sapsys 271745024 2013-01-16 02:03 datavolume_0000.dat

      The instance seems to be running, but the SapService isn't running so HANA Studio can connect to even get the diagnostic traces. I have to go to Linux. I say this because when I issue the ./HDB stop command, it takes about 6 minutes to shut down.

      Is there any way to delete tables or truncate tables in a database so that I can recover it?

      Thanks,

      Bill

  • author's profile photo Former Member
    Former Member
    Posted on Jan 16, 2013 at 04:18 PM

    Hi Bill,

    In case you find the reason behind the crash, request you to please share the findings for future reference.

    Thanks & Regards,

    Rahul Rajagopalan Nair

    Add a comment
    10|10000 characters needed characters exceeded

  • Posted on Feb 13, 2014 at 02:08 AM

    Hey Bill did you ever get to the bottom of this? I have this on a system.

    Add a comment
    10|10000 characters needed characters exceeded

    • Interestingly this wasn't our problem, it got resolved by .

      The ACTIVE_TABLES table had a corrupted delta and mergedog would try to merge it around 15 minutes after startup. At this point the indexserver would restart.

      Lloyd found it by starting the indexserver in console mode and watching the logs. He then copied the affected table, deleted the original and copied it back.

      Then it was fine. Thanks for the reply.

  • author's profile photo Former Member
    Former Member
    Posted on Jun 18, 2014 at 11:59 PM

    Hi Bill,

    I do not think it was the Row Tables, but the data load for the indexserver and HANA.

    If loading 30G of DATA, using the HANA One 60G AWS instance and from the DF list of volume usage, it looks like you did not have enough room on HANA which is prob the root cause of the crash.

    HANA requires 30 of the 60G from HANA One to run its application/services/etc - so in reality, there is only 30G of memory that is available. I know you are well versed and a HANA expert, as I have followed many of your posts, so if you knew this and does not apply - just trying to help! 😊

    Cheers!

    Add a comment
    10|10000 characters needed characters exceeded

    • Hi Peter,

      You are quite right on the memory limits - especially in the case of row tables. I ended up with a scenario where the data loader added too much data to the system and ideally should have thrown an error during the load. It did - just too late. The row table ended up having more data than could load into memory and the server crashed. Unfortunately, I could never recover other than tearing down the instance and building a new instance - this time using column tables.

      Regards,

      Bill

  • author's profile photo Former Member
    Former Member
    Posted on Dec 16, 2014 at 09:57 AM

    I have the same issue , please go through SAP Note 2081663

    Add a comment
    10|10000 characters needed characters exceeded

Before answering

You should only submit an answer when you are proposing a solution to the poster's problem. If you want the poster to clarify the question or provide more information, please leave a comment instead, requesting additional details. When answering, please include specifics, such as step-by-step instructions, context for the solution, and links to useful resources. Also, please make sure that you answer complies with our Rules of Engagement.
You must be Logged in to submit an answer.

Up to 10 attachments (including images) can be used with a maximum of 1.0 MB each and 10.5 MB total.