cancel
Showing results for 
Search instead for 
Did you mean: 

S4HANA 1511 installation..Problem in HAN DB import.(Defunct process)

Former Member
0 Kudos

Hello Experts,

We have a strange problem in 1511 installation.

During the DB import phase, the import halts without any reason. It will not update the error logs as well.

Up on checking, we find that the CPU utilization for all the R3load processes will become zero.

Further checks shows that, the indexserver becomes like this;

hd1adm   21900 14230 19 04:44 ?        00:12:15 [hdbindexserver] <defunct>

It will not let us use that DB instance even after OS reboot. We need to install new DB instance each time it fails.

System Environment :

OS         - SUSE Linux 11 SP2.

HDB      - HANA 1.0 SP11.

SAP      - S4HANA 1511.

Memory - 512GB

All the mount points have sufficient free space as well.

mhdhana02:~ # df -h

Filesystem                Size  Used Avail Use% Mounted on

/dev/mapper/vg0-root      9.9G  5.2G  4.3G  55% /

devtmpfs                  253G  304K  253G   1% /dev

tmpfs                     380G     0  380G   0% /dev/shm

/dev/sda1                  98M   23M   71M  25% /boot

/dev/mapper/vg0-hanadata  985G   71G  864G   8% /hana/data

/dev/mapper/vg0-shared    504G  198G  307G  40% /hana/shared

/dev/mapper/vg0-usrsap     99G  5.4G   94G   6% /usr/sap

/dev/mapper/vg0-hanalog   493G   17G  451G   4% /hana/log

/dev/mapper/vg0-sapmnt     99G  1.7G   92G   2% /sapmnt

mhdhana02:~ #

Unable to find the root cause. Please let us know if anyone have come across this situation & found the fix.

Regards,

Suresh Kadali

Accepted Solutions (0)

Answers (4)

Answers (4)

former_member185239
Active Contributor
0 Kudos

Hi Suresh,

Which revision are you using for HANA SP11?

It better to update the revision of hana database.

Can you paste the current log for the hana database?

With Regards

Ashutosh Chaturvedi

Former Member
0 Kudos

Hello Ashutosh,

Thanks for the response.

I did upgrade HANA DB SP11 from Rev 110 to 112 but no luck. It has crashed once during the Rev upgrade & once during the SAP installation.

Size of index crash trace is very huge I am not able to copy here.

Regards,

Suresh Kadali

Former Member
0 Kudos

borting validity check. Please make sure that you have up-to-date timezone data tables. (see SAP Note 1932132)

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.048327 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.101196 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.151069 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.200920 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.250837 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.300669 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.350575 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.400223 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.450445 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.500676 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.550748 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.600816 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.650990 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:12.816002 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:13.153395 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:13.606238 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:13.659750 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:13.710082 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:13.947206 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[45816]{-1}[-1/-1] 2016-05-18 09:50:14.175453 e REPOSITORY       biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.

[46109]{-1}[12/-1] 2016-05-18 09:50:14.700386 e Memory           MallocProxy.cpp(01617) : libnuma: Error: mbind: Bad address

[46124]{-1}[12/-1] 2016-05-18 09:50:14.702508 e Memory           MallocProxy.cpp(01617) : libnuma: Error: mbind: Bad address

[46100]{-1}[12/-1] 2016-05-18 09:50:14.708522 e Basis            FaultProtectionImpl.cpp(01592) : SIGNAL 11 (SIGSEGV) caught, thread: 31882[thr=46100]: JobWrk0054, addr

: 0x00007feab7902000, time: 2016-05-18 09:50:14 000 Local

Instance DBH/00, OS Linux mhdhana02 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64

----> Register Dump <----

  rax: 0x00000000ffffffff  rbx: 0x00007feb7d0d94b0

  rcx: 0x00007fec5a2fb3c9  rdx: 0x0000000000000003

  fp[0]: 0x0.ed00 4e88 7fec 0000 * 2^0x1520

  fp[1]: 0x0.1fa0 0000 ffff 0000 * 2^0xc

  fp[2]: 0x0.ffff 0000 0000 0000 * 2^0x34

  fp[3]: 0x0.ffff 0000 0000 0000 * 2^0x0

  fp[4]: 0x0.ffff 0000 0000 0000 * 2^0x1

  fp[5]: 0x0.ffff 0000 0000 0000 * 2^0x0

  fp[6]: 0x0.4002 0000 0000 0000 * 2^0x0

  fp[7]: 0x0.4002 0000 0000 0000 * 2^0x0

  xmm[00]: 0x00004002.00000000.00000000.00000000

  xmm[01]: 0x0000ffff.00000000.00000000.00000000

  xmm[02]: 0x00000000.00000000.0000ffff.00000000

  xmm[03]: 0x00000000.00000000.00000000.40c89680

  xmm[04]: 0x00400020.01000080.ffff0908.ffff0b0a

  xmm[05]: 0xffff0d0c.ffff0f0e.ffff0100.ffff0302

  xmm[06]: 0xffff0504.ffff0706.00000038.00000038

  xmm[07]: 0x00000038.00000038.00000038.00000038

  xmm[08]: 0x00000038.00000038.00000038.00000038

  xmm[09]: 0x00000038.00000038.00000038.00000038

  xmm[10]: 0x00000038.00000038.04030201.08070605

  xmm[11]: 0x0b0a0908.0f0e0d0c.3fffffff.3fffffff

  xmm[12]: 0x3fffffff.3fffffff.00000000.00000000

  xmm[13]: 0xffffffff.ffffffff.00000000.3fffffff

  xmm[14]: 0x00000000.3fffffff.3fffffff.00000000

  xmm[15]: 0x3fffffff.00000000.03020100.07060504

[46100]{-1}[12/-1] 2016-05-18 09:50:14.708522 e Basis            FaultProtectionImpl.cpp(01592) : NOTE: full crash dump will be written to /usr/sap/DBH/HDB00/mhdhana02/

trace/indexserver_mhdhana02.30003.crashdump.20160518-095014.045766.trc

Call stack of crashing context:

1: 0x00007fec5e79e940 in MemoryManager::assertReadability(void*)+0x0 at MallocProxy.cpp:1709 (libhdbbasis.so)

2: 0x00007fec5e79f1a8 in MemoryManager::mmapOverride(void*, unsigned long, int, int, int, long, void*)+0x534 at MallocProxy.cpp:1836 (libhdbbasis.so)

3: 0x00007fec5e79f4c2 in mmap64+0x10 at MallocProxy.cpp:1478 (libhdbbasis.so)

4: 0x00007fec5e8cbf8a in System::UX::mmap(void*, unsigned long, int, int, int, unsigned long)+0x46 at SystemCallsUNIX.cpp:442 (libhdbbasis.so)

5: 0x00007fec5e8a60dd in System::memAllocSystemPages(void*&, unsigned long, System::NUMAPolicy)+0x129 at Memory.cpp:1201 (libhdbbasis.so)

6: 0x00007fec5e7c8462 in MemoryManager::MemorySource::allocateSystemMemory(void*, unsigned long, unsigned long&, System::NUMAPolicy)+0x90 at MemorySource.cpp:346 (libh

dbbasis.so)

7: 0x00007fec5e7c941b in MemoryManager::MemorySource::allocateBigBlock(MemoryManager::BlockInfo*&, unsigned long, unsigned long, unsigned long&, System::NUMAPolicy)+0x

67 at MemorySource.cpp:477 (libhdbbasis.so)

8: 0x00007fec5e768861 in MemoryManager::BigBlockAllocator::createNewBlock(unsigned long, unsigned long&)+0x1f0 at BigBlockAllocator.cpp:629 (libhdbbasis.so)

9: 0x00007fec5e76c78e in MemoryManager::BigBlockAllocator::allocateBlock(unsigned long, MemoryManager::OutOfMemoryHandlingMode, unsigned long&)+0x44a at BigBlockAlloca

tor.cpp:1008 (libhdbbasis.so)

10: 0x00007fec5e769201 in MemoryManager::BigBlockAllocator::allocateBig(MemoryManager::BlockInfo*&, unsigned long, unsigned long, unsigned short, ltt::allocator_statist

ics::SubStats&, void const*, MemoryManager::OutOfMemoryHandlingMode, MemoryManager::BigBlockAllocator::AllocationType, unsigned long&)+0x360 at BigBlockAllocator.cpp:11

59 (libhdbbasis.so)

11: 0x00007fec5e7ba8c5 in MemoryManager::MemoryPool::reserveMemoryAndAllocateBigBlock(unsigned long, bool, unsigned short, ltt::allocator_statistics::SubStats&, void co

nst*, bool, Synchronization::LockHandle<Synchronization::Mutex, false>&)+0x141 at MemoryPool.cpp:2231 (libhdbbasis.so)

12: 0x00007fec5e7be612 in MemoryManager::MemoryPool::allocateBigOrHugeBlock(unsigned long, unsigned short, ltt::allocator_statistics&, void const*, bool, bool, Synchron

ization::LockHandle<Synchronization::Mutex, false>&, bool)+0x290 at MemoryPool.cpp:1613 (libhdbbasis.so)

13: 0x00007fec5e7be9e7 in MemoryManager::MemoryPool::allocate(unsigned long, unsigned short, ltt::allocator_statistics&, bool&, bool, bool, void const*)+0x373 at Memory

Pool.cpp:1558 (libhdbbasis.so)

14: 0x00007fec5e7fa7ee in MemoryManager::PoolAllocator::allocateNoThrowImpl(unsigned long, void const*)+0x6a at PoolAllocator.cpp:1833 (libhdbbasis.so)

15: 0x00007fec56240066 in ltt::allocator::allocate(unsigned long)+0x22 at memory.cpp:83 (libhdbsasso.so)

16: 0x00007fec5621b99b in Keylist_CompressedBitList::reallocate()+0x3f7 at memory.hpp:866 (libhdbsasso.so)

17: 0x00007fec562139d7 in KeylistScanJob::run()+0x63 at escada.cpp:69 (libhdbsasso.so)

18: 0x00007fec76d30ee9 in TRexUtils::Parallel::JobBase::runEx()+0x15 at ParallelDispatcher.cpp:222 (libhdbbasement.so)

19: 0x00007fec76d30dc0 in TRexUtils::Parallel::JobBase::run(Execution::Context&, Execution::JobObject&)+0x40 at Timer.hpp:69 (libhdbbasement.so)

20: 0x00007fec5e5e238e in Execution::JobObjectImpl::run(Execution::JobWorker*)+0xf1a at JobExecutorImpl.cpp:1086 (libhdbbasis.so)

21: 0x00007fec5e5ed383 in Execution::JobWorker::runJob(ltt::smartptr_handle<Execution::JobObjectForHandle>&)+0x390 at JobExecutorThreads.cpp:203 (libhdbbasis.so)

22: 0x00007fec5e5ef99a in Execution::JobWorker::run(void*&)+0x1b6 at JobExecutorThreads.cpp:411 (libhdbbasis.so)

23: 0x00007fec5e6439f0 in Execution::Thread::staticMainImp(void**)+0x700 at Thread.cpp:461 (libhdbbasis.so)

24: 0x00007fec5e644fc8 in Execution::Thread::staticMain(void*)+0x34 at ThreadMain.cpp:26 (libhdbbasis.so)

Former Member
0 Kudos

Hello Suresh,

Did you find a solution for your issue. We are having the same exact issue with our hana upgrade 102.04. We were asked to upgrade to SUSE 11 SP3. We are also on SP2 and the server has the same amount of memory 512GB.

Please let us me know if you have a fix for this issue.

Thank you

Dinali

Former Member
0 Kudos

Hello Dinali,

Unfortunately we couldn't solve the issue. Really weird one..!!
Currently we are in the process of setting up new system with SLES SP3. Hope that we will not face it with SP3.

Regards,

Suresh Kadali

Former Member
0 Kudos

Hello Suresh,

Thank you for your prompt reply. We are also upgrading to SP3. I will only know the results in a weeks time. When are you upgrading yours to sp3?

Thanks

Dinali

Former Member
0 Kudos

Hello All,

The issue is solved after upgrading to SLES SP3 from SP2.

Closing this thread, and thanks to all replies.

Best Regards,

Suresh Kadali

Former Member
0 Kudos

Hello All,

To update you all on this blog, we have followed below solutions:

1. SAP note 2001528 

Applied libgcc_s1-4.7.2_20130108-0.17.2.x86_64.rpm & libstdc++6-4.7.2_20130108-0.17.2.x86_64.rpm

2. The most relevant SAP note for the issue was:  2263929

    As per the recommendation I have installed glibc i.e  glibc-2.11.3-17.95.2.

3. Also I found SAP Note 1978433, about possible import errors on HANA.

Inspite of all the above fixes, issue is not yet fixed. Infact we were successful on installing S4H 1511 version once but even it got crashed while running SGEN.

Unable to corner the source of root cause; either issue is with SUSE or HANA DB.

Regards,

Suresh Kadali

Former Member
0 Kudos

Though I haven't seen this behaviour before, just a couple of thoughts.

Am I right that you are still running the default kernel of SLES11 SP2 3.0.13-0.27?

Which glibc version are you using?

(see SAP note 1888072 SAP HANA DB: Indexserver crash in __strcmp_sse42)

Did you implement the recommended settings in SAP note 1824819? (SAP HANA DB: Recommended OS settings for SLES 11 / SLES for SAP Applications 11 SP2)

And which database revision are you running?

Regards,

Thorsten

Former Member
0 Kudos

Hello Nitsch,

Your observations seems valid as well.

Based on the SAP note : 1888072 recommendations we have updated below glibc rpm. But no luck.

We will check with unix admin for validating the other recommendations in SAP note 1824819.

Our SAP HANA Database version is  -> '1.00.110.00.1447753075'

SUSE Linux glibc version  ->  glibc-2.11.3-17.74.13

Thanks for the details you provided.

Best Regards,

Suresh Kadali

Former Member
0 Kudos

If not already done, have a look at SAP note 2001528 as well.

This might be relevant as well.

Regards,

Thorsten

yakcinar
Active Contributor
0 Kudos

Hello Suresh,

Isn't there any log for the installation tool?

Are you lookking at DB logs only?

Can you attach index server traces?

Regards,

Yuksel AKCINAR

Former Member
0 Kudos

Hello Yuksel AKCINAR,

Thanks for the quick response.

Installation logs doesn't get updated when the load hangs.

We are referring to DB logs as well but could not find the resolution.

Attached are the indexserver, name server & xsengine traces when the DB crashed.

What we understood is the unix process "defunct" is kind of zombiee process, to which we are not able to kill through os commands, not even user kill able to terminate it. Only OS restart is the fix.

We are not able to find root cause why indexserver process at OS level is becoming "defunct".

Once we were successful in installing the S4HANA 1511 but consequent OS reboot has again created indexserver defunct process & crashed the Database.

Regards,

Suresh Kadali