on 05-02-2016 12:07 PM
Hello Experts,
We have a strange problem in 1511 installation.
During the DB import phase, the import halts without any reason. It will not update the error logs as well.
Up on checking, we find that the CPU utilization for all the R3load processes will become zero.
Further checks shows that, the indexserver becomes like this;
hd1adm 21900 14230 19 04:44 ? 00:12:15 [hdbindexserver] <defunct>
It will not let us use that DB instance even after OS reboot. We need to install new DB instance each time it fails.
System Environment :
OS - SUSE Linux 11 SP2.
HDB - HANA 1.0 SP11.
SAP - S4HANA 1511.
Memory - 512GB
All the mount points have sufficient free space as well.
mhdhana02:~ # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg0-root 9.9G 5.2G 4.3G 55% /
devtmpfs 253G 304K 253G 1% /dev
tmpfs 380G 0 380G 0% /dev/shm
/dev/sda1 98M 23M 71M 25% /boot
/dev/mapper/vg0-hanadata 985G 71G 864G 8% /hana/data
/dev/mapper/vg0-shared 504G 198G 307G 40% /hana/shared
/dev/mapper/vg0-usrsap 99G 5.4G 94G 6% /usr/sap
/dev/mapper/vg0-hanalog 493G 17G 451G 4% /hana/log
/dev/mapper/vg0-sapmnt 99G 1.7G 92G 2% /sapmnt
mhdhana02:~ #
Unable to find the root cause. Please let us know if anyone have come across this situation & found the fix.
Regards,
Suresh Kadali
Hi Suresh,
Which revision are you using for HANA SP11?
It better to update the revision of hana database.
Can you paste the current log for the hana database?
With Regards
Ashutosh Chaturvedi
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
borting validity check. Please make sure that you have up-to-date timezone data tables. (see SAP Note 1932132)
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.048327 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.101196 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.151069 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.200920 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.250837 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.300669 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.350575 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.400223 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.450445 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.500676 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.550748 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.600816 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.650990 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:12.816002 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:13.153395 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:13.606238 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:13.659750 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:13.710082 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:13.947206 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[45816]{-1}[-1/-1] 2016-05-18 09:50:14.175453 e REPOSITORY biSetup.cpp(01751) : BiSetup:migrateSchema: currentDBSchema is 0.
[46109]{-1}[12/-1] 2016-05-18 09:50:14.700386 e Memory MallocProxy.cpp(01617) : libnuma: Error: mbind: Bad address
[46124]{-1}[12/-1] 2016-05-18 09:50:14.702508 e Memory MallocProxy.cpp(01617) : libnuma: Error: mbind: Bad address
[46100]{-1}[12/-1] 2016-05-18 09:50:14.708522 e Basis FaultProtectionImpl.cpp(01592) : SIGNAL 11 (SIGSEGV) caught, thread: 31882[thr=46100]: JobWrk0054, addr
: 0x00007feab7902000, time: 2016-05-18 09:50:14 000 Local
Instance DBH/00, OS Linux mhdhana02 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64
----> Register Dump <----
rax: 0x00000000ffffffff rbx: 0x00007feb7d0d94b0
rcx: 0x00007fec5a2fb3c9 rdx: 0x0000000000000003
fp[0]: 0x0.ed00 4e88 7fec 0000 * 2^0x1520
fp[1]: 0x0.1fa0 0000 ffff 0000 * 2^0xc
fp[2]: 0x0.ffff 0000 0000 0000 * 2^0x34
fp[3]: 0x0.ffff 0000 0000 0000 * 2^0x0
fp[4]: 0x0.ffff 0000 0000 0000 * 2^0x1
fp[5]: 0x0.ffff 0000 0000 0000 * 2^0x0
fp[6]: 0x0.4002 0000 0000 0000 * 2^0x0
fp[7]: 0x0.4002 0000 0000 0000 * 2^0x0
xmm[00]: 0x00004002.00000000.00000000.00000000
xmm[01]: 0x0000ffff.00000000.00000000.00000000
xmm[02]: 0x00000000.00000000.0000ffff.00000000
xmm[03]: 0x00000000.00000000.00000000.40c89680
xmm[04]: 0x00400020.01000080.ffff0908.ffff0b0a
xmm[05]: 0xffff0d0c.ffff0f0e.ffff0100.ffff0302
xmm[06]: 0xffff0504.ffff0706.00000038.00000038
xmm[07]: 0x00000038.00000038.00000038.00000038
xmm[08]: 0x00000038.00000038.00000038.00000038
xmm[09]: 0x00000038.00000038.00000038.00000038
xmm[10]: 0x00000038.00000038.04030201.08070605
xmm[11]: 0x0b0a0908.0f0e0d0c.3fffffff.3fffffff
xmm[12]: 0x3fffffff.3fffffff.00000000.00000000
xmm[13]: 0xffffffff.ffffffff.00000000.3fffffff
xmm[14]: 0x00000000.3fffffff.3fffffff.00000000
xmm[15]: 0x3fffffff.00000000.03020100.07060504
[46100]{-1}[12/-1] 2016-05-18 09:50:14.708522 e Basis FaultProtectionImpl.cpp(01592) : NOTE: full crash dump will be written to /usr/sap/DBH/HDB00/mhdhana02/
trace/indexserver_mhdhana02.30003.crashdump.20160518-095014.045766.trc
Call stack of crashing context:
1: 0x00007fec5e79e940 in MemoryManager::assertReadability(void*)+0x0 at MallocProxy.cpp:1709 (libhdbbasis.so)
2: 0x00007fec5e79f1a8 in MemoryManager::mmapOverride(void*, unsigned long, int, int, int, long, void*)+0x534 at MallocProxy.cpp:1836 (libhdbbasis.so)
3: 0x00007fec5e79f4c2 in mmap64+0x10 at MallocProxy.cpp:1478 (libhdbbasis.so)
4: 0x00007fec5e8cbf8a in System::UX::mmap(void*, unsigned long, int, int, int, unsigned long)+0x46 at SystemCallsUNIX.cpp:442 (libhdbbasis.so)
5: 0x00007fec5e8a60dd in System::memAllocSystemPages(void*&, unsigned long, System::NUMAPolicy)+0x129 at Memory.cpp:1201 (libhdbbasis.so)
6: 0x00007fec5e7c8462 in MemoryManager::MemorySource::allocateSystemMemory(void*, unsigned long, unsigned long&, System::NUMAPolicy)+0x90 at MemorySource.cpp:346 (libh
dbbasis.so)
7: 0x00007fec5e7c941b in MemoryManager::MemorySource::allocateBigBlock(MemoryManager::BlockInfo*&, unsigned long, unsigned long, unsigned long&, System::NUMAPolicy)+0x
67 at MemorySource.cpp:477 (libhdbbasis.so)
8: 0x00007fec5e768861 in MemoryManager::BigBlockAllocator::createNewBlock(unsigned long, unsigned long&)+0x1f0 at BigBlockAllocator.cpp:629 (libhdbbasis.so)
9: 0x00007fec5e76c78e in MemoryManager::BigBlockAllocator::allocateBlock(unsigned long, MemoryManager::OutOfMemoryHandlingMode, unsigned long&)+0x44a at BigBlockAlloca
tor.cpp:1008 (libhdbbasis.so)
10: 0x00007fec5e769201 in MemoryManager::BigBlockAllocator::allocateBig(MemoryManager::BlockInfo*&, unsigned long, unsigned long, unsigned short, ltt::allocator_statist
ics::SubStats&, void const*, MemoryManager::OutOfMemoryHandlingMode, MemoryManager::BigBlockAllocator::AllocationType, unsigned long&)+0x360 at BigBlockAllocator.cpp:11
59 (libhdbbasis.so)
11: 0x00007fec5e7ba8c5 in MemoryManager::MemoryPool::reserveMemoryAndAllocateBigBlock(unsigned long, bool, unsigned short, ltt::allocator_statistics::SubStats&, void co
nst*, bool, Synchronization::LockHandle<Synchronization::Mutex, false>&)+0x141 at MemoryPool.cpp:2231 (libhdbbasis.so)
12: 0x00007fec5e7be612 in MemoryManager::MemoryPool::allocateBigOrHugeBlock(unsigned long, unsigned short, ltt::allocator_statistics&, void const*, bool, bool, Synchron
ization::LockHandle<Synchronization::Mutex, false>&, bool)+0x290 at MemoryPool.cpp:1613 (libhdbbasis.so)
13: 0x00007fec5e7be9e7 in MemoryManager::MemoryPool::allocate(unsigned long, unsigned short, ltt::allocator_statistics&, bool&, bool, bool, void const*)+0x373 at Memory
Pool.cpp:1558 (libhdbbasis.so)
14: 0x00007fec5e7fa7ee in MemoryManager::PoolAllocator::allocateNoThrowImpl(unsigned long, void const*)+0x6a at PoolAllocator.cpp:1833 (libhdbbasis.so)
15: 0x00007fec56240066 in ltt::allocator::allocate(unsigned long)+0x22 at memory.cpp:83 (libhdbsasso.so)
16: 0x00007fec5621b99b in Keylist_CompressedBitList::reallocate()+0x3f7 at memory.hpp:866 (libhdbsasso.so)
17: 0x00007fec562139d7 in KeylistScanJob::run()+0x63 at escada.cpp:69 (libhdbsasso.so)
18: 0x00007fec76d30ee9 in TRexUtils::Parallel::JobBase::runEx()+0x15 at ParallelDispatcher.cpp:222 (libhdbbasement.so)
19: 0x00007fec76d30dc0 in TRexUtils::Parallel::JobBase::run(Execution::Context&, Execution::JobObject&)+0x40 at Timer.hpp:69 (libhdbbasement.so)
20: 0x00007fec5e5e238e in Execution::JobObjectImpl::run(Execution::JobWorker*)+0xf1a at JobExecutorImpl.cpp:1086 (libhdbbasis.so)
21: 0x00007fec5e5ed383 in Execution::JobWorker::runJob(ltt::smartptr_handle<Execution::JobObjectForHandle>&)+0x390 at JobExecutorThreads.cpp:203 (libhdbbasis.so)
22: 0x00007fec5e5ef99a in Execution::JobWorker::run(void*&)+0x1b6 at JobExecutorThreads.cpp:411 (libhdbbasis.so)
23: 0x00007fec5e6439f0 in Execution::Thread::staticMainImp(void**)+0x700 at Thread.cpp:461 (libhdbbasis.so)
24: 0x00007fec5e644fc8 in Execution::Thread::staticMain(void*)+0x34 at ThreadMain.cpp:26 (libhdbbasis.so)
Hello Suresh,
Did you find a solution for your issue. We are having the same exact issue with our hana upgrade 102.04. We were asked to upgrade to SUSE 11 SP3. We are also on SP2 and the server has the same amount of memory 512GB.
Please let us me know if you have a fix for this issue.
Thank you
Dinali
Hello All,
To update you all on this blog, we have followed below solutions:
1. SAP note 2001528
Applied libgcc_s1-4.7.2_20130108-0.17.2.x86_64.rpm & libstdc++6-4.7.2_20130108-0.17.2.x86_64.rpm
2. The most relevant SAP note for the issue was: 2263929
As per the recommendation I have installed glibc i.e glibc-2.11.3-17.95.2.
3. Also I found SAP Note 1978433, about possible import errors on HANA.
Inspite of all the above fixes, issue is not yet fixed. Infact we were successful on installing S4H 1511 version once but even it got crashed while running SGEN.
Unable to corner the source of root cause; either issue is with SUSE or HANA DB.
Regards,
Suresh Kadali
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Though I haven't seen this behaviour before, just a couple of thoughts.
Am I right that you are still running the default kernel of SLES11 SP2 3.0.13-0.27?
Which glibc version are you using?
(see SAP note 1888072 SAP HANA DB: Indexserver crash in __strcmp_sse42)
Did you implement the recommended settings in SAP note 1824819? (SAP HANA DB: Recommended OS settings for SLES 11 / SLES for SAP Applications 11 SP2)
And which database revision are you running?
Regards,
Thorsten
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Nitsch,
Your observations seems valid as well.
Based on the SAP note : 1888072 recommendations we have updated below glibc rpm. But no luck.
We will check with unix admin for validating the other recommendations in SAP note 1824819.
Our SAP HANA Database version is -> '1.00.110.00.1447753075'
SUSE Linux glibc version -> glibc-2.11.3-17.74.13
Thanks for the details you provided.
Best Regards,
Suresh Kadali
Hello Suresh,
Isn't there any log for the installation tool?
Are you lookking at DB logs only?
Can you attach index server traces?
Regards,
Yuksel AKCINAR
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Yuksel AKCINAR,
Thanks for the quick response.
Installation logs doesn't get updated when the load hangs.
We are referring to DB logs as well but could not find the resolution.
Attached are the indexserver, name server & xsengine traces when the DB crashed.
What we understood is the unix process "defunct" is kind of zombiee process, to which we are not able to kill through os commands, not even user kill able to terminate it. Only OS restart is the fix.
We are not able to find root cause why indexserver process at OS level is becoming "defunct".
Once we were successful in installing the S4HANA 1511 but consequent OS reboot has again created indexserver defunct process & crashed the Database.
Regards,
Suresh Kadali
User | Count |
---|---|
85 | |
10 | |
10 | |
9 | |
6 | |
6 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.