Skip to Content
0

SAP Hana Installation - hdbindexserver not starting

Oct 28, 2016 at 02:01 PM

725

avatar image

Hello,

Recently I installed SAP Hana DB with all the options (DB, Client, AFL, Studio, XS) on an AWS machine with ~128GB RAM and 16 processor cores (for a small development machine according to documentation https://websmp201.sap-ag.de/~sapidb/011000358700000050632013E - page 36).

After installation the hdbindexserver service is not running, yet I can see other Hana processes running:


hdbnameserver
hdbcompileserver
hdbpreprocessor
hdbwebdispatcher

Checking hdbindexserver logs, the following line caught our attention:

JobExecutorUtil.cpp(01581) : dubious configuration detected, switching to single numa node mode: 128 log.CPUs, 16 active, 8 phys.cores, 1 sockets

After that the program iterate 128 times trying to find processor cores, from 0 to 15 it finds the 16 cores, but after that it cannot map the other cores and ends up shutting down the service.

[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013408 e Job              JobExecutorUtil.cpp(01581) : dubious configuration detected, switching to single numa node mode: 128 log.CPUs, 16 active, 8 phys.cores, 1 sockets
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013425 e Job              JobExecutorUtil.cpp(01584) : running as VM guest
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013426 e Job              JobExecutorUtil.cpp(01589) : log.CPU 0 core index 0 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013429 e Job              JobExecutorUtil.cpp(01589) : log.CPU 1 core index 1 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013430 e Job              JobExecutorUtil.cpp(01589) : log.CPU 2 core index 2 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013432 e Job              JobExecutorUtil.cpp(01589) : log.CPU 3 core index 3 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013433 e Job              JobExecutorUtil.cpp(01589) : log.CPU 4 core index 4 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013434 e Job              JobExecutorUtil.cpp(01589) : log.CPU 5 core index 5 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013435 e Job              JobExecutorUtil.cpp(01589) : log.CPU 6 core index 6 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013436 e Job              JobExecutorUtil.cpp(01589) : log.CPU 7 core index 7 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013436 e Job              JobExecutorUtil.cpp(01589) : log.CPU 8 core index 0 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013437 e Job              JobExecutorUtil.cpp(01589) : log.CPU 9 core index 1 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013438 e Job              JobExecutorUtil.cpp(01589) : log.CPU 10 core index 2 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013439 e Job              JobExecutorUtil.cpp(01589) : log.CPU 11 core index 3 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013440 e Job              JobExecutorUtil.cpp(01589) : log.CPU 12 core index 4 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013441 e Job              JobExecutorUtil.cpp(01589) : log.CPU 13 core index 5 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013442 e Job              JobExecutorUtil.cpp(01589) : log.CPU 14 core index 6 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013443 e Job              JobExecutorUtil.cpp(01589) : log.CPU 15 core index 7 socket index 0
[18276]{-1}[-1/-1] 2016-10-27 18:48:07.013444 e Job              JobExecutorUtil.cpp(01589) : log.CPU 16 core index -1 socket index -1
(...)
[18376]{-1}[-1/-1] 2016-10-27 18:50:19.408141 e Job              JobExecutorUtil.cpp(01589) : log.CPU 127 core index -1 socket index -1
[18374]{-1}[-1/-1] 2016-10-27 18:50:20.175527 i Service_Shutdown TrexService.cpp(00803) : Preparing for shutting service down

I attached the trace files from daemon and hdbindexserver for further analysis.

Anyone faced this problem before? Any help is appreciated.

Thank you.

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

2 Answers

Doug Rainey Nov 25, 2016 at 03:25 PM
0

Sorry, I don't have an answer, but we have the exact issue also. HANA, SPS12 and SUSE Linux 11.4.

IF you look in /var/log/warn do you also see this messages at the same or similar time? If course there are LOTs of messages.

BUG: soft lockup - CPU#1 stuck for 25s! [MemoryCompactor:28564]

I'm a DBA, not an OS or VMware guy but I'll speculate. I suspect it is looking for NUMA boundaries, this in turn causes thrashing and VMware or Linux can't handle it. Could this be a VMware bug?
Share
10 |10000 characters needed characters left characters exceeded
Tiago Marinho Dec 07, 2016 at 01:14 PM
0

Hi Doug,

Since I could not solve the problem I reinstalled the DB and it worked, first time it installed with some warnings (scenario from when I posted this thread). I am not sure what exactly solved the problem, but the second time I did it on command line instead of GUI.

One thing I noticed was that when I tested library depency(ldd command on UNIX), it returned a false positive since I had the libs on 32 bit version but the installer needed the 64 bit version folder. Hope it adds to the solution of your problem.


Best Regards

Share
10 |10000 characters needed characters left characters exceeded