cancel
Showing results for 
Search instead for 
Did you mean: 

Job: Importing data into DB table

Former Member
0 Kudos

Hi everybody,

we are experiencing some severe performance problems on importing accounts (about 36000 objects) from an LDAP directory into a DB table for further processing. The whole import takes more than 1800 sec and aborts because LDAP server closes the connection. The Advanced button in LDAP pass didn't help.

Obviously, LDAP is not the bottleneck here because reading data with a self-implemented Java tool into a csv-file takes only 2 minutes with the same basic transformation logic for attributes. Trying to read this generated file with a from ASCII file pass takes very long. It seems like SAP IDM performs a commit after eacht insert. Is it possible to influence this behaviour?

Importing the csv-file into the database with a Microsoft tool called "bcp" takes less than 5 seconds.

Kind regards

Matthias

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Are you limiting the attributes returned from LDAP at all? Getting * is very convenient, but not very efficient if all you need is 10 attributes out of 250. Especially if there's multivalue attributes you dont need that holds large amounts of data. What type of directory (Sun, OpenLDAP, ADS, ...) is it?

Br,

Per Christian

Message was edited by: Per Krabsetsve

Former Member
0 Kudos

Hi Per,

currently, there's a query for *, but there are about 15 attributes in the full set and we are processing 11 or 12 attributes so I don't think that there's a benefit if we limit the attributes.

kind regards

Matthias

Former Member
0 Kudos

Which directory are you using? Pretty sure its not ADS since you get 15 attributes 🙂

Also check if disabling "follow referrals" & Alias Resolving makes any difference. I've seen instances where LDP and others are much faster because they skip some setting.

If I remember correctly you could get into problems with servers that dont support paging but only supports virtual list views which IdM doesnt support. But that is some years ago now.

Br

Per Christian

Former Member
0 Kudos

Hi Per,

not it's not AD. It's a NotesLDAP (kind of LDAP-acces to the Notes address book). I will check the settings you mentioned, but obviously the database guys fixed something because jobs are working better right now.

Nevertheless, doing the same operation with a java tool and Microsoft bcp tool is pretty much faster than importing data with the IDM job. A setting for the commit interval would be very nice!

kind regards

Matthias

former_member2987
Active Contributor
0 Kudos

Matthias,

It all depends on what those attributes are.  As Per Christian said, if they are multivalued or contain large amounts of data (PKI certs) this can still slow you down.  It's worth investigating.

Also where is the server located in relation to the IDM Server? Is there a slow/bad network connection in the way?  I know you bench-marked against another tool, but did you do the test from the same server as IDM? Also were you able to check the paging setting?

Matt

Former Member
0 Kudos

Hi Matthias

Just confirming - you're doing a FROM LDAP pass to a temporary destination DB (essentially like the AD initial load jobs).

The performance you're getting is woeful.  It takes me 180 seconds for about 9000 objects and I'm doing javascript processing on each object.  Is there any other processing going on during the ldap read / write to dB?  How many attributes are you reading?

You could try changing the page size from 50 to something larger - it'll get bigger chunks and hopefully process them more efficiently.

Peter

Former Member
0 Kudos

Hi Peter,

yes, it's like an AD initial load job, but system is NotesLDAP.

The strange thing about this is, that this process (and other processes) loading many accounts from the system (NotesLDAP, AD, ADAM) are performing very bad since a few days. It seems like there are problems with the database since a few days (we've already asked DB-Admins to check).

I think that the FromLDAP pass issues an INSERT/Commit for each dataset, whereas issuing a commit after a configurable amount of data would be more efficient. Is this possible?

kind regards

Matthias

Former Member
0 Kudos

Are you using SQL2008?  If you reindex the database you might find that performance picks up.  You may also have to reindex TEMP_DB.  I've seen issues on SQL2008 with other IDM products where the  he server essentially grinds to a halt doing inserts.  In fact, I think TEMP_DB was the whole problem...

Its a temp fix, not a solution, but that would at least let you know where to start looking for the solution.

Peter

Former Member
0 Kudos

Hi Matthias

I had a similar problem before with a very large directory. We had populated a group of approx.4000 members which had a series of nested groups,  and was set to grow significantly later in the project .

I solved the problem in two steps.

  1. Performance was problematic because the standard intial load job reads all users first, then all group membership of those users second. We switched the ReadGroups and ReadUsers passes round, so that a temporary table was created with the group membership data in the ReadGroups pass. Then in the ReadUsers pass I used  the Advanced button on the FromLDAP source tab to  define a query based on the temporary table filled in the previous step, that limited the DN read from usernames beginning A% to D% for example. Then I duplicated this next pass and adjusted the SQL to read from D% to H% and so on.
  2. Even using this method, we reached a point in our read where we needed to read managers of users as individual identities in their own right, and still experienced crashing as the open connections from the previous pass had not been released while the next pass had started.  I had discussions with the directory expert and through LDAP tracing we established that IdM was opening a connection to the directory for each individual DN. On the directory there was a limit of 1000 concurrent open connections, so although it was load balanced we were connected to just one server and were consuming too many connections.

So IdM was actually working too fast for the directory server. I understand that technically it is possible to open one connection and read all the data in that one connection (as the bcp tool probably does) but I have not found the resolution of how to make IdM do that - I assume some scripting would be required to change the default behaviour.

The solution I used was to introduce a delay of around 1 minute using the uSleep(<Milliseconds>); generic JavaScript function in a pass between the FromLDAP passes  -this gave enough time for the connections to be released between IdM and the directory, then the next pass could start again.

At the time, in 7.1 sp5, this approach was endorsed by SAP IdM Product team.

I hope this helps your situation.

arndt_spitzer
Member
0 Kudos

Hi Matthias,

did you checked if the performance issues could come from the Notes LDAP Server? There is also an option on the Domino side to build an index and so on. Only an idea:

http://www-12.lotus.com/ldd/doc/domino_notes/rnext/help6_admin.nsf/f4b82fbb75e942a6852566ac0037f284/...

Besides this there could be processes in the database which are perhaps deadlocked. That could cause as well performance issues. DBA's could figure that out on the DB host via Management > Activity Monitor (MS-SQL)

Regards, Arndt