Skip to Content

SDI Intial Load from HIVE tables to HANA

Jul 28, 2017 at 06:58 PM


avatar image

We configured a HIVE adapter using SDI DP agent. We have a huge table in HIVE DB with 125 fields.

We created a virtual table on top of HIVE table and created a flow graph which pulls the data based on a date filter.

We have around 10 million records for day and it is taking around 1 hour to read the data from HIVE.

When we look in HADOOP yarn logs, job was completed with in 4 mins.

Can you help me with the question below.

1) Are there any settings in SDI DP agent (like increase the number of threads/process) to improve the speed of the data loads.

2) How do we know how many threads DP agent is using when pulling the data from HIVE.

3)how to Monitor the loads in DP agent.



10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

1 Answer

Timo Wagner
Jul 31, 2017 at 06:44 PM

Hi Srini,

there are a couple of parameters that we can tweak to get the data into Hana as I've mentioned in this answer ( ) .

In addition to the two things mentioned in there (configure fetch size in agent & partition on source) you can also try to partition the load in the task partitions, that are accessible through the Flowgraph settings. This can add actual parallel loading( or sequential if necessary).

In your case you can add task partitions based on your mentioned Date column. You can then also specify the number of parallel jobs that should run.

For more information on the Task Partitions and the performance boost that we see, please have a look at this Blogpost:

Please let me know if you have any question.


10 |10000 characters needed characters left characters exceeded