Skip to Content
avatar image
Former Member

Loading of remote sources input of many flowgraphs run in parallel? Manuals about this topic?

I'm looking for a good manual about Smart Data Integration jobs, expecially about parallel running, but I can't find anything enough detailed.

In particular, I'm looking for answers to this question:

I have 2 flowgraphs (F1 and F2): F1 has two remote source (T1 and T2), F2 has two remote source (T1 and T3), so T1 is in both of them. If I run these two flowgraphs, T1 will be load twice (one time from F1 and one from F2) or there is a sort of "temporary cache area" to load T1 only one time? And if I run the two flowgraphs in parallel, how does the T1 loading work?

Thanks in advance

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Feb 07, 2017 at 11:26 PM

    Hi Enrico,

    I think when you are mentioning Remote Source you are actually referring to Virtual Tables pointing to a Remote Source, correct?
    In general the operation will be pushed down to the source if the Adapter supports them. So if you have e.g. a Filter or Aggregation directly after the Data Source Node of a virtual table the query might get pushed down and the filter/aggregation will be applied on the remote source system. In this case the Remote Database can have some caching to allow the same query being run in the second flowgraph to reuse the internally created caches etc.

    From the Hana Side the two flowgraphs having the same source and running at the same time can result in a slower execution, as the DPAgent usually has its bottleneck in the network which will then be utilized twice.

    As an alternative you can also think about replicating the common dataparts to a hana table or putting it all into the same flowgraph. Then the loading of T1 will only happen once which should result in a better performance (Especially with using the newly introduced task partitions)



    Add comment
    10|10000 characters needed characters exceeded