Extract BSEG table using SLT/Data Services to flat...

yogesh_joshi · ‎08-28-2017

Scenario is ERP > SLT > DS > flat files.

We need to do initial extract of SAP table BSEG, with billions of records. SLT is setup to trigger initial load. DS has to fetch this data from SLT and generate multiple flat files.

DS automatically includes an additional column DI_SEQUENCE_NUMBER (int, starting from 0, mapped to Query output, and output already sorted with this column) when this data is read from SLT.

We want to write first 10 million records into 1 flat file, and the next 10 million is another and so on. What is the best approach to design a flow to handle this, given the billions of records? There is no need to group or do any further transforms on the incoming data.

Edit: SLT to DS data transfer is using RFC.

Any ideas or suggestions are much appreciated, thanks for your time.

Yogesh

former_member208402 · ‎08-30-2017

Hi Yogesh Joshi,

You can achieve this by using a loop. Take count of your source data and divide your source count by 10 million.. so you will get a value, say it is 10, so you have to loop your dataflow 10 times.

initialize variables outside loop

$G_Loop=1;

$G_max_loop=10; ( the value which you got after dividing your source count by 10 million)

$G_INITIAL=1;

$G_MAX=10000000;

and give target file name using a variable like $G_FILENAME='filename_'||$G_Loop;

Now connect source-----> Query-----> target file

in query where condition give

DI_SEQUENCE_NUMBER>=$G_INITIAL and DI_SEQUENCE_NUMBER<=$G_MAX ( which will be DI_SEQUENCE_NUMBER>=1 and DI_SEQUENCE_NUMBER<=10000000)

This will load 10 million records to the file (filename_1 ----> since $G_loop=1)

now inside loop take a script after dataflow and

$G_INITIAL=$G_MAX+1; ( this value will be 10000001)

$G_MAX=$G_MAX+$G_MAX; ( this value will be 20000000)

$G_Loop=$G_Loop+1 ( this value will be 2)

Now the where condition will be

DI_SEQUENCE_NUMBER>=$G_INITIAL and DI_SEQUENCE_NUMBER<=$G_MAX ( which will be DI_SEQUENCE_NUMBER>=10000001 and DI_SEQUENCE_NUMBER<=20000000)

This will load next 10 million records to the file (filename_2 -----> since $G_loop=2)

every time filename will be different. So 1 dataflow in a loop creates multiple files for you with 10 million records in each file.

Change this logic according to your requirement.

Thanks,

Ravi kiran

former_member208402 · ‎08-30-2017