cancel
Showing results for 
Search instead for 
Did you mean: 

Process Multiple Files Sequentially and load into staging table.

former_member241220
Participant
0 Kudos

Hi All,

I have requirement we are extracting the Multiple files from on of the source folder( D:\Source). the folder contains files such as .CSV format

Ex:


Test-20160616-00000049.csv

Test-20160616-00000050.csv

Test-20160616-00000051.csv

Test-20160616-00000052.csv

I need to extract this files one by one ( Sequentially ) and load into staging table( DB2 database). My guess is I have to use while loop in this scenario to loop files one by one. can you please give the script to do this process.

Thanks

Murali

Accepted Solutions (1)

Accepted Solutions (1)

former_member208402
Active Contributor
0 Kudos

Hi Murali,

No need to use while loop. in the source file format give file name as  Test*.csv   or    *.csv and it will pick all the files that start with the name Test from that path and load files to target sequentially in single execution.

if you give  *.csv  it will pick all the files that end with .csv.

Thanks,

Ravi Kiran.

former_member241220
Participant
0 Kudos

Hi Ravi Kiran,

Thanks for your swift reply.

I have done same as it is what you suggested. But my client does not want like this, he wants load the files one after one and want to see filename in print function. I think there is a time gap between each and every file. please give me your suggestion how to achieve this scenario.

Thanks

Murali.

former_member208402
Active Contributor
0 Kudos

Hi Murali,

Your file name is Test-20160616-00000049.csv

This is in the format Test-sysdate-00000049

if Test and 00000049 will be same everyday and if only sysdate value changes.. i think you can achieve this.

design you job this way

Script-------- While_Loop

write the script like this

$G_FILENAME='Test'||'-'||to_char(sysdate();,'yyyymmdd')||'-'||'00000049'||'.csv';

$G_FILENAME_CHANGE=substr($G_FILENAME,15,8);

Here 00000049 will be assigned to this variable ($G_FILENAME_CHANGE)

now inside the while loop design like this

script_start----------Dataflow---------- script_end

inside script_start:-

$G_FILENAME='Test'||'-'||to_char(sysdate();,'yyyymmdd')||'-'||$G_FILENAME_CHANGE||'.csv';

print($G_FILENAME);

inside Dataflow:-

design your flow and in the source file filename give this variable ($G_FILENAME)

inside Script_end:-

$G_FILENAME_CHANGE=$G_FILENAME_CHANGE+1;

this make 00000049 as 00000050 and this will be assigned to the $G_FILENAME variable and that file will be picked.

But give condition in the loop based on the number of files you have to read

Please reply if u need more information and also check if this works.

Please excuse if there are any syntax errors

Thanks,

Ravi Kiran.

former_member241220
Participant
0 Kudos

Hi Ravi kiran.

Thank you so much for you patience. you are awesome.

I need small conformation, what are the condition i have to give in while loop. Suppose if i have 4 files

I have to give like this whlileloop ($G_FILENAME_CHANGE>=4).Please suggest me

your help much appreciated

Thanks,

Murali

former_member208402
Active Contributor
0 Kudos

Hi Murali,

$G_FILENAME_CHANGE variable is used to store 00000049/00000050 values which are last part of the file names.


Declare new global variable for loop condition like $G_LOOP_COUNT of datatype int and intialize it to 1 in the script_start


$G_LOOP_COUNT=1;


and in the script_end


$G_LOOP_COUNT=$G_LOOP_COUNT+1;


Now give condition in Loop as


$G_LOOP_COUNT<=4 (less than or equal to 4



and please mark my above answer as correct if it solved your issue, so that it will be useful for other members.



Thanks,

Ravi Kiran.

former_member241220
Participant
0 Kudos

Hi Ravi Kiran,

Thanks for your support to develop this my job.

As you mentioned in the previous repose.

inside Dataflow:-

design your flow and in the source file filename give this variable ($G_FILENAME)

I have pass variable "$G_FILENAME" in the source file filename. After try to update schema, i am getting only one field. But my table having 15 fields. Because of that I am getting error in my job.

Hope you understand my question. I hope if i fix this, i can run job successfully.

Once again thanks your help.

Thanks

Murali

former_member208402
Active Contributor
0 Kudos

Hi Murali,

You try adding the fields manually in file format sturcture and this will fix the issue.

in the local object library right click on the file format u defined for this file and

EDIT----> and add column names manually in the structure

Thanks,

Ravi Kiran

former_member208402
Active Contributor
0 Kudos

Hi Murali,

You can also try this.. update the schema of the source file with actual file name and after updating schema replace file name with $G_FILENAME and dont update schema now and just save it and close.

former_member241220
Participant
0 Kudos

Hi Ravi kiran,

Thank you so much for your swift reply.

Yes, this will help to get all fields in my source.

Thanks you so much. Thank you is small word for your help. But I don't know how to say another way.

Regards,

Murali.

former_member208402
Active Contributor
0 Kudos

Hi Murali,

So you are working in Mindtree, hyderabad?

former_member241220
Participant
0 Kudos

Hi Ravi Kiran,

You can reach me out "muralikrishna.bods156@gmail.com". Then we can talk personally.

I think it is not good sign to talk our things here. Hope you understand.

Thanks and Regards,

Murali.

former_member241220
Participant
0 Kudos

Hi Ravi Kiran,

Hope you are doing good,

I have implemented my job as you suggest and job is going smoothly.

But I am getting small problem to load data into database.

you know that i have 4 flat files, assume each flat file having 15 records. That means in target the records should be 4x15= 60 records.

But i am getting 15 records only that too final flat file (Test-20160616-00000052.csv) data only.

I do not know what is happeing inside dataflow.

Please guide me how to overcome this issue.

Thanks,

Murali

former_member241220
Participant
0 Kudos

Thank to all,

Issue resolved, I forget disable "Delete Data before loading" check box in target table.Now I disable, now I can see all files records in target table.

Thanks you so much you hlep to resolve this issue.

Thanks

Murali

former_member241220
Participant
0 Kudos

Hi Ravi,

How are you? Hope you are doing good,

I came back for few scenarios on this requirement.I am trying to myself to fix, and  I want take your help in this requirement.

Now my requirement is, Initial days my files are Test-20160616-00000049.csv

for this we passed variable file name $G_FILENAME and apply this variable in file_format

but now i am getting files with different format like one day i am getting files

such as

Test-20160617-00000086.csv,

Test-20160617-00000092.csv,

Test-20160618-0000012AA.csv

How to over come this.

For while loop we create a variable $G_Loop_Count

some times i am getting 6 files and some times i am getting 15 files from source

how can I give particular count for while because we do not know how many files coming that day.

Hope you understand my requirement.

Please give me your suggestion on this scenario.

Regards,

Murali.

former_member208402
Active Contributor
0 Kudos

Hi Murali,

for $G_Loop_Count logic use cmd to read number of files exist in that folder and assign that value to $G_Loop_Count.

Refer this link how to read count of files usig command

http://www.digitalcitizen.life/4-ways-count-number-folders-and-files-inside-folder

or try anything like this in command.

and for files with different names we need to work out and find the solution.

first try this reading count. and let me know.

Thanks,

Ravi kiran.

Answers (0)