on 06-16-2016 12:36 PM
Hi All,
I have requirement we are extracting the Multiple files from on of the source folder( D:\Source). the folder contains files such as .CSV format
Ex:
Test-20160616-00000049.csv
Test-20160616-00000050.csv
Test-20160616-00000051.csv
Test-20160616-00000052.csv
I need to extract this files one by one ( Sequentially ) and load into staging table( DB2 database). My guess is I have to use while loop in this scenario to loop files one by one. can you please give the script to do this process.
Thanks
Murali
Hi Murali,
No need to use while loop. in the source file format give file name as Test*.csv or *.csv and it will pick all the files that start with the name Test from that path and load files to target sequentially in single execution.
if you give *.csv it will pick all the files that end with .csv.
Thanks,
Ravi Kiran.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Ravi Kiran,
Thanks for your swift reply.
I have done same as it is what you suggested. But my client does not want like this, he wants load the files one after one and want to see filename in print function. I think there is a time gap between each and every file. please give me your suggestion how to achieve this scenario.
Thanks
Murali.
Hi Murali,
Your file name is Test-20160616-00000049.csv
This is in the format Test-sysdate-00000049
if Test and 00000049 will be same everyday and if only sysdate value changes.. i think you can achieve this.
design you job this way
Script-------- While_Loop
write the script like this
$G_FILENAME='Test'||'-'||to_char(sysdate();,'yyyymmdd')||'-'||'00000049'||'.csv';
$G_FILENAME_CHANGE=substr($G_FILENAME,15,8);
Here 00000049 will be assigned to this variable ($G_FILENAME_CHANGE)
now inside the while loop design like this
script_start----------Dataflow---------- script_end
inside script_start:-
$G_FILENAME='Test'||'-'||to_char(sysdate();,'yyyymmdd')||'-'||$G_FILENAME_CHANGE||'.csv';
print($G_FILENAME);
inside Dataflow:-
design your flow and in the source file filename give this variable ($G_FILENAME)
inside Script_end:-
$G_FILENAME_CHANGE=$G_FILENAME_CHANGE+1;
this make 00000049 as 00000050 and this will be assigned to the $G_FILENAME variable and that file will be picked.
But give condition in the loop based on the number of files you have to read
Please reply if u need more information and also check if this works.
Please excuse if there are any syntax errors
Thanks,
Ravi Kiran.
Hi Murali,
$G_FILENAME_CHANGE variable is used to store 00000049/00000050 values which are last part of the file names.
Declare new global variable for loop condition like $G_LOOP_COUNT of datatype int and intialize it to 1 in the script_start
$G_LOOP_COUNT=1;
and in the script_end
$G_LOOP_COUNT=$G_LOOP_COUNT+1;
Now give condition in Loop as
$G_LOOP_COUNT<=4 (less than or equal to 4
and please mark my above answer as correct if it solved your issue, so that it will be useful for other members.
Thanks,
Ravi Kiran.
Hi Ravi Kiran,
Thanks for your support to develop this my job.
As you mentioned in the previous repose.
inside Dataflow:-
design your flow and in the source file filename give this variable ($G_FILENAME)
I have pass variable "$G_FILENAME" in the source file filename. After try to update schema, i am getting only one field. But my table having 15 fields. Because of that I am getting error in my job.
Hope you understand my question. I hope if i fix this, i can run job successfully.
Once again thanks your help.
Thanks
Murali
Hi Ravi Kiran,
You can reach me out "muralikrishna.bods156@gmail.com". Then we can talk personally.
I think it is not good sign to talk our things here. Hope you understand.
Thanks and Regards,
Murali.
Hi Ravi Kiran,
Hope you are doing good,
I have implemented my job as you suggest and job is going smoothly.
But I am getting small problem to load data into database.
you know that i have 4 flat files, assume each flat file having 15 records. That means in target the records should be 4x15= 60 records.
But i am getting 15 records only that too final flat file (Test-20160616-00000052.csv) data only.
I do not know what is happeing inside dataflow.
Please guide me how to overcome this issue.
Thanks,
Murali
Hi Ravi,
How are you? Hope you are doing good,
I came back for few scenarios on this requirement.I am trying to myself to fix, and I want take your help in this requirement.
Now my requirement is, Initial days my files are Test-20160616-00000049.csv
for this we passed variable file name $G_FILENAME and apply this variable in file_format
but now i am getting files with different format like one day i am getting files
such as
Test-20160617-00000086.csv,
Test-20160617-00000092.csv,
Test-20160618-0000012AA.csv
How to over come this.
For while loop we create a variable $G_Loop_Count
some times i am getting 6 files and some times i am getting 15 files from source
how can I give particular count for while because we do not know how many files coming that day.
Hope you understand my requirement.
Please give me your suggestion on this scenario.
Regards,
Murali.
Hi Murali,
for $G_Loop_Count logic use cmd to read number of files exist in that folder and assign that value to $G_Loop_Count.
Refer this link how to read count of files usig command
http://www.digitalcitizen.life/4-ways-count-number-folders-and-files-inside-folder
or try anything like this in command.
and for files with different names we need to work out and find the solution.
first try this reading count. and let me know.
Thanks,
Ravi kiran.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.