cancel
Showing results for 
Search instead for 
Did you mean: 

Need help on OS command to split files

varun_kasireddy
Explorer
0 Kudos

Dear Experts,

I am working on a requirement where souce file (csv format) is of large size(Max 100 MB).
I want to split the source file into multiple files say 20MB each i.e 5 files for further processing.

I want to achieve this using script option available in PI File sender channel

"Run Operating System Command before message processing"

We are using SunOS Operating system

In the script, I want to achieve below items of the source file.
1. Delete 1st 5 rows and last row (1st 5 rows is header and last is Footer)
2. Count the number of rows and add the count to the filename

3. Split the files based on lines say 20000 per file or based on size 20MB each

Please provide me the suggestions to achieve my requirement.at sender comm channel(run OS command before processing) end.

Thanks in advance

Accepted Solutions (0)

Answers (6)

Answers (6)

weberpat
Contributor

Hi Varun,

# Remove first 5 lines from "yourfile"
sed -i 1,5d yourfile

# Remove last line of "yourfile":
sed -i '$ d' yourfile

# Count the number of rows in "yourfile" and add the count to the file name:
linecount=`wc -l < yourfile`
mv yourfile yourfile_$linecount

# Split "yourfile" into several "yourfile_*" files with 20K lines each
split -l 20000 yourfile yourfile_
 

If you're not familiar with shell scripting and have a similar requirement in the future, you can find snippets for pretty much anything you can probably think of on Stackoverflow.

As Evgeniy explained, running such script as external command in the sender communication channel is not going to cut it. You will need to run it as a post-process to the file creation in your source application or set up a job to process the file and only then move it to the PI input directory.


Regards,

Patrick

former_member190293
Active Contributor

Hi Varun!

It's worth noting that OS command is executed after the source file is taken by adapter and before it is sent to Messaging system. So you won't be able to fulfill your requirement using mentioned functionality.

Regards, Evgeniy.

BJarkowski
Active Contributor

in Linux you can use "split" command:

split -l 20000 inputFile.txt newFile.txt

More details:

https://kb.iu.edu/d/afar

Former Member
0 Kudos

Hi Varun

I think it can be done through the maximum file size in MB which is available in file adapter itself.

Regards

Nagur

Former Member
0 Kudos

Hi Varun

I think it can be done through the maximum file size in MB which is available in file adapter itself.

Regards

Nagur

Former Member
0 Kudos

Hi Varun

I think it can be done through the maximum file size in MB which is available in file adapter itself.

Regards

Nagur

former_member190293
Active Contributor
0 Kudos

Hi Nagur!

Wouldn't you please clarify further, how restriction of taking files with size greater than defined helps in this case?

Or are you talking about anything else, like chunk mode?

Regards, Evgeniy.