Skip to Content
0

Need help on OS command to split files

Sep 19 at 08:27 PM

52

avatar image

Dear Experts,

I am working on a requirement where souce file (csv format) is of large size(Max 100 MB).
I want to split the source file into multiple files say 20MB each i.e 5 files for further processing.

I want to achieve this using script option available in PI File sender channel

"Run Operating System Command before message processing"

We are using SunOS Operating system

In the script, I want to achieve below items of the source file.
1. Delete 1st 5 rows and last row (1st 5 rows is header and last is Footer)
2. Count the number of rows and add the count to the filename

3. Split the files based on lines say 20000 per file or based on size 20MB each

Please provide me the suggestions to achieve my requirement.at sender comm channel(run OS command before processing) end.

Thanks in advance

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

6 Answers

Bartosz Jarkowski Sep 19 at 09:52 PM
1

in Linux you can use "split" command:

split -l 20000 inputFile.txt newFile.txt

More details:

https://kb.iu.edu/d/afar

Share
10 |10000 characters needed characters left characters exceeded
Evgeniy Kolmakov Sep 19 at 10:04 PM
1

Hi Varun!

It's worth noting that OS command is executed after the source file is taken by adapter and before it is sent to Messaging system. So you won't be able to fulfill your requirement using mentioned functionality.

Regards, Evgeniy.

Share
10 |10000 characters needed characters left characters exceeded
Patrick Weber Sep 20 at 11:40 AM
1

Hi Varun,

# Remove first 5 lines from "yourfile"
sed -i 1,5d yourfile

# Remove last line of "yourfile":
sed -i '$ d' yourfile

# Count the number of rows in "yourfile" and add the count to the file name:
linecount=`wc -l < yourfile`
mv yourfile yourfile_$linecount

# Split "yourfile" into several "yourfile_*" files with 20K lines each
split -l 20000 yourfile yourfile_
 

If you're not familiar with shell scripting and have a similar requirement in the future, you can find snippets for pretty much anything you can probably think of on Stackoverflow.

As Evgeniy explained, running such script as external command in the sender communication channel is not going to cut it. You will need to run it as a post-process to the file creation in your source application or set up a job to process the file and only then move it to the PI input directory.


Regards,

Patrick

Share
10 |10000 characters needed characters left characters exceeded
Nagur Meera Sep 20 at 05:10 AM
0

Hi Varun

I think it can be done through the maximum file size in MB which is available in file adapter itself.

Regards

Nagur

Show 1 Share
10 |10000 characters needed characters left characters exceeded

Hi Nagur!

Wouldn't you please clarify further, how restriction of taking files with size greater than defined helps in this case?

Or are you talking about anything else, like chunk mode?

Regards, Evgeniy.

0
Nagur Meera Sep 20 at 05:10 AM
0

Hi Varun

I think it can be done through the maximum file size in MB which is available in file adapter itself.

Regards

Nagur

Share
10 |10000 characters needed characters left characters exceeded
Nagur Meera Sep 20 at 05:11 AM
0

Hi Varun

I think it can be done through the maximum file size in MB which is available in file adapter itself.

Regards

Nagur

Share
10 |10000 characters needed characters left characters exceeded