Skip to Content

Need help on OS command to split files

Dear Experts,

I am working on a requirement where souce file (csv format) is of large size(Max 100 MB).
I want to split the source file into multiple files say 20MB each i.e 5 files for further processing.

I want to achieve this using script option available in PI File sender channel

"Run Operating System Command before message processing"

We are using SunOS Operating system

In the script, I want to achieve below items of the source file.
1. Delete 1st 5 rows and last row (1st 5 rows is header and last is Footer)
2. Count the number of rows and add the count to the filename

3. Split the files based on lines say 20000 per file or based on size 20MB each

Please provide me the suggestions to achieve my requirement.at sender comm channel(run OS command before processing) end.

Thanks in advance

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

6 Answers

  • Sep 19, 2017 at 09:52 PM

    in Linux you can use "split" command:

    split -l 20000 inputFile.txt newFile.txt

    More details:

    https://kb.iu.edu/d/afar

    Add comment
    10|10000 characters needed characters exceeded

  • Sep 19, 2017 at 10:04 PM

    Hi Varun!

    It's worth noting that OS command is executed after the source file is taken by adapter and before it is sent to Messaging system. So you won't be able to fulfill your requirement using mentioned functionality.

    Regards, Evgeniy.

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    Sep 20, 2017 at 11:40 AM

    Hi Varun,

    # Remove first 5 lines from "yourfile"
    sed -i 1,5d yourfile
    
    # Remove last line of "yourfile":
    sed -i '$ d' yourfile
    
    # Count the number of rows in "yourfile" and add the count to the file name:
    linecount=`wc -l < yourfile`
    mv yourfile yourfile_$linecount
    
    # Split "yourfile" into several "yourfile_*" files with 20K lines each
    split -l 20000 yourfile yourfile_
     
    

    If you're not familiar with shell scripting and have a similar requirement in the future, you can find snippets for pretty much anything you can probably think of on Stackoverflow.

    As Evgeniy explained, running such script as external command in the sender communication channel is not going to cut it. You will need to run it as a post-process to the file creation in your source application or set up a job to process the file and only then move it to the PI input directory.


    Regards,

    Patrick

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    Sep 20, 2017 at 05:10 AM

    Hi Varun

    I think it can be done through the maximum file size in MB which is available in file adapter itself.

    Regards

    Nagur

    Add comment
    10|10000 characters needed characters exceeded

    • Hi Nagur!

      Wouldn't you please clarify further, how restriction of taking files with size greater than defined helps in this case?

      Or are you talking about anything else, like chunk mode?

      Regards, Evgeniy.

  • avatar image
    Former Member
    Sep 20, 2017 at 05:10 AM

    Hi Varun

    I think it can be done through the maximum file size in MB which is available in file adapter itself.

    Regards

    Nagur

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    Sep 20, 2017 at 05:11 AM

    Hi Varun

    I think it can be done through the maximum file size in MB which is available in file adapter itself.

    Regards

    Nagur

    Add comment
    10|10000 characters needed characters exceeded