on 04-13-2018 10:00 PM
I have a CSV file from a vendor that I need to process that has multiple sets of data--not multiple schemas, but entirely different sets. Here is an example:
** Message
** Timestamp
Data Set 1
Value 1, Value 2, Value 3, Value 4, Value 5
Data, Data,,Data,
Data, Data, Data,,Data
** Message
** Timestamp
Data Set 2
Value 1, Value 2, Value 3
Data,,
Data,Data,Data
** Message
** Timestamp
Data Set 3
Value 1, Value 2, Value 3, Value 4
Data, Data, Data, Data
Data,,Data,
I need to retrieve the data from the second set. I don't know how to have Data Services skip the first set and skip the top three rows of file information before processing the file. The record count for each set varies, of course, so I can't set a row to start reading the file. Can this be done?
I solve this type of problem by dumping everything into a 2-column table first, the 1st column is populated with a sequence number, the 2nd contains the input data. Then using DS logic (transforms and functions) for further processing.
Have a look at
for a slightly comparable use case.You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thank you for sharing that, dirk.venken. You shared that same blog post on my last question, and it worked wonderfully. Now I'm thinking I should've thought of using the same method!
The only issue I can see is the headers in my file are split into two lines, like so:
Column 1, Column 2, Column 3, Column 4, Column 5, Column 6, Col
umn 7, Column 8, Column 9
I'm working on using this method--perhaps I can just skip the second header row and have my columns predefined. Finding the end of each data set also concerns me, but it'll just need to be in stages.
That proves I am consistent in my answers, doesn't it? 🙂
User | Count |
---|---|
84 | |
24 | |
12 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.