Skip to Content

Best approach for transformation

We have a single flat file.

900 distinct groups with multiple members.

The data is written by user ID.

ID, Group

ID, Group


We need to transform this into a format like this:

Group, "UserID1, UserID2...UserIDn"

Basically, we need to concatenate the user id's into a single comma separated string quoted at the beginning and end of the string

We were thinking about some sort of pivot but each group could have a variable number of ID's.

Furthermore we are also wondering about the limits of varchar,  6000 user ID's with a comma, each 5 to 12 characters would be a pretty long string.

We would be willing to break up the list and have multiple records for a group each with a subset of user ID's (say two records for the group with 500 user ID's instead of one record with 1000 ID's)

There are 900 distinct groups, and can have anywhere from 1 to 6000+ members.

Our thinking now might be that we create a query transform of all records Group Name, User ID (sorted by Group Name).

Then using a script we might invoke sql to"walk" the table and concate each ID into a string until we encounter a change in group, then write out that string for the group ID - and continue this process until the end of the table. 

This seems a little awkward and we are wondering if there is a better approach or some built-in BODS features that might make this easier

Any ideas or suggestions would be appreciated,


Jeff Henke

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

2 Answers

  • Oct 17, 2015 at 09:09 AM

    This is possible with a DS custom function, if your data would reside in a database. But once you have the data is in a database, there is a much more effective approach by using a database-level function. That will avoid thousands of roundtrips from DS to database.

    Build a data flow to copy your file into a database. Make sure you add a column with a sequence number, use the built-in gen_row_num_by_group function to do so.

    Write the database function with 2 input parameters (group and sequence number) to concatenate (up to) 500 userid's, and import it into DS.

    Build a 2nd data flow, use the table as a source. Add a Query transform that reads records #1, #501, #1001... from the table (use where-clause mod(seq_number,500) = 1) and calls the function. Write the output to a file or another databse table.

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    Oct 18, 2015 at 08:54 PM
    Add comment
    10|10000 characters needed characters exceeded