cancel
Showing results for 
Search instead for 
Did you mean: 

Information Steward : Compare 2 flat files

0 Kudos

Hello,

We have a requirement of comparing 2 flat files in Information Steward.

We are seeking answers for the below questions :

1) Is it possible to compare 2 flat files in Information Steward ?

(we are aware that it is possible to compare a file with a table using 'exists' function)

2) It is required to find which records exist or do-not exist in a particular file vis-a-vis the second file

3) for the records which exist in both the files (matching reocrds) , it is also required to compare the values of the records to find if the value of these records same ?

Any thoughts or inputs would be highly appreciated.

Regards,

Krupali

Accepted Solutions (1)

Accepted Solutions (1)

NielsWeigel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Krupali,

given the estimation that your two files contain records you want to compare are both having an ID column (e.g. BuPaID or MaterialID) where the content of the record should be identical for same IDs, I think you can achieve following in SAP Information Steward:

1) Add the files to you project and do a Redundancy Profiling with the two files and comparing the ID column. This will create a Venn diagram showing you, how many IDs are in both files, who many are in FileA but not in FileB and how many are in FileB but not in FileA.

--> This provides first insight, not matching the content of the record itself, but if you want to you can increase the number of fields (ID + ProductName, ...)

2) If you want to apply rules to your data and even bring scores to a DQ Scorecard on consistency across data sources, then Devilal's approach is a way to go: Create Information Steward view(s). One View if you focus just on on File, two views if you want to understand both directions A--B and B--A.

https://scn.sap.com/docs/DOC-33471

Add to the Information Steward View all the fields that you want to use for comparison (e.g. ProductNameFromFileA, ProductNameFromFileB, ProductColorFromFileA, ProductColorFromFileB, ...)

Then define your Validation Rules that the Names, Colors should be identical and bind the rules to the View(s).

Create the Rule Task and you will get scores showing how many times records with same ID have different field content.

Best regards,

Niels

0 Kudos

Hello Niels,

Thank you for the detailed explanation.

Seems to perfectly match our requirement.

We have just tested this fora demo and we will plan to put this solution to use once our analysis is completed.

Thank you again ! 🙂

Answers (2)

Answers (2)

former_member187605
Active Contributor

This is a perfect use case for SAP Data Services. Would that be an option?

0 Kudos

Hello Dirk,

Thanks for the reply.

This could be an option.

Have you already used this ? If yes, could you share some details.

Regards,

Krupali

former_member187605
Active Contributor
0 Kudos

You will need 2 data flows:

  1. Pull one file into a database table
  2. Read 2nd file. Use Table_comparison transform to do the comparison. Make sure to select Detect deleted row(s) from comparison table. Add a Map_Operation transform to reset the row types to normal and set the I/D/U flag. You can write the results to a table or a file.

      

Check for more details on performance implications.

0 Kudos

Thank you Dirk for the details.

I will explore this option further.

It is great that the performance aspects are also described in your document, as we also have to process millions of records.

Regards,

Krupali

former_member27665
Active Participant
0 Kudos

Hi Former Member,

One option (in SAP IS) will be to create a view with File1 left outer join File2 and create rules to compare columns.

Thx,