cancel
Showing results for 
Search instead for 
Did you mean: 

How to implement data duplication check for flat files using Data Services?

santossh
Active Participant
0 Kudos

Hi Experts,

I am new to Data Services and curious to learn about the data duplication check functionality. I am want to know:

How can we implement data duplication check for flat files using Data Services,

do I need to install any separate components at Data Services end?

Explanation with example will be much appreciated...

Thanks in anticipation,

Santosh

Accepted Solutions (0)

Answers (1)

Answers (1)

former_member187605
Active Contributor
0 Kudos

The Match transform is by far the most complex DS component. If you're new to DS, IMO that doesn't seem the right to start with. SAP runs a 2-day training course on the DS Data Quality components and almost the entire 2nd day is spend on Matching and De-duplication.

The first thing you need to get you going is a Data Quality license.

The matching process gives good results on cleansed data only. You might need a custom dictionary (to be developed with the Cleansing Package Builder in Information Steward) and/or address directories (licensed separately) to that extent.

santossh
Active Participant
0 Kudos

Thanks Dirk for the quick response,

Just for clarity sake, we have a scenario where we are not using SAP CRM| MDG | ECC. We have multiple flat files of global customer data where we want to perform the deduplication and reporting on the duplicate data,

So, is it possible to perform deduplication on the independent system?

Could you please, help out with an approach to achieve this.

Regards,

Santosh

former_member187605
Active Contributor
0 Kudos

The data source really does not matter. In your case, you'll have to import all files into a database staging area (you could use the same database where your DS repositories are created). Then cleanse the data and eventually match and de-duplicate.

You can find a lot of information on DS cleansing and matching in .