cancel
Showing results for 
Search instead for 
Did you mean: 

Match Transform Help

william_grdovich
Explorer
0 Kudos

Folks,

Need some clarification on how to use the Match transform. I really can't figure this transform out. I have read the designer guide and reference guide constantly and I am at my wits end. I also am not really sure if it is something I should be using since what I feel is what I am really trying to do is a search.

I am trying to take data from one file and compare fields such as first name, last name, phone number email address etc and trying to find matches of this data in another file. My assumption was that this transform would take data from multiple sources such as 2 files in my example and allow me to perform some match. Out of the transform I would then get the records that match.

This does not seem to be the case. I have no clue what is coming out of this transform. Also I have no idea how to pass multiple sources in. If the transform only takes one source of data in which I believe is the case then I would have to use a merge transform.

How do I go about telling this transform that a certain set of records are just used for comparison and another set of records are to be used to look-up and find duplicates.

Accepted Solutions (0)

Answers (1)

Answers (1)

Former Member
0 Kudos

Matching is a complex subject for sure.

Have you tried downloading the Data Services match blueprints? Those will provide a template that you can consider reusing or simply use them to pick up match techniques. To download, simply go to and download the match transform blueprints and import them into Data Services.

If running the match transform in batch, yes you must merge the files together before matching. Make sure to assign a data source identifier so that you can tell which rows came from which source -- you can also use this data source identifier to set the matching composition options you mentioned.

Optionally you could match a flat file to a database using the candidate selection option w/in the match transform but that is an even more advanced option.

My recommendation is for you to use the match wizard to setup your matching transform/options if the blueprints don't meet your needs. After merging together your sources, simply right click on the merge transform and select the "match wizard" option and step through the setup.

Good luck!

william_grdovich
Explorer
0 Kudos

I have looked at the blueprints and after playing around with them and the match transform I can match my data. I have also done as you mentioned and put the source in as one of the fields so I can remove the records that I need to remove.

Partly my issue might have been also what I was expecting from the match transform. None the less we have come up with another solution. Since my match results were coming from an SAP function we are just having the SAP function performing more of our matching and I will just receive one result back from this function. The record that matches based on rules we have decided to program.

I do now see the power though that this function can provide when working with large amounts of data, although my knowledge is still limited on all the possibilities that this transform provides.

A couple of items I would like clarification on:

I am still confused when running the wizard on what a break key is exactly. If I say I want to have my postal code be a break key what does that mean? And what effect does it have on the other fields that make up my match criteria.

The second item is the input sources on the options tab. What exactly are we specifying here? I see a value field but it seems to be made up of my match criteria. My thought on this option would be we would specify a field in our source that would be used as a selector for which source we are working with in a match.

Thanks for your help.

Bill

paul_kessler
Active Participant
0 Kudos

William,

A break key allows you to "break" your input data into smaller datasets to reduce the number of match comparisons. In the match process, each input record must be compared to every other input record. As the number of input records increases, the number of comparisons increases exponentially.

Some records will never match. For example, an address in one zip code will never match an address in another zip code. Break keys allow you to separate these records into separate groups. Each break group is passed to the match transform separately. Records in different break groups will never be compared.

In the Options tab of the match transform editor, under Match Criteria Key Layout, two fields are defined. The Mapped Field Name is the source field for the match criteria definition. The Standard Key Name is an internal key that tells match transform what type of data is in the input field and, therefore, what types of operations to perform on that data.

Paul