cancel
Showing results for 
Search instead for 
Did you mean: 

USA Address cleansing and English NorthAmerica

0 Kudos

I am doing the address cleansing for name .i am getting standardised name as output field. But not sure how the data is split into different parts. please explain.. Any response will be appreciated.

I am sharing the out screen and setting applied in North America Address Cleanse transform...Please explain on what basis data in the Name field is split in to different fields like Prename, Person1_Person_Standardised and title. Name1 in the below screen shot is the input field. please reply.. 🙂

Accepted Solutions (1)

Accepted Solutions (1)

former_member106536
Active Participant
0 Kudos

I would suggest spending some time reading through the IS Cleansing Package Builder documentation.

The high level gist of it is ... that the Data Cleanse uses a data dictionary and a rules file to take word classifications and match them against several output category patterns, perform some math to determine which rule is more correct, then output each word using an output format definition for the winning rule. 

Your input may hit several different rules, which can have differing weights assigned.  NONE of them may be correct for your particular input requiring you to make dictionary and or rule file changes to coerce the data to come out appropriately.  Depending on the source of your data, you may also wish to identify common abnormalities and address those before presenting them to the address/data cleanse transforms.

I would say more but I have to get back to making dictionary/rules file changes...

Answers (1)

Answers (1)

0 Kudos

Hi Joshua,

                Thanks for giving answer to my question.. Do u mean to sayrules applied here are same as that of IS package builder ? If so can u provide any link for IS package builder documentation..?

former_member106536
Active Participant
0 Kudos

You select the cleansing package in the data cleanse.  Inside IS you can modify the rules/data dictionary for that cleansing package, and republish.  During the duration the CP is publishing, it is not available for use. 

I've been trying to find a link to the doc Im referring to, but Im having some issues finding it.  I dont see it in the install directory either.  Its 170 pages and goes over each component of setting up a CP.  I'm working off a printed copy of this section.