Skip to Content
author's profile photo Former Member
Former Member

USA Address cleansing and English NorthAmerica

I am doing the address cleansing for name .i am getting standardised name as output field. But not sure how the data is split into different parts. please explain.. Any response will be appreciated.

I am sharing the out screen and setting applied in North America Address Cleanse transform...Please explain on what basis data in the Name field is split in to different fields like Prename, Person1_Person_Standardised and title. Name1 in the below screen shot is the input field. please reply.. :-)😊

Settings.png (14.6 kB)
Settings.png (14.6 kB)
OutPut.png (101.4 kB)
Add a comment
10|10000 characters needed characters exceeded

Assigned Tags

Related questions

2 Answers

  • Best Answer
    Posted on Sep 26, 2014 at 02:07 PM

    I would suggest spending some time reading through the IS Cleansing Package Builder documentation.

    The high level gist of it is ... that the Data Cleanse uses a data dictionary and a rules file to take word classifications and match them against several output category patterns, perform some math to determine which rule is more correct, then output each word using an output format definition for the winning rule.

    Your input may hit several different rules, which can have differing weights assigned. NONE of them may be correct for your particular input requiring you to make dictionary and or rule file changes to coerce the data to come out appropriately. Depending on the source of your data, you may also wish to identify common abnormalities and address those before presenting them to the address/data cleanse transforms.

    I would say more but I have to get back to making dictionary/rules file changes... 😎

    Add a comment
    10|10000 characters needed characters exceeded

  • author's profile photo Former Member
    Former Member
    Posted on Sep 26, 2014 at 03:09 PM

    Hi Joshua,

    Thanks for giving answer to my question.. Do u mean to sayrules applied here are same as that of IS package builder ? If so can u provide any link for IS package builder documentation..?

    Add a comment
    10|10000 characters needed characters exceeded

    • You select the cleansing package in the data cleanse. Inside IS you can modify the rules/data dictionary for that cleansing package, and republish. During the duration the CP is publishing, it is not available for use. 😔

      I've been trying to find a link to the doc Im referring to, but Im having some issues finding it. I dont see it in the install directory either. Its 170 pages and goes over each component of setting up a CP. I'm working off a printed copy of this section.

Before answering

You should only submit an answer when you are proposing a solution to the poster's problem. If you want the poster to clarify the question or provide more information, please leave a comment instead, requesting additional details. When answering, please include specifics, such as step-by-step instructions, context for the solution, and links to useful resources. Also, please make sure that you answer complies with our Rules of Engagement.
You must be Logged in to submit an answer.

Up to 10 attachments (including images) can be used with a maximum of 1.0 MB each and 10.5 MB total.