cancel
Showing results for 
Search instead for 
Did you mean: 

Data Cleansing Terms Clarification

william_grdovich
Explorer
0 Kudos

Folks,

Another question and thanks again to all those that have been helpful so far.

In a data flow that I am building I have both customer information and address information. I plan on running my data both through an English base data cleanse and a Canada address cleanse.

I notice that there are many options for the same output field. The difference being the generated field class. I am confused as to what is the difference between Parsed and Standardized in regards to the base data cleanse.

Also I am confused that there are extra types such as alternate or none. Also in dealing with address cleansing there are other items such as generated field category and generated field addrclass which also seem to have an impact on the data that is output from these transforms.

I would like my transforms to correct error data and also if it is possible add items such as city or province when they are missing based on the postal code which is there for example. I have gone through both the designer guide and reference guide in regards to these terms, it has left me more confused than anything.

Which way of implementing this would be the best.

Thanks in advance,

Bill

Edited by: William Grdovich on Oct 5, 2010 4:13 PM

Accepted Solutions (1)

Accepted Solutions (1)

paul_kessler
Active Participant
0 Kudos

Bill,

I notice that there are many options for the same output field. The difference being the generated field class. I am confused as to what is the difference between Parsed and Standardized in regards to the base data cleanse.

Parsed means that the address has been separated into its components (house numnber, street name, city, etc). Standardized means that the address has been parsed, and each component value has been corrected, updated or enhanced.

Also I am confused that there are extra types such as alternate or none.

An altenate type means that an alternate value is available. For example, in New York City, 6th Ave (the official name) is also known as Avenue of the Americas (the alternate name). If a field has type 'None', it means that there is only one type associated with this field.

Also in dealing with address cleansing there are other items such as generated field category and generated field addrclass which also seem to have an impact on the data that is output from these transforms.

A description of the field category columns is provided in the SAP Business Objects Data Services Reference Guide, Data Quality Fields, Global Address Cleanse fields.

I would like my transforms to correct error data and also if it is possible add items such as city or province when they are missing based on the postal code which is there for example.

Use Generated Field Class 'Best', Generated Field Category 'Component, Generated Field Addrclass 'Official'. If your selected output field has Generated Field Class 'None' then use Generated Field Category 'Standardized'.

Paul

william_grdovich
Explorer
0 Kudos

Paul,

Thanks for the info, in regards to your first two answers I now see that they are true. One thing to mention is that besides parsed and Best for Generated Field class I also notice in the reference guide an option called corrected. I have to say I don't see anywhere inside the Canada Address cleanse an generated field class named corrected. Is the documentation out of date? I am on the latest data services 12.2.2.2

In regards to your mentioning the reference guide, if I can rant I must say its absolute garbage. This also goes for the designer guide as well. 2000 pages of documentation and at the end of it I am no clearer on understanding data cleansing or matching for that matter. Why can't there be a simple table that shows what input fields map to output fields? I am trying to map an address line 2 that may contain unit numbers and apt numbers but I keep on getting errors saying an invalid combination of fields was set.. For 2000 pages of documentation I shouldn't ever have to even go to a forum to get answers, thank god I can and their are helpful people out there like yourself. Rant over.

In your last response you mention using generated field addrclass official. Most of my fields that I want to correct such as locality1 and region and postcode_full do not have the addrclass. The best practices for the Canada address cleanse list mainly

generated field class best, generated field category component and generated field addrclass delivery... Again maybe I am missing something.

Thanks again for all your help, I really appreciate it.

Former Member
0 Kudos

Bill,

If you haven't downloaded the DQ blue prints you might want to look into them.

They are a good starting point for most basic DQ work flows.

Regards,

Victor

paul_kessler
Active Participant
0 Kudos

Bill,

The generated field class 'Correct' is used in the US Regulatory Addres Cleanse (URAC) transform but not the GLobal Address Cleanse (GAC) transform, which is what Canada Address Cleanse is built on.

Regarding the input field errors, address cleanse expects certain input fields to be mapped. For example, if you map an input address field the you should also map a locatity and region or a postcode or all three components.

regarding best practice, you are correct. I meant to say use Addrclass Delivery for best practice. You would use Official if you want to always use the offical address component value from the postal address lookup tables.

Paul

Answers (0)