cancel
Showing results for 
Search instead for 
Did you mean: 

Information steward data cleansing

former_member185138
Participant
0 Kudos

Dear All,

I am new in Information steward space.

Can someone please explain how cleansing of data can be achieved through IS?

Do we essentially need the BODS jobs to be created to clean up the data.

Tanks in advance.

Accepted Solutions (1)

Accepted Solutions (1)

former_member185138
Participant
0 Kudos

Hi Niels,


Thanks for the quick response.

So should I infer from this is IS will identify the irregular data and the cleansing of the data can be done using any ETL tool like BODS's DQ or SAP metadata management?

But now the confusion is about the cleansing package builder part of IS?I have seen in some tutorials that we can create the BODS job directly from it, then does that mean we will define some cleansing rule in IS and a BODS job will be created automatically for cleanup, please confirm.

NielsWeigel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi S N,

yes, for cleansing the identified bad data, you have to use SAP Data Services (Note: Official naming is without BusinessObjects now). Or any other tool that supports your cleansing activity. Keep in mind that the Data Cleanse transform from the SAP Data Services supports you in parsing, standardizing and cleansing information based on Dictionaries and Cleansing rules, represented in the so called Cleansing Packages, however sometimes even simple "Search and Replace" functions within a Data Services data flow are used to "cleanse and improve" data.

To resolve your confusion:

Information Steward

We see Information Steward as the business user tool, where the experts on your data like the Data Stewards are working and e.g. they are the ones who can definitely define form a business perspective how the data domain PRODUCT must be cleansed and standardized. So they start creating a new Cleansing Package within the Information Steward UI based on some sample data, defining the attributes of a product (e.g. Manufacturer, Size, Weight, Color, ..) and the valid standard values and variations to be parsed and standardized.

--> They transfer their knowledge and expertise on HOW to parse, standardize and cleanse a product / product description into a technical Cleansing Package. When they have set up all the policies and content they are going to publish this Cleansing package to the Data Services repository, to make it available in the Data Services landscape.

Data Services

Within Data Services Designer, the technical user who is responsible for setting up the data movement or data cleansing jobs itself (or creating real-time jobs to be exposed as web service and integrated in applications for real-time data cleansing) just picks up the Cleansing Package and integrates it into the data flow with the Data Cleanse transform. He is not the business matter expert, not aware if "HewPack" should be standardized to "HP" or "Hewlett-Package", but he is the expert to set up the technical data flow to prepare the cleansed data in a target table or even play corrected data back to the system.

Hope this makes sense...

What I am not sure about is your reference to "SAP metadata management" to be used for cleansing of the data. Might have been a typo and you wanted to say "SAP Master Data Management"?

Niels

former_member185138
Participant
0 Kudos

Yes I meant SAP Master Data Management.

Is cleansing also possible from it?

Answers (2)

Answers (2)

Former Member
0 Kudos

yes, you will need BODS to cleanse data.

First you will need to create the cleansing package in Information Steward. It can be either Custom Cleansing Package or the Person & Firm Cleansing Package.

If your requirement fullfill by the SAP provided cleansing package then no need to create one, it is already available, such as Global Address cleanse etc..

If you are creating the cleansing package the in IS then import it into BODS and create a job to cleanse data.

Please find the below link where you can get more information on IS

http://scn.sap.com/docs/DOC-8751

For Cleansing package builder use below link.

http://wiki.scn.sap.com/wiki/display/EIM/Cleansing+Package+Builder

Let me know if you need more information on step, I woule be happy to help you.

Regards,

Chetan

NielsWeigel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi S N,

SAP Information Steward enables users to identify (already known or so far not known) data quality issues and monitor the data quality scores of their current data based on the central validation rules defined in Information Steward. (Note: Further capabilities around central Busienss Term Glossary, Business Value Monitoring, Metadata Management, Match Review, ... not to forget).

The data REMEDIATION or data CLEANSING  is not the activity done within Information Steward.

Why?

We think data cleansing should be a tracked and governed process, not just a one of step, while someone identifies single issues. And in many cases it should be done within the context of an application the data belongs to.

So we have integrated Information Steward's Scorecard and Failed Record Database into the SAP MDG solution, so that any cleansing or remediation activity of bad master records can fully be initiated and monitored within a MDG Workflow as a change request.

Or as you mention, customers are setting up their Data Services Cleansing Batch Job to correct the errors with one of the DQ transforms or with Queries , Mappings or Custom Function embedded in the data flow.

Niels