on 10-22-2013 6:37 AM
Dear All,
I am new in Information steward space.
Can someone please explain how cleansing of data can be achieved through IS?
Do we essentially need the BODS jobs to be created to clean up the data.
Tanks in advance.
Hi Niels,
Thanks for the quick response.
So should I infer from this is IS will identify the irregular data and the cleansing of the data can be done using any ETL tool like BODS's DQ or SAP metadata management?
But now the confusion is about the cleansing package builder part of IS?I have seen in some tutorials that we can create the BODS job directly from it, then does that mean we will define some cleansing rule in IS and a BODS job will be created automatically for cleanup, please confirm.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi S N,
yes, for cleansing the identified bad data, you have to use SAP Data Services (Note: Official naming is without BusinessObjects now). Or any other tool that supports your cleansing activity. Keep in mind that the Data Cleanse transform from the SAP Data Services supports you in parsing, standardizing and cleansing information based on Dictionaries and Cleansing rules, represented in the so called Cleansing Packages, however sometimes even simple "Search and Replace" functions within a Data Services data flow are used to "cleanse and improve" data.
To resolve your confusion:
Information Steward
We see Information Steward as the business user tool, where the experts on your data like the Data Stewards are working and e.g. they are the ones who can definitely define form a business perspective how the data domain PRODUCT must be cleansed and standardized. So they start creating a new Cleansing Package within the Information Steward UI based on some sample data, defining the attributes of a product (e.g. Manufacturer, Size, Weight, Color, ..) and the valid standard values and variations to be parsed and standardized.
--> They transfer their knowledge and expertise on HOW to parse, standardize and cleanse a product / product description into a technical Cleansing Package. When they have set up all the policies and content they are going to publish this Cleansing package to the Data Services repository, to make it available in the Data Services landscape.
Data Services
Within Data Services Designer, the technical user who is responsible for setting up the data movement or data cleansing jobs itself (or creating real-time jobs to be exposed as web service and integrated in applications for real-time data cleansing) just picks up the Cleansing Package and integrates it into the data flow with the Data Cleanse transform. He is not the business matter expert, not aware if "HewPack" should be standardized to "HP" or "Hewlett-Package", but he is the expert to set up the technical data flow to prepare the cleansed data in a target table or even play corrected data back to the system.
Hope this makes sense...
What I am not sure about is your reference to "SAP metadata management" to be used for cleansing of the data. Might have been a typo and you wanted to say "SAP Master Data Management"?
Niels
yes, you will need BODS to cleanse data.
First you will need to create the cleansing package in Information Steward. It can be either Custom Cleansing Package or the Person & Firm Cleansing Package.
If your requirement fullfill by the SAP provided cleansing package then no need to create one, it is already available, such as Global Address cleanse etc..
If you are creating the cleansing package the in IS then import it into BODS and create a job to cleanse data.
Please find the below link where you can get more information on IS
http://scn.sap.com/docs/DOC-8751
For Cleansing package builder use below link.
http://wiki.scn.sap.com/wiki/display/EIM/Cleansing+Package+Builder
Let me know if you need more information on step, I woule be happy to help you.
Regards,
Chetan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi S N,
SAP Information Steward enables users to identify (already known or so far not known) data quality issues and monitor the data quality scores of their current data based on the central validation rules defined in Information Steward. (Note: Further capabilities around central Busienss Term Glossary, Business Value Monitoring, Metadata Management, Match Review, ... not to forget).
The data REMEDIATION or data CLEANSING is not the activity done within Information Steward.
Why?
We think data cleansing should be a tracked and governed process, not just a one of step, while someone identifies single issues. And in many cases it should be done within the context of an application the data belongs to.
So we have integrated Information Steward's Scorecard and Failed Record Database into the SAP MDG solution, so that any cleansing or remediation activity of bad master records can fully be initiated and monitored within a MDG Workflow as a change request.
Or as you mention, customers are setting up their Data Services Cleansing Batch Job to correct the errors with one of the DQ transforms or with Queries , Mappings or Custom Function embedded in the data flow.
Niels
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
91 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.