Skip to Content

Data Wrangling / Cleansing in PA

How much Data Wrangling and/or Data Cleansing does PA perform, and how much do I need to take care of (either beforehand or within PA tool) ?    For example, how does PA deal with Gaps in the data ?  Does it treat them as a unique data point to take into consideration when generating a model, or does it ignore blanks and leaves them out of the equation ?    How does PA deal with anomalies in the data - e.g. text in numeric field, etc.   Any links to reference materials would be helpful 😊

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

3 Answers

  • Best Answer
    avatar image
    Former Member
    Oct 24, 2015 at 08:59 AM

    Hi Jerold,

    It depends on your PA module (expert or automated). I use expert mode and I noticed that null values in data may cause problems with R based algorithms.  I recommend to clean null values before inserting data into PA, using for instance a microsoft sql server database or a microsoft access database if you work with windows.

    Add comment
    10|10000 characters needed characters exceeded

    • Former Member

      Hi Jerold

      If your using SAP PA Automated mode/ Infinite Insight, then things like null values are handled automatically, but like if your are using time series or regression based algorithm where date is there, you have to check the SAP document as in which all date formats are supported. Second thing will be some junk columns which may be containing huge text or description or blank columns may present in your data. For such scenario, first select the algorithm, load the data and before running the model just exclude those columns  unwanted columns, and you will get your desired ouput. You can go through the below link for getting acquainted with the tool.

      Use InfiniteInsight Recommendation: SAP InfiniteInsight 7.0 - YouTube

      Now after understanding about Predictive Analysis using Automated mode come to SAP PA Expert Mode, it needs a bit smart user, who understands the algorithms and their coefficients. The out of the box available Predictive algorithms provide the option to handle missing values and do the prediction for them, you just need to tick the check box. You can see the below snapshot here: for the below eg. you have the option to handle missing values

      For anomalies in the data - e.g. text in numeric field, SAP PA Expert Mode has Data Preparation Tab where you can convert the numeric field to text field.  Check the below snapshot. Where customer number is converted to text for analysis.

      Regards

      Ranajay

      Capture1.PNG (37.2 kB)
      Capture1.PNG (56.8 kB)
  • Oct 26, 2015 at 08:17 AM

    Hi Jerold,

    In order to complete the previous answers:

    I hope this is helpful for you, you can give a try for 30 days to the solution, just follow the link: Welcome | SAP

    Best regards,

    Antoine

    Add comment
    10|10000 characters needed characters exceeded

  • Nov 02, 2015 at 07:24 AM

    Hi Jerold,

    Please mark the question as "Answered" (if you feel such is the case) to ease the future follow-up.

    Thanks & regards

    Antoine

    Add comment
    10|10000 characters needed characters exceeded