Skip to Content
avatar image
Former Member

How is data modeling and hypothesis models done in predictive analytics?

What supervised/unsupervised models are used for predictive analytics? Do we need a data scientiest to create model or will the predictive analytics solution have build in libraries and based on the clustering and deploy the models automatically? How can we determine the fit of the model and which model was used for the data set?

I have seen some products in which the system creates multiple models for each line of the dataset automatically without the need for data scientist. Is SAP predictive analytics solution on similar lines?

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

3 Answers

  • Best Answer
    avatar image
    Former Member
    Aug 15, 2017 at 10:56 AM

    Hi Prashant,

    SAP Predictive Analytics has rich set of functionalities both for Citizen Data Scientist/Business Analyst using Automated mode and Data Scientist - Expert Mode. So Automated mode provides built-in ML techniques (which are supervised) where a user does not need to select a specific algorithm for given business problem. It also takes care binning, missing value handling and encoding while generating predictive models. Expert mode on the other hand does gives flexibility to Data Scientists to choose algorithms and build complicated model chain via predictive libraries including R. The algorithms are not limited to training alone but does include data preparation, feature generation/selection, scoring and operationlization of models. There are lot of blogs , forums, media resources on community which can help you more on getting up to speed + the links above.

    Regards. Priti

    Add comment
    10|10000 characters needed characters exceeded

  • Oct 13, 2017 at 02:57 PM

    A few comments.

    SAP Predictive Analytics - Automated Mode make it possible to perform supervised clustering, which is yet another unique value of the tool. It also allows creating in a simple way recommendation & social models.

    The question as to whether a Data Scientist is required to operate Automated Analytics depends on the definition we give of a data scientist. To me Michael has a strong point that most of the question consists in translating a business question into the relevant data set that makes it possible to address it with predictive technologies. So you need a person that can translate business needs into datasets.

    Predictive Power is strongly tied to AUC.

    I have seen some products in which the system creates multiple models for each line of the dataset automatically without the need for data scientist

    We have segmented time series in Factory out of the box. You can perform segmented automated classification using our scripting language KxShell. Having this capability out of the box in the product is part of the roadmap.

    Kind regards

    Antoine Chabert

    Add comment
    10|10000 characters needed characters exceeded

  • Oct 13, 2017 at 01:37 AM

    Hi Prashant,

    What supervised/unsupervised models are used for predictive analytics?

    Predictive Analytics is composed of two products, Automated Analytics has a "Clustering Model" for your unsupervised learning, and for supervised learning there is machine learning based regression and bi-classification.

    Expert Analytics has this functionality too plus access to R libraries which contain many supervised/unsupervised algorithms.

    Do we need a data scientist to create model or will the predictive analytics solution have build in libraries and based on the clustering and deploy the models automatically?

    Yes and Yes. The libraries are available and the models can be easily deployed using the Predictive Factory. You will need a data scientist to be able to convert the business problem/question into an analytics solution on these systems, they will need to build the complex data transformation pipeline that these models often require.

    How can we determine the fit of the model...

    Automated Analytics gives "Area under curve" metric for its supervised algorithms, and for clustering there is "Overlap" and "Unassigned Records" count. The metrics "Predictive Power" and "Predictive Confidence" are given in both cases.

    Expert Analytics it depends on what model you are running. Here is an example output of an R model I made for a simple churn model.

    and which model was used for the data set?

    You usually select the model to be used, but Automated Analytics uses propriety models (KXEN). For instance, the regression/classification is based on Ridge Regression.

    Is SAP predictive analytics solution on similar lines?

    I need more info to answer this question, but it generally shares features with other analytics tools. It's a nice product if you want to productionise machine learning models in organisations that use SAP ERP.

    Cheers, Michael


    Add comment
    10|10000 characters needed characters exceeded