on 12-05-2016 9:45 PM
Team
Current with delegation we have the following restrictions with APL
Restriction
Cases when model training is not delegated to APL:
1) Will the same restrictions apply to delegation with Native Spark as well . ?
2) Why would a cutting strategy affect delegation ....if I would want to manually suggest a cutting strategy (70% Training 20% Validation 10% Testing)
3) If I suggest a manual cutting strategy will all the data be replicated to the PA server ?
Hi, first thing first, let's not confuse APL delegation (HANA world) and Native Spark Modeling (Hadoop/Spark world) 😉
The reference for Native Spark Modeling restrictions is the SAP note https://launchpad.support.sap.com/#/notes/2391541. This answers 1)/.
2) this is a current restriction where APL delegation does not kick in. I do not have the detailed answer at hand but I can involve the appropriate product manager, if need be. Can you please explain why this is a concern?
3) when there are restrictions related to APL delegation (HANA world) and Native Spark Modeling (Hadoop/Spark world) it does not mean that the data is replicated, it is in fact queried from the underlying database or data lake and transferred for processing to the PA server (or desktop if desktop is used). The real bottleneck for predictive projects is not necessarily the creation of the predictive model, but rather the scoring of new data rows. For this one, this is purely processed in-database.
I hope this helps, thanks & regards Antoine
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thank you Antoine for the detailed explanation .
1) Since the data is queried we are still bringing data over to PA server for processing and if I am querying more that a million rows this would definitely affect performance
2) As for Cutting Strategy, based on what I have seen data scientst prefer a customized method than to rely on SAP's cutting strategy gives them more control .
On Point 2 this is a technical restriction ; the underlying APL stored procedure takes one table as input dataset in its signature ; such procedure does not support custom cutting strategies where the user specifies 2 or 3 input data sets (estimation, validation and test).
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
83 | |
10 | |
10 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.