on 08-12-2021 4:10 AM
Hello, everyone
My client is using PA Time Series,
so I need to explain all the details to my clients
Here below are my questions (Sorry, it has lots of questions)
1. In "Data Description" panel, I need to set 1 on "Order" for the date variable.
What is the meaning of "Order"?
2. I guess, Time Series are doing Regression too internally in most cases.
Is Time Series using the same strategy as that of Regression?
That is, for example, is Time Series using criteria like KI + KR >= (some number) by default?
And also, is Time Series doing the same steps to get maximum smart variable selection?
3. As related to question 2, in Time Series, when it is doing regression, is it doing the same "repeated" procedures to remove an unnecessary variable for each step?
(For example, when doing "Classifiation/Regression", not "Time Series", it does repeated steps to remove one unimportant variable for each step)
4. In "Specific Parameters of the Model" --> General Panel, there is "Variable Selection"
and "Percentage of Variable Contributions to Keep"
What is the exact meaning of this? Is this saying, for each variable, if KI drops less than or equal 5%, that variable should remain?
5. At the same panel, there is "Maximum Order of the Autoregressive Model",
In many cases, this edit box is empty without default value. If this box is empty, what value does Time Series use?
Or if that is empty, what does Time Series do in doing AR modeling?
6. At the same panel, there is "Activate for All Extrapredictable-based Trends"
I cannot understand what this option is doing.
Please explain to me what Time Series do when this check box is on and also when it is off.
7. As related to question 6, there I can see in Display Menu (after generating the model)
"Regression: Contribution by Variables"
There I can see the chart showing Maximum Smart Variable Contributions, Variable Contributions etc.
But even though I check the "Activate for All Extrapredictable-based Trends" on or off,
this chart never shows the whole list of extrapredictable variables in any case
but the number of extrapredictable variables shown changes if I check it on or off, anyway.
What is the difference between when "Activate for All Extrapredictable-based Trends" is on and when it is off
in deciding the number of extrapredictable variables selected in Variable Contributions or in Maximum Smart Variable Contributions?
8. At the same panel, there is "Activate for all autoregressive models".
It says, when it is checked on, it is doing parsimonious AR modeling.
Does this mean Time Series will set some of the coefficients in AR(n) to zero to get parsimonious one?
What is the exact formula when it is parsimonious?
9. The reference guide says, Time Series will create model = Trend + Cycle + AR + residual, if it is not doing Exponential Smoothing.
Can I get the exact formula including coefficients when the model is created?
10. For Trend, there is list Lag1, Lag2, Second Order Differencing....etc
Can Trend component have several among these at the same time, when it it created?
That is, Can Trend be something like Lag1 + Lag2 + Linear in Time?
11. For Trend, I guess, Time Seris looks like doing Regression, not "Time Series"
Is this correct? (But I am not sure what it does for Lag1, Lag2, Second Order Differencing)
12. I did several forecasts after generating models. I found MAPE for forecasts are not increasing w.r.t time
even though next forecast is done using previous forecast, which will increase error.
May I ask : why the forecast MAPE's are not increasing w.r.t time?
13. In "Model Overview", there is shown horizon-wide MAPE.
Is this MAPE calculated using estimation data only ? or using validation data only ?
14. In "Display" --> "Regression: Contributions by Variables" chart,
I can see variable something like ExtrasPred_xxxx where xxxx is one of the extrapredictable variable.
What are these variables?
15. In classification/regression, there are variables like c_xxxx where xxxx is the name of the variable.
In Time Series, are these c_xxxx variables also created?
16. In "Statistical Reports" --> "Performance Indicators" --> "Forecaste MAPE"
I found, MAPE value is just constant for each difference forecast.
Could you please explain why MAPE values are constant?
Also, for Forecast Error Bars, Forecast Efficiency, Other Performance Indicators, they are also all constant. Why are they constant?
17. Also, related to 16, what is Forecast Efficiency?
18. In "Statistical Reports" --> "Performance Indicators" --> "Other Performance Indicators"
What is Quality Coefficient?
Sorry for all long list of questions. Thank you in advance
Here are complementary answers.
Q: Regarding the report: Performance Indicators > Forecast
Mean Absolute Percentage Error, under the menu: Display > Statistical
Reports
How come the Forecast 1 MAPE, the Forecast 2 MAPE, …, the Forecast n MAPE, are sometimes
constant, and sometimes not?
A: They are not constant if there is at least one component
in the forecasting model that relies on past values, such as: L1, L2, 2*L1-L2,
AR, or Exponential smoothing.
The evaluation of the time series model accuracy is based on a rolling forecast
from origin. Let’s take as an example: a forecast with horizon 6 on a monthly
series. The application under the cover will produce 6-month wide forecasts for
each single period in the past, then the MAPE values for these forecasts are
computed. The Forecast 1 MAPE is obtained by averaging all the H1 MAPE values, the
Forecast 2 MAPE by averaging all the H2 MAPE values, …, the Forecast 6 MAPE by averaging
all the H6 MAPE values.
With a linear model, say, Ŷ = 92574 - 26.18 X
whatever the origin is (e.g. Feb-21 or Mar-21) the forecast for Jun-21
remains the same, therefore the 6 Forecast MAPE values in the report will be
constant.
Q: Can I get the time series model equation?
A: Unlike with Classification or Regression, the generation of the model equation in the case of Time Series is not supported.
The first diagram in the blog below gives the different calculation steps for Time Series modeling.
Note that Exponential smoothing is not represented in that diagram. With SAP Predictive Analytics, the Exponential smoothing technique is used separately in two cases: i) It has been explicitly requested by the user through the interface ii) It serves as a fallback mechanism when the dataset is too small or when a pattern could not be found.
Q: What is the default value for "Maximum Order of the Autoregressive Model"
A: 100.
I didn’t reproduce the case of an empty edit box you mentioned.
Q: What’s in the Forecast Efficiency report?
A: The squared Pearson.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Dear Marc:
Thank you for your detailed explanation
It help me a lot
I have some questions about your answers.
- For Forecast Efficiency Report, if that values are r^2 (squared Pearson), how is r^2 calculated for each forecast?
(Sorry, actually I don't understand how r^2 makes sense for future forecast and what r^2 actually is here)
- For Forecast MAPE Report and Forecast Error Bars Report,
What is the origin if training period is 2015-01-01 ~ 2015-03-31 (daily series) ?
What is horizon if training period is 2015-01-01 ~ 2015-03-31 (daily) and if I requested 2 future forecasts (daily)?
And using the same (training period is 2015-01-01 ~ 2015-03-31 daily and if I requested 2 future forecasts (daily)),
what is H1 and what is H2?
and how are H1 MAPE and H2 MAPE and so on... calculated?
Thank you and see you again, Marc.
Squared Pearson in the Forecast Efficiency report is based
on the Pearson coefficient defined as:
Covariance(Actual,Forecast) / (StandardDeviation(Actual) * StandardDeviation(Forecast))
As for H1, H2, …, H6, it is a way to represent the sequence within the forecast horizon. The MAPE calculations are described here: https://blogs.sap.com/2021/04/21/understand-accuracy-measure-of-time-series-forecasting-models
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks Marc!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Just a note in case anyone else is looking into this - remaining points are 4/5/7/8/12/16/17/18. We are half-way 🙂
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
9. There is no formula for trend & cycles. This is not relevant.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi, answers below.
To be completed over time as I have time.
1. It's a way to tell the predictive engine how the dates are ordered (from oldest to newest). It should not be up to the end-user to do this, unfortunately it is (but not in SAP Analytics Cloud)
2. This is not correct, please refer to the detailed blogs I shared earlier on. We only use some regression techniques to determine fluctuations. The time series algorithm first de-trend the signal (identify trends if any) then de-cycle it (identify cycles if any) then determine possible fluctuations. This is the overall logic with some recently introduced nuances (exponential smoothing is part of the internal competition).
3. No. Use influencing variables. Let the automated technique identify the relevant ones. Prune the useless ones.
4. & 5. I need some more time
6. in the trend identification step, will influencers (additional variables) be used
7. & 8. I need some more time
9. do you mean formula for AR? something else?
10. Only one.
11. You are not correct - see 2.
12. I am not sure to understand the question, sorry.
13. You can refer to this to have a general understanding of HW MAPE calculation. Please plan some time to go through this https://blogs.sap.com/2021/04/21/understand-accuracy-measure-of-time-series-forecasting-models/
14. These are the variables you have in your data model for time series
15. No
16. & 17. & 18 more time needed.
Not sure whether your client is aware that SAP Predictive Analytics is currently in mainstream maintenance mode (ending up end of 2022) with no active feature development - just in case this is the official blog: https://blogs.sap.com/2020/01/27/sap-predictive-analytics-maintenance-policy/?preview_id=1037565. I would actively encourage your customer to explore complementary options provided in SAP product portfolio to support their predictive use cases. In my humble opinion SAP Analytics Cloud might be a good, future fit.
Best regards,
Antoine
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello,
9. Yes formula for AR and also, if possible, formula for trend, cycles, including their coefficients
12. If I choose "3 forecasts" at Summary of Modeling Parameters, and if the last training date is 2017-04-30,
it will forecast three values for 2017-05-01, 2017-05-02, 2017-05-03.
And the forecast value of 2017-05-02 depends on the forecast value of 2017-05-01
the forecast value of 2017-05-03 depends on the forecast values of 2017-05-01 and 2017-05-02.
Because those forecast values are simply estimated ones, not true values, so the estimating errors will increase from 2017-05-01 to 2017-05-03 theoretically
but I found those errors are not increasing. So I guess this forecast algorithm will use some kind of more elaborated theory other than simple time series forecast.
So I want to know what algorithm the Time Series is using when doing forecast
User | Count |
---|---|
101 | |
13 | |
13 | |
11 | |
11 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.