on 02-08-2016 8:42 PM
Hi,
I try to do a time-series analysis with a lot of data. When I enter the "Selecting Variables" screen (i.e. where one chooses the time and target variable) the system says "Guessing default variables" with a separate window and a bar moving back and forth. Due to the huge number of records, this takes a very long time.
Question: Can I switch off that guessing? Actually I know what the time and the target should be.
Thanks,
Ingo
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Antoine,
Just take the electric power consumption data from https://archive.ics.uci.edu/ml/datasets/Individual+household+electric+power+consumption.
These are ca. 2 Mio. data records based on a per minute timestamp.
When automated analytics is getting to the variable selection you will see for several minutes
As I know, what is chosen as time and target, this is superfluous and just causes long waiting times.
Regards,
Ingo
Instead of clicking on Analyze (button that guesses the data description), one can click on Open Description that reads the metadata from a text file (see example below). An alternative is to use a file that is a subset of the data (first 200 rows for instance), run Analyze, Save the description file, Go Back, use the file with all the data and click on Open Description.
RANK | NAME | STORAGE | VALUETYPE | KEYLEVEL | ORDERLEVEL | MISSINGSTRING | GROUPNAME | DESCRIPTION |
0 | Date | datetime | continuous | 1 | 1 | |||
1 | WorkingDaysIndices | number | continuous | 0 | 0 | |||
2 | ReverseWorkingDaysIndices | number | continuous | 0 | 0 | |||
3 | MondayMonthInd | number | ordinal | 0 | 0 |
Hi Marc,
Actually I use a description file to describe the date.
The issue which I have, occurs one step later when it comes to selecting the variables. See the screenshot: Target, weight, excluded variables. I do not even have a chance to omit the guessing: As soon as I click next on the variable description screen, the variable guessing comes up.
This happens independent from whether I let the system analyze the date or use a decsription file.
Regards,
Ingo
I can see the Initialization message when using the very large dataset: Electric Power Consumption. The message appears just for a few seconds on my laptop with SAP Predictive Analytics 2.4. I presume this time is used to check the data in order to set up the Last Training Date and Line information at the bottom left of the screen. I don't think this initialization step can be deactivated.
Looking now at the business problem to solve (the forecast), should the user run a forecast directly on the raw data that is a series of more than 2 millions points ? Or should that data be aggregated first before performing the forecast ?
By the way, as an SAP employee, you can use the SAP Predicitive internal forum.
User | Count |
---|---|
90 | |
10 | |
10 | |
10 | |
7 | |
7 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.