cancel
Showing results for 
Search instead for 
Did you mean: 

Guessing default settings

i033659
Advisor
Advisor
0 Kudos

Hi,

I try to do a time-series analysis with a lot of data. When I enter the "Selecting Variables" screen (i.e. where one chooses the time and target variable) the system says "Guessing default variables" with a separate window and a bar moving back and forth. Due to the huge number of records, this takes a very long time.

Question: Can I switch off that guessing? Actually I know what the time and the target should be.

Thanks,

Ingo

Accepted Solutions (1)

Accepted Solutions (1)

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Good question, I do not know this message. As you are from SAP, can you please provide us a way to test in-house?

Thanks & regards

Antoine

i033659
Advisor
Advisor
0 Kudos

Hi Antoine,

Just take the electric power consumption data from https://archive.ics.uci.edu/ml/datasets/Individual+household+electric+power+consumption.

These are ca. 2 Mio. data records based on a per minute timestamp.

When automated analytics is getting to the variable selection you will see for several minutes

As I know, what is chosen as time and target, this is superfluous and just causes long waiting times.

Regards,

Ingo

marc_daniau
Advisor
Advisor
0 Kudos

Instead of clicking on Analyze (button that guesses the data description), one can click on Open Description that reads the metadata from a text file (see example below). An alternative is to use a file that is a subset of the data (first 200 rows for instance), run Analyze, Save the description file, Go Back, use the file with all the data and click on Open Description.

RANKNAMESTORAGEVALUETYPEKEYLEVELORDERLEVELMISSINGSTRINGGROUPNAMEDESCRIPTION
0Datedatetimecontinuous11
1WorkingDaysIndicesnumbercontinuous00
2ReverseWorkingDaysIndicesnumbercontinuous00
3MondayMonthIndnumberordinal00
i033659
Advisor
Advisor
0 Kudos

Hi Marc,

Actually I use a description file to describe the date.

The issue which I have, occurs one step later when it comes to selecting the variables. See the screenshot: Target, weight, excluded variables. I do not even have a chance to omit the guessing: As soon as I click next on the variable description screen, the variable guessing comes up.

This happens independent from whether I let the system analyze the date or use a decsription file.

Regards,

Ingo

marc_daniau
Advisor
Advisor
0 Kudos

I can see the Initialization message when using the very large dataset: Electric Power Consumption. The message appears just for a few seconds on my laptop with SAP Predictive Analytics 2.4. I presume this time is used to check the data in order to set up the Last Training Date and Line information at the bottom left of the screen. I don't think this initialization step can be deactivated.

Looking now at the business problem to solve (the forecast), should the user run a forecast directly on the raw data that is a series of more than 2 millions points ? Or should that data be aggregated first before performing the forecast ?

By the way, as an SAP employee, you can use the SAP Predicitive internal forum.

i033659
Advisor
Advisor
0 Kudos

Hi Marc,

Thanks for the answer --> cannot be deactrivated, this is what I wanted to know.

For the business question: No, 2 mio. records will definitely not be used for the forecast. In fact I aggregated them up to days and made then a forecast for a week.

Thanks,

Ingo

Answers (0)