Skip to Content
0
Former Member
Jun 15, 2016 at 01:29 PM

HANA update of table slowly, looking for improvments of update procedure

424 Views

Hello,

I'm facing problem with very slow table updates. Currently I perform several 1000 updates on a table sequential. I would like to know some improvements for a better performance.

I will give a short introduce on what I'm doing exactly:

My starting situation is a table with sequences for different properties. Each value of the sequence represents one measured value for one day.

SEQUENCE PROPERTY 22;25;30;24;20 Temperature 23;25;30;20;18 Temperature 60;70;50;40;60 Humidity ... ...

On this I perform a sequential pattern mining using the SPADE implementation in R, getting as result a frequent sequence and the support of it. Let's say the result looks like following one:


ID FREQ-SEQUENCE SUPPORT PROPERTY 1 25;30 1 Temperature 2 23;25 0.5 Temperature 3 60;70;50 0.33 Humidity .. .. .. ..


Now I have a procedure which creates dynamical a table where each frequent sequence represents one column. The column name consist's of the property and the ID of the frequent sequence it represents (e.g "Temperature_1" stands for the freq. sequence "25:30"). Additional there are the columns sequence and property.

This table represents if a frequent sequence occurs in a sequence or not. Therefore I fill them with 1 (occurs) or 0 (occurs not). E.g. as I mentioned column "Temperature_1" stands for the freq. sequence "25;30" this occurs in the sequence "22;25;30;24;20" so it will be "1" as it occurs also in "23;25;30;20;18" the second row in column "Temperature_1" will be also "1" but the third will be "0" as it's a totally other property.


Temperature_1 Temperature_2 Humidity_3 SEQUENCE PROPERTY 1 0 0 22;25;30;24;20 Temperature 1 1 0 23;25;30;20;18 Temperature 0 0 1 60;70;50;40;60 Humidity ... ... ... ... ...


Filling the table is also where I facing the performance problems.


In a first step I just insert the sequences and properties in the related columns.


To fill the table now with "0" or "1" for occurrence or absence of a frequent sequence in a sequence I use the following algorithm:


-- iteration over all columns (frequent sequences)
FOR freqSequenceID IN 1..:maxFreqSequenceID DO


  --get the propertie and the frequent sequence which the current column represents
  SELECT "PROPERTY" INTO property FROM  FREQUENT_SEQUENCE_TBL WHERE "ID" = :freqSequenceID;
  SELECT "SEQUENCE" INTO freqSequence FROM  FREQUENT_SEQUENCE_TBL WHERE "ID" = :freqSequenceID;

  -- concatenate the name of current column
  colName := CONCAT(:property,'_');
  colName := CONCAT(:colName,freqSequenceID);

  -- update all rows in the column with "0"
  insertVar := CONCAT ('UPDATE OCCURENCE_TBL SET  "', :colName);
  insertVar := CONCAT (insertVar, '" = 0');
  EXECUTE IMMEDIATE (:insertVar);

  -- update all rows where the current frequent sequence occurs in the sequence with one
  --      UPDATE OCCURENCE_TBL SET  OCCURENCE_TBL.colName = 1 FROM OCCURENCE_TBL
  --      INNER JOIN "SEQUENCE_TBL" ON OCCURENCE_TBL."SEQUENCE_ID" = SEQUENCE_TBL."SEQUENCE_ID" AND OCCURENCE_TBL."PROPERTY" = :property
  --      WHERE LOCATE ( SEQUENCE_TBL."SEQUENCE", :freqSequence) >= 1 AND SEQUENCE_TBL."PROPERTY" = :property

  insertVar := CONCAT ('UPDATE O SET  O."', :colName);
  insertVar := CONCAT (insertVar, '" = 1 FROM OCCURENCE_TBL AS O INNER JOIN "SEQUENCE_TBL" AS S ON O."SEQUENCE_ID" = S."SEQUENCE_ID" AND O."PROPERTY" = ''');
  insertVar := CONCAT (insertVar, :property);
  insertVar := CONCAT (insertVar, '''WHERE LOCATE ( S."SEQUENCE", ''');
  insertVar := CONCAT (insertVar, :freqSequence);
  insertVar := CONCAT (insertVar,''') >= 1 AND S."PROPERTY" = ''');
  insertVar := CONCAT (insertVar, :property);
  insertVar := CONCAT (insertVar, '''');
  EXECUTE IMMEDIATE (:insertVar);
  
END FOR;


I think that the slow performance of it is because a update is sequential done for each column and not together at all in the end.

I tried already to collect all this update statement and execute them after the loop but it seems that with EXEC / EXECUTE IMMEDIATE only one statement is possible.

Has anyone an idea how to improve the update process of this table?

When it's possible it could be also a virtual table, it must be not persistent. In the next step it would be used for a classification with CART.

Thank you in advance

Matthias