cancel
Showing results for 
Search instead for 
Did you mean: 

HANA loads the right partition for range-partitioned table only for queries in SQL plan cache

andrey_ryzhkov
Participant
0 Kudos

Hi!

I've noticed a strange behavior of HANA (SPS 08 and 09) with tables loading in memory. I have a big column table partitioned by HASH on the first level and by RANGE (calendar period) on the second level. For clean test I unload the table comletely from memory. After that I run an SQL query with the restriction of calendar period in the way that only single or very few partitions must be analyzed. If that query is missing in SQL plan cache HANA loads all partitions in memory, but only the columns involved in the query. After that I unload the table from memory again and repeat the query. At this moment a query is already in SQL plan cache and only relevant partitions are loaded. This behavior is normal for HANA? The most recent revisions work the same way? If it's normal, please, provide me to official documentation which desrcibe this.

Thanks, Andrey.

Accepted Solutions (0)

Answers (3)

Answers (3)

0 Kudos

Andrey,

Congratulations for the tests.

In our SPS09, we face the same scenario: unload priority affects loading behavior exactly as you detailed.

Also, loading only works right concerning columns in the first level of the partitioning. If a select is executed with a where clause on a column in the second leval of the partitioning, all partitions are loaded into memory.

andrey_ryzhkov
Participant
0 Kudos

Hi!

Just for info. A behaviour when HANA loads all the partitions in memory at the time of SQL statement preparation (instead of loading only the relevant partitions according to SQL query restriction) is expected:


When all partitions are unloaded and you prepare an SQL statement, HANA SQL optimizer first collects statistics from most columns of ALL partitions to be able to porivde optimal execution plans for the later queries.

So that's how it is. Actually, it means that partition pruning is not working during SQL statement preparation. But this is true only for tables with unload priority values 0 - 5 (btw, the default value for most tables is 5). But not everything is so bad, you can change this behavior:


If you want to change the behavior that for prepared statements most

of the columns for all partions are loaded then you need to change the

unload priority for the table from default value 5 to 7.Then only

columns from some larger partitions are loaded, with the burden that

SQL execution plans for the statement may not be optimal dependent

on filter values which are later used to execute the SQL, after the

preparation.

Actually, we noticed that even priority value 6 provides loading of almost only the necessary partitions and columns (you can find in my message of Apr 20, 2016 10:05 PM that HANA still loading very few unnecessary columns and partitions). We did not notice any performance degradation with changing priority from 5 to 6, but the memory is well saved and the queries run faster. Unfortunately, the dependence of loading behavior on the priority values is not documented:


The different unload priority values are not officially documented

and explained in detail, but the value range is from 0-9, with 5 being

default.

So just need to have this in mind.

lbreddemann
Active Contributor
0 Kudos

Once again: you have been observing an effect and mis-attributed the cause.

The table unload priority has no influence on when a column gets loaded back to memory.

And all dramatic writing in bold and over-generalisation of single observation to a general processing scheme doesn't change that.

andrey_ryzhkov
Participant
0 Kudos

Hi!

Don't agree. We have a response from support that unload priority affects (would like to highlight it, but the drama is not welcome) the behaviour of columns/partition loading during statement preparation.

In my message from Apr 20, 2016 10:05 PM () I gave an example how to reproduce the influence of unloading priority to columns/partitions loading. Anyone checked it in a real system except me? (in HANA Cloud platform for example) Or the conclusions were made by the text description? I pay special attention that all these three test steps are performed by the three completely new SQL requests to eliminate the SQL cache influence. The uniqueness is provided here by different comment strings /* ..... */.

And returning to my example: if (as you wrote) "The optimiser sampling results, which lead to the initial column load in your example with unload priority 5, are cached and kept." why we observe that during the last request with unloading priority 5 all partitions/columns are loaded again (the test step #3)? The optimizer results are supposed to be cached already, right?

lbreddemann
Active Contributor
0 Kudos

Ok, I once again spend some time to look into this.

The software you're using on HCP is no different than the on premise version, which means they work the same.

Not sure, what confirmation you received from out support on this matter, but the fact is: UNLOAD priority doesn't change whether or not column are loaded during preparation.

In your test, you took the status LOADED after your preparation as an indicator for that partitions had been loaded into memory for the query preparation. And in your example the partitions for the table with UNLOAD PRIORITY 6 weren't loaded when you checked after you prepared the statements.

The problem here is, that by simply checking the loaded state you're dismissing the possibility that the columns might have been loaded, sampled and unloaded early due to the UNLOAD priority.

I repeated my tests and took the effort to actually trace the loading/unloading as well as the partition pruning and can assure you: the UNLOAD PRIORITY has no influence on the column loading during statement preparation.

Concerning the "inexplicable" load behavior you observed: have you ensured that the table partitions don't get loaded by other processes, e.g. the merge dog? In my tests, the "weird" load states disappeared when I ensured the testes tables where excluded from the auto merge.

As a guidance on how to you also can create a test scenario that doesn't mix up all these different complex concepts, here's the list of steps for that:

1. create and fill the tables - all the same, but not the same table:

drop table pruning_seed;

drop table pruning_testUP1;

[...]

drop table pruning_testUP9;

CREATE column TABLE pruning_seed (

DOCNR NVARCHAR(5),

CALMONTH NVARCHAR(6),

SUMM DECIMAL(17,2),

PRIMARY KEY (DOCNR)) UNLOAD PRIORITY 1;

/* PARTITIONING */

ALTER TABLE pruning_seed PARTITION BY HASH ("DOCNR") PARTITIONS 3, RANGE ("CALMONTH")  (

  partition VALUE = '201401',

  partition VALUE = '201402',

  partition VALUE = '201403',

  partition VALUE = '201404',

  partition VALUE = '201405',

  partition VALUE = '201406',

  partition VALUE = '201407',

  partition VALUE = '201408',

  partition VALUE = '201409',

  partition VALUE = '201410',

  partition VALUE = '201411',

  partition VALUE = '201412',

  partition VALUE = '201501',

  partition VALUE = '201502',

  partition VALUE = '201503',

  partition VALUE = '201504',

  partition VALUE = '201505',

  partition VALUE = '201506',

  partition VALUE = '201507',

  partition VALUE = '201508',

  partition VALUE = '201509',

  partition VALUE = '201510',

  partition VALUE = '201511',

  partition VALUE = '201512',

  partition OTHERS);

 

INSERT INTO pruning_seed VALUES ('1', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('2', '201502', '100.50');

INSERT INTO pruning_seed VALUES ('3', '201503', '100.50');

INSERT INTO pruning_seed VALUES ('4', '201504', '100.50');

INSERT INTO pruning_seed VALUES ('5', '201505', '100.50');

INSERT INTO pruning_seed VALUES ('6', '201506', '100.50');

INSERT INTO pruning_seed VALUES ('7', '201507', '100.50');

INSERT INTO pruning_seed VALUES ('8', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('9', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('11', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('12', '201502', '100.50');

INSERT INTO pruning_seed VALUES ('13', '201503', '100.50');

INSERT INTO pruning_seed VALUES ('14', '201504', '100.50');

INSERT INTO pruning_seed VALUES ('15', '201505', '100.50');

INSERT INTO pruning_seed VALUES ('16', '201506', '100.50');

INSERT INTO pruning_seed VALUES ('17', '201507', '100.50');

INSERT INTO pruning_seed VALUES ('18', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('19', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('21', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('22', '201502', '100.50');

INSERT INTO pruning_seed VALUES ('23', '201503', '100.50');

INSERT INTO pruning_seed VALUES ('24', '201504', '100.50');

INSERT INTO pruning_seed VALUES ('25', '201505', '100.50');

INSERT INTO pruning_seed VALUES ('26', '201506', '100.50');

INSERT INTO pruning_seed VALUES ('27', '201507', '100.50');

INSERT INTO pruning_seed VALUES ('28', '201501', '100.50');

INSERT INTO pruning_seed VALUES ('29', '201501', '100.50');

merge delta of pruning_seed; 

alter table pruning_seed disable automerge;

--> now we have a seeding table, let's create the actual test tables WITHOUT AUTO MERGE!

create column table pruning_testUP1 like pruning_seed with data WITHOUT AUTO MERGE;

[...]

create column table pruning_testUP9 like pruning_seed with data WITHOUT AUTO MERGE;

--> now set the unload priorities and issue a delta merge

alter table pruning_testUP1 unload priority 1;

[...]

alter table pruning_testUP9 unload priority 9;

merge delta of pruning_testUP1; 

[...]

merge delta of pruning_testUP9; 

Here we are with identical tables, that are excluded from delta merge and that have all different unload priorities.

Unload them from memory:

unload pruning_testUP1;

[...]

unload pruning_testUP9;

Check the load status for all columns and all partitions:

with load_stat as (select table_name,  part_id,

           sum (case loaded

         when 'TRUE' then 1

         else 0

      end )load_cnt

  from m_cs_all_columns

  where table_name like 'PRUNING_TESTUP%'

  group by table_name,  part_id

  order by table_name,   part_id)

select table_name, string_agg (load_cnt, '/' order by part_id) as load_meter

from load_stat

group by table_name;

TABLE_NAME      LOAD_METER                                                                                                                                          

PRUNING_TESTUP1 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP2 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP3 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP4 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP5 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP6 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP7 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP8 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP9 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0

No partition loaded - as expected.

Prepare the statements:

select /* test stmt #1*/ calmonth from pruning_testUP1 where calmonth = '201502' ;

select /* test stmt #2*/ calmonth from pruning_testUP2 where calmonth = '201502' ;

select /* test stmt #3*/ calmonth from pruning_testUP3 where calmonth = '201502' ;

select /* test stmt #4*/ calmonth from pruning_testUP4 where calmonth = '201502' ;

select /* test stmt #5*/ calmonth from pruning_testUP5 where calmonth = '201502' ;

select /* test stmt #6*/ calmonth from pruning_testUP6 where calmonth = '201502' ;

select /* test stmt #7*/ calmonth from pruning_testUP7 where calmonth = '201502' ;

select /* test stmt #8*/ calmonth from pruning_testUP8 where calmonth = '201502' ;

select /* test stmt #9*/ calmonth from pruning_testUP9 where calmonth = '201502' ;

And check the Plan Cache:

select plan_id, statement_hash, statement_string, accessed_table_names, reference_count,

       preparation_count, execution_count, total_table_load_time_during_preparation

from M_SQL_PLAN_CACHE

where statement_string like '%test stmt #%';

PLAN_ID     STATEMENT_HASH                      STATEMENT_STRING                                                                    ACCESSED_TABLE_NAMES        REFERENCE_COUNT PREPARATION_COUNT   EXECUTION_COUNT TOTAL_TABLE_LOAD_TIME_DURING_PREPARATION

44580002    94d2b5937b0b60c61bdfabf374d609f7    select /* test stmt #9*/ calmonth from pruning_testUP9 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP9(18) 0               1                   0               1310                                   

44570002    0ea60eb42735799128a00008b41858f0    select /* test stmt #8*/ calmonth from pruning_testUP8 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP8(18) 0               1                   0               1364                                   

44560002    fffa5043402a3992ce9f22a0a3105b4d    select /* test stmt #7*/ calmonth from pruning_testUP7 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP7(18) 0               1                   0               1355                                   

44550002    536920ec921299930c52a3a59c59a756    select /* test stmt #6*/ calmonth from pruning_testUP6 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP6(18) 0               1                   0               1403                                   

44540002    8e7964926f235ffb0e6b25131e3689ba    select /* test stmt #5*/ calmonth from pruning_testUP5 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP5(18) 0               1                   0               1299                                   

44530002    454a1b96bd35c9bdfb69e5e803ff422e    select /* test stmt #4*/ calmonth from pruning_testUP4 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP4(18) 0               1                   0               1350                                   

44520002    198caff798de0c26b0b8572ee13ef535    select /* test stmt #3*/ calmonth from pruning_testUP3 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP3(18) 0               1                   0               1922                                   

44510002    4cea6e058ee10a030e00be82baadf8b7    select /* test stmt #2*/ calmonth from pruning_testUP2 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP2(18) 0               1                   0               1339                                   

44500002    d6cefc1f24249e32aff53f54e22176b7    select /* test stmt #1*/ calmonth from pruning_testUP1 where calmonth = '201502'    DEVDUDE.PRUNING_TESTUP1(18) 0               1                   0               1350                                   

*/

Each table gets touched, each query had been prepared once and we see a very similar LOAD TIME DURING PREPARATION - irrespective of the UNLOAD PRIORITY.

Checking the load status now produces:

TABLE_NAME      LOAD_METER                                                                                                                                          

                                  1                   2                   3                   4                   5                   6                   7 

                1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5                                                                                                                                       

PRUNING_TESTUP1 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP2 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP3 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP4 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP5 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP6 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP7 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP8 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP9 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0                                         

All tables (with all the different unload priority settings) have the exact same partitions loaded to memory right now.

In a test before, I could see that the tables with unload priority >5 had already been UNLOADED again.

These observations can be backed by the UNLOAD/LOAD traces.

Feel free to use this example and perform the adequate tracing yourself.

Or not and believe what you want to believe.

andrey_ryzhkov
Participant
0 Kudos

Hi, Lars!

Thank you very much for such a detailed test scenario. Just checked it very precisely on our SP82 system. There are my results:

TABLE_NAME;LOAD_METER

PRUNING_TESTUP1;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2

PRUNING_TESTUP2;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2

PRUNING_TESTUP3;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2

PRUNING_TESTUP4;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2

PRUNING_TESTUP5;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2

PRUNING_TESTUP6;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP7;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP8;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP9;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

This is exactly what I was writing about all the time I can not understand why do you get a different result. I will check it on HCP.

Just one remark: our SP82 did not like a syntax of:

so I removed "order by...".

UPDATE: just checked it on HCP trial dev. account (SP97), the results are very similar to mine (the syntax "order by..." was incorrect also):

by text:

TABLE_NAME;LOAD_METER

PRUNING_TESTUP1;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3

PRUNING_TESTUP2;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3

PRUNING_TESTUP3;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3

PRUNING_TESTUP4;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3

PRUNING_TESTUP5;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3

PRUNING_TESTUP6;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP7;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP8;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

PRUNING_TESTUP9;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0

UPDATE: checked it on HCP SP102: HANA MDC (<trial>) 1.00.102.03.1449674847

The syntax was correct at this time, but the results are the same as on SP97: for tables 1 - 5 all partitions are loaded. For 6 - 9 - only three partitions.

lbreddemann
Active Contributor
0 Kudos

Concerning the syntax: I'm using a recent SPS11 HANA. SPS8 is outdated and not further developed. Without the ORDER BY clause in the STRING_AGG function, the result positions may be wrong - so, the output might be misleading.

As mentioned a couple of times:

Do enable tracing for loading, unloading and partitioning to see which partitions actually get touched due to your queries and which do due to other system activity.

By looking for which partitions are still in memory afterwards it's not possible to distinguish between columns that had been touched and left in memory and those who had been touched and dropped from memory again.

If you suspect a bug in your SAP HANA instance and it's in a current revision, then opening a support incident to investigate this would be the right move.

andrey_ryzhkov
Participant
0 Kudos

I tested and traced it on our different SPS9 (sp96) system. I turned on load*, unload*, partitioning, optimizer*, query*, sqlopt*, sqlquery, stat* and table* traces. I clearly see that for UP5 and UP6 trace results are pretty the same, partition pruning seems to be working for all test tables. For example:

[4534]{434643}[187/-1] 2016-08-03 12:40:20.581883 d partitioning     Pruning.cpp(00098) : Pruning for index RYZHKOV:PRUNING_TESTUP5 (162281): considering part(s) 14, 39, 64 only. Partition spec is HASH 3 DOCNR; RANGE CALMONTH.....

[4534]{434643}[187/-1] 2016-08-03 12:40:21.429779 d partitioning     Pruning.cpp(00098) : Pruning for index RYZHKOV:PRUNING_TESTUP6 (162282): considering part(s) 14, 39, 64 only. Partition spec is HASH 3 DOCNR; RANGE CALMONTH.....

The .loads. trace file also shows that allegedly only DOCNR, CALMONTH and $rowid$ columns are loaded for all the *UP1 - *UP9 tables.

The only difference that I noticed in trace files is the Query rewriting time. It's considerably (10x times) bigger for UP1-UP5 tables:

[...]

UP4     433.425 ms

UP5     436.699 ms

UP6     33.758 ms

UP7     32.494 ms

[...]

As I wrote before, SAP support confirmed to us that for tables with priority 1 - 5 all partitions are loaded by (or for) the HANA SQL optimizer during SQL statement preparation. As I see in my tests this is true for SPS8, 9 and 10. But I cannot see in trace files what exactly causes the loading of all partitions and which columns are involved. In the same time your test on SPS11 shows that probably starting from SPS11 this behaviour is different. Could you test this example on SPS9 or SPS10? I combined complete test scenario in attached file for convenience.


Concerning this:


Lars Breddemann wrote:

By looking for which partitions are still in memory afterwards it's not possible to distinguish between columns that had been touched and left in memory and those who had been touched and dropped from memory again.

I clearly see that for UP5 table these columns are still in memory (on SPS9):

DOCNR for 16 partitions;

CALMONTH, $trex_udiv$ and $rowid$ for all partitions.

For UP6 only columns DOCNR, CALMONTH and $rowid$ are still in memory only in 3 partitions.

lbreddemann
Active Contributor
0 Kudos

I've done my part of testing here and not going to extend this.

You may provide the incident no. where support provided you the confirmation.

Instead of my own test script in a single file it would have made sense to provide the trace files instead.

Also, by actually executing the statement instead of just preparing them, you made the test cases more complex than necessary.

andrey_ryzhkov
Participant
0 Kudos

Incident no.

(0000235413 2016)

andrey_ryzhkov
Participant
0 Kudos

Hi!

I got a chance to test it on the SPS11 system (SP111). Well, it works how it really should work - only relevant partitions are touched in all cases. It seems that something was fixed in SPS11 concerning partitions loading during statements preparartion, but I cannot find what exactly.

For example, table PRUNING_TESTUP1.

SPS9:

SPS11:

UPD: guys, pls, could someone check any of the test cases from this thread (txt files with SQL script) on both SPS11(or 12) and on SPS<=10. Do you observe the same unwanted partitions loading on SPS<=10? Thanks in advance.

lbreddemann
Active Contributor
0 Kudos

Sounds like you provided the partition information in your where clause in the wrong data type. You might want to check with the partitioning trace which partition conditions were understood.

Check the SAP note FAQ partitioning, search this forum (this has been discussed before) or read my book.

The SQL plan cache does not have an influence on partition loading.

andrey_ryzhkov
Participant
0 Kudos

Thanks for answer, Lars!

There's something different. I observe direct influense of the SQL plan cache on partition loading (pruning). So I repeated an experiment:

1. Unloaded table from memory.

2. Run SQL-query via ABAP with CALMONTH restrictions in WHERE for using partition pruning (table is RANGE partititioned on 2nd level by CALMONTH). And I sure the query is not in cache.

select count( * ) UP TO 1 rows FROM /bic/azuucalc00  WHERE calmonth = '201701' and /bic/miedivsnd = '1234'.

3. HANA prepares statement (cause it's missing in cache) and during preparation I see that ALL partitions are loaded into memory. At end of preparation (it tooks about 2 minutes) all partitions are loaded:

4. After that I unload table again and repeat the same query. This time the query is in cache already, HANA executes query instantly and loads only the relevant partition:

What I see in SQL plan cache: execution count 3 times (3rd time - test case with wrong data type in WHERE). Preparation - 1 times (about 2 minutes). Execution itself is very quick and tooks about 40 millisec.

That behavior is on SPS 08 and 09. SPS 10 not tested yet.

Also I tested a case with the wrong data type in WHERE condition (I used WHERE calmonth = 201701 (without quotes)) - and it works exactly the same way as described above.


Actually, exactly my case is described here: Musings of an IT Implementor: HANA OOM Error Tip #1 - Partition Tables Correctly (indexes are not important here).

lbreddemann
Active Contributor
0 Kudos

Nope, you are not observing the direct impact of the SQL cache here. Are you debugging SAP HANA kernel code? Are you looking at trace files?

You are really just putting together two observations of yours and construct the connection.

Your claim, expressed more explicitly is:

Parsing a query in SAP HANA executes a full load of ALL partitions of a table.

That's not correct.

In order to find an explanation for your example, it's necessary that we see the exact definition of the table including the partitioning definition and the exact query text, too.

My ad hoc assumption would be that the table had been partition based on CALMONTH (integer value) and now gets queries with CALMONTH (string value).

SAP HANA doesn't convert this automatically and won't use the partition pruning in this case.

For the claim of parsing -> partition loading, this can be easily disproved by manually unloading the table in SAP HANA studio and preparing any statement against the table (right click, prepare or explain will do).

Then check M_CS_ALL_COLUMNS to see which columns are in memory.

andrey_ryzhkov
Participant
0 Kudos

Thanks!

Yes, the table is partitioned by integer values of CALMONTH:


HASH 3 FISCPER3,FISCYEAR,FISCVARNT,/BIC/MIESKODNM; RANGE CALMONTH 201000-201100,201100-201200,201200-201300,201300-201400,201401,201402.... (until 2017)

BW attribute CALMONTH however has type NUMC(6).

Is the partitioning definition is wrong in this case  and must be reworked on char values of CALMONTH for using pruning?

Are there any options to make pruning to work without repartitioning (some settings or upgrading to more recent HANA revision)?


UPDATE: and if it's no influence of SQL plan cache on pruning or loading, why there is a different loading behavior for cached and not-cached queries?


UPDATE 2: did I understand correctly, that for write-optimized DSO in BW we never get pruning benefits on range partitioning, because all BW attributes are character like and in RANGE definition we can only set integer values?

lbreddemann
Active Contributor
0 Kudos

There's no setting to change that. If you want to have partition pruning work for NUMC values you will have to re-partition correctly.

As for your example: this is because actual preparation of the query included bind variables.

In that case, the partition pruning cannot work properly as the values are not known yet.

However, to prepare the statement some size and cardinality estimations must take place.

So, in this case

a) statement not in cache yet

b) table not in memory yet

c) no values for the partition pruning available

SAP HANA has to load some of the columns into memory in order to prepare the statement.

So, even though there is no direct impact, there you have your connection: parsing/optimizing a statement requires some information from some of the columns, so in the worst case, you will trigger column loading just for the prepare.

As long as the statement is reusable in cache, this doesn't need to be done again, even if the table meanwhile has been unloaded.

andrey_ryzhkov
Participant
0 Kudos

Lars, thank you very much for participation!

I tried two versions of query (non-cached) directly from Studio in order to exclude the case with bind variables: (... where calmonth = '201701' ) and (... where calmonth = 201701 ) and pruning did not work for both cases.

Could you provide, please, how to activate partitioning trace for deeper analysis? I did not find it nor in SAP HANA Administration Guide, nor in your book (thanks for the advice).

lbreddemann
Active Contributor
0 Kudos

Ok, there are SAP notes, there is the Administration Guide from the official documentation set and there is Richard's and my book.

In all of these sources it's explained how to setup a trace in SAP HANA.

In my book on page 375 I also cover which traces to activate to see the partitioning pruning at work.

In newer SAP HANA versions, even the explain plan covers which partitions are considered.

andrey_ryzhkov
Participant
0 Kudos

On the page 375 it's written: "...it is possible to activate the PARTITIONING trace with trace level INFO or DEBUG."

I did not find nothing about partitioning trace on "Trace configuration" tab in HANA Studio. There are only Database, SQL, Plan trace and others, but no Partitioning. I thought it might be as component of the Database trace, but there'se nothing like "partitioning". Google for SAP "partitioning trace" returns only 3 relevant results. All of them are SCN threads including this one where no explanation of how to activate it.

Anyway, I decided to run a pure test, so I created new empty table with simplified structure and similar partitioning scheme, filled it with some data and checked the queries. All of them are in the attachment.

In case of running NON-CACHED query on single CALMONTH I see that ALL non-empty partions are loaded, but only the involved column and the primary key column:

I checked this query in PlanViz - it's showing that only three relevant partitions are scanned:

So is it the partition pruning at work? If yes, why HANA is loading all partitions at preparation? Or does it mean that partition pruning doesn't work at preparation, but only at execution of the queries?

In case of the same but CACHED query only the relevant partitions and only the involved column are loaded:

lbreddemann
Active Contributor
0 Kudos

How to enable specific traces is explained in the admin guide. Not going to repeat it here.

To your test:

/* CREATE */

CREATE column TABLE pruning_test (

DOCNR NVARCHAR(5),

CALMONTH NVARCHAR(6),

SUMM DECIMAL(17,2),

PRIMARY KEY (DOCNR))

/* PARTITIONING */

ALTER TABLE pruning_test PARTITION BY HASH ("DOCNR") PARTITIONS 3, RANGE ("CALMONTH")  (

  partition VALUE = '201401',

  partition VALUE = '201402',

...

You partition by HASH-RANGE with 3 HASH partitions, which means that there are three partitions for each CALMONTH value.

Then you load data for CALMONTHS between 201501 and 201507.

That's 6 different values for CALMONTH and 18 possibly matching partitions for those.

Now I unload all columns.

unload pruning_test;

Let's check what's loaded:

select column_name, part_id, loaded

from m_cs_all_columns where table_name='PRUNING_TEST'

order by loaded desc;

COLUMN_NAMEPART_IDLOADED
SUMM      52    FALSE
CALMONTH  52    FALSE
DOCNR      52    FALSE
$rowid$    51    FALSE
$trex_udiv$51    FALSE
SUMM      51    FALSE
CALMONTH  51    FALSE
DOCNR      51    FALSE
$rowid$    50    FALSE

...

LOADED = TRUE would come first, so here ALL columns are unloaded.

Let's check the plan cache:

select top 10 statement_hash,plan_id, left (statement_string, 30) as cmd_start,

is_valid, preparation_count, execution_count, max_preparation_time, max_table_load_time_during_preparation

from "PUBLIC"."M_SQL_PLAN_CACHE" where statement_string like '%RUN #%';

STATEMENT_HASH                  PLAN_ID  CMD_START                    IS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
257c5cc3b41666c03d6607e999c36f8925,550,002select top 10 statement_hash,pTRUE    1                0              15,671              0                                    

It's just us at the moment - all good.

Now, let's create a new statement to be parsed and check what happens.

select /*RUN #1*/ calmonth from pruning_test where calmonth = '201502' ;

Right click on Prepare Statement... and check plan cache again.

STATEMENT_HASH                  PLAN_ID  CMD_START                    IS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
257c5cc3b41666c03d6607e999c36f8925,550,002select top 10 statement_hash,pTRUE    1                1              15,671              0                                    
286e61d6c8b1459dfdd57043604f38f025,560,002select /*RUN #1*/ calmonth froTRUE    1                0              67,377              1,260                                

Statement Run #1 has been prepared, but not executed and apparently some of the table columns got loaded...

Let's check:

COLUMN_NAMEPART_IDLOADED
DOCNR      39    TRUE 
CALMONTH  39    TRUE 
$rowid$    39    TRUE 
$rowid$    14    TRUE 
$rowid$    64    TRUE 
CALMONTH  14    TRUE 
CALMONTH  64    TRUE 
DOCNR      64    TRUE 
DOCNR      14    TRUE 
DOCNR      52    FALSE
$rowid$    51    FALSE
$trex_udiv$51    FALSE
SUMM      51    FALSE

The internal columns for the record management ($rowid$) and the two columns involved with partitioning CALMONTH and DOCNR have been loaded for 3 different partitions (14, 39, 64).

These three partitions are exactly the partitions where the data can possibly be.

All other partitions remain off memory. Partition pruning works also during preparation.

So far so good. The statement is prepared (or cached as you say).

Now, let's unload the table and re-prepare the statement:

STATEMENT_HASH                  PLAN_ID  CMD_START                    IS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
257c5cc3b41666c03d6607e999c36f8925,550,002select top 10 statement_hash,pTRUE    1                2              15,671              0                                    
286e61d6c8b1459dfdd57043604f38f025,560,002select /*RUN #1*/ calmonth froTRUE    1                0              67,377              1,260                                

COLUMN_NAMEPART_IDLOADED
SUMM      52    FALSE
CALMONTH  52    FALSE
DOCNR      52    FALSE
$rowid$    51    FALSE
$trex_udiv$51    FALSE

...

As we see: nothing happens! The plan is still valid, no new preparation is required and no columns get loaded this time.

Now I run the query...

STATEMENT_HASH                  PLAN_ID  CMD_START                    IS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
257c5cc3b41666c03d6607e999c36f8925,550,002select top 10 statement_hash,pTRUE    1                3              15,671              0                                    
286e61d6c8b1459dfdd57043604f38f025,560,002select /*RUN #1*/ calmonth froTRUE    1                1              67,377              1,260                                

COLUMN_NAMEPART_IDLOADED
CALMONTH  39    TRUE 
$rowid$    14    TRUE 
$rowid$    39    TRUE 
CALMONTH  14    TRUE 
$rowid$    64    TRUE 
CALMONTH  64    TRUE 
CALMONTH  52    FALSE

...

Again no preparation, but execution count increased by 1. As we still only have one part of the partitioning condition, of course all 3 partitions have to be touched again.

Now, let's unload again and prepare another statement...

select /*RUN #2*/ calmonth from pruning_test where calmonth = ?

STATEMENT_HASH                  PLAN_ID  CMD_START                    IS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
257c5cc3b41666c03d6607e999c36f8925,550,002select top 10 statement_hash,pTRUE    1                5              15,671              0                                    
286e61d6c8b1459dfdd57043604f38f025,560,002select /*RUN #1*/ calmonth froTRUE    1                1              67,377              1,260                                
a58d94ab23fdee706c00e5a3a9529c0d25,570,002select /*RUN #2*/ calmonth froTRUE    1                0              561,580            438,071                              

KAPOW!

For RUN #2 the preparation cannot make use of any partition pruning, because this information is not available yet.

But let's assume the main underlying idea of the in-memory database holds true and the stuff actually is in memory all the time. Then this is not a problem at all.

And if this prepared statement remains in cache and that it doesn't need to get re-prepared but some of the columns really weren't used at all and got unloaded. Would that lead to constant re-loading of those colunms?

unload, execute, check...

STATEMENT_HASH                  PLAN_ID  CMD_START                    IS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
257c5cc3b41666c03d6607e999c36f8925,550,002select top 10 statement_hash,pTRUE    1                6              15,671              0                                    
286e61d6c8b1459dfdd57043604f38f025,560,002select /*RUN #1*/ calmonth froTRUE    1                1              67,377              1,260                                
a58d94ab23fdee706c00e5a3a9529c0d25,570,002select /*RUN #2*/ calmonth froTRUE    1                1              561,580            438,071                              

COLUMN_NAMEPART_IDLOADED
CALMONTH  39    TRUE 
$rowid$    14    TRUE 
$rowid$    39    TRUE 
CALMONTH  14    TRUE 
$rowid$    64    TRUE 
CALMONTH  64    TRUE 
CALMONTH  52    FALSE
DOCNR      52    FALSE
$rowid$    51    FALSE
$trex_udiv$51   

FALSE

...

Nope: alll as we expect.

With the statement being prepared and a value for the bind variable provided only the partitions that cannot be excluded are loaded.

Remember: partition pruning is all about being able to leave out partitions that can safely be ignored with the current set of constraints. If the constraint is left unspecific, e.g. because it's a bind variable, it cannot be used for any partition pruning.

andrey_ryzhkov
Participant
0 Kudos

Thanks for testing!

That's really strange, I repeated exactly your actions: unloaded the table, checked in M_CS_COLUMNS that the table is unloaded

COLUMN_NAMEPART_IDLOADED
SUMM52FALSE
CALMONTH52FALSE
DOCNR52FALSE
$rowid$51FALSE
$trex_udiv$51FALSE
SUMM51FALSE
CALMONTH51FALSE
DOCNR51FALSE
$rowid$50FALSE
$trex_udiv$50FALSE



after that hit "Prepare statement" in Studio for

select /*RUN #1*/ calmonth from pruning_test where calmonth = '201502' ;

And I see this picture in M_CS_COLUMNS:

COLUMN_NAMEPART_IDLOADED
CALMONTH63TRUE
DOCNR63TRUE
$rowid$62TRUE
$rowid$20TRUE
$rowid$21TRUE
$rowid$61TRUE
$rowid$22TRUE
$rowid$60TRUE
$rowid$23TRUE
$rowid$59TRUE
$rowid$24TRUE
$rowid$58TRUE
$rowid$25TRUE
$rowid$26TRUE
$rowid$57TRUE
$rowid$27TRUE
$rowid$56TRUE
CALMONTH15TRUE
$rowid$15TRUE
DOCNR16TRUE

All non-empty partitions are loaded, but only the columns: CALMONTH, DOCNR. $rowid$. All as I said before. The other strange thing that I tested this case on three different systems and different releases (SPS 08, 09 and 10) and behavior is exactly the same.

I repeated test many times but all remains the same. Here is SQL cache:

STATEMENT_HASHPLAN_IDCMD_STARTIS_VALIDPREPARATION_COUNTEXECUTION_COUNTMAX_PREPARATION_TIMEMAX_TABLE_LOAD_TIME_DURING_PREPARATION
286e61d6c8b1459dfdd57043604f38f02 995 400 002select /*RUN #1*/ calmonth froTRUE10763 925316
bbb26f2490a2156520d6debaf5dc97442 995 430 002select /*RUN #3*/ calmonth froTRUE10291 493308
e0e787daf339d949650c7dda719178dd2 995 420 002select /*RUN #2*/ calmonth froTRUE10474 523307
b2485c758d7c472d494f5820502d2aec2 995 440 002select top 10 statement_hash,pTRUE1330 3690
2088c58eb6a9b37fb0c4a574a46900762 995 460 002select /*RUN #5*/ calmonth froTRUE10366 770308
64737b8c2199d5291db26e6699141b8b2 995 450 002select /*RUN #4*/ calmonth froTRUE10284 762302

Can it be that there'se something is not configured correctly here, or some component is missing, or some process is not started? Please advice further steps for analysys.

lbreddemann
Active Contributor
0 Kudos

That's indeed odd.

Also the particular long preparation time while the load time during preparation is very short doesn't seem to make sense.

To me this looks like there is something else going on in your systems.

If this is on a supported setup, I recommend to have this checked by support.

andrey_ryzhkov
Participant
0 Kudos

Hi! I checked partitionong trace, there's OK:

2016-04-04 10:09:14.871951 d partitioning     Pruning.cpp(00510) : Pruning for index RYZHKOV:PRUNING_TEST (95566): considering part(s) 14, 39, 64 only. Partition spec is HASH 3 DOCNR; RANGE CALMONTH 201401,201402,201403,201404,201405,201406,201407,201408,201409,201410,201411,201412,201501,201502,201503,201504,201505,201506,201507,201508,201509,201510,201511,201512,*. Query values are ((<UNKNOWN> == 201502)).

But all non-empty partitions are loaded nevertheless.

lbreddemann
Active Contributor
0 Kudos

I suspect something else triggers loading the partitions then. You may want to activate the load/unload tracing.

andrey_ryzhkov
Participant
0 Kudos

Checked the case with  SAP HANA Cloud Platform (Trial Developer Account, HANA XS (<shared>)  1.00.102.03.1449674847).

And I see exactly the same as my results - after preparing of the very first statement


select calmonth from pruning_test where calmonth = '201502';

on the newly created, partitioned, filled, merged and unloaded table I get this:

What's wrong?

lbreddemann
Active Contributor
0 Kudos

Hmm... so you just executed the statement

select calmonth from pruning_test where calmonth = '201502';


right?


The thing is, that SAP HANA Studio by default always prepares and then executes the statement.

You can switch this off in the settings and then you should be able to distinguish between prepare and execute.

I recommend to also make use of JDBC tracing of SAP HANA studio, so that you have a better grip on what commands are actually send to SAP HANA.

andrey_ryzhkov
Participant
0 Kudos

Hi, Lars!

I ran "Prepare statement" in studio for

select calmonth from pruning_test where calmonth = '201502' /* some comment to ensure uniqueness */;

I unchecked checkbox "Prepare statements before executions", but nothing changed.

When the new statement is prepared or executed (before execution of the new query it must be pepared, right?) HANA loads almost all partitions as shown in the screenshot.

So what we have:

1. Every new statement must be prepared before first execution;

2. You say that Partition pruning works also during preparation. And on your test case it can be clearly seen that it is so - only partitions 14, 39, 64 are loaded during preparation of the new query;

3. When I try exactly the same test case on the HANA Cloud Platform dev. ed. rel. 102 (to eliminate the influence factor of the third-party system) I see that almost all partitions are loaded.


Could you try the same case on the HANA Cloud Platform dev. ed?

andrey_ryzhkov
Participant
0 Kudos

Hi!

I found that partition pruning behaviour during SQL preparation depends on UNLOAD PRIORITY table setting. With values 0 - 5 (my table has default setting - 5) partitioning is not taken into account during SQL statement preparation: mostly all partitions are loaded. But between values 6 and 9 pruning starts working also during preparation: only relevant partitions are loaded.

It's a bit confusing because according to official "help":

UNLOAD PRIORITY specifies that priority of table to be unloaded from memory. 

No word on how that affects the loading behaviour.

lbreddemann
Active Contributor
0 Kudos

Nope, that's not correct. Unload priority is not even evaluated for partition pruning or column loading.

You are looking at a coincidence there.

andrey_ryzhkov
Participant
0 Kudos

Ok. Then, how to explain the following example? I run it on the SAP HANA Cloud Platform developer edition.

1.

alter table pruning_test unload priority 5;

unload pruning_test;

select calmonth from pruning_test where calmonth = '201502' /* 20.04.16 #01 */;

select column_name, m.part_id, range, loaded

from m_cs_all_columns as m

join M_CS_PARTITIONS as p

  on m.schema_name = p.schema_name

and m.table_name = p.table_name

and m.part_id = p.part_id

where m.table_name = 'PRUNING_TEST'

order by loaded desc, m.part_id, column_name desc;

all partitions are touched:

2.

alter table pruning_test unload priority 6;

unload pruning_test;

select calmonth from pruning_test where calmonth = '201502' /* 20.04.16 #02 */;

select column_name, m.part_id, range, loaded

from m_cs_all_columns as m

join M_CS_PARTITIONS as p

  on m.schema_name = p.schema_name

and m.table_name = p.table_name

and m.part_id = p.part_id

where m.table_name = 'PRUNING_TEST'

order by loaded desc, m.part_id, column_name desc;

much better! the relevant partitions 14, 39 and 64 are loaded! But what are these columns from partitions 22, 30 and 63? I did not request them! Why they have been loaded?

3. (the same as the stage 1)

alter table pruning_test unload priority 5;

unload pruning_test;

select calmonth from pruning_test where calmonth = '201502' /* 20.04.16 #03 */;

select column_name, m.part_id, range, loaded

from m_cs_all_columns as m

join M_CS_PARTITIONS as p

  on m.schema_name = p.schema_name

and m.table_name = p.table_name

and m.part_id = p.part_id

where m.table_name = 'PRUNING_TEST'

order by loaded desc, m.part_id, column_name desc;

Again, everything is bad.

lbreddemann
Active Contributor
0 Kudos

Ok, there's some mixup here.

The optimiser sampling results, which lead to the initial column load in your example with unload priority 5, are cached and kept.

When you run your second test with unload priority 6 these sampling results got reused and didn't trigger a column reload.

Once again: unload priority is evaluated only when SAP HANA determines column unloading. It's not used anywhere else, especially not in the query evaluation.