on 11-20-2012 11:42 AM
After completing a large insert into a table I noticed thee two threads running in HANA, consuming CPU.
I'm guessing that these threads are HANA optimizing the structure of the table I've just loaded.
I'd like to know more about what goes on here. Any information available anywhere?
Cheers,
Chris
Status: null
Host: imdbhdb
Port: 30003
Service: indexserver
Hierarchy: 0/-166626
Connection ID: 0
Thread ID: 5072
Thread Type: MergedogMonitor
Thread Method: optimize compression
Thread Detail: 1 of 1: COUNTRY_DAILY_SALES_BOOK
Duration [ms]: 92832
Status: null
Host: imdbhdb
Port: 30003
Service: indexserver
Hierarchy: 0/167501
Connection ID: 0
Thread ID: 5089
Thread Type: ParallelDispatcher
Thread Method: working
Thread Detail: PrepareRenumberJob: table=PUBS3:COUNTRY_DAILY_SALES_BOOKen, column=$rowid$: prepareRenumber
Hi there,
the other commentators are right: there is a delta merge going on.
This is a standard process and it's usually OK that this happens automatically.
While the delta merge is running the table can be used without restriction to the end user (select/insert/update/delete all work) - the table is not locked on SQL level.
To get a closer look you may check the
[Administration Perspective] -> [Performance] -> [Job Progress]
information.
In there you usually find progress information for each single column that is currently merged.
- Lars
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Lars,
with the HANA and merging I do not understand:
If for exmaple, one record is inserted, commited
to a Col. Table does this mean
that the entire table is build up again in the Delta and
made to the new (TREX) Main Memory ?
What if this Col Table is a 100 Mio DS Table ?
Some SAP papers suggest that a Col Table
query loads only the dedicated, querried Columns
to the Main- (TREX-) Memory.
What happens with the other columns
if than an insert occures ?
Thank You
Martin
Hello Martin,
for an insert, we need to have in memory:
1. the delta log (obviously, because it will contain all changed data AND the delta log is in memory all the time anyhow. In fact it only exists in memory...)
2. the main stores for the columns required to check primary/uniqueness constraints.
All the remaining columns are not loaded into memory, if they are not already present anyhow.
On the other hand: if the columns are there already, we just don't touch them.
Once it's time to perform the merge, ALL columns are loaded into memory and new main stores for each column are created by combining the information from the old main stores and the delta store.
Finally, the table is switched to the new main stores and by that the merge is basically done.
By the way: the merge operation is not directly coupled to the transaction management.
It doesn't lock the table or records. It's happening online in the background on the data container level (not on your application data level).
Hope that helps a bit.
-Lars
Lars,
Thanks. Got it. Will try it on our HANA.
In turn this means:
Having the famous 1000 Million Row Table ( currently not in Memory, unloaded), inserting andcommittig 1 Record (into Delta), all 1000 Million Records are pulled into MAIN ? and a New MAIN (=TREX / InMemory Structure) with 1000 Million +1 is build.
?
Second )
The Delta Memory is a part of the Memory (organsiation structure) where the Row Tables also go, is an equivalent to Oracle's Shared Memory / Shared Pool ?
Thanks ?
Martin
Hi again.
First of all again: commit or rollback have nothing to do with whether table columns are loaded or not.
These commands work on a different processing level, the transaction management, "above" the data containers.
Now to your question.
If we insert data, we need to check primary/unique constraints. So the columns for these need to be loaded.
When we load columns we read the main store persistence from the data area.
If index structures have been defined on any of the columns, these are rebuild in memory on the fly.
Indexes are not persisted.
In parallel we read the redo log information from the log area and rebuild the delta log in memory.
Then the insert is processed (checking of constraints included) and the new data is appended to the delta log.
Once you commit, a log flush is triggered to save the changed data in the redo log (log area).
Important to understand here: delta log != redo log.
From here, the insert can be recovered but it's not part of the main store yet.
This happens during the delta merge process.
Let's try out some steps for this:
create sequence ilt increment by 1;
drop table insert_load_test;
create column table insert_load_test ( id int primary key, name varchar (150), comment_text varchar(2000));
insert into insert_load_test (id, name )
(select ilt.nextval, table_name from tables);
select * from insert_load_test limit 10;
/*
ID NAME COMMENT_TEXT
939 P_INDEXCOLUMNS_ NULL
940 P_OBJECTDEPENDENCY_ NULL
941 P_PROCSOURCES_ NULL
942 P_SQLSCRIPTTRACE_ NULL
943 P_RTABLES_ NULL
944 P_UTABLES_ NULL
945 P_QTABLES_ NULL
946 P_DATATYPES_ NULL
947 P_QUERYPLANS NULL
948 CACHE_P_USERS_ NULL
*/
select table_name, loaded, is_delta_loaded, record_count, raw_record_count_in_main,raw_record_count_in_delta
from m_cs_tables
where schema_name = 'LARS'
and table_name ='INSERT_LOAD_TEST';
/*
TABLE_NAME LOADED IS_DELTA_LOADED RECORD_COUNT RAW_RECORD_COUNT_IN_MAIN RAW_RECORD_COUNT_IN_DELTA
INSERT_LOAD_TEST FULL TRUE 895 0 895
*/
select column_name, count, distinct_count, loaded, index_type, index_loaded
from m_cs_columns
where schema_name = 'LARS'
and table_name ='INSERT_LOAD_TEST';
/*
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 895 0 TRUE FULL LOADED
NAME 895 0 TRUE NONE NOT APPLICABLE
COMMENT_TEXT 895 0 TRUE NONE NOT APPLICABLE
*/
/* Now let's unload the data from memory */
UNLOAD INSERT_LOAD_TEST;
/* check if this was done correctly
TABLE_NAME LOADED IS_DELTA_LOADED RECORD_COUNT RAW_RECORD_COUNT_IN_MAIN RAW_RECORD_COUNT_IN_DELTA
INSERT_LOAD_TEST NO FALSE 895 0 895
=> both table data and DELTA are NOT loaded
=> what about the columns?
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 0 0 FALSE NONE NOT APPLICABLE
NAME 0 0 FALSE NONE NOT APPLICABLE
COMMENT_TEXT 0 0 FALSE NONE NOT APPLICABLE
=> see: when the columns are not loaded, also the index type is not known!
=> let's go and insert another row but not the 3rd column!
*/
insert into insert_load_test (id, name )
(select ilt.nextval, 'MY TEST ROW' from dummy);
/* what has been loaded now??
TABLE_NAME LOADED IS_DELTA_LOADED RECORD_COUNT RAW_RECORD_COUNT_IN_MAIN RAW_RECORD_COUNT_IN_DELTA
INSERT_LOAD_TEST FULL TRUE 896 0 896
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 896 0 TRUE FULL LOADED
NAME 896 0 TRUE NONE NOT APPLICABLE
COMMENT_TEXT 896 0 TRUE NONE NOT APPLICABLE
==> so, also the "untouched" column was loaded.
==> but did we really NOT the column? effectively, we told HANA to place a NULL value in that column, and that needs to be stored as well.
==> also see: ALL data is currently in the DELTA log - so the main store is very small and doesn't take much time/memory when loaded
=> what if the column has a default value?
*/
alter table INSERT_LOAD_TEST add (another_comment varchar (20) default 'nothing');
unload INSERT_LOAD_TEST;
insert into insert_load_test (id, name )
(select ilt.nextval, 'MY TEST ROW' from dummy);
/*
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 897 0 TRUE FULL LOADED
NAME 897 0 TRUE NONE NOT APPLICABLE
COMMENT_TEXT 897 0 TRUE NONE NOT APPLICABLE
ANOTHER_COMMENT 897 0 TRUE NONE UNLOADED
=> even with a default value we need to load the column
=> because basically every row is inserted into each column (well, the respective part only of the row of course).
Obvious next question: what if we update columns??
*/
unload INSERT_LOAD_TEST;
update INSERT_LOAD_TEST set another_comment ='TATAAAAAA';
/*
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 1794 0 TRUE FULL LOADED
NAME 1794 0 TRUE NONE NOT APPLICABLE
COMMENT_TEXT 1794 0 TRUE NONE NOT APPLICABLE
ANOTHER_COMMENT 1794 0 TRUE NONE UNLOADED
now how about reading and merging ??
TABLE_NAME LOADED IS_DELTA_LOADED RECORD_COUNT RAW_RECORD_COUNT_IN_MAIN RAW_RECORD_COUNT_IN_DELTA
INSERT_LOAD_TEST FULL TRUE 897 0 1794
==> good, nothing merged so far...
==> let's unload the table and select some data!
*/
unload INSERT_LOAD_TEST;
select id, another_comment from INSERT_LOAD_TEST limit 10;
/*
ID ANOTHER_COMMENT
939 TATAAAAAA
940 TATAAAAAA
941 TATAAAAAA
942 TATAAAAAA
943 TATAAAAAA
944 TATAAAAAA
945 TATAAAAAA
946 TATAAAAAA
947 TATAAAAAA
948 TATAAAAAA
=> check
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 1794 0 TRUE FULL LOADED
NAME 1794 0 TRUE NONE NOT APPLICABLE
COMMENT_TEXT 1794 0 TRUE NONE NOT APPLICABLE
ANOTHER_COMMENT 1794 0 TRUE NONE UNLOADED
=> okay - all columns are loaded, but distinct_count still 0 ??
=> let's unload, merge and check again!
*/
unload INSERT_LOAD_TEST;
merge delta of INSERT_LOAD_TEST;
unload INSERT_LOAD_TEST;
select id, another_comment from INSERT_LOAD_TEST limit 10;
/*
ID ANOTHER_COMMENT
939 TATAAAAAA
940 TATAAAAAA
941 TATAAAAAA
942 TATAAAAAA
943 TATAAAAAA
944 TATAAAAAA
945 TATAAAAAA
946 TATAAAAAA
947 TATAAAAAA
948 TATAAAAAA
TABLE_NAME LOADED IS_DELTA_LOADED RECORD_COUNT RAW_RECORD_COUNT_IN_MAIN RAW_RECORD_COUNT_IN_DELTA
INSERT_LOAD_TEST PARTIALLY TRUE 898 897 1
COLUMN_NAME COUNT DISTINCT_COUNT LOADED INDEX_TYPE INDEX_LOADED
ID 898 897 TRUE FULL LOADED
NAME 0 0 FALSE NONE NOT APPLICABLE
COMMENT_TEXT 898 0 TRUE NONE NOT APPLICABLE
ANOTHER_COMMENT 898 1 TRUE NONE NOT APPLICABLE
==> why is COMMENT_TEXT loaded here? -> no idea, to be honest... but as it contains only NULLs until now, it may be due to
an internal optimization heuristic
*/
For the second question: the memory organisation of Oracle DBMS is a lot different to HANA and comparing it doesn't really make sense.
The row store and the column store (to which main and delta log belong to) both have their own memory structures.
SELECT scm.component, scm.used_memory_size
FROM M_SERVICE_COMPONENT_MEMORY scm join m_services s
on (scm.host, scm.port) = (s.host, s.port)
and s.service_name ='indexserver'
where upper(scm.component) like '%ROW%'
or upper(scm.component) like '%COLUMN%'
COMPONENT USED_MEMORY_SIZE
Row store tables + Indexes 763227181
Auxiliary Row Store Components 401931274
Column Store Tables 347135392
Column Store Intermediate Results
and Shared Datastructures 115912
- Lars
Hi,
this seems to be the merge dog runnning - Please compare:
Thank You
Martin
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Chris,
To my knowledge, the MergedogMonitor is related to the Delta merge process. It is the process which checks the Delta memory and triggers the Delta Merge process based on the internal algorithm and the setting in administrator tab.
I guess we should expect more information on this from the SAP internal forum members.
Regards,
Ravi
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
101 | |
13 | |
13 | |
11 | |
11 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.