cancel
Showing results for 
Search instead for 
Did you mean: 

SAP HANA Persistent Layer

Former Member
0 Kudos

Hello Everyone

Could anyone please clarify me on the below,

      When i read the SAP HANA Architecture, it says me that the data will be stored as  IN-Memory on which the reporting will be done and when there are power failures or disk failures occurs it retrieves the data from the persistent layer.

So from HANA Studio where do i see this persistent layer ? , Am assuming that the table under the content folder are IN MEMORY and all the views built in the catalog folders access the tables in content folder. So when any power failures occurs, from where these tables are recovered.

It might be a simple question but am a bit confused in this . Please explain me as clearly as possible.

Thank you.

Regards

Prashanth

Accepted Solutions (1)

Accepted Solutions (1)

tomas-krojzl
Active Contributor
0 Kudos

Hello,

you can open table (right click + Open Definition) and then you can switch to tab "Runtime Information" to see information related to memory vs. disk. Also notice information on tabs Parts (listing partitions) and Columns (listing individual columns). You might add additional columns into this view by right clicking the table to get more detailed information.

In general you do not need to worry about table being persisted. Table is always on disk and can be fully or partially loaded into memory where it is operated.

I hope this helps...

Tomas

rindia
Active Contributor
0 Kudos

Hi Tomas,

I thought that all of the data will be stored in-memory and hard disk will only be used for taking back up and only be used for recovering when the HDB crashes or undergoes failure.

As per your saying, i just created a small column store table with only 5 records in it and saw the run-time information. Below is the screenshot.

    

As per this, most of the data is residing in-memory and only few percentage was on disk.

Estimated Max size = Size in memory + Size on Disk.

Size in Memory = Main Storage size + Delta Storage Size

Also can you please throw some light on Main storage and Delta storage size?

Regards

Raj

Message was edited by: Raj Kumar Salla

tomas-krojzl
Active Contributor
0 Kudos

Hello,

discussions memory vs. disk are mostly rhetorical discussions... here is maybe more clear explanation:

- memory is volatile = you will loose it in case of outage

- to protect the data you need to "persist" the information - you need to write the log file to disk as part of commit to ensure transactional consistency --> direct impact on performance = we need fast medium for logs (usually flash technology)

- if you have logs you are protected - data in data files does not need to be instantly written and can be persisted later as part of save point operation

So now you have everything in both memory and disk... Now if you do restart of SAP HANA..

- when you stop - all data from memory is gone (volatile medium)

- during startup only row based tables are loaded - column based tables are loaded only if marked with preload flag

- remaining column based tables are loaded when first used - and not completely but only those columns that are required (lazy loading allowing faster startup)

Now you have all on disk but only relevant part that is really required in memory...

Where is SAP HANA operating? In memory (for inserts, updates and deletes also on flash disk in log files)...

Is data in memory loaded completely? No... only what is required is loaded... SAP is also working on mechanism to push data out of memory (like PSA tables) because these are not required to be kept in memory...

Are data on disk required? Yes - logs are super-important for consistency and data files to have something from where you can populate the memory in case that data needs to be loaded into memory (during startup or later)...

As you can see it is just matter of perspective...

Regarding your image - notice that you have status LOADED = PARTIALLY - this means that table is not completely in memory...

Also regarding why it is in memory bigger - because not everything is persisted on disk - some memory structures are existing only in memory and are populated during the load operation.

I hope this explains... I next post I will cover delta merge concept (need to run now) 🙂

Tomas

-

rindia
Active Contributor
0 Kudos

Hi  Thomas,

That was a great explanation with good . I like that.

tomas-krojzl
Active Contributor
0 Kudos

Hello,

now second part - regarding main storage and delta storage...

Just quick (simplified) recap what is delta storage - SAP HANA column based tables (main storage) are optimized for read performance - that means that data for each column is stored in sorted dictionary and just dictionary positions are forming the column...

Adding new record into this is quite expensive operation - you need to check if dictionary entry exist and if not then you need to create new entry and sort the dictionary again... and again for another modification... and again...

To avoid repeating this expensive operation it is more logical to have some temporary area (delta store) where you will queue modified records and then process them in batch...

That is all - delta area can be seen as temporary area for modified records before these are "merged" inside the main storage - which is called delta merge operation...

In your case you inserted rows into table but database did not yet perform delta merge (which is done automatically when certain conditions are met) - therefore you can see main store is empty while delta is full...

..you can trigger delta merge manually (see SQL guide) and then you will be able to see that delta will be empty and main store will be occupied...

Hope this helps...

Tomas

Former Member
0 Kudos

Hi Tomas

If you are saying that the table when we create are on persistent layer, Then why are logging all the changes and creating the save points regularly. Since the data is already on persistent layer,& also the in memory data is loaded from persistent layer only. we dont require any backup. Right ? And All the data that is changed in the source system will be replicated to the tables. i.e. which are stored in the persistent layer.

Regards

Prashanth.

rindia
Active Contributor
0 Kudos

Hi Tomas,

Yes i got all the explanation i need form you. But unfortunately I am not the owner of this thread.

This is really helpful and many thanks for your time you spent in clarifying my question.

Regards

Raj

tomas-krojzl
Active Contributor
0 Kudos

Hello,

> If you are saying that the table when we create are on persistent layer,

yes - all tables must be on persistent level so that in case of power outage you do not loose data... (but you will not update them instantly but later asynchronously)

> ...then why are logging all the changes and creating the save points regularly...

we are logging the changes to be covered in case of power outage (when content of memory is lost) - in that case data is loaded from data files and then post-processed using information from log files (containing every transaction that was ever committed)

save points are required to have once upon a time (usually once per 5 minutes) consistent data and to ensure that we do not need more then last 5 minutes of log information (otherwise it could happen that we might need A LOT of log files to process)

> ..Since the data is already on persistent layer, & also the in memory data is loaded from persistent layer only. we dont require any backup. Right ?

Not at all - backup is vital and super-important - if there is failure and you will loose or corrupt data file (I never seen this but it might hypothetically happen) that it is very likely that SAP HANA will fail (and you will loose information in memory) - and even if not then SAP HANA does not have mechanism how to reconstruct data file from information in memory...

If this would happen that you need data file and all log files to restore the operation - alternatively you might switch to second data center in case that disaster recovery solution is implemented (but even in this case backups are important)...

What you might encounter are also logical errors or human errors - for example someone will make mistake and delete table and you might need to use one week old backup to undo the damage...

Backup strategy is equally important as with other databases...

> And all the data that is changed in the source system will be replicated to the tables. i.e. which are stored in the persistent layer.

In case you are using SAP HANA as side-car scenario (you replicate data from other systems and you do not have any native data) there are still other objects that should be preserved - for example modeling, users / roles, etc...

Hope it helps...

Tomas

Former Member
0 Kudos

Hi Tomas

Thanks for explanation.

So, whenever a table is created it will be on persistent layer and also in memory. And whatever we are accessing i.e. when we right click on table and select open content it picks the data from in memory and displays to us. Right ?,

Regards

Prashanth

tomas-krojzl
Active Contributor
0 Kudos

Hello,

more or less yes - when you open table in Studio then Studio will send SQL query to the server, server will process the query - if table columns that are required are not in memory than required table columns are loaded into memory, data is extracted from memory and query result is returned back to Studio where it is displayed.

Tomas

Former Member
0 Kudos

Hello Tomas

I guess now i have understood a bit. But one more doubt is, when any updates or insertions happen first the data will be sent to main memory and then to disk. Right ? , Do you have any idea on what is its frequency.

Regards

Prashanth

tomas-krojzl
Active Contributor
0 Kudos

Hello,

When you do modifications - these are written to memory (immediately), when transaction is committed then it is also written to disk (log files) and later asynchronously as part of periodic save point (by default once per 5 minutes) it is also written to disk to data files...

Tomas

Former Member
0 Kudos

Excellent !

Thank you very much Tomas.

Former Member
0 Kudos

Hi Tomas,

Thank you for the all the valuable information which you shared regarding BW on HANA.

I have a question regarding the below statement which you mentioned while a go:

Is data in memory loaded completely? No... only what is required is loaded... SAP is also working on mechanism to push data out of memory (like PSA tables) because these are not required to be kept in memory...


I have a basic question : What is the difference between In Memory Vs HANA Optimized? (I am thinking both are same)

You are right about PSA data doesn't need to be In Memory, does it mean that PSA Data will not be in HANA Database (because entire HANA database is In Memory). Is my understanding correct?

Regards

Shankar

tomas-krojzl
Active Contributor
0 Kudos

Hello,

I think your question is explained here:

Tomas

Former Member
0 Kudos

Hello Thomas,

In case of SAP HANA, there are many software innovations are done. One of that innovation is SAP HANA avoids expensive database operation. What does it mean? Can you please elaborate it?

My second question to you is, in SAP HANA column store, update are performed by inserting a new entry into the delta storage. What does it mean? From the database level, I will execute update, delete, insert statements when how does it will get converted in insert statement into the delta storage.

Thanks,

Bhupesh Pant

tomas-krojzl
Active Contributor
0 Kudos

Hello,

"innovation is SAP HANA avoids expensive database operation"

This can be interpreted in many ways - for example that SAP HANA is storing data in memory (and not as cached blocks in data cache like other databases - but in very structured way optimized for faster processing) thus avoiding the need to go to disk in case that data is not in cache... etc...

It can also be interpreted that SAP HANA is storing data in columnar tables - therefore for statements when you need to analyze big amounts of rows but you need only few fields - you do not need to retrieve complete row and throw everything else away but you will retrieve only required fields because you will work only with required columns... etc..

In other words this statement is very generic...

Regarding second part - storing data in columnar tables has huge boost on select performance (in particular for doing selects for few fields against huge amount of rows) - however drawback is more expensive operation for "row based" types of operation (in particular insert, update. delete) - this would need rebuild of complete table - so SAP developed way how to mitigate degradation for these operations using "delta store"...

Every columnar table is equipped with delta store (you can imagine it like additional internal table which is completely transparent to all operations = it is one logical table composed from these two parts) which is optimized for row based operation and all row based operations are done there (therefore there is no need to rebuild complete column table each time row operation is done).

Therefore update statement will not adjust values in column based table directly - instead it will insert new record in to delta store invalidating "original" record in column store. In such case when you do select statement both parts (column table and delta store) are processed and correct result is returned...

Once upon a time (when is controlled either by internal SAP HANA rules or optionally can be controlled by application like BW itself) an operation called "Delta Merge" is triggered which will take content of delta store and will include it into column table itself (rebuilding table) - this internal operation is online and is invisible to application (there is no disruption).

I hope this clarifies the concept.

Answers (0)