cancel
Showing results for 
Search instead for 
Did you mean: 

Regarding compression?

Former Member
0 Kudos

hai friends,

can any tell me clearly about compression of data targets ?

for which data targets we can do compression ?

thanks & regards,

ramnaresh.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi,

Check the below link for compression

http://help.sap.com/saphelp_nw04/helpdata/en/ca/aa6437e7a4080ee10000009b38f842/frameset.htm

you can do compression for only infocube ...

Answers (2)

Answers (2)

former_member345199
Active Contributor
0 Kudos

Hi,

Compressing InfoCubes

Use

When you load data into the InfoCube, entire requests can be inserted at the same time. Each of these requests has its own request ID, which is included in the fact table in the packet dimension. This makes it possible to pay particular attention to individual requests. One advantage of the request ID concept is that you can subsequently delete complete requests from the InfoCube.

However, the request ID concept can also cause the same data record (all characteristics agree, with the exception of the request ID) to appear more than once in the fact table. This unnecessarily increases the volume of data, and reduces performance in reporting, as the system has to perform aggregation using the request ID every time you execute a query.

Using compressing, you can eliminate these disadvantages, and bring data from different requests together into one single request (request ID 0).

This function is critical, as the compressed data can no longer be deleted from the InfoCube using its request ID. You must be absolutely certain that the data loaded into the InfoCube is correct.

Features

You can choose request IDs and release them to be compressed. You can schedule the function immediately or in the background, and can schedule it with a process chain.

Compressing one request takes approx. 2.5 ms per data record.

With non-cumulative InfoCubes, compression has an additional effect on query performance. Also, the marker for non-cumulatives in non-cumulative InfoCubes is updated. This means that, on the whole, less data is read for a non-cumulative query, and the reply time is therefore reduced. See also Modeling of Non-Cumulatives with Non-Cumulative Key Figures.

If you run the compression for a non-cumulative InfoCube, the summarization time (including the time to update the markers) will be about 5 ms per data record.

If you are using an Oracle database as your BW database, you can also carry out a report using the relevant InfoCube in reporting while the compression is running. With other manufacturers’ databases, you will see a warning if you try to execute a query on an InfoCube while the compression is running. In this case you can execute the query once the compression has finished executing.

If you want to avoid the InfoCube containing entries whose key figures are zero values (in reverse posting for example) you can run a zero-elimination at the same time as the compression. In this case, the entries where all key figures are equal to 0 are deleted from the fact table.

Zero-elimination is permitted only for InfoCubes, where key figures with the aggregation behavior ‘SUM’ appear exclusively. In particular, you are not permitted to run zero-elimination with non-cumulative values.

For non-cumulative InfoCubes, you can ensure that the non-cumulative marker is not updated by setting the indicator No Marker Updating. You have to use this option if you are loading historic non-cumulative value changes into an InfoCube after an initialization has already taken place with the current non-cumulative. Otherwise the results produced in the query will not be correct. For performance reasons, you should compress subsequent delta requests.

Compression is done to improve the performance. When data is loaded into the InfoCube, its done request wise.Each request ID is stored in the fact table in the packet dimension.This unnecessarily increases the volume of data, and reduces performance in reporting, as the system has to perform aggregation using the request ID every time you execute a query.When you compress the request from the cube, the data is moved from F Fact Table to E Fact Table.Using compressing, you can eliminate these disadvantages, and bring data from different requests together into one single request (request ID 0). i.e. all the data will be stored at the record level & no request will then be available. This also removes the SID's, so one less Join will be there while data fetching.

The compressed data can no longer be deleted from the InfoCube using its request ID. You must be absolutely certain that the data loaded into the InfoCube is correct before compressing.

Note 407260 - FAQs: Compression of InfoCubes

Summary

Symptom

This note gives some explanation for the compression of InfoCubes with ORACLE as db-platform.

Compression on other db-platform might differ from this.

Other terms

InfoCubes, Compression, Aggregates, F-table, E-table, partitioning,

ora-4030, ORACLE, Performance, Komprimierung

Reason and Prerequisites

Questions:

1. What is the extent of compression we should expect from the portion we are loading?

2. When the compression is stopped, will we have lost any data from the cube?

3. What is the optimum size a chunk of data to be compressed?

4. Does compression lock the entire fact table? even if only selected records are being compressed?

5. Should compression run with the indexes on or off?

6. What can I do if the performance of the compression is bad or becomes bad? Or what can I do if query performance after compression is bad?

Solution

In general:

First of all you should check whether the P-index on the e-facttable exists. If this index is missing compression will be practically impossible. If this index does not exist, you can recreate this index by activating the cube again. Please check the activation log to see whether the creation was successful.

There is one exception from this rule: If only one request is choosen for compression and this is the first request to be compressed for that cube, then the P-index is dropped and after the compression the index is recreated again automatically. This is done for performance reasons.

Answers:

1. The compression ratio is completely determined by the data you are loading. Compression does only mean that data-tuples which have the identical 'logical' key in the facttable (logical key includes all the dimension identities with the exception of the 'technical' package dimension) are combined into a single record.

So for example if you are loading data on a daily basis but your cube does only contain the month as finest time characteristics you might get a compression ratio of 1/30.

The other extreme; if every record you are loading is different from the records you have loaded before (e.g. your record contains a sequence number), then the compression ratio will be 1, which means that there is no compression at all. Nevertheless even in this case you should compress the data if you are using partitioning on the E-facttable because only for compressed data partitioning is used. Please see css-note 385163 for more details about partitioning.

If you are absolutely sure, that there are no duplicates in the records you can consider the optimization which is described in the css-note 0375132.

2. The data should never become inconsistent by running a compression. Even if you stop the process manually a consistent state should be reaches. But it depends on the phase in which the compression was when it was canceled whether the requests (or at least some of them) are compressed or whether the changes are rolled back.

The compression of a single request can be diveded into 2 main phases.

a) In the first phase the following actions are executed:

Insert or update every row of the request, that should be compressed into the E-facttable

Delete the entry for the corresponding request out of the package dimension of the cube

Change the 'compr-dual'-flag in the table rsmdatastate

Finally a COMMIT is is executed.

b) In the second phase the remaining data in the F-facttable is deleted.

This is either done by a 'DROP PARTITION' or by a 'DELETE'. As this data is not accessible in queries (the entry of package dimension is deleted) it does not matter if this phase is terminated.

Concluding this:

If the process is terminated while the compression of a request is in phase (a), the data is rolled back, but if the compression is terminated in phase (b) no rollback is executed. The only problem here is, that the f-facttable might contain unusable data. This data can be deleted with the function module RSCDS_DEL_OLD_REQUESTS. For running this function module you only have to enter the name of the infocube. If you want you can also specify the dimension id of the request you want to delete (if you know this ID); if no ID is specified the module deletes all the entries without a corresponding entry in the package-dimension.

If you are compressing several requests in a single run and the process breaks during the compression of the request x all smaller requests are committed and so only the request x is handled as described above.

3. The only size limitation for the compression is, that the complete rollback information of the compression of a single request must fit into the rollback-segments. For every record in the request which should be compressed either an update of the corresponding record in the E-facttable is executed or the record is newly inserted. As for the deletion normally a 'DROP PARTITION' is used the deletion is not critical for the rollback. As both operations are not so expensive (in terms of space) this should not be critical.

Performance heavily dependent on the hardware. As a rule of the thumb you might expect that you can compress about 2 million rows per hour if the cube does not contain non-cumulative keyfigures and if it contains such keyfigures we would expect about 1 million rows.

4. It is not allowed to run two compressions concurrently on the same cube. But for example loading into a cube on which a compression runs should be possible, if you don´t try to compress requests which are still in the phase of loading/updating data into the cube.

5. Compression is forbidden if a selective deletion is running on this cube and compression is forbidden while a attribute/hierarchy change run is active.

6. It is very important that either the 'P' or the primary index '0' on the E-facttable exists during the compression.

Please verify the existence of this index with transaction DB02. Without one of these indexes the compression will not run!!

If you are running queries parallel to the compression you have to leave the secondary indexes active.

If you encounter the error ORA-4030 during the compression you should drop the secondary indexes on the e-facttable. This can be achieved by using transaction SE14. If you are using the tabstrip in the adminstrator workbench the secondary indexes on the f-facttable will be dropped, too. (If there are requests which are smaller than 10 percent of f-facttable then the indexes on the f-facttable should be active because then the reading of the requests can be speed up by using the secondary index on the package dimension.) After that you should start the compression again.

Deleting the secondary indexes on the E facttable of an infocube that should be compressed may be useful (somemtimes even necessary) to prevent ressource shortages on the database. Since the secondary indexes are needed for reporting (not for compression) , queries may take much longer in the time when the secondary E table indexes are not there.

If you want to delete the secondary indexes only on the E facttable, you should use the function RSDU_INFOCUBE_INDEXES_DROP (and specify the parameters I_INFOCUBE = ). If you want to rebuild the indexes use the function RSDU_INFOCUBE_INDEXES_REPAIR (same parameter as above).

To check which indexes are there, you may use transaction RSRV and there select the elementary database check for the indexes of an infocube and its aggregates. That check is more informative than the lights on the performance tabstrip in the infocube maintenance.

7. As already stated above it is absolutely necessary, that a concatenated index over all dimensions exits. This index normally has the suffix 'P'. Without this index a compression is not possible! If that index does not exist, the compression tries to build it. If that fails (forwhatever reason) the compression terminates.

If you normally do not drop the secondary indexes during compression, then these indexes might degenerate after some compression-runs and therefore you should rebuild the indexes from time to time. Otherwise you might see performance degradation over time.

As the distribution of data of the E-facttable and the F-facttable is changed by a compression, the query performance can be influenced significantly. Normally compression should lead to a better performance but you have to take care, that the statistics are up to date, so that the optimizer can choose an appropriate access path. This means, that after the first compression of a significant amount of data the E-facttable of the cube should be analyzed, because otherwise the optimizer still assumes, that this table is empty. Because of the same reason you should not analyze the F-facttable if all the requests are compressed because then again the optimizer assumes that the F-facttable is empty. Therefore you should analyze the F-facttable when a normal amount of uncompressed requests is in the cube.

Header Data

Release Status: Released for Customer

Released on: 05-17-2005 09:30:44

Priority: Recommendations/additional info

Category: Consulting

Primary Component: BW-BEX-OT-DBIF-CON Condensor

Secondary Components: BW-SYS-DB-ORA BW ORACLE

http://help.sap.com/saphelp_nw04/helpdata/en/ca/aa6437e7a4080ee10000009b38f842/frameset.htm

http://help.sap.com/saphelp_nw04/helpdata/en/80/1a6466e07211d2acb80000e829fbfe/frameset.htm

Thanks,

JituK

Former Member
0 Kudos

Hi,

When the same data is splitted in different requests, say like this. Your data in different requests is same except your request no. And this 'Request Id' needs more memory space to maintain them.

So, Cubes will be compressed which makes all the data in different request to appear with one request '0' . And this data will be moved from F fact table to E fact table. When new data comes, that will come to F table and u need to compress the new requests to , to get that data in reporting.

Hope this clears you.

Regards,

-N-