Skip to Content

Data Cluster Mistery: Data Cluster vs. RAWSTRING

Hi,
In NW 7.51 SAP converted the "classic" structure of data cluster tables from multiple rows of datatype LRAW to single row of the "new" datatype RAWSTRING (ABAP News for Release 7.51 – New Structure for Export/Import Tables).

Now the question left is what would be the best solution (size efficient) to save hugh datasets in DB in legacy system (earlier NW versions than 7.51):
Using old (multiple rows) data cluster table or a simple DB table with single RAWSTRING field?

I have done some tests and the results were quite confusing.
I’ve saved an internal table of MARA with 5000 records.
For conversion to XSTRING I used the EXPORT TO DATA BUFFER (COMPRESSION ON) command.

The results were:

  1. Data cluster table for ITAB (not converted to XSTRING) – 664KB.
  2. Data cluster table for converted XSTRING – 272KB.
  3. DB table with RAWSTRING – 536KB.

Is there any explanation for it?

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

2 Answers

  • Aug 29, 2017 at 07:22 AM

    According to the following program, the length of the data clusters is exactly the same for an internal table stored in a data buffer or in the CLUSTD fields of the two possible export/import table structures, for compression on and off.

    How have you counted? Don't forget, that besides the cluster itself, there are also the administrative columns in export/import table structures. Of course, this overhead is higher for the old tabular storage.

    DATA(rnd) = cl_abap_random_int=>create(
                   seed = CONV i( sy-uzeit )
                   min = 1
                   max = 1000 ).
    TYPES itab TYPE TABLE OF i WITH EMPTY KEY.
    DATA(itab) = VALUE itab( FOR i = 1 UNTIL i > 100000
                             ( rnd->get_next( ) ) ).
    
    DATA(out) = cl_demo_output=>new( ).
    
    out->next_section( 'Compression off' ).
    
    DATA data_buffer TYPE xstring.
    EXPORT itab = itab TO DATA BUFFER data_buffer
                       COMPRESSION OFF.
    
    EXPORT itab = itab TO DATABASE demo_indx_blob(rn)
                          ID 'RandomNumbers' COMPRESSION OFF.
    SELECT SINGLE clustd
           FROM demo_indx_blob
           WHERE relid = 'RN' AND
                 id    = 'RandomNumbers'
           INTO  @DATA(indx_blob_clustd).
    
    EXPORT itab = itab TO DATABASE demo_indx_table(rn)
                       ID 'RandomNumbers' COMPRESSION OFF.
    SELECT *
           FROM demo_indx_table
           WHERE relid = 'RN' AND
                 id    = 'RandomNumbers'
           ORDER BY srtf2
           INTO TABLE @DATA(indx_table).
    DATA(indx_table_clustd) = VALUE xstring( ).
    LOOP AT indx_table INTO DATA(indx_table_wa).
      CONCATENATE indx_table_clustd
                  indx_table_wa-clustd(indx_table_wa-clustr) 
                  INTO indx_table_clustd
                  IN BYTE MODE.
    ENDLOOP.
    
    out->write( xstrlen( data_buffer )
      )->write( xstrlen( indx_blob_clustd )
      )->write( xstrlen( indx_table_clustd ) ).
    
    CLEAR data_buffer.
    DELETE FROM DATABASE demo_indx_table(rn) ID 'RandomNumbers'.
    DELETE FROM DATABASE demo_indx_blob(rn)  ID 'RandomNumbers'.
    
    out->next_section( 'Compression on' ).
    
    EXPORT itab = itab TO DATA BUFFER data_buffer
                       COMPRESSION ON.
    
    EXPORT itab = itab TO DATABASE demo_indx_blob(rn)
                          ID 'RandomNumbers' COMPRESSION ON.
    SELECT SINGLE clustd
           FROM demo_indx_blob
           WHERE relid = 'RN' AND
                 id    = 'RandomNumbers'
           INTO  @indx_blob_clustd.
    
    EXPORT itab = itab TO DATABASE demo_indx_table(rn)
                       ID 'RandomNumbers' COMPRESSION ON.
    SELECT *
           FROM demo_indx_table
           WHERE relid = 'RN' AND
                 id    = 'RandomNumbers'
           ORDER BY srtf2
           INTO TABLE @indx_table.
    indx_table_clustd = VALUE xstring( ).
    LOOP AT indx_table INTO indx_table_wa.
      CONCATENATE indx_table_clustd
                  indx_table_wa-clustd(indx_table_wa-clustr) 
                  INTO indx_table_clustd
                  IN BYTE MODE.
    ENDLOOP.
    
    out->write( xstrlen( data_buffer )
      )->write( xstrlen( indx_blob_clustd )
      )->write( xstrlen( indx_table_clustd ) ).
    
    DELETE FROM DATABASE demo_indx_table(rn) ID 'RandomNumbers'.
    DELETE FROM DATABASE demo_indx_blob(rn)  ID 'RandomNumbers'.
    
    out->display( ).
    
    Add comment
    10|10000 characters needed characters exceeded

    • Just to be complete:

      "all 3 variables should contain exactly the same value (regardless the way they are stored in DB)"

      Not really. As a rule, the size is about the same. But for optimization reasons, the binary content of the data cluster can differ between those that are stored in one string and those that are splitted over several lines.

      Therefore, you must always use the appropriate IMPORT.

  • Aug 30, 2017 at 05:49 AM

    To summarize:

    • We speak about export/import tables. Those are transparent database tables, not to be mixed up with the special cluster tables.
    • The statement EXPORT serializes ABAP data into a binary format called data cluster, with or without compression
    • A data cluster can be stored in different media.
    • For storing in an export/import table there are two ways: the whole cluster in one RAWSTRING or the data cluster split up into several LRAW fields.

    You claim that storing one and the same binary data in one RAWSTRING uses more space on the database than splitting it in several LRAW fields. If it is true, that finding should be independent from data clusters. You should be able to reproduce it by writing any xstring to a RAWSTRING or to several LRAW fields. Then it is a question to the database. If you can't reproduce it, use Open SQL to examine your data clusters in order to find the difference.

    Add comment
    10|10000 characters needed characters exceeded