04-03-2019 10:04 AM
Hi,
When using CL_ABAP_GZIP_TEXT_STREAM, it requires a buffer interface if_abap_gzip_text_handler~use_out_buf which is defined as static.
This works fine when I need a single compression stream. E.g. looping through a single table, writing data out using a single gzip stream.
However, when I need multiple independent concurrent compression streams (writing records to multiple files concurrently), then this does not work as the user_outbuf class has static variables / methods, meaning the buffer values would get mixed together. E.g. looping through one table writing its record out, and querying another table to get related records and writing out to a different compression stream of the same type user_outbuf.
Is there a cleaner way to solve this problem, rather than redefining the same class with a different name over and over again to get concurrency support? e.g. (replicate the class definitions / implementations), then reference:
uref1 TYPE REF TO user_outbuf1,
uref2 TYPE REF TO user_outbuf2,
etc
Thanks,
Jay 🙂
Example code:
REPORT TEST.
" Define a buffer handler class
CLASS user_outbuf DEFINITION.
PUBLIC SECTION.
INTERFACES if_abap_gzip_text_handler.
CLASS-DATA:
buffer TYPE x LENGTH 1000, " predefine size of the buffer
buffer_len TYPE i VALUE -1. " -1 means the total length of buffer
ENDCLASS.
CLASS user_outbuf IMPLEMENTATION.
METHOD if_abap_gzip_text_handler~use_out_buf.
WRITE: / buffer.
ENDMETHOD.
ENDCLASS.
START-OF-SELECTION.
DATA:
uref TYPE REF TO user_outbuf,
uref2 TYPE REF TO user_outbuf,
csref TYPE REF TO CL_ABAP_GZIP_TEXT_STREAM.
" create a copy of the buffer
CREATE OBJECT uref.
" create a copy of the gzip compression class
CREATE OBJECT csref
EXPORTING CONVERSION = 'DEFAULT'
OUTPUT_HANDLER = uref.
" setup the buffer
csref->set_out_buf(
IMPORTING
out_buf = uref->buffer
out_buf_len = uref->buffer_len
).
" compress some data
CALL METHOD csref->compress_text_stream
EXPORTING
TEXT_IN = 'Some text'
TEXT_IN_LEN = -1.
CALL METHOD csref->compress_text_stream_end
EXPORTING TEXT_IN = 'Last text'
TEXT_IN_LEN = -1.
" create a new buffer, which,
" due to if_abap_gzip_text_handler~use_out_buf being static
" the buffer variables also have to be static, therefore,
" cannot be concurrently reused in another compression instance
CREATE OBJECT uref2.
WRITE: / uref2->buffer.
04-03-2019 1:55 PM
04-03-2019 1:33 PM
I'm not sure that I understand your problem, but I think that this code could be help you to solve.
report zzzjc_test01.
" Define a buffer handler class
class user_outbuf definition.
public section.
interfaces if_abap_gzip_text_handler.
class-data:
buffer type x length 1000, " predefine size of the buffer
buffer_len type i value -1. " -1 means the total length of buffer
endclass.
class user_outbuf implementation.
method if_abap_gzip_text_handler~use_out_buf.
write: / buffer.
endmethod.
endclass.
start-of-selection.
types begin of lty_list_buffers.
types uref type ref to user_outbuf.
types csref type ref to cl_abap_gzip_text_stream.
types end of lty_list_buffers.
types lty_list_buffers_tt type standard table of lty_list_buffers with empty key.
data lt_list_buffers type lty_list_buffers_tt.
" your input text as an itenal table
data lt_input_text type standard table of string with empty key.
loop at lt_input_text assigning field-symbol(<ls_input_text>).
append initial line to lt_list_buffers assigning field-symbol(<ls_list_buffers>).
<ls_list_buffers>-uref = new user_outbuf( ).
<ls_list_buffers>-csref
= new cl_abap_gzip_text_stream(
conversion = 'DEFAULT'
output_handler = <ls_list_buffers>-uref ).
<ls_list_buffers>-csref->set_out_buf(
importing
out_buf = <ls_list_buffers>-uref->buffer
out_buf_len = <ls_list_buffers>-uref->buffer_len
).
<ls_list_buffers>-csref->compress_text_stream(
exporting
text_in = <ls_input_text>
text_in_len = -1
).
<ls_list_buffers>-csref->compress_text_stream_end(
exporting
text_in = <ls_input_text>
text_in_len = -1
).
endloop.
loop at lt_list_buffers assigning <ls_list_buffers>.
write / <ls_list_buffers>-uref->buffer.
endloop.
04-03-2019 1:55 PM
04-04-2019 10:18 AM
Example to demonstrate the point #2, by using a static internal table GZIPPERS. My code probably looks complex, but I tried to make it reusable (in fact right now I don't consider it a good reusable class, I'd like to rewrite it completely but I don't have time right now ;-)).
" PART 1 : REUSABLE CODE
INTERFACE lif_gzip_output_handler_new.
METHODS use_out_buf
IMPORTING
out_buf TYPE xsequence
out_buf_len TYPE i DEFAULT 0
part TYPE i.
METHODS get_out_buf EXPORTING ref_out_buf TYPE REF TO data ref_out_buf_len TYPE REF TO data.
ENDINTERFACE.
CLASS lcl_gzip_text_stream_new DEFINITION.
PUBLIC SECTION.
INTERFACES if_abap_gzip_text_handler.
TYPES : BEGIN OF ty_zipper,
gzip_stream TYPE REF TO cl_abap_gzip_text_stream,
output_handler TYPE REF TO lif_gzip_output_handler_new,
END OF ty_zipper,
ty_zippers TYPE HASHED TABLE OF ty_zipper WITH UNIQUE KEY gzip_stream.
METHODS constructor
IMPORTING
ref_string TYPE REF TO string
in_buf_len TYPE i DEFAULT 200
compress_level TYPE i DEFAULT 6
conversion TYPE abap_encod DEFAULT 'DEFAULT'
use_outbuf TYPE REF TO lif_gzip_output_handler_new.
METHODS next_chunk.
DATA: done TYPE abap_bool READ-ONLY.
PRIVATE SECTION.
CLASS-DATA:
gzippers TYPE ty_zippers.
DATA:
ref_string TYPE REF TO string,
in_buf_len TYPE i,
in_offset TYPE i,
gzipper TYPE REF TO cl_abap_gzip_text_stream.
ENDCLASS.
CLASS lcl_gzip_text_stream_new IMPLEMENTATION.
METHOD constructor.
me->ref_string = ref_string.
me->in_buf_len = in_buf_len.
done = abap_false.
gzipper = NEW cl_abap_gzip_text_stream(
compress_level = compress_level
conversion = conversion
output_handler = me ).
use_outbuf->get_out_buf( IMPORTING ref_out_buf = DATA(ref_out_buf) ref_out_buf_len = DATA(ref_out_buf_len) ).
ASSIGN ref_out_buf->* TO FIELD-SYMBOL(<out_buf>).
ASSIGN ref_out_buf_len->* TO FIELD-SYMBOL(<out_buf_len>).
gzipper->set_out_buf( IMPORTING out_buf = <out_buf> out_buf_len = <out_buf_len> ).
INSERT VALUE ty_zipper( gzip_stream = gzipper output_handler = use_outbuf ) INTO TABLE gzippers.
ENDMETHOD.
METHOD next_chunk.
CHECK done = abap_false.
ASSIGN ref_string->* TO FIELD-SYMBOL(<string>).
IF in_offset >= strlen( <string> ).
done = abap_true.
RETURN.
ENDIF.
IF in_offset + in_buf_len < strlen( <string> ).
DATA(chunk) = <string>+in_offset(in_buf_len).
gzipper->compress_text_stream( text_in = chunk ).
ADD in_buf_len TO in_offset.
ELSE.
chunk = <string>+in_offset.
gzipper->compress_text_stream_end( text_in = chunk ).
in_offset = strlen( <string> ).
done = abap_true.
ENDIF.
ENDMETHOD.
METHOD if_abap_gzip_text_handler~use_out_buf.
DATA(user_outbuf) = gzippers[ gzip_stream = gzip_stream ]-output_handler.
user_outbuf->use_out_buf(
out_buf = out_buf
out_buf_len = out_buf_len
part = part ).
ENDMETHOD.
ENDCLASS.
" PART 2 : DEMO CODE
CLASS lcl_use_outbuf DEFINITION.
PUBLIC SECTION.
INTERFACES lif_gzip_output_handler_new.
DATA: gzip_data TYPE xstring READ-ONLY,
out_buf TYPE x LENGTH 100,
out_buf_len TYPE i VALUE -1.
ENDCLASS.
CLASS lcl_use_outbuf IMPLEMENTATION.
METHOD lif_gzip_output_handler_new~use_out_buf.
gzip_data = gzip_data && out_buf(out_buf_len).
ENDMETHOD.
METHOD lif_gzip_output_handler_new~get_out_buf.
ref_out_buf = REF #( out_buf ).
ref_out_buf_len = REF #( out_buf_len ).
ENDMETHOD.
ENDCLASS.
START-OF-SELECTION.
DATA(a) = `Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod `
&& `tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim ven`
&& `iam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea com`
&& `modo consequat. Duis aute irure dolor in reprehenderit in voluptate veli`
&& `t esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat c`
&& `upidatat non proident, sunt in culpa qui officia deserunt mollit anim id`
&& ` est laborum.`.
DATA(use_outbuf1) = NEW lcl_use_outbuf( ).
DATA(use_outbuf2) = NEW lcl_use_outbuf( ).
DATA(compressor1) = NEW lcl_gzip_text_stream_new( ref_string = REF #( a ) in_buf_len = 90 use_outbuf = use_outbuf1 ).
DATA(compressor2) = NEW lcl_gzip_text_stream_new( ref_string = REF #( a ) in_buf_len = 60 use_outbuf = use_outbuf2 ).
WHILE compressor1->done = abap_false OR compressor2->done = abap_false.
compressor1->next_chunk( ).
DO 2 TIMES.
compressor2->next_chunk( ).
ENDDO.
ENDWHILE.
DATA a_again TYPE string.
cl_abap_gzip=>decompress_text( EXPORTING gzip_in = use_outbuf1->gzip_data IMPORTING text_out = a_again ).
ASSERT a_again = a.
cl_abap_gzip=>decompress_text( EXPORTING gzip_in = use_outbuf2->gzip_data IMPORTING text_out = a_again ).
ASSERT a_again = a.
04-25-2019 9:03 AM
An interesting approach using a "shim" layer in between to select the correct stream / buffer, which does seem to work! Thanks for the example 😄
04-03-2019 4:30 PM
Hi Sandra, Juan,
Thanks for responding, much appreciated. 🙂
>> point 1 - code doesn't reflect the question
The code was meant to demonstrate the new buffer reference uref2 (please note the 2 at the end of the name) created after uref is used provides the same value back for the buffer value, thereby, demonstrating they can't be used concurrently due to the static variable buffer. If two or more compression streams were created concurrently, then their data would intertwine and become corrupt.
>> point 3 - why do I need to compress in parallel
Say I need to extract specific records from both BKPF (Journal Headers) and BSEG (Journal Entries), I first query BKPF and loop through each record, but for reach record, I query the associated journal lines from BSEG for the given BKPF header. Both tables are streamed simultaneously (concurrently) to different files aptly named BKPF.csv.gz and BSEG.csv.gz...
>> Juan's example
Yes, your example works when writing serially to independent files, but would not work if we need to write simultaneously to different files as they all use the same class user_outbuf which uses a static class-data buffer, meaning all instances share the same buffer variable value (as demonstrated by my example - last 2 lines).
>> Sandra's point 2 - the SAP design is weird
Yes, I agree. Looks like its designed to be used once at any time with the same buffer class.
>> Thoughts / options thus far:
1) Define a buffer class for every instance we intend to use concurrently (bkpf_outbuf, bseg_outbuf, etc). This does not feel like good OO design, especially if we don't know how many levels (classes) we will need until runtime.
2) Take a local copy of the SAP package CL_ABAP_GZIP_TEXT_STREAM and tweak the interface so that its not static, then this problem should go away (I hope).
Thanks,
Jay 🙂
04-04-2019 2:35 PM
Quick update on this, attempted to clone the CL_ABAP_GZIP_TEXT_STREAM package, but it has kernel method references which requires editing of the abkmeth.seg file (https://help.sap.com/doc/abapdocu_752_index_htm/7.52/en-US/abenkernel_methods.htm) (can view definitions through RSKMETH) to be enabled in the new package. Unfortunately, this does not appear to be Transport friendly.
Instead, will try the other suggested option, defining a buffer class per table name. 😞