Application Development Blog Posts
Learn and share on deeper, cross technology development topics such as integration and connectivity, automation, cloud extensibility, developing at scale, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
kilian_kilger
Active Participant

New ABAP expressions for generic and dynamic programming in ABAP Platform 2021:


Part I - Dynamic Access to (maybe generic) references


Do you use the new ABAP expressions like constructor operators or table selectors in your coding? But you often find that when using generic programming, i.e. data types like REF TO DATA, DATA or ANY you fall back to programming style of the 70th? Then the new ABAP platform 2021 (which shipped with kernel 7.85 last week) has some new features to get you clean up your coding.

The main mantra of the new release is: "Get rid of field-symbols!"

This is part of a series of multiple blog posts. Please revisit this page as it might point to the sequel in a few weeks or if new topics concerning generic programming in ABAP may arise.

1. The old days: how to handle generic data references classically?


When using non-generic references in ABAP you always could write the following:
DATA foo TYPE REF TO i.
...
foo->* = 5.

Here and in the following the CREATE DATA statement or NEW operator has been omitted.

But when using generically typed references this was not possible:
DATA foo TYPE REF TO data.
...
foo->* = 5. " Syntax error: No dereferencing of generic reference possible

The only possibility to access the variable "foo" would be to use field-symbols.
DATA foo TYPE REF TO data.
...
ASSIGN foo->* TO FIELD-SYMBOL(<fs>).
<fs> = 5.

This makes the code uglier and more difficult to read. It also makes dereferencing the reference impossible inside ABAP expressions, as there is no expression variant of ASSIGN. Another disadvantage is the tricky error handling of ASSIGN. You can have subtle bugs when the error handling is forgotten.

2. Dereferencing generic references is now possible: (nearly) everywhere!


We now lifted the above restriction. You can now use the dereferencing operator in most places in ABAP where you can use generically typed ABAP variables. A simple example would be:
DATA foo TYPE REF TO data.
...
my_object->meth( foo->* ).

If FOO is the initial reference, then a runtime error will occur, as in the non-generic case. So no error handling is necessary in most cases.

Of course this also works in ABAP SQL like follows:
DATA ref TYPE REF TO data.
...
SELECT * FROM T100
INTO TABLE @ref->*.

This however, immediately leads to a new question: The variable REF is a "REF TO DATA", not a reference to an internal table type.

The latter is not possible in ABAP yet. There simply is no "REF TO TABLE" - type.

3. Generic References and Internal Tables


In the past in many circumstances you could not use field-symbols of type ANY or variables of type DATA to access internal tables.
FIELD-SYMBOLS <any> TYPE any.
...
READ TABLE <any> ASSIGNING FIELD-SYMBOL(<line>)
WITH TABLE KEY (dyn_key) = value. " Syntax error: <any> is no internal table

Note that I am using a dynamic key specification here.

You had to manually "reassign" the field-symbol like follows:
FIELD-SYMBOLS <any> TYPE any.
FIELD-SYMBOLS <table> TYPE any table.
...

ASSIGN <any> TO <table>.
IF sy-subrc <> 0.
... " error handling!
ENDIF.

READ TABLE <table> ASSIGNING FIELD-SYMBOL(<line>)
WITH TABLE KEY (dyn_key) = value.

This makes the coding at least 5 lines longer, because of the error handling and the check for sy-subrc. It is also error-prone, as you might forget the error handling, which yields all kinds of funny results if you do this inside a loop and the field-symbol of the last loop iteration is still assigned.

You can now use variables and field-symbols of type ANY and DATA directly in LOOP and READ statements. This gives many new possibilities:
DATA ref TO REF TO data.
...
LOOP AT ref->* ASSIGNING FIELD-SYMBOL(<fs>). " now possible
ENDLOOP.

READ TABLE ref->* ASSIGNING FIELD-SYMBOL(<fs>) " now possible
WITH KEY (dyn_key) = value.

It also makes it possible to directly dereference a reference and apply a table selector.
DATA itab_ref TYPE REF TO data.
...
itab_ref->*[ (dyn_key) = key_value ] = value.

The same mechanism has been applied to internal table functions like LINES:
DATA itab_ref TYPE REF TO data.
...
IF lines( itab_ref->* ) > 0.
...
ENDIF.

In case that ITAB_REF does not point to a an internal table at runtime, there is the new runtime error ITAB_ILLEGAL_OPERAND.

There is however a serious limitation to this. You can still not access variables of type DATA or ANY by index. You will still need a field-symbol of type INDEX TABLE.
DATA itab_ref TYPE REF TO data.
...
itab_ref->*[ 1 ] = value. " syntax error
itab_ref->*[ (dyn_key) = key_value ] = value. " ok

4. Introducing Dynamic Reference Expressions


The previous paragraphs are only one part of the solution. What if the target of your reference is a structure type, which you do not know exactly at compile time? How to access the individual components of the structure?

In the past you would have done something like this:
DATA struct_ref TYPE REF TO data.
...
ASSIGN struct_ref->* TO FIELD-SYMBOL(<fs>).
IF sy-subrc <> 0.
" error handling 1
ENDIF.
ASSIGN COMPONENT 'COMP' OF STRUCTURE <fs> TO FIELD-SYMBOL(<fs2>).
IF sy-subrc <> 0.
" error handling 2
ENDIF.
<fs2> = value.

Some more knowledgeable colleagues even know the completely dynamic ASSIGN, where you could do this all in one step:
DATA struct_ref TYPE REF TO data.
...
ASSIGN ('STRUCT_REF->COMP') TO FIELD-SYMBOL(<fs>).
IF sy-subrc <> 0.
" error handling
ENDIF.
<fs> = value.

This has of course serious drawbacks:

  • You can not do this inside expressions

  • Everything is dynamic. If you change the name of the variable STRUCT_REF, you will only know at runtime of there is an error

  • ASSIGN is dangerous, because of the sy-subrc error handling you might forget

  • You need many lines of code


There also is a very unknown variant of ASSIGN you could use:
DATA struct_ref TYPE REF TO data.
...
ASSIGN STRUCT_REF->('COMP') TO FIELD-SYMBOL(<fs>).
IF sy-subrc <> 0.
" error handling
ENDIF.
<fs> = value.

We now decided that this gives a good hint for a new kind of ABAP expression, which you can use in many places in ABAP platform 2021. You can now write:
DATA foo TYPE REF TO data.
DATA comp_name TYPE string VALUE `comp`.
...
my_object->meth( foo->(comp_name) ).

You can use these new kind of expression in most places where you can use expressions and generically typed variables.

But what to do, if the component is again a reference to another structure or a reference to a simple type? You have two possibilities.

  • The component name can be an arbitrary assign expression, like: COMP->* or
    COMP-COMP2 or COMP->COMP2

  • You can use chaining on these new kind of expressions.


DATA foo TYPE REF TO data.
...
" assign expression:
my_object->meth( foo->('comp1->comp2->*') ).

" assign expression with structures:
my_object->meth( foo->('comp1-comp2->*') ).

" new kind of daisy-chaining:
my_object->meth( foo->('comp1')->('comp2')->* ).

Of course you always get nice exceptions if the components do not exist or are not assigned.

No sy-subrc is set, of course!

The last example with daisy-chaining does not work with structures yet. So you can't write:
DATA foo TYPE REF TO data.
...
my_object->meth( foo->('COMP')-('COMP2') ).

This might by a possible improvement in later ABAP releases.

Of course this new feature makes most sense, if you do not know the target type exactly. If you know the target type exactly at compile time, you can always do a:
DATA foo TYPE REF TO data.
...
CAST concrete_type( foo )->comp = 5.

This makes of course even more sense, if you need to access many components of the structure in one method.

So the rule of thumb would be:

  • If you know the type of the reference exactly at compile time and need to access multiple components, you should opt for CAST #( ).

  • If you don't know the type of the reference exactly, but only know that the target type contains a column named BLA, use REF->('BLA')

  • If you just need a single component from REF but you know the target type of REF statically, both methods are possible. It depends on the context if you should prefer REF->('BLA') or
    CAST concrete_type( ref )->bla. I would probably use the CAST #( ) more often in this case, as you will get syntax errors when the component is deleted or renamed in the original structure. This also enables "where used" functionality. Of course you will get a stonger coupling to the specific type in this case, which might not be desired in all cases.


5. Dereferencing fully generic types


In the past you could only dereference variables which are explicitely typed as references. But to allow daisy-chaining this had to be lifted. Here is why:
DATA foo TYPE REF TO data.
...
FOO->(COMP)->(COMP2) = 5.
" ^__________ the result of FOO->(COMP) is *no* reference,
" but of type DATA.

It is now also possible to dereference completely generic types. An exception is thrown at runtime, if no reference is assigned to the variable.
METHODS foo IMPORTING value TYPE data.
...
METHOD foo.
value->* = 5. " possible, runtime error if value is not a reference.
ENDMETHOD.

From a language theoretic point this is the following: If something can not be checked at compile time but could be valid at runtime, we should not disallow it, but postpone the check to the runtime.

In ASSIGN, it was always possible to dereference a fully generic variable. But this is not really any safer:
METHODS foo IMPORTING value TYPE data.
...
METHOD foo.
FIELD-SYMBOLS <fs> TYPE REF TO i.
ASSIGN value->* TO <fs>.
IF sy-subrc <> 0.
... " error handling
ENDIF.
ENDMETHOD.

6. Performance


Many people ask for performance when new ABAP expressions are introduced.

The new ABAP expressions described in this document have a very small performance penalty in comparison with their non-generic counterparts. The following holds regarding performance:

  • The new expressions are faster or equally fast than the ASSIGN command. If you do proper error handling in ASSIGN, the new expressions will be faster.

  • When you use the same expression many times in one ABAP method, using the old ASSIGN with a field symbol and continuously using that single field symbol is still a bit faster.


Both assertions are not new: they apply (in a similar way) for nearly every other kind of ABAP expression. Also be aware regarding performance: measure, don't guess! Prefer the kind of coding which is most clean. Only if you have performance problems stick to less clean coding.

The immediately leads to another question: What is faster?
object->meth( BLA->('BLUB->BLOB->*') )

vs.

object->meth( BLA->('BLUB')->('BLOB')->* )

i.e. old style ASSIGN-expressions vs. the new kind of daisy chaining.

The answer, of course, depends. Generally we assume that old style ASSIGN expressions are a very very tiny bit faster if the chain is short. This (nearly unmeasurable) performance benefit should diminish for longer chains.

If you need ABAP coding to produce  the string 'BLUB->BLOB->*' then the new daisy chaining will have an advantage. So it might depend on the context which of the two variants will perform better. But also here we would stick to the coding which is more clear and more easily understandable.

7. Outlook


The new expressions provide an easier way to handle fully generic variables or references in ABAP. They can be used in expressions, throw exceptions and do not set any sy-subrc. They can be combined to form even more powerful constructs.

As of ABAP platform 2021 still not every combination of the expressions is possible at every position in the ABAP coding. This leaves room for (possible) improvements in later releases.

What additional stuff does not work at the moment?

7.1. Possible Improvement: Daisy-Chaining in ASSIGN for "simple" variables


Using chaining of generic reference component access in ASSIGN does not work at the moment, i.e. the following does not work yet:
DATA foo TYPE REF TO data.
...
" Syntax error in ABAP platform 2021:
ASSIGN foo->(comp_name)->* TO FIELD-SYMBOLS(<fs>).

This does only work outside of ASSIGN. The reason is the sy-subrc semantics, as inside ASSIGN the sy-subrc must be set and no exception should be thrown.

Due to implementation choices in the original ASSIGN implementation, it does work when using a table selector though:
DATA itab TYPE TABLE OF REF TO data.
...
"No syntax error in ABAP platform 2021:
ASSIGN itab[ 1 ]->('BLA')->* TO FIELD-SYMBOL(<fs>).

The latter sets sy-subrc to 4 if the itab is empty, but will throw a RABAX if there is no component BLA or references are not assigned. This is in sync with the behaviour in previous ABAP releases.

7.2. Possible Improvemement: Dynamic access to structure components


At the moment the new dynamic expressions can only be used when the starting variable is a reference. If the starting variable is of type DATA or ANY or of structure type and points to a variable of structure type at runtime they provide no benefit.

The following would be imaginable:
METHODS foo IMPORTING value TYPE data.
...
METHOD foo.
value-(comp_name) = 5. " Syntax error in ABAP platform 2021
value->(comp_name)-(struc_comp) = 5. " Syntax error in ABAP platform 2021
ENDMETHOD.

This could also be a possible future improvement.

8. Resumé


In this article we described how you:

  • can write better code when using generic data references by using the arrow operator ->* directly inside expressions

  • use the new dynamic reference expressions to access components of generic data or object references


Usage of these new tools will make generic coding ready for the 21th century and leads to much shorter, more concise code with less field symbols.

Please give your feedback in the comments below. The ABAP language is strongly influenced from your user input.

Please ask questions also in the corresponding Q&A forums.
16 Comments
oberon_ntpl
Explorer
data ref1 type ref to data.
data ref2 type ref to some_specific_type.
move-corresponding ref1->* to ref2->*.

Does this also become possible?

Also I really hope, in case of structured types, there will be no mass exitement of the fancier way to do things wrong.
If you already know the exact type to cast to, please, cast first and then use ordinary references like ref_of_spec_type->field.

kilian_kilger
Active Participant
0 Kudos

For the move-corresponding: Yes, this is also possible.

For the other suggestion: good point! If you know the correct type to cast to, you can always use the CAST #( ) - operator. I adjusted the blog post to include your recommendation.

The generic reference expressions are mostly useful for generic coding where you do not know the target type exactly, say coding which processes multiple similar but different types. These occur 
very often in framework coding nowadays, especially with RAP.

It is no replacement for CAST #( ), but a replacement for the cases where you used ASSIGN and field-symbols beforehand. 

JPT
Participant

Really nice job. Thanks.

  • If you just need a single component from REF but you know the target type of REF statically, both methods are possible. It depends on the context if you should prefer REF->(‘BLA’) or
    CAST concrete_type( ref )->bla. I would probably use the CAST #( ) more often in this case.

Also assume that with cast you also get the cross-reference link to the element whereas when you would use the new way with ref->('bla') that there is no cross-reference.

kilian_kilger
Active Participant
Correct. You also get an error message, when the component "COMP" is deleted or renamed in the original structure.
Daniil
Active Contributor

Hi Kilian, since it is “Get rid of field-symbols!” maybe it would be possible to enhance CORRESPONDING operator, to make it possible adjust some fields of the table.

Here is an example:

Data definition and initialization

TYPES:
BEGIN OF struct1,
col1 TYPE i,
END OF struct1,

BEGIN OF struct2.
INCLUDE TYPE struct1.
TYPES col2 TYPE string.
TYPES END OF struct2.
DATA:
itab1 TYPE STANDARD TABLE OF struct1 WITH EMPTY KEY,
itab2 TYPE STANDARD TABLE OF struct2 WITH EMPTY KEY.

itab1 = VALUE #( ( col1 = 1 ) ( col1 = 2 ) ( col1 = 3 ) ).

So if I need to fill bigger table, now I have to do something like this:

" From time to time it is required to add some constant / or to replace something with other value.
itab2 = VALUE #( FOR <fs> IN itab1 ( col1 = <fs>-col1 col2 = 'ABAP' ) ).

"OR if big structure is used, nobody wants to write all the fields in FOR / IN operator.
itab2 = CORRESPONDING #( itab1 ).
LOOP AT itab2 ASSIGNING FIELD-SYMBOL(<fsi2>).
<fsi2>-col2 = 'ABAP'.
ENDLOOP.
" One more option is avaialable, but it is too long from my PoV

How cool would it be to use something like this:

itab2 = CORRESPONDING #( itab1 CONSTANTS col2 = 'ABAP' ).
" OR Using
itab2 = CORRESPONDING #( itab1 USING col2 = 'ABAP' ).

 

BR,

Daniil

 

JPT
Participant

Hey d.m.

One possible way to do this is:

itab2 = value #( for <x> in itab1 ( value #( base CORRESPONDING #( <x> ) col2 = 'ABAP' ) ) ).

Daniil
Active Contributor
Hi Jan Pascal Tschudy,

thanks for your hint, yes this one is to long, and it is little bit hard to read. And as soon as variables are named more descriptive you easy get over 100 Symbols. Old way (with LOOP and APPEND) seems to me more readable in such case.
hatrigt
Participant
Really happy to see where ABAP is going forward in the future. This really eases off my work on dynamic programming. Thank you for bringing such a nice blog.
kilian_kilger
Active Participant
This is exactly the right way to do it.
Prasenjitsbist
Participant
0 Kudos

Well everything that was nice is now ugly and everything new that looks insane is pretty. Where is ABAP going ? There is something called readable code the new syntax changes are mostly ridiculous and worthless except a few that are welcome.

Save lines and make code junk and unreadable well then SAP you are doing a great job .

sh4il3sh
Participant
0 Kudos
I feel like stabbing myself everytime I write ASSIGN COMPONENT with inline/expression based codes.
What a beautiful blog, made me feel I know nothing.

I could even pass lra_tab->* to the exporting param of a FM. wow!
MichiFr
Participant
0 Kudos
Slightly OT, nevertheless an almost daily requirement: I wonder if it is possible will be possible one day to use expressions like:
IF LINE_EXISTS( GET_PARTNERS( )[ PARVW = 'AG' ] ).

Currently this and similar expressions are not possible and would mean to use an intermediate variable.
Timo_John
Active Participant
I like that expression!
Again ang again the direct access on tables resulting from a method wourd be great:

 

 Get_parameters( )[ 'AG' ]-partnerNumber 

 

I would use that a lot.

On the other side you often have to question youself if it would not beneficial to create new methods like:

if partnerExists( 'AG' ) . 

 

which will of cause help in reusability and readability.
MichiFr
Participant
The later expression look more like a functional method, I'm using a lot (as a static method for example).

Which means in turn to pass to parameter to the method and do the appropriate calculation inside and return  ABAP_TRUE or ABAP_FALSE.

That said, if your return parameter is of type ABAP_BOOL, you can immediately used that method return value in an if expression like the one you mentioned.
shais
Participant
0 Kudos
If we are already talking about improvements,

are there any plans for optional chaining/Null-conditional operator, like in other languages ("?." in Javascript, for example).

This should save some time (and lines of code) for handling of unbound values.

e.g.
DATA(text) = columns->get_column('MYCOLUMN')?->get_text( )

instead of
DATA(column) = columns->get_column('MYCOLUMN').
IF column IS BOUND.
DATA(text) = column->get_text( ).
ENDIF.

This is relevant also for access of internal table's record, like
columns[ fieldname = 'MYCOLUMN' ]?-text
pokrakam
Active Contributor
0 Kudos

This is all great, but one thing I'm really missing is to do similar things with objects, especially dynamic casting.

Dynamic objects types can only be declared as TYPE REF TO OBJECT. But this causes an issue when passing as a parameter as ABAP fails it even if it is the correct type:

 

CLASS lcl DEFINITION.
  PUBLIC SECTION.
    METHODS run IMPORTING o TYPE REF TO lcl.
...
"Caller:
    DATA o TYPE REF TO object.
    CREATE OBJECT o TYPE lcl.
    CALL METHOD obj->('RUN')
      EXPORTING
        o = o.   "Fail

 

So we need a dynamic cast.

Interfaces might help, but type ref to <interface> cannot be created dynamically as objects, only as TYPE REF TO DATA.

I've managed to cast a generic object to a field symbol which has the correct type in the debugger, but cannot be used to call methods (error message you cannot use field symbols as object references).

Example: 

 

DATA obj TYPE REF TO object.
DATA iref TYPE REF TO data.
FIELD-SYMBOLS <iref> TYPE any.
CREATE DATA iref TYPE REF TO ('LIF').
ASSIGN iref->* TO <iref>.   "Yay, <iref> is type Ref to LIF
CREATE OBJECT obj TYPE ('LCL').
<iref> ?= obj.  "Correct type, but can't call methods of field symbols
data obj2 type ref to object.
obj2 = <iref>.  "obj2 has type ref to LCL 

 

 

Labels in this area