Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

RFC Call - Parameter value truncated

matt
Active Contributor

Edited, with duplicable code.

I have two RFC function modules:

FUNCTION Z_MATT_TEST1.
*"----------------------------------------------------------------------
*"*"Local Interface:
*"----------------------------------------------------------------------

ENDFUNCTION.

and

FUNCTION Z_MATT_TEST2.
*"----------------------------------------------------------------------
*"*"Local Interface:
*" IMPORTING
*" VALUE(I_REASON) TYPE STRING
*"----------------------------------------------------------------------


ENDFUNCTION.
REPORT ymabitest2.


CALL FUNCTION 'Z_MATT_TEST1'
DESTINATION 'OTHER_SYSTEM'.

CALL FUNCTION 'Z_MATT_TEST2'
DESTINATION 'OTHER_SYSTEM'
EXPORTING
i_reason = 'The reason'.


These FM exist in OTHER_SYSTEM but Z_MATT_TEST2 has an endless loop so I can debug


DATA x.
WHILE x IS INITIAL.
ENDWHILE.

When I run ymabitest2, in debug on OTHER_SYSTEM, I can see that i_reason contains 'T'.

If I comment out the call to Z_MATT_TEST1, in debug on OTHER_SYSTEM , I can see that i_reason contains 'The reason'.

This only happens if

a) Both FM are called

b) i_reason is supplied with a type C variable or literal. If it is supplied with a string literal or variable, it's fine.

1 ACCEPTED SOLUTION

Ulrich_Schmidt
Product and Topic Expert
Product and Topic Expert

The reason for this is the following:

The sending system does not know the datatype of the parameter I_REASON in the receiving system. Therefore it simply uses the datatype of the value provided in the CALL FUNCTION statement.

In the CALL FUNCTION statement, you are passing a string literal, which is of type CHAR (two-byte Unicode). Therefore the kernel of the sending system sends the value as a Unicode CHAR type, which has the binary representation

T     h     e           r     e     a     s     o     n
54 00 68 00 65 00 20 00 72 00 65 00 61 00 73 00 6F 00 6E 00

The kernel of the receiving system, however, knows that the parameter is of type STRING (zero-terminated UTF-8). So it takes the first byte of the Unicode representation of T (54 00) and maps it to the UTF-8 representation of T (54). Then it reads the second byte, sees it is a 00 and thinks that this is the zero-termination of the value, so it stops. The rest of the bytes is discarded (as usual in the RFC protocol).

If in your report ymabitest2 you use a temporary variable of type STRING and pass that into the CALL FUNCTION statement,

DATA temp TYPE STRING.
temp = 'The reason'. " <-- This statement converts the string literal from
" Unicode CHAR to UTF-8 STRING!

CALL FUNCTION 'Z_MATT_TEST2'
DESTINATION 'OTHER_SYSTEM'
EXPORTING
i_reason = temp.

then the sending kernel uses the datatype of temp for sending the data, which is zero-terminated UTF-8:

T  h  e     r  e  a  s  o  n
54 68 65 20 72 65 61 73 6F 6E 00

The receiving kernel also uses zero-terminated UTF-8 for interpreting the data, and everything is fine...

However, I have absolutely no explanation for why the effect only happens, if you call the empty function module before?! Looks like a bug to me.

PS: ok, during lunch break, the explanation for the second symptom suddenly occurred to me...:

The first function call on a freshly opened RFC connection is always performed with codepage 1100 (ISO-Latin-1), because at that time, the sending system does not yet know the partner's codepage, and codepage 1100 is supported by every SAP system, even old non-Unicode systems. Then during this first call, the RFC handshake is performed, and the two partners exchange each other's codepage information, and for all following calls the "best fitting" codepage is used, which is normally Unicode these days.

So when you call Z_MATT_TEST2 as the very first FM, the data for I_REASON is sent in ISO-Latin-1, which for ASCII characters is almost identical to UTF-8, except for the missing terminating zero:

T  h  e     r  e  a  s  o  n
54 68 65 20 72 65 61 73 6F 6E

The receiving system again interprets it as UTF-8 (and then probably adds the missing zero at the end), so everything looks fine.

However, when that empty FM is called as the first FM, the handshake takes place already here, the connection is switched from codepage 1100 to 4102/4103 (Unicode), and the second call on that connection is then performed with Unicode as codepage, leading to the problem/mismatch as described above... 🙂

22 REPLIES 22

FredericGirod
Active Contributor
0 Kudos

Could it be the variable already contains T before the call ? (global memory of the function group)

matt
Active Contributor

After a little more experimenting, I've found if "The reason" is a string, rather than a character, it works fine.

But why there is different behaviour if one FM is called rather than two seems very very strange.

matt
Active Contributor
0 Kudos

frdric.girod No global variable are in use.

In any case, the FMs calls are supposed to by asynchronous, aren't they?

Sandra_Rossi
Active Contributor
0 Kudos

I reproduce the behavior on my 7.52 system, calling 'NONE', but it's consistent as it happens with both 1 and 2.

matt
Active Contributor
0 Kudos

sandra.rossi

I thought you might try it out! I've updated the question with real code. The strange thing is that if I comment out the first call, the character data is complete.

matt
Active Contributor

The take-home from this is:

If you call an RFC FM with a string parameter, make sure the value is a string, not a character type.

Ulrich_Schmidt
Product and Topic Expert
Product and Topic Expert

The reason for this is the following:

The sending system does not know the datatype of the parameter I_REASON in the receiving system. Therefore it simply uses the datatype of the value provided in the CALL FUNCTION statement.

In the CALL FUNCTION statement, you are passing a string literal, which is of type CHAR (two-byte Unicode). Therefore the kernel of the sending system sends the value as a Unicode CHAR type, which has the binary representation

T     h     e           r     e     a     s     o     n
54 00 68 00 65 00 20 00 72 00 65 00 61 00 73 00 6F 00 6E 00

The kernel of the receiving system, however, knows that the parameter is of type STRING (zero-terminated UTF-8). So it takes the first byte of the Unicode representation of T (54 00) and maps it to the UTF-8 representation of T (54). Then it reads the second byte, sees it is a 00 and thinks that this is the zero-termination of the value, so it stops. The rest of the bytes is discarded (as usual in the RFC protocol).

If in your report ymabitest2 you use a temporary variable of type STRING and pass that into the CALL FUNCTION statement,

DATA temp TYPE STRING.
temp = 'The reason'. " <-- This statement converts the string literal from
" Unicode CHAR to UTF-8 STRING!

CALL FUNCTION 'Z_MATT_TEST2'
DESTINATION 'OTHER_SYSTEM'
EXPORTING
i_reason = temp.

then the sending kernel uses the datatype of temp for sending the data, which is zero-terminated UTF-8:

T  h  e     r  e  a  s  o  n
54 68 65 20 72 65 61 73 6F 6E 00

The receiving kernel also uses zero-terminated UTF-8 for interpreting the data, and everything is fine...

However, I have absolutely no explanation for why the effect only happens, if you call the empty function module before?! Looks like a bug to me.

PS: ok, during lunch break, the explanation for the second symptom suddenly occurred to me...:

The first function call on a freshly opened RFC connection is always performed with codepage 1100 (ISO-Latin-1), because at that time, the sending system does not yet know the partner's codepage, and codepage 1100 is supported by every SAP system, even old non-Unicode systems. Then during this first call, the RFC handshake is performed, and the two partners exchange each other's codepage information, and for all following calls the "best fitting" codepage is used, which is normally Unicode these days.

So when you call Z_MATT_TEST2 as the very first FM, the data for I_REASON is sent in ISO-Latin-1, which for ASCII characters is almost identical to UTF-8, except for the missing terminating zero:

T  h  e     r  e  a  s  o  n
54 68 65 20 72 65 61 73 6F 6E

The receiving system again interprets it as UTF-8 (and then probably adds the missing zero at the end), so everything looks fine.

However, when that empty FM is called as the first FM, the handshake takes place already here, the connection is switched from codepage 1100 to 4102/4103 (Unicode), and the second call on that connection is then performed with Unicode as codepage, leading to the problem/mismatch as described above... 🙂

0 Kudos

Thanks! Do you mean that the RFC information is not considered for STRING type, UTF-8 is always used?

Yes, ABAP type STRING is always UTF-8, in the ABAP runtime as well as in RFC, independent of the current codepage (or the target system codepage).

And so, just for fun, if the RFC destination is defined with code page 4102 (UTF-16BE), I guess the remote server would get 0 character instead of 1 😄 (if sending a type C literal "The reason" and the remote server's function is defined with type STRING).

T     h     e           r     e     a     s     o     n
00 54 00 68 00 65 00 20 00 72 00 65 00 61 00 73 00 6F 00 6E

matt
Active Contributor

And how was I supposed to figure that out!

😄

More to the point - how did you figure it out? Amazing! Thank you.

> More to the point - how did you figure it out? Amazing! Thank you.

I first stumbled into this "behavior" in 2007, when writing sample and test programs & documentation for the NW RFC SDK. If you activate RFC trace, the bytes that the two communication partners exchange over the network, are written to the trace in hexdump form. If you then know, how the RFC protocol works on binary level, you can deduce from this trace exactly, which bytes are sent for which FM parameter, and then make the right conclusions.

But it certainly helps, if you have spent the last 25 years working on tools that interpret the incoming RFC data stream from ABAP and convert it into data types of other programming languages (like C, Java and Microsoft .NET)... 🙂

0 Kudos

Wow, great description of the issue!

-- Tomas --

matt
Active Contributor
0 Kudos

Well, thank you very much for sharing your knowledge. I didn't think of activating the RFC trace. I'll definitely bear that in mind if I encounter any other weird behaviour.

0 Kudos

Just saw your additional comment about the RFC handshake with codepage 1100. In my test with 7.52, I did the test with RFC 'NONE', the parameter in the first call was truncated, so I guess that it didn't use the codepage 1100, it used my Unicode codepage (which is 4203), so I guess there is another condition (maybe simply the fact that it's 'NONE' so same server and no need to use the codepage 1100).

Yes, I would guess, the two exceptions to this are NONE (which is already "known") and the case where you select "Non-Unicode" with an explicit codepage in SM59. Even for Unicode destinations, the handshake is necessary, because the sending system doesn't know yet, whether the other side is little-endian (4103) or big-endian (4102)...

0 Kudos

Thanks Ulrich. That would be fun (again) to test what happens during the handshake if the argument is of type STRING and contains a null character, would it truncate and receive only "A" 😉

DATA(null) = cl_abap_char_utilities=>minchar.
DATA(reason) = |A{ null }reason|.

I'm pretty sure it will work well but who knows. I'll test it this afternoon.

Ulrich_Schmidt
Product and Topic Expert
Product and Topic Expert
0 Kudos

As far as I know, there is no "string literal" in the ABAP language. Every literal is automatically type CHAR?! So the workaround with the temporary variable described below, is the only solution for this problem that I know.

Ulrich_Schmidt
Product and Topic Expert
Product and Topic Expert

> These FM exist in OTHER_SYSTEM but Z_MATT_TEST2 has an endless loop so I can debug

You can also debug by setting an "external breakpoint" in the receiving system. No need for an endless loop. (Of course the FM needs to have at least one statement, where you can set the breakpoint...)

matt
Active Contributor
0 Kudos

ulrich.schmidt I know... but then I'd have to change the userid in the RFC destination to dialog. I was just to lazy. 😄

matt
Active Contributor
0 Kudos

When I used

 `this is a string literal` 

it worked,

I think...

Maybe

|This would work too?|