Skip to Content

Character conversion: Unicode to non-unicode

Nov 02, 2017 at 06:55 AM


avatar image

There are lot of post on this subject. But none of them make clear to me how it works.

Data in SAP is maintained in Unicode format. But some external partners require the data to be converted into a different non-Unicode character set.


In SAP we have a character ß = C39F.

Our interface partner expects character set CP437. In this character set the code for character ß = E1.

So I expect SAP Unicode character C39F to be converted into character E1.

How to accomplish ?

Regards Jack

lo_xml            = cl_ixml=>create( ).
lo_encoding       = lo_xml->create_encoding( byte_order = 0 character_set = 'CP437' ).
lo_document       = lo_xml->create_document( ).
lo_stream_factory = lo_xml->create_stream_factory( ).
lo_stream         = lo_stream_factory->create_ostream_xstring( lv_data ).
lo_stream->set_encoding( lo_encoding ).
lo_renderer       = lo_xml->create_renderer( ostream = lo_stream document = lo_document  ).
lo_element        = lo_document->create_element_ns( `data` ).
lo_element->set_attribute_ns( name = 'attribute' value = `AAAAßßßß` ).
lo_document->append_child( lo_element ).
lo_renderer->render( ).
10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

2 Answers

Best Answer
Sandra Rossi Nov 02, 2017 at 07:43 AM

Your code is correct, but a few code pages are not fully described in table TCP00A (Relationship between standardized name and SAP code page number). For instance, it would work with code page "iso-8859-1".

To complete TCP00A, you may see an example with GB18030 in note 1901768 - code page undefined for gb18030 .

Go to transaction SCP. Enter code page 1107 (SAP code page for CP437).

Change the code page.

Add attribute H 0001 CP437 and SAVE.

Try your program again.


EDIT: the Eszett character is the Unicode character U+00DF. C39F is its code value in UTF-8. When you indicate an invalid or unknown encoding (in TCP00A), the encoding is ignored, so UTF-8 will be used with a rendering to an XSTRING variable.

run6e.png (66.1 kB)
kg9p6.png (23.9 kB)
94qy2.png (15.0 kB)
10 |10000 characters needed characters left characters exceeded
J. Graus Nov 02, 2017 at 08:29 AM

That solves the problem.

That also explain to me the missing link between code page and character set.

Most characters are now correctly converted into the destination character set. Characters that are missing are converted into the missing conversion character which is '#' by default. That is all fine.

A next conversion in the 'nice to have category' would be a conversion of character Ë. Character Ë is not available in the destination character set. So it gets converted into #. Nice to have is to have it converted into E. Do you know this is possible ?

Thanks and regards

10 |10000 characters needed characters left characters exceeded