How to determine if source file is Unicode or non-...

kimmo_sirpoma · ‎01-16-2008

Scenario: An ABAP program should read file in application server. Source files come from various systems, where some are Unicode system and some not. The system where ABAP program is running is a Unicode enabled SAP system.

Reading file content with OPEN DATASET FOR INPUT IN TEXT MODE ENCODING DEFAULT and READ DATASET causes sometimes runtime error "CX_SY_CONVERSION_CODEPAGE" at READ DATASET command, when the non-Unicode file contains "illegal" characters like german umlauts.

I know how to avoid the runtime error by using different options of the OPEN DATASET statement (LEGACY or ENCODING codepage <codepage> ), thus such hints are not necessary.

But I want to determine at runtime if the source file is of type Unicode or non-Unicode. How can I do that?

I remember seeing somewhere in this forum a sample source code but did not find it anymore. The code I am looking for was reading first 2 digits of input file and checked some hex values of such string. I did found a sample program in this forum that check the byte order mark of the file, but that was checking some hex value 'FFFE' but my programs newer find such byte order mark in input files even if the file was created in Unicode system. I remember the sample code I was looking for was checking if first byte was '00' which I think should be the case if file is from non-Unicode system and according to that check the OPEN DATASET was performed either with FOR INPUT LEGACY IN TEXT MODE or FOR INPUT ENCODING DEFAULT.

Anybody remembers the sample program I am looking for or could provide a sample code yourself?

br: Kimmo

uwe_schieferstein · ‎01-16-2008

Hello Kimmo

Have a look at the Blog [Unicode File Handling in ABAP|https://www.sdn.sap.com/irj/sdn/weblogs?blog=/pub/wlg/2194] [original link is broken] [original link is broken] [original link is broken];.

If class CL_ABAP_FILE_UTILITIES is available on your system you can use its static method CHECK_FOR_BOM in order to determine then encoding as well as the endianness.

Regards,

Uwe

How to determine if source file is Unicode or non-Unicode in ABAP