Hi,
We did some tests regarding to UTF-8 and UTF-16 encoding using file adapters. Our conclusion so far is (when using Windows OS):
1. Inbound adapter can handle UTF-8 and UTF-16 correctly, but do not specify the encoding!
2. XI mappings will set the XML encoding to UTF-8 correctly when sending an UTF-16 file to XI.
3. Outbound adapter can only handle UTF-8 (and US-ACSII and ISO-8859-1) correctly.
The exact test results are:
>>Outbound file adapter bug.
If no encoding is specified in the outbound file adapter, UTF-8 and UTF-16 are handled correctly. However if the encoding is set to UTF-16, XI mapping will fail with the error:
During the application mapping com/sap/xi/tf/_CHRIS_OUTBOUND_TO_INBOUND_ a com.sap.aii.utilxi.misc.api.BaseRuntimeException was thrown: Fatal Error: com.sap.engine.lib.xml.parser.Parser~
Part of the trace:
com.sap.aii.ibrun.server.mapping.MappingRuntimeException: Runtime exception occurred during execution of application mapping program com/sap/xi/tf/_CHRIS_OUTBOUND_TO_INBOUND_: com.sap.aii.utilxi.misc.api.BaseRuntimeException; Fatal Error: com.sap.engine.lib.xml.parser.ParserException: XMLParser: No data allowed here: (hex) a0d, a0d, 6e3c(:main:, row:3, col:2) at com.sap.aii.ibrun.server.mapping.JavaMapping.executeStep(JavaMapping.java:72) at com.sap.aii.ibrun.server.mapping.Mapping.execute(Mapping.java:91) at com.sap.aii.ibrun.server.mapping.MappingHandler.run(MappingHandler.java:78) at com.sap.aii.ibrun.sbeans.mapping.MappingRequestHandler.handleMappingRequest
>>Inbound file adapter bug.
If the encoding of an inbound file adapter is set to UTF-16 everything works ok (except the XML encoding is not set correctly, but this may be a mapping issue and not an adapter issue). However the default UTF-16 encoding seems to be UTF-16BE, where I would expect UTF-16LE since this is the most commonly used encoding.
If the encoding UTF-16LE or UTF-16BE the characterset used in the message is correct, except the BOM of the file. The BOM is empty which means UTF-8 encoded file. Since the file is UTF-16BE or UTF-16LE encoded, this is wrong and the correct BOM should be added by the adapter.
Encodings like US-ASCII and ISO-8859-1 are handled correctly.
>>Mapping bug
When we send in a message encoded in UTF-8 and want to send it out as a UTF-16 encoded message, we need to set the XML encoding to UTF-16. Normally this is done by an XSLT mapping using the <xsl:output encoding=UTF-16/> command.
The UTF-8 message will get processed by the XSLT and any special character will be converted to its UTF-16 value. However the output message is not UTF-16 encoded (1 byte in-stead off 2 bytes).
When this 1 byte message is send to the inbound adapter (encoding is set to UTF-16) the message will be translated from 1 byte to 2 byte (UTF-8 to UTF-16). The characters that were converted from UTF-8 to UTF-16 will be read as single byte characters and will be converted again. This will result in an incorrect message with illegal characters.
So basically characters will be converted to UTF-16 2 times, which is incorrect.
Maybe someone can confirm this on another XI system (maybe different OS). If you need test files or mapping, please let me know.
Kind regards,
Christiaan Schaake.