Skip to Content
Former Member
Apr 05, 2011 at 12:18 PM

Message mapping : UDF parameter string type versus default UTF-8 encoding



I'm facing an issue with character encoding when using an UDF to transform into base64 encoding.

While thinking about the subject, I'm not 100% sure if it's possible to get it to work corerctly :

Given :

-The input XML is encoded UTF-8 ( with a special characeter )

-The UDF is generated with java parameter type 'string' ( = set of 16bit unicode characters )

Doubts :

-What is supposed to happen when a node content ( of message encoded in UTF-8 ) is used as input for the UDF string type parameter ? Is the node content decoded/encoded correctly by PI automatically ( at input/output versus the internal 16bit unicode character string ) ?

( I would assume yes )

-Is the default charset of the underlying JVM relevant ? Or does pi always use explicit charsets when encoding/decoding ?

( I would assume it's not relevant )

The UDF java code considers the string as a array of chars while processing them. It uses methods .length and .charat on the input string.

The result is that I have a ISO-8859 encoded string ! ( after decoding it back from the base64 )

What could cause this ?



PS If I simply use default functions ( concat etc..) then the resulting xml stays correctly encoded...