Skip to Content

String utf-16 conversion

Hello,

I call rfc (via xi) and get result as flat xml String from r/3.

This string contains english and hebrew characters. The english is o.k. but the value nodes in hebrew looks like " ׳ ׳�׳™ ׳�׳™׳ ׳™׳�׳ยจ׳׳˜/׳™׳’׳�׳� "

how can i convert the string in order to see my language characters?

I think the source unicode is utf-8, and i need to convert it to utf-16 but i'm not sure about that nor how to do that.

Thanks for your help.

Roni.

Add comment
10|10000 characters needed characters exceeded

  • Follow
  • Get RSS Feed

3 Answers

  • Best Answer
    Mar 14, 2007 at 12:24 PM

    Hi Roni,

    If your RFC returns an XML file as a java.lang.String object, the String itself is by definition UTF-16. To see if the XML file is correct, you could write it to disk and open it with your favourite text editor. Something like this.

    private void save(String xml) throws IOException {
      OutputStream out = new FileOutputStream(new File("/tmp/result.xml"));
      out.write(xml.getBytes("UTF-8"));
      out.close();
    }
    

    Or whatever supported encoding you like and that's supported by your text editor, e.g. ISO-8859-8 is Latin/Hebrew Alphabet; see http://java.sun.com/j2se/1.4.2/docs/guide/intl/encoding.doc.html

    Obviously, to correctly see the characters you need the necessary fonts too.

    In case the above doesn't work, it means the xml String you receive from the RFC call is already corrupt. However, if it does work, the RFC as such works correctly.

    The next step is likely to parse the XML to extract the information you need and store it in your Web Dynpro context. An XML parser usually expects bytes (a java.io.InputStream) as input, which means you need to convert the String to bytes and by doing that, you need to choose a character encoding. It could be something like the following.

    SAXParserFactory.newInstance().newSAXParser().parse(new ByteArrayInputStream(xml.getBytes("UTF-8")), handler);
    

    Note that the character encoding you specify here does make a difference. It should be the same, in order not to "confuse" the XML parser, as defined in the XML file's document type declaration, e.g. <?xml version="1.0" encoding="UTF-8"?>

    BTW, what exactly do you do with the received XML to obtain the value nodes?

    Kind regards,

    Sigiswald

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    Mar 13, 2007 at 05:15 AM

    I think the links below should help

    link" target="_blank">http://www.jguru.com/faq/view.jsp?EID=137049">link 1

    link" target="_blank">http://unicode.org/faq/utf_bom.html#45">link 2

    or you may also try

    byte[] utf16 = theString.getBytes("UTF-16");

    Regards

    sid

    Add comment
    10|10000 characters needed characters exceeded

  • Apr 01, 2007 at 02:48 PM

    seems like the solution is :

    new InputStreamReader(httpConn.getInputStream(),"UTF-8");

    Add comment
    10|10000 characters needed characters exceeded