cancel
Showing results for 
Search instead for 
Did you mean: 

JCo encoding problem

Former Member
0 Kudos

Hi,

I have a problem where a login over JCo fails when I have the character u20AC in the password.

This is a bit strange, because the JCo Client itself has the UTF-16 encoding, and the password is UTF-8. The password is UTF-8 because it also contains Ä which works.

I set up a test password with umlauts, and all umlauts seem to work. Everytime I have a u20AC or even &, the login fails.

I tried setting "jco.client.unicode" to 1, but this didn't seem to work. I am not even sure the property "jco.client.unicode" even exists, although there is a "jco.server.unicode" which in my case is not helpful - I'm creating a client, not a server.

The remote SAP System is a Unicode system by the way.

Any ideas? Oh, I am using JCo 2.1.8.

T00th

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

This is a bit strange, because the JCo Client itself has the UTF-16 encoding, and the password is UTF-8.

Ok, so I'm assuming this means that your passwords are stored as UTF-8 hashes on the application server (i.e. profile parameter login/password_charset=2). Note that default value is pointing to Latin-1 encoding, which contains Ä, but no u20AC (so per your comment it's not 100% clear if you're really using UTF-8 on the server).

First of all, are you sure that your password string is correct? I.e. the Euro symbol is Unicode code point U20AC+, so when debugging your Java application you should be able to validate that. This way you ensure that you don't read the password from some other location (e.g. database, file) with a possibly wrong encoding (e.g. system code page) and thus ending up with a different character.

You're right, there's no Unicode boolean/flag for JCo client programs. However, you do have an attribute for the code page used by the client, i.e. jco.client.codepage. It's a bit confusing to me, because the Java Strings are anyhow UTF-16 (see also OSS note 794411), but when tracing I can clearly see that it's set to my system code page. When I change the code page to UTF-16LE, i.e. 4103 I can see that reflected in the trace (logon succeeds), but later my RFC fails. So maybe you can experiment with setting the client codepage.

Maybe somebody else could explain if and if so why the client code page matters. Java is using UTF-16 strings, so I really wonder why the client code page would have any impact?!

Cheers, harald

p.s.: Tracing can be enabled via the following JCo parameters that can be passed directly to Java: Trace level is controlled via -Djco.trace_level=<n>, where 0<=n<=10 and trace directory via -Djco.trace_path=<path>.

If all of this doesn't help, I'd personally do the following: Try testing the character passing by using a user with a working password and pass the special characters (like Euro sign) as a parameter (ideally you have some coding with an appropriate function for that, otherwise you could quickly create a test class). Debug your example and see if the characters ending up in SAP are correct.

Former Member
0 Kudos

Hi Harald,

first off, thanks for your answer. A lot of helpful tips there.

I already checked the encoding of the password at mulitple stages (as it is passed through the application down to the part where my component finally establishes the connection) and it is in UTF-8. Nowhere is the password messed around with, it is readable, in the correct encoding and is written out in the log files (just to check, of course it won't be written out always ) correctly all the time.

Your tip about jco.client.codepage is what I'm trying out now, I hope that nails it.

What really bugs me though, is the following sentence in the SAP Note you specified (thanks for that too BTW!)

"The SAP Java Connector is doing all codepage conversions from and to the SAP codepages internally. You cannot influence the conversion process with a switch or a JCo API call. The appropriate codepage converter is automatically selected by relying on the codepage information returned from the partner SAP backend system..."

If I understand this correctly, it means, no matter what codepage I set clientside, the password will be converted to the codepage which the remote SAP System uses!!

The problem is, when I break it down into steps, it looks like this -

1) I get the password (UTF-8)

2) I set the password in my client object (password stays UTF-8)

3) I set the codepage for my client ( question - will the password be in this case automatically converted to the codepage I set at this point?)

4) I try to connect.

Edited by: Sameer Jagirdar on Mar 24, 2010 10:59 AM --> Too long a message, had to split in two parts

Former Member
0 Kudos

(RESPONSE CONTINUED FROM POST ABOVE)

I don't understand where my codepage would have any influence over my UTF-8 password in terms of conversion from the original encoding to the new encoding...

BTW, my test with the code page just finished, and I have the following glorious information from the remote SAP System -

 
<tag><![CDATA[com.encoway.xxx.xxx.xxx
 2010-03-24 10:47:55,921
 com.encoway.xxx.xxx.xxx: (103) RFC_ERROR_LOGON_FAILURE: F
	at com.encoway.xxx.xxx.xxx.getConnection(SAPConnectorImpl.java:136)
	at org.apache.jsp.sapTestLoginAction_jsp._jspService(sapTestLoginAction_jsp.java:97)
	at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
	at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:324)
	at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
	at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)

What kind of a message is "F"?

This message is different from what I had earlier ("Name or Password incorrect, repeat logon"), so I guess setting the codepage to what you suggested did do something, but I'm not sure what exactly. I am not sure of the profile parameters, I will need to check with the SAP Admin.

Any further ideas?

T00th

Former Member
0 Kudos

One additional remark:

You have to set the password through a java.lang.String value, of course.

In Java up to release 1.4 this is always UCS-2.

If you have your password stored as an UTF-8 string you have convert it either prior or when creating the java.lang.String.

Try

new String(byte[], "UTF-8")

for example or use an appropriate java.io.InputStreamReader for this.

And to confuse a little bit more:

Even UCS-2 and UTF-16 are not the same...

UTF-16 support has been added to the Java language since J2SE 5.0 (which does not mean that there was no charset converter included for this, in fact there was an UTF-16 encoding converter even before version 5.0).

Well, codepages are really a difficult area.

Edit: Sorry, should be marked as a reply to Sameer, and not to Harald, of course.

Edited by: Stefan Gass on Mar 24, 2010 1:27 PM

Former Member
0 Kudos

Thanks to you both again. Soooo, what I did was basically the following (based on Stefan's suggestion) -


           String new_password = new String(password.getBytes(), "UTF-8");
            props.put(SAPConnector.SAP_PASSWORD, new_password);

The properties "props" I use to create my JCo Client. This did not work unfortunately. The following is the log (a little edited) -


<entry>
 <timestamp>2010-03-24 14:37:01,308</timestamp>
 <message><![CDATA[Set the password :xxxxxxx?.
]]>

The "?" is supposed to be the "u20AC".

And I got the same error as I reported before. Stefan, your information about the codepage in JCo being set to 1100 is invaluable. Could you send me a link to where you found it? I would like to add it to my "collection" of information about JCo

Am still playing about with the codepages, I hope I find the solution soon...

T00th

Edited by: Sameer Jagirdar on Mar 24, 2010 2:49 PM --> Added The "?" is supposed to be the "u20AC".

Former Member
0 Kudos

Oh what I forgot to add -

I am using Java 1.6 and JCo is 2.1.8. The remote SAP System is a Unicode system.

Grüß

Former Member
0 Kudos

And what type is password in your coding? I guess it is already a java.lang.String, so your coding is useless.

Furthermore we do not know if UTF-8 is your default system encoding, so it is unclear what password.getBytes() is returning in your case.

So either your information to have the password as UTF-8 data is wrong or you have to deal with it earlier (when reading or when creating your original password instance).

Regarding the 1100 codepage you may simply trust me or alternatively analyze the RFC traces. I do not have an official source to refer to.

And finally: JCo 2.1 is not supported for your JRE. You will definitely get in trouble when dealing with BCD types (ABAP type p).

Please see note [549268|https://service.sap.com/sap/support/notes/549268] for a list of supported platforms and JRE releases.

Former Member
0 Kudos

Hi,

my password is in UTF-8 because the web application sends in the UTF-8 format. Furthermore, when I look at it in a text editor set to the UTF-8 encoding, I see the password correctly, not a jumble of characters.

The password is a java.lang.String.

And furthermore,


getBytes() 
Encodes this String into a sequence of bytes using the platform's default charset, storing the result into a new byte array

- comes from the Java Documentation. Would "Platform" here mean the JVM or my OS (incidently Win7)? I know I have a 64-bit System, but the Java is 32 bit just in case you need that information...

I will go with your suggestion of codepage 1100 Out of desparation I even tried

new_password = new String(password.getBytes("UCS-2"), "UTF-8")

which as I expected does not work.

I am about 15 seconds away from starting to bang my head against the wall....

T00th

Former Member
0 Kudos

As I tried to explain, if you have a java.lang.String this is already UCS-2. So if it was UTF-8 previously the conversion already took place. So if System.out.println(password) does a correct output, it had been converted correctly and you may forget all those UTF-8 stuff.

Looking with a text editor doesn't help as you definitely then are displaying some (text) file and by creating this file there assumably was the next codepage conversion.

Regarding your default encoding: With JRE 1.6 you may get this with java.nio.charset.Charset.defaultCharset().name() .

And my suggestion was not to use codepage 1100. My suggestion was to use jco.client.codepage=4103.

I only explained how JCo works.

And finally to your error message:

Don't know if it is being cut off. Please have a look at the file dev_rfc.trc. There will surely be the whole error message.

Former Member
0 Kudos

Hi,

As I said, I tried UCS-2 as a desparate attempt, which I never expected to work as did you...it was just a shot in the dark.

I have used the codepage 4103 and not 1100, I meant that I trust you when you say that JCo uses 1100 by default

I tried the charset test already with this in the log - "The default charset is: windows-1252"

The error message is unfortunately also not cut-off, it is the exact message which I get also in the dev-trc file...it is all the SAP System is sending me.

The console cannot write out u20AC in my password because it does not support the encoding, I had that problem and thats the reason I used a text editor (not the normal notepad) to try and open it in the original encoding. But I think what you mentioned might have happened - the text editor MAY have (I'm not sure about it) opened the log file in a wrong encoding. It is supposed to open it in the original encoding, but I cannot be sure if it did that.

Am still on the problem, will get back if I find something new.

Bye!

T00th.

Former Member
0 Kudos

Well with windows-1252 you should be able to print the Euro sign (u20AC).

But on windows-1252 this character is mapped to 0x80 whereas in Unicode it is U+20AC.

And to be complete: the Euro sign is not contained in the SAP codepage 1100 at all (codepage 1100 is equivalent to ISO-8859-1).

Now this will get really tricky.

The question is, how does your password really look like at the ABAP side?

How did you create it? Using a SAPGUI on Windows?

Former Member
0 Kudos

Please have a look at SAP note [735356|https://service.sap.com/sap/support/notes/735356].

I therefore suggest NOT to use any national language characters within your passwords.

If you do nevertheless you should then definitely use the jco.client.codepage=4103 logon parameter.

Furthermore please check if your password Java string was created correctly.

If your password would consist of only an Euro sign then password.getBytes("UTF-8") should return the byte\[\] { 0xE2, 0x82, 0xAC }, consistently password.getBytes("windows-1252") should return the byte\[\] .

password.getBytes("ISO-8859-1") would return byte\[\] which is the replacement char (Question mark: ?) for unknown/undefined characters in this codepage.

Former Member
0 Kudos

Hi Stefan,

Can you shed some light on that codepage parameter <i>jco.client.codepage</i>? I'm really confused with this one, as I don't understand why this would be important.

As you mentioned, Java uses UTF-16 for character sequences. From that perspective I'd say there should be no client code page, because any possible Unicode character can be represented in Java. So when JCo connects to a Unicode SAP system, which on application server corresponds to usage of UTF-16, there should be at most a conversion between little endian versus big endian encoding. If I connect to a non-Unicode system I might be in trouble, because some of my Unicode characters in Java might not be part of the codepage used in SAP.

You mentioned:

JCo uses by default the codepage 1100 for doing the logon itself because it does not know which codepage the communcation partner is running on and this is the lowest common denominator of all systems.

Codepage 1100, i.e. ISO-8859-1, does not seem like the lowest common denominator, for that part US-ASCII would qualify much better. However, I think this is also the default for SAPgui, if you don't change anything.

As far as I know all code page names can be expressed in US-ASCII, especially for SAP using those 4-digit numbers. So part of opening the connection setup should be to exchange used code pages and then continue with the communication using that knowledge.

Anyhow, that might be again wishful thinking. So just to be clear, are you saying that codepage 1100 is always used or system code page for logon? What is the parameter jco.client.codepage for, just for logon (and even for logon, wouldn't you still somehow need to communicate the actual codepage used and why then not use Java's UTF-16)? What happens once you logged on and exchange data?

Appreciate your feedback.

Cheers, harald

Former Member
0 Kudos

By default JCo uses codepage 1100 for doing the logon. That's the short answer.

This is because you don't know anything of your partner system before doing the logon. You do not know which codepage it is running on or if it is unicode or not. So the default codepage 1100 is used for doing the logon unless you know better by specifying the jco.client.codepage parameter. This property/parameter is used for converting the logon parameters only!

After the logon, JCo knows the partner codepage and switches to the correct one automatically.

If JCo would do the logon with UTF-16/Unicode by default, you won't be able to logon to a Non-Unicode system without specifying the jco.client.codepage parameter. But on the other hand, you may use codepage 1100 for logging on to a Unicode system. So the question is just what should be the default, and SAP decided to use codepage 1100 as the lowest common denominator (amongst all SAP AS ABAP server codepages to choose between).

Regarding the connection setup process it currently doesn't work as you described it. There is no additional roundtrip getting the codepage data before the logon, this is only one step (the additional roundtrip would also cost performance otherwise). Maybe one can think of some additional parameter telling JCo to detect automatically what codepage the partner is running on before doing the logon, but for performance reasons this should not be the default as there really might be lots of logons in real scenarios. Furthermore nearly all users limit their UserIDs and passwords to the characters defined in note [735356|https://service.sap.com/sap/support/notes/735356]. So the used codepage 1100 works almost everytime.

If you definitely know that the communication partner system is Unicode, simply specify jco.client.codepage=4103 and everything is fine for you: supporting all Unicode characters with no additional roundtrip via network.

Former Member
0 Kudos

Stefan,

thanks a lot for the explanation, now we know at least what happens. As this seems so crappy though, I cannot help but rant a little...

If JCo would do the logon with UTF-16/Unicode by default, you won't be able to logon to a Non-Unicode system without specifying the jco.client.codepage parameter.

Not really as long as a proper code page conversion from UTF-16 to the code page used by the system would've been implemented on the server side. Instead of choosing codepage 1100 the more natural choice seems UTF-8.

As far as I know since application server 6.10 the server "knows" about Unicode (e.g. see [here|http://help.sap.com/saphelp_nw04/helpdata/en/cb/56453c3ff4110ee10000000a11405a/frameset.htm]). Anyhow, if I understood you correctly you can actually set the code page for the logon data using jco.client.codepage, but what happens when you use Unicode codepages on non-Unicode systems?

OSS note [794411 - Supported codepages of SAP Java Connector 2.1 and 6.x|https://service.sap.com/sap/support/notes/794411] seems to list the supported codepages for JCo 2.1.x (stops at 2.1.5 though, maybe no changes after that). UTF-8 is not listed (i.e. 4110), yet no error occurs when I specify this on JCo 2.1.8 when connecting to a non-Unicode SAP 4.7 system (kind of expected an exception).

Finally OSS note [975768 - Deprecation of Java features with non-Unicode Backend|https://service.sap.com/sap/support/notes/975768] seems to put an end to all that madness. Well let's hope JCo follows and deprecates the jco.client.codepage parameter as soon as possible...

Here's something funny though: When using JCo 3.0.5 I get client codepage 1100 when connecting to a non-Unicode 4.7 system and client codepage 4102 when connecting to a Unicode ECC 6.0 system (without setting jco.client.codepage). So maybe this is just a concern for JCo 2.x, but still it seems odd that we have codepage references in JCo 3.0.5 (<i>DestinationDataProvider.JCO_CODEPAGE</i>, <i>JCoCustomDestination.setCodepage(String)</i> and others).

Cheers, harald

Former Member
0 Kudos

Ok, I think I found the missing link that explains the codepage 1100 default in JCo 2.x: OSS note [1021459 - Conversion behavior of the RFC library|https://service.sap.com/sap/support/notes/1021459].

Not sure how far that note applies in JCo context, but at least it's clear that JCo 2.x uses the non-Unicode RFC library (one of those mysteries to me), e.g. see the configuration document delivered with the JCo:

Note: For JCo you will always need a Non-Unicode version of the RFC library, even if you would like to connect to a Unicode SAP System

Interestingly enough the note claims:

The code pages 4102 and 4103 are used by the Unicode RFC library only. These code pages are selected regardless of the processor architecture and must not be set using parameters. You must not use the code pages 4102 and 4103 in the non-Unicode RFC library.

Leaves one wondering if that also applies to JCo 2.x and the jco.client.code_page parameter. Or is this referring to a different code page usage?

Former Member
0 Kudos

Hi,

I must say this was a very very interesting discussion. The SAP Notes and other information which you both mentioned, have been very helpful. Thanks to you both again for that.

As a summary, I think it is crazy that JCo 2.1.x (I'm not sure yet about 3.x) uses the codepage 1100 by default. The Note 735356 I will use as the main reference for informing why 2.1.x has problems. This note confuses me a bit where it says -

<Z13>Caution:<Z13> Using login/password_charset = 2, passwords are stored in a form (code version "D") that cannot be interpreted by systems using older kernels. Therefore, you should only set the profile parameter to the value 2 if you have first made sure that all participating systems support the new password source code.<Z13>

Confuses me because I am not sure how this would help with JCO, which would even then use 1100, which as we know now, does not contain u20AC.

The solution should have been that JCo finds out what codepage the SAP System uses before sending the login details, so that it can do it in the right way instead of sticking with 1100. Forcing it to use 4103 would naturally lead to problems when the remote System isn't Unicode. This should therefore, in my opinion, be done automatically by JCo.

This problem is brought out in the Note 975768 -

Users can enter characters on a Java frontend which are not supported by the non-Unicode backend system and which therefore cannot be converted. Since Java applications can handle all Unicode characters, such an error is not detected before the data have been sent to the backend already.

I am not completely familiar with 3.x, as I have not used it yet, so I cannot say if the solution in my case would be actually to switch to v3.x.

As far as my problem is concerned, I will be using the 3 SAP Notes you guys specified to show why the problem with non-ASCII characters crops up. So that part is closed. I will however test it with 3.x in my spare time to see how that works out. I would love it though if any of you (or anyone else reading this thread) would put his/her experiences, thoughts, ideas and tips on this thread. For that reason, I will mark this thread as closed for now, but keep a watch on it.

Thank you both again.

T00th

Former Member
0 Kudos

Sameer,

please edit your post and split your lines. The resulting format is awful to read.

Better choice would be to use

instead. Thanks.

Former Member
0 Kudos

Yeah, after I saw the the result was, I did want to edit it, but the edit button for a couple of my posts seems to have vanished...maybe a system glitch or something. I wanted to edit it anyways, will do it when see the edit button.

T00th

Former Member
0 Kudos

Harald,

1. Note 1021459 is about the classic RFC library only. It is addressed to C developers. Although it explains some basic SAP

codepage functionality please don't take care of this note in the context of JCo.

2. Code page 4110 is not listed in note 794411 as a supported code page because it simply is no system or server code page.

You may not run an application server on code page 4110. Currently I am not sure if specifying jco.client.codepage=4110 will

work under all circumstances. At least it would definitely fail for old R/3 releases. 4110 is also a Unicode code page and old

releases cannot handle this. Please always remember that JCo features RFC communication back to R/3 release 3.1.

3. If you connect to a Non-Unicode systems with code page 4102/4103 you will get a logon failure with no error text because

the communication cannot be established at all. The Non-Unicode system doesn't understand the Unicode code page from the

partner. So this no option for being default.

4. It is NOT the solution to query the partner code page before every logon. I don't know how many users would complain about

dramatically decreased performance in real scenarios in this case. I think you are underestimating this. The partner code page is

a known component for the user/developer and could be defined as a logon parameter if you are not satisfied with the default

1100. Additionally, changing defaults is always critical: think of developers who did not offer the codepage parameter to the

application users -> these applications won't be able to do any logon to Non-Unicode systems any longer.

Former Member
0 Kudos

It's not a bug it's a feature... The Edit button disappears once somebody replied to your post, which makes sense.

Former Member
0 Kudos

<div style="width:50em">

Stefan, let me try to comment your points and then I promise to move on...

ad 1. JCo 2.1.9 is shipped with RFC library 6.40, so it's a classic RFC library, not the new NW (which I think would be at least 7.x). Maybe you can substitute it with the new NW RFC library, haven't seen any comments on this though. Since it's using the classic RFC library, I'm assuming that some of the remarks should also apply for JCo (though it's questionable which parts).

ad 2. True, but I thought there's a reason software has an end of life and you cannot always be backward compatible in the course of software development (e.g. JCo 3.x API broke with 2.x for obvious reasons). Considering current systems, I'd say OSS note [975768|https://service.sap.com/sap/support/notes/975768] is on the right track...

Note that I can actually log on to a non-Unicode 4.7 system using JCo 2.1.8 and codepage 4110 (trace shows that client code page was set); 4102/4103 both fail with nicely garbled logon failure message as you said.

ad 3. Maybe you need to define what you mean by non-Unicode. I think in SAP the term is used to describe systems that actually process and store all data as Unicode, so database has to use Unicode as well. However, from application server perspective the system might be Unicode capable, thus handling Unicode to non-Unicode codepage conversions (I though with basis release 6.1 SAP actually could handle Unicode.

ad 4. Querying the partner code page would only be required if client has to do codepage conversion up-front, not if the server handles that. That's why for logon information to me a default like UTF-8 seemed a natural choice, giving freedom for new systems with crazy Unicode passwords and backwards compatibility for older systems that should be ok with US-ASCII. To me the parameter jco.client.codepage actually shows that there's no need to query the partner code page, the client just informs which codepage it uses.

Anyhow, enough blabbering on this subject. I'm really looking forward to some more expert insights in future JCo postings/topics from you, so thanks a lot for sharing your knowledge, Stefan.

</div>

Former Member
0 Kudos

ad 1. Yes, that's why I said: the note explains some basic codepage functionality. But the classic RFC library has been

modified for exclusive usage by the SAP connectors. So not all the information within this note is valid for JCo (for example the

usage of code pages 4102/4103). And no, you cannot substitute the classic RFC library with the NetWeaver version. But you

may use the classic RFC library of newer releases for JCo 2.1 (classic RFC Library version 7.00, 7.01, 7.10, 7.11, etc.).

ad 2. Yes, as a developer you always would like to design from scratch and don't care about compatibility and legacy stuff.

But unfortunately business works different Maybe the next JCo version would do this step forward.

Regarding code page 4110 I will do some more testing. I think it will work back to release 4.6 with most characters (I think the

codepage converter did not offer full support for code page 4110 in this release; try chinese or japanese for example). 4.7 should work though.

ad 3. With non-Unicode I meant all application servers not running on an UTF-16 code page.

ad 4. Yes, you are right. But as the old R/3 releases are already shipped for a long time, it works as it is: the initiating RFC

partner has to choose some code page that the other RFC partner is capable to handle. Please see ad 2. again.

Answers (1)

Answers (1)

Former Member

Hi Sameer,

Harald almost gave the complete solution and you already did some testing and found out the most.

The missing part is the following:

JCo uses by default the codepage 1100 for doing the logon itself because it does not know which codepage the communcation partner is running on and this is the lowest common denominator of all systems.

After the logon and therefore knowing the partner's codepage JCo then automatically switches to the appropriate communication codepage.

So you may get in trouble if some of your logon parameters are not part of codepage 1100.

Usually it works for all characters being in the range up to character code U00FF as these codes are not converted with codepage 1100. That's the reason why it works with the "umlauts" but does not work with the Euro sign as this is character code U20AC and definitely must be converted.

So the solution to your problem is to specify the additional logon parameter "jco.client.codepage" which is really only used for the logon and not for the following real RFC communication.

You must define a codepage fitting to the used characters within your UserID and password. If your partner is a unicode system you may define 4102 or 4103 and it will always work with all characters - BUT only with unicode systems, of course.

By the way, it doesn't matter if you specify 4102 or 4103 - both will work even if the endianity doesn't fit.