cancel
Showing results for 
Search instead for 
Did you mean: 

Invalid CESU-8 sequence

0 Kudos

Hi,

Currently we have some issues when throw a query to SAP HANA to get some records from a JAVA application. The technology that we use in the application are:

  • JDK 1.8.x.
  • Spring Core 4.3.19.RELEASE.
  • Spring Boot 1.5.16.RELEASE.
  • DataBase SAP HANA 2.00.034.00.1539746999 PRODUCCION.
  • DataBase SAP HANA 2.00.042.00.1564994110 DEVELOPMENT.
  • Driver JDBC ngdbc 2.3.62.
  • HDB client 2.3.144

There are around of 20, 000, 000 of records in the database, so the application gets information by pages. Where the page´s size is 1,000,000 of records and the fechSize to ResultSet is configured to pageSize/4, that is to say, 250,000 records to keep in memory. For each page read, the information is stored in the file FILE_NAME.TXT. In the first eight pages (8, 000, 000 records) it seems that there is not problem, but at some point, the driver JDBC starts throwing the exception:

com.sap.db.jdbc.exceptions.Cesu8ConversionException: Invalid CESU-8 sequence: [49, 54, 49, 54, 50, 48, 49, 57, 49, 49, 49, 50, 48, 57, 49, 54, 49, 54, 2, 48, 51, 2, 48, 48, -1, -105, -102, -52, 50, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 64, 48, 12, 57, 55, 51, 53, 51, 54, 51, 49, 55, 56, 54, 51, 20, 48, 57]

It is important to clarify that, the structure´s database (tables) does not have any fields of type CLOB or BLOB. All fields are of types VARCHAR, DECIMAL, INTEGER.

04705881D294DAE216006600109AB107;09875634;985309;2019-11-16;20;XX;XXX;?;00;X;4673625091268437;12345678922222222222;09;00;?;59076123689;098767890987;09876543212345678909;?;?;X;0;0;80458127

I hope you can help me.

Regarsds.

lbreddemann
Active Contributor
0 Kudos

Ok, you mentioned that the CESU-8 error has been resolved (probably by the workaround I proposed).

This is the time to mark the question as answered.

For the new problem, rather write a new question, so that the two issues don't get mixed up.

Accepted Solutions (0)

Answers (3)

Answers (3)

Good day, Lars.

I could not change definition of columns in the database because the owner has not authorized the change. Meanwhile, Do you know effect of the function "STRTOBIN" that could have in any column? In other words, if I apply the function "STRTOBIN" in my query, it would resolve the problem? For example:


select COLUMN_1, STRBIN('COLUMN_2', 'UTF-8'), COLUMN_3 FROM TABLE_1;

I don´t know, because of that I do not understand how it works the driver in the moment of read the data. What do you think?

Regards.

lbreddemann
Active Contributor
0 Kudos

Not sure how the STRBIN function would fit in here.

What I would do, if I cannot change the table, is simply cast the datatype in the query.

SELECT 
TO_NVARCHAR(COLUMN_2) as COLUMN_2
FROM
TABLE_1;

That way you can easily check if the conversion solves the issue and the overhead of doing the conversion every time is relatively small. What is important here is that the JDBC driver sees that this is an NVARCHAR column in the result set and maps it to the java.String correctly.

0 Kudos

Lars, thanks for your anwser. Something important is that our issue is not seems to be concistent. In otrher words, we could reply thi issue in differents ranges of records.

- In the range of 2.5 millions of records.

- In the range of 5 millions of records.

- In the range of 10 millions of records.

- And, in the range of 390 thousand of records.

According this, Do you think that the cause of the problem is the same?


Regards.

lbreddemann
Active Contributor
0 Kudos

There seems to be multi-byte character data in one of the VARCHAR columns. The JDBC driver tries to convert the data into JAVA String, which is UNICODE UTF-8, but it expects the data to be single-byte (due to the VARCHAR column data type).

To fix this, change those columns to NVARCHAR.

Note that HANA doesn’t stop multi-byte data to get into a VARCHAR column; it simply determines the decode logic for the character byte stream.

0 Kudos

Thanks Lars, I will change the columns to NVARCHAR and I report you the result.

Have a great day.

0 Kudos

but we are run SUM auto add NVARCHAR