Skip to Content

How to get Datalength in HANA ?


I want to get the string length as BYTE in HANA.

HANA seems to have no function like Datalength in MSSQL.

How can I get the length of a string in bytes?


Add comment
10|10000 characters needed characters exceeded

  • Follow
  • Get RSS Feed

3 Answers

  • Best Answer
    Jan 09 at 11:56 PM

    This one is a bit tricky, as the behaviour of the LENB function in Excel is "weird".

    The idea is that the function would return the number of bytes used for an input string. That already tells us, that this is about how Excel (!) stores the strings - not about strings per se.

    So, for LENB the deal is: if the default character set of the operating system (!) supports double-byte characters (DBCS) then the LENB will return 2 times the number of characters in the string as the number of bytes.

    If the default character set of the OS does not support DBCS LENB behaves like LEN, so for each character in the string, only one byte is counted.

    See LEN, LENB functions for reference.

    Now, the problem here is that this has not much to do with how HANA (or other DBMS) store bytes.

    HANA uses CESU-8 string encoding which allows for storing only the significant bytes (e.g. characters from the ASCII range don't need more than 1 byte, while characters with larger Unicode point numbers, like the Korean(?) ones you gave as an example, take two or three bytes).

    The point here is that by using the

    LENGTH(to_binary (... )) 

    approach you get the actual number of bytes used in CESU-8 to represent the characters.

    Two differences here between Excel and HANA:
    Excel doesn't use CESU-8 but the DBCS character encoding scheme of the OS.
    Due to that, the byte-counting functions for strings are different: CESU-8 can have between 1 and 3 bytes and need to be checked for each character while the Excel version simply assumes two bytes for each character in the string.

    Obviously, if you want to emulate the Excel behaviour in HANA, just use

    LENGTH (...) *2 as EXCEL_LENB

    and you're done.

    Add comment
    10|10000 characters needed characters exceeded

    • Now that's a completely different question then!

      How different DBMS store data differs widely between setups. Your example, Oracle, allows for many different storage settings, including UTF versions (which are used with Unicode SAP systems). On top of that, the length() semantics change depending on whether a fixed-byte character set is used or not. For Unicode systems, length() commonly returns the number of characters - not bytes.

      Finally, what specific logic do you believe you need to change in your BI solution because of this difference?
      Anyhow - this question and the discussions are already at the limit of what can be done in this forum format. I recommend opening a new question addressing your actual problem.

  • Jan 09 at 05:28 AM

    Another thread with similar discussion

    Hope it helps


    Add comment
    10|10000 characters needed characters exceeded

  • Jan 09 at 06:01 AM

    maybe you can try this

    SELECT LENGTH (to_binary('length in char')) as length FROM DUMMY;
    Add comment
    10|10000 characters needed characters exceeded