Skip to Content

Check for Unicode Categories

Hi,

what I am trying to do is to check in ABAP a given URL for characters from certain UC categories.

For example, I want to check if an input characters belongs to category [L], which is actually consisting of the (sub) categories Ll, Lm, Lo, Lt and Lu.

There are thousands of characters in these categories.

In Perl, for example, a regex for checking for a lower case letter would simply look like this \p{Ll}, but that does no work in ABAP regex.

Does anyone had such a problem before or even a solution for this in ABAP?

Thanks, Otto

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Nov 11, 2015 at 11:49 AM

    The only current solution in ABAP is to use the static method cl_icu_character=>get_property_value.

    Input:

    im_c = <character>

    im_property = 'General_Category'

    Output:

    ex_value = 'Lowercase_Letter' , 'Uppercase_Letter' , 'Titelcase_Letter' , etc.

    You may map the returned string to a "Perl-like" category name.

    Example:

    Lowercase_Letter -> Ll

    Uppercase_Letter -> Lu

    Titlecase_Letter -> Lt

    etc.

    Add comment
    10|10000 characters needed characters exceeded