Skip to Content

Regex - to identify non printable characters

In the online REGEX tester, I am able to see that [^[:print:]] regex is able to correctly identify TAB as a non-printable character.

But, when I use the same REGEX in ABAP, it doesn't find TAB as a non-printable character. I wrote this simple program to try an loop through all unicode characters and see how many of them ABAP identifies as "printable".

constants:hex type char16 value '0123456789ABCDEF'.

data: p type i,

q type i,

r type i,

s type i,

str type char4,

val type string,

rpl type string.

do 16 times.

p = sy-index - 1.

do 16 times.

q = sy-index - 1.

do 16 times.

r = sy-index - 1.

do 16 times.

s = sy-index - 1.

str = |{ hex+p(1) }{ hex+q(1) }{ hex+r(1) }{ hex+s(1) }|.

rpl = val = cl_abap_conv_in_ce=>uccp( str ).

replace all occurrences of regex '[^[:print:]]' in rpl with ` `.

if rpl = val.

write:/ str, val.

endif.

enddo.

enddo.

enddo.

enddo.

I got many characters, which were skipped by ABAP, saying they are all printable.

TAB:

Many many other non-printable characters:

I understand that some of these characters, may be appearing on my system as [] because of missing font on my system - am I right?


But, how about TAB character? Is this a bug? ABAP correctly identifies new line characters as non-printable.


I tried using regex [:cntrl:] and the condition was worse, as shown below. It couldn't catch TAB as well as NEWLINE.



Inviting Former Member, @Michael KozlowskiFormer Member

Former Member

Former Member

Former Member

Former Member

Add comment
10|10000 characters needed characters exceeded

  • Follow
  • Get RSS Feed

1 Answer

  • Best Answer
    Apr 14, 2016 at 02:03 PM

    This works for me:

    DATA: l_text type string.

    l_text = 'A' && cl_abap_char_utilities=>horizontal_tab && 'B'.

    REPLACE ALL OCCURRENCES OF REGEX '[^[:print:]]' IN l_text WITH space.

    Hexadecimal value of l_text before replace is:

    410942

    after:

    4142

    Add comment
    10|10000 characters needed characters exceeded

    • Naimesh Patel Juwin Pallipat Thomas

      Actually PRINT is a union of GRAPH and BLANK. If you run them both individually, you would notice that both are opposite to each other. Hence [^[:print:]] doesn't replace anything and displays the string as is.

      • [:print:]
        Set of all displayable characters (union of [:graph:] and [:blank:])

      DATA: l_text TYPE string.
      l_text = 'A' && cl_abap_char_utilities=>horizontal_tab && 'B'.
      WRITE: l_text.
      REPLACE ALL OCCURRENCES OF REGEX '[^[:blank:]]' IN l_text WITH ` `.
      WRITE: / l_text.


      output

      A#B

      #

      l_text = 'A' && cl_abap_char_utilities=>horizontal_tab && 'B'.
      WRITE: / l_text.
      REPLACE ALL OCCURRENCES OF REGEX '[^[:graph:]]' IN l_text WITH ` `.
      WRITE: / l_text.


      output

      A#B

      A B

      Also helps says, that this functions greatly depend on language and local platform. That could be the reason its not working for both of us, but works for @Tomas Buryanek

      Within sets for single characters defined using [ ], predefined character classes can be specified for certain sets for single characters whose behavior can, however, depend on the language and platform.

      Regards,
      Naimesh Patel