Hi all,
I have a problem using regexes with word boundaries.
First of all: my goal. I want to remove all words in my text, that are shorter than, let's say, 5 characters. As usually, I started playing with DEMO_REGEX_TOY. First try: \<\S{1,5}\>. This should match every word with 1 to 5 characters. But in the sample program (text: Cathy's black cat…) it matches, by the option FIRST OCCURRENCE, Cathy, by all occurrences Cathy and s. Documentation for \< and \> says that Words are defined as uninterrupted strings of alphanumeric characters. Alphanumeric are characters / letters and digits, so my mistake. But if I change the length to 7 or more, then, all of a sudden, the whole word is matched (Cathy's). How's that possible? What am I missing here? It's working the same when I change \S to [[:graph:]].
I would very appreciate any clues. Many thanks in advance!
Best regards
Michał