Skip to Content

Remove all HTML flags from text field

Oct 31, 2017 at 04:56 PM


avatar image

Hi Expert,

I have a text field, and I need to remove all HTML relevant flags in SAP HANA sql script (such as :<>,<h1>,<h2> etc), as there are a lot of flags , I don't want to remove them one by one, is there any easy way to remove them .

Many Thanks,

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

1 Answer

Best Answer
Florian Pfeffer
Nov 01, 2017 at 07:15 AM

Have you tried it with the REPLACE_REGEXPR function?

Following a little quick and dirty coding which replaces the HTML tags in a string with the help of that function and a regular expression. Of course you maybe need to invest a little bit more to find a more accurate regular expression to consider "<", "</" and ">" characters which are not part of an HTML tag.

    declare htmlString nvarchar(255) := '<html><head>Test Head</head><body>Test Body</body></html>';
    result = select replace_regexpr('[</?.*>]' in "htmlString" with '' occurrence all) as "cleanedString" 
        from ( select :htmlString as "htmlString" from dummy );
    select * from :result;


Show 1 Share
10 |10000 characters needed characters left characters exceeded

This might work to remove regular tags, especially when the tags are closed correctly etc.

For a general solution, that persistently does un-html all text, it might not exactly be sufficient, but more often than not, good enough is, well, good enough.

On the entertaining side of this, I felt reminded of the famous SO post (RegEx match open tags except XHTML self-contained tags). :)