cancel
Showing results for 
Search instead for 
Did you mean: 

Data Quality Regex

Former Member
0 Kudos

Hello, community,

I am creating rules for profiling data in Information Steward using regular expressions. It seems to be quite time consuming. Can you please share the regexes you use?

Accepted Solutions (0)

Answers (2)

Answers (2)

adrian_storen
Active Participant
0 Kudos

Erlan,

The thing with regex is that rules are likely to be unique (ie name starts with XXX or YYY or contains or ends in) which make it hard to know what you're after.  However, as it's a standard you can search online for techniques on how best to apply to your needs.

Some examples that may suit you are:

Emails (RFC5322): match_regex(rtrim(lower($email_address), ' '), '[a-z0-9!#$%&*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?', NULL)

Landline Phone:  match_regex(rtrim($landline_number, ' '), '[0-9]{2}\s[0-9]{4}\s[0-9]{4}', NULL) OR match_regex(rtrim($landline_number, ' '), '[0-9]{2}\s[0-9]{8}', NULL)

Mobile/ Cell Phone:  match_regex(rtrim($mobile_number, ' '), '04[0-9]{2}\s[0-9]{3}\s[0-9]{3}', NULL) OR match_regex(rtrim($mobile_number, ' '), '04[0-9]{2}\s[0-9]{6}', NULL)

regards

Adrian

Former Member
0 Kudos

Hi, Adrian,

Thank you for you reply. Yes, that's what I figured out. I created some general rules, like checking for whitespaces, symbols, punctuations, cases, etc. I will then get the requirements for unique rules from the business users.

Thank you!

Kevin_SAP
Advisor
Advisor
0 Kudos

What product is this for?

Regards,

Kevin

Former Member
0 Kudos

SAP Information Steward. However, regexes are independent of a product, i.e. they have the same syntax everywhere.