replace_regex

Description

Replaces all instances of a matched regex pattern in the input string with a provided value.

The first parameter specifies the input string, the second parameter specifies the regex pattern, the third parameter specifies the replacement. If the replacement is empty all matched patterns are removed. The fourth parameter specifies which occurrences are to be replaced. If 0 is specified all occurrences are replaced. The fifth parameter specifies optional regex flags:

  • c - Enables case-sensitive matching (default.)

  • i - Enables case-insensitive matching.

  • m - Enables multi-line mode (i.e. meta-characters ^ and $ match the beginning and end of any line of the input string.) By default, multi-line mode is disabled (i.e. ^ and $ match the beginning and end of the entire input string.)

  • s - Enables the POSIX wildcard character . to match \n (newline.) By default, . does not match \n.

For more about syntax, see POSIX extended regular expressions.

This function returns the input without modification for the empty regular expression "". However, for the empty group regular expression "()", it adds the replacement string at the specified occurrences. It returns null in case of null or invalid arguments, or any other error.

Return type

string

Domain

This is a scalar function (calculates a single output value for a single input row.)

Categories

Usage

replace_regex(input_string, pattern, replacement, [ occurrence ], [ flags ])

Argument

Type

Optional

Repeatable

Restrictions

input_string

string

no

no

none

pattern

regex

no

no

constant

replacement

string

no

no

none

occurrence

int64

yes

no

constant

flags

string

yes

no

constant

Examples

make_col date:"2001-31-12"
make_col new_date:replace_regex(date, /^.*([0-9]{4,4})-([0-9]{1,2})-([0-9]{1,2}).*$/,'\\3/\\2/\\1', 0, 's')

Use parenthesis to encapsulate groups and refer to them via the double backslash and their index, starting from 1.

Aliases

regex_replace (deprecated)