get_regex_all

Description

Returns an array containing all parts of the argument string that match the given regular expression (which may be empty if nothing matches.) There are two optional parameters: a capture group parameter to specify which group to capture and a string parameter to specify flags:

  • c - Enables case-sensitive matching (default.)

  • i - Enables case-insensitive matching.

  • m - Enables multi-line mode (i.e. meta-characters ^ and $ match the beginning and end of any line of the input string.) By default, multi-line mode is disabled (i.e. ^ and $ match the beginning and end of the entire input string.)

  • s - Enables the POSIX wildcard character . to match \n (newline.) By default, . does not match \n.

For more about regular expression syntax, see POSIX extended regular expressions.

This function returns null for the empty regular expression "" for all inputs. However, for the empty group regular expression "()", it returns an array containing all the empty space "" between characters for all valid inputs except null. It returns null in case of null or invalid arguments, or any other error.

Return type

array

Domain

This is a scalar function (calculates a single output value for a single input row.)

Categories

Usage

get_regex_all(input_string, pattern, [ group ], [ flags ])

Argument

Type

Optional

Repeatable

Restrictions

input_string

variant

no

no

none

pattern

regex

no

no

constant

group

int64

yes

no

constant

flags

string

yes

no

constant

Examples

make_col dotted_quads:get_regex_all(log, /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/)

Extract all occurrences of four numbers (length 1-3 digits) separated by periods, from the input column log, into an array column named dotted_quads. If no such quads exist in the input log, then an empty array is the result.

Aliases

match_regex_all (deprecated)