count_regex_matches

Description

Returns the number of times the given regular expression pattern occurs in the argument string.

There are two optional parameters:

  • a position parameter to specify the number of characters from the beginning of the string where the function starts searching for matches

  • a string parameter to specify flags:

    • c - Enables case-sensitive matching (default.)

    • i - Enables case-insensitive matching.

    • m - Enables multi-line mode (i.e. meta-characters ^ and $ match the beginning and end of any line of the input string.) By default, multi-line mode is disabled (i.e. ^ and $ match the beginning and end of the entire input string.)

    • s - Enables the POSIX wildcard character . to match \n (newline.) By default, . does not match \n.

For more about regular expression syntax, see POSIX extended regular expressions.

Return type

int64

Domain

This is a scalar function (calculates a single output value for a single input row.)

Categories

Usage

count_regex_matches(input, pattern, [ position ], [ flags ])

Argument

Type

Optional

Repeatable

Restrictions

input

storable

no

no

none

pattern

regex

no

no

constant

position

int64

yes

no

constant

flags

string

yes

no

constant

Examples

make_col dotted_quads_count:count_regex_matches(log, /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/)

Count all occurrences of four numbers (length 1-3 digits) separated by periods, from the input column log, into an int64 column named dotted_quads_count. If no such quads exist in the input log, then it returns 0.