January 27, 2023 Release Notes#

Features#

You may encounter errors when ingesting data into Observe. Use the System datastream to help diagnose issues with incoming data. This datastream contains many different types of observations about activity in your workspace. To investigate ingest issues, open the System datastream and filter for those with OBSERVATION_KIND ingest_error or click View in the last error column for ingest tokens in the Datastream Details page. See [Troubleshooting Data Ingestion]

OPAL#

join on#

join on (same([email protected])), name:@container.name

Perform an inner-join between the default input of the verb and input container based on the condition that the default input’s column. The container_id must contain the same value as the other input’s id column, because same is used to compare the two columns.

join on(value > @right.min and value < @right.max), name:@right.name

Perform an inner-join between the default input of the verb and input @right based on the condition value must be within the range defined by min and max.

join on(container_id=@container_id, frame(back:5s, ahead:1s))

Perform an inner-join between the default input and input @container. The container_id from the default input must be equal to the id column from the input @container, and the timestamp from the input @container must overlap with a window of [t - 5s, t + 1s] where t is the timestamp of the default input.

unsort#

unsort

Removes all column sort ordering metadata that may have been inserted by the sort verb for the rest of the current OPAL pipeline.

extract_regex#

Add one or more columns by matching capture names in a regular expression against a given source expression. Regex extractions create string columns. Named capture groups are an extension to POSIX extended regular expressions.

If the column already exists, and it is of type string, the original value is replaced with the matched text. If the regular expression does not match anything, the original value is preserved.

If the column already exists, but it is not a string column, extract_regex returns an error.

See also: make_col.

The flags argument specifies optional regex flags:

  • c - Enables case-sensitive matching (default.)

  • i - Enables case-insensitive matching.

  • m - Enables multi-line mode (i.e. meta-characters ^ and $ match the beginning and end of any line of the input string.) By default, multi-line mode is disabled (i.e. ^ and $ match the beginning and end of the entire input string.)

  • s - Enables the POSIX wildcard character . to match \n (newline.) By default, . does not match \n.

For more about syntax, see POSIX extended regular expressions.

extract_regex also supports capture group column typecasting using the following syntax: (?P<value::float64>). The named capture group column “value” is casted to float64 using the float64 typecast function. The currently supported typecast functions are float64, int64, string, parse_isotime (typecast to timestamp), duration, duration_ms, duration_sec, duration_min, duration_hr and parse_json.

extract_regex message, /status=(?P<statuscode>\d+)/

Create the column statuscode by matching for status=numbers in the field ‘message’.

extract_regex inputcol, /(?P<sensor>[^|]*)\|count:(?P<counts>[^|]*)\|env:(?P<env>[^|]*)/

Given an input column value like: “studio-aqi|count:654 201 28 0 0 0|env:3 4 4a”, generate three output columns: “sensor” with the value “studio-aqi”, “counts” with the value “654 201 0 0 0”, and “env” with the value “3 4 4a”.

extract_regex message, /(?P<date::parse_isotime>[0-9:TZ.]+) (?P<name>[a-z]+)=(?P<value::float64>[0-9.]+)/

Given an input column called message, generate three output columns called date, name and value with the corresponding regex matching. The date column is typecasted to datatype timestamp using parse_isotime, the name column remains the default type string and the value column is typecasted to datatype float64 using float64.