Examples¶
Filtering¶
One of the most common OPAL operations is searching for data matching, or not matching, a condition. The filter
verb accepts a filter expression, and returns all matching events in the query time window. Additional verbs provide specialized matching conditions such as uniqueness, existence or non-existence, and top values.
Filter expressions¶
Filter expressions consist of Boolean expressions that can use any supported OPAL functions and operators, including the special ~
inexact search operator. Some examples of the simplest filter expressions include the following:
// keep only rows where "is_connected" field is "true"
filter is_connected
// keep only rows where "severity" field is not "info"
filter not severity = "info"
// keep only rows where "severity" field is not "DEBUG"
filter severity != "DEBUG"
// keep only rows with temperature out of range
filter temperature < 97 or temperature > 99
The ~
(tilde) operator supports inexact matching, and can match one or all of the fields. Some examples of inexact matches include:
// keep rows where "log" field contains "hello", case-insensitive
filter log ~ hello
// keep rows where any field contains "hello", case-insensitive
filter * ~ hello
Those OPAL examples use hello
as a search term. Search terms can have any of the following:
letters, digits, or underscores
Any other symbols, such as spaces, quotes, colons, or dashes, must be quoted using single
'
or double"
quote strings. You can add any characters inside the quotes, and you can slash-escape quote charactersQuoted search term segments are case-sensitive. For example,
"fig\"bar"
matchesfig"bar
, but notFig"bar
glob character
*
, which matches 0 or more characters of any type, including newlines.For example,
fig*bar
matchesfig123bar
andfIgBaR
Glob
*
can also anchor text to the beginning or end of the string when used at the beginning or end of the search term. For example,fig*
only matches strings beginning withfig
An
*
inside a quoted string will search for a literal asterisk character, such as"* list item"
.You can use globs together with quoted strings, for example:
"fig:"*":bar"
Multiple glob
*
characters can be used in a single search, such as:"fig:"*":bar:"*"baz"
Leading newlines and spaces in the source data are ignored
A search term may optionally start with
-
to invert the match:-foo
matches any string which does not containfoo
Non-quoted search terms always match case-insensitively (though quoted search terms are case sensitive). For example:
// data, search, match
us-west-2, filter source ~ WEST, TRUE
us-west-2, filter source ~ "us-WEST-2", FALSE
us-west-2, filter source ~ US"-"WEST"-"2, TRUE
More search term examples:
test
allowed_characters'Disallowed\'#:%Characters!'123
hello*world
line_start*anything*
*anything*line_end
-negative_term
Multiple search terms can be combined using Boolean expressions. For example:
filter log ~ fig AND log ~ bar AND log ~ baz
// "log" field must include all 3 words in no particular order
filter search(log, "fig", "bar", "baz")
// this will match rows where log starts with `foo`, and don't contain `bar`
filter log ~ foo* AND log ~ -bar
// "or" and other boolean operators can be used between `~` expressions:
filter log ~ foo OR log ~ bar
// parenthetical expressions can be grouped
filter (log ~ foo AND log ~ bar) OR log ~ baz
The tilde ~
operator also accepts POSIX extended regular expressions and IPv4 CIDRs.
// mathing on a regular expression
filter log ~ /foo|bar/
// same as
filter match_regex(log, /foo|bar/)
// IP matching
filter ip ~ 192.168.0.0/16
// can also use wild cards
filter ip ~ 192.168.*.*
// or even shorter. At least two segments with two dots are required
filter ip ~ 192.168.*
The left side of the ~
expression can be any field, converted to a string if necessary, a JSON payload, or *
, which means that condition should be matched by at least one field.
// any field contains "error", case insensitive
filter * ~ error
// none of the fields contain "error"
filter * !~ error
Unicode characters¶
There are several ways to use non-ASCII text with filter
:
Text containing Unicode characters may be typed or pasted into the OPAL console like any other text.
Examples:
filter <हर दिन> filter @."ввод" < 5 // These are equivalent filter "😀" filter "\x{1F600}"
Unicode or special characters in a regular expression may be either a character or a hex value, but you must also specify the columns to search with
~
:Examples:
filter message ~ /😀/ filter message ~ /\x{1F600}/ filter message ~ /\x{000d}\x{000a}/ filter message + name ~ /\x{000d}\x{000a}/ filter (message ~ /\x{000d}\x{000a}/) or (name ~ /\x{000a}/)
Handling null values¶
In OPAL, null values always have a type, but not handled in the same way as a regular value. This is particularly important in comparisons.
This statement returns events with a severity
not equal to DEBUG
, but only for events that have a severity
value:
filter not severity="DEBUG"
An event that does not have a severity
(in other words: the value is null), will never match. Use is_null
or if_null
to explicitly include them:
// exclude "DEBUG" but include null
filter not severity="DEBUG" or is_null(severity)
// replace null with empty string, then check
filter if_null(severity, '') != "DEBUG"
For filter
expressions using contains()
, ensure what filter
compares against (the result of the contains()
) isn’t null:
// This filter expression suppresses null values,
// because contains(field_with_nulls, "string") returns null
filter not contains(severity, "DEBUG")
// These filter expressions include null values,
// because potential null values are handled
filter is_null(severity) or not contains(severity, "DEBUG")
filter does not contain (if_null(severity, ""), "DEBUG")
For some comparisons, you may also compare with a null value of the appropriate type.
make_col positive_or_null:case(value > 0, value, true, int64_null())
Specialized filter verbs¶
In addition to filter
, OPAL uses several additional verbs for different types of filter operations. See the OPAL filter verbs documentation for details. (Note that several of these verbs need a frame
to be streamable.)
Fields¶
Change a field type¶
To change the type of an existing field, create a new field with the desired type. Use a new name to keep both, or replace the existing one by giving it the same name. This is useful when creating metrics, which require numeric fields to be float64
.
Example:
make_col temperature:float64(temperature)
Extract from JSON¶
Reference properties in a JSON payload with either the dot or bracket operators:
make_col data:string(FIELDS.data), kind:string(FIELDS["name"])
Quote the string if the property name has special characters:
make_col userName:someField["user name"]
make_col userCity:someField."user city"
make_col requestStatus:someField.'request.status'
You may also combine methods:
// Sample data: {"fields": {"deviceStatus": {"timestamp": "2019-11-15T00:00:06.984Z"}}}
make_col timestamp1:fields.deviceStatus.timestamp
make_col timestamp2:fields["deviceStatus"]["timestamp"]
make_col timestamp3:fields.deviceStatus.["timestamp"]
make_col timestamp4:parsejson(string(fields.deviceStatus)).timestamp
Extract and modify values using replace_regex()
:
make_col state:replace_regex(string(FIELDS.device.date), /^.*([0-9]{4,4})-([0-9]{1,2})-([0-9]{1,2}).*$/, '\\3/\\2/\\1', 1)
make_col state:replace_regex(string(FIELDS.device.state), /ошибка/, "error", 0)
make_col state:replace_regex(string(FIELDS.device.manufacturer), /\x{2122}/, "TM", 0)
Extract with a regex¶
Use extract_regex
to extract fields from a string.
extract_regex data, /(?P<deviceid>[^|]*)\|count:(?P<counts>[^|]*)\|env:(?P<env>[^|]*)/
Note
extract_regex
allows named capture groups, unlike filter
expressions.
Metrics¶
Registering with set_metric
¶
set_metric
registers a single metric. It accepts anoptions
object containing details of its type, unit, how it should be aggregated, and other options.set_metric options(label:"Temperature", type:"gauge", unit:"C", rollup:"avg", aggregate:"avg", interval:5m), "temperature" set_metric options(label:"Power", description:"Power in watts", type:"gauge", rollup:"avg", aggregate:"avg"), "power"
The type of a metric determines how its values are interpreted.
Metric type
Description
cumulativeCounter
A monotonically increasing total over the life of the metric. A cumulativeCounter value is never negative.
delta
The difference between the current metric value and its previous value.
gauge
A measurement at a single point in time.
A metric rollup method determines how multiple data points for the same metric are summarized over time. A single value is created for multiple values in each rollup time window.
Rollup method
Description
avg
The average (arithmetic mean) of all values in the window.
count
The number of non-null values in the window.
max
The largest value.
min
The smallest value.
rate
The rate of change across the window, which may be negative for delta and gauge types. A negative rate for a cumulativeCounter is treated as a reset.
sum
The sum of all values in the window.
The aggregate type determines how values are aggregated across multiple metrics of the same type. For example, temperature metrics from multiple devices. Aggregate types correspond to the aggregate function of the same name.
Aggregate type
Description
any
An arbitrary value from the window, nondeterministically selected. Useful if you need a representative value, may be, but not guaranteed to be, faster to calculate than other methods.
any_not_null
Like
any
, but guaranteed to be not null.avg
The average (arithmetic mean.)
count
The number of non-null values.
countdistinct
An estimate of the number of unique values in the window. Faster than countdistinctexact.
countdistinctexact
The number of unique values in the window, slower but more accurate than countdistinct.
max
The largest value in the window.
median
An approximation of the median value, faster than medianexact.
medianexact
The median value across the window.
min
The smallest value in the window.
stddev
The standard deviation across all values in the window.
sum
The sum of all values in the window.
Note
For more about units, see Introduction to Metrics.
Links¶
Observe represents foreign keys through a concept called links. Links consist of two pieces: a mapping of columns in the current dataset to columns in a target dataset, and a name to refer to the link itself.
Creating with set_link
¶
Use set_link
to create a link, and define the mapping between columns in the current and target datasets.
set_link ^Cluster, clusterUid:@"K8s Cluster".uid
This defines a link named ^Cluster
between the current dataset and @"K8s Cluster"
, and maps the column clusterUid
to the uid
column of @"K8s Cluster"
.
Links may also be defined using a composite key, which can be composed of more than one column:
set_link ^Container,
containerName:@Container.name,
podName:@Container.podName,
clusterUid:@Container.clusterUid,
namespace:@Container.namespace
Four local columns, namely containerName
, podName
, clusterUid
and namespace
link to their counterparts in @Container
. This is necessary because the @Container
dataset defines the primary key in terms of those four columns.
Link Labels¶
Because links are represented in the dataset by the key components, it’s often not clear at a glance what value is pointed to by a link. For example, a cluster might have a primary key which is a UUID like 4ef39c4f-7685-11e8-9d40-02ab4d0e1e2e
; while precise and unambiguous, it is not very useful for humans examining the data. Instead, Observe hides these key columns by default, and presents a column of type link
, which is rendered using the label column of the target dataset.
For example, with the ^Cluster
link defined above, although the local dataset stores only a clusterUid
column containing the UUID, this is displayed as a column named ^Cluster
with the value of @"K8s Cluster".name
inlined. Observe will display the value on the matched row in the target dataset pulled from the column specified by set_label
as the target dataset’s label.
Sometimes, it is more helpful to filter by the value displayed for a link column rather than the local column values themselves. The label()
function can be used on links to retrieve the label for the linked resource, which can then be used as a normal string value:
filter label(^Cluster) ~ /prod.*/i
This filters out any rows with ^Cluster
s that do not have a label matching the regular expression /prod.*/i
.
Joining With Links¶
Since links define a mapping from the current dataset to another input dataset, a simple equijoin between the two datasets can be written succinctly:
leftjoin ^Container, state:^Container.state
This is equivalent to an explicit equijoin between the source and target columns defined by the link:
leftjoin on(
[email protected] and
[email protected] and
[email protected] and
[email protected]
),
state:@Container.state
This syntax mirrors the natural join syntax wherein the first argument is the dataset, and works for many join
verbs, namely: join
, full_join
, leftjoin
and lookup
.
Grouping By Links¶
It’s frequently useful to aggregate over all rows related to a linked resource, e.g., to summarize logs grouped by the container they come from:
statsby numLines:count(), group_by(^Container...)
The ...
is the “unpack” operator. Conceptually, it acts as if you replaced the ^Container
argument with a listing of each of the columns in the local dataset. So, the above is equivalent to:
statsby numLines:count(), group_by(containerName, podName, clusterUid, namespace)
which will bin all events in the current dataset by the container they come from, returning the count of events for each container.
Note that you can also group by the label of a link, though this has subtly different semantics.
statsby numLines:count(), group_by(label(^Container))
The above groups bins by the label of the linked resource, for example, the string value presented to you in the UI. Depending on how the @Container
dataset is defined, this may or may not give you the same results as grouping by ^Container...
. This is because label values are not guaranteed to be unique. For example, if containers are simply named after the service they run, then group_by(label(^Container))
gives you bins that include that container’s values across all clusters they appear in. Make sure you select the right grouping behavior for the query you want to perform.