Observe Performance Cookbook: Using Filter instead of Ever¶
Problem¶
A query using ever
to search resources in OPAL is taking more time or credits than desired.
Solution¶
Explanation¶
The filter
and ever
verbs have subtly different semantics when used on Resources. A single resource is actually implemented as a set of rows, where each row has the same primary key (the identifier of the resource), and each row stores the state of the resource at some time interval. Whenever the state of a resource changes; that is, one of its fields changes, a new row is created.
The ever
verb checks whether, for a given resource, there has ever been a moment in time when the given predicate was true. It does this by looking at all the interval rows for that resource, and if the given predicate evaluates to true for any of these rows, then all the rows for that resource become part of the result, including those where the predicate was not true. These semantics require a relational (self-)join, which is an expensive operation. The cost quickly adds up when multiple ever
verbs are stacked.
In contrast, filter
treats the resource as a plain interval dataset. It simply checks the predicate for each interval row individually, and returns the interval rows where the predicate is true.
The result of filter
and ever
is the same if all the fields that are referenced by the predicate are time-immutable. If all the fields are immutable, then clearly a predicate either holds for all interval rows that make up a given resource, or it holds for none of them.
However, if the fields are not immutable, then filter
may result in holes in the result. The interval rows that make up an individual resource may no longer be gap-free, resulting in UI artifacts or subtly incorrect query results in some cases.
That said, for many interactive queries, the result of filter
is close enough to ever
to be worth the (often significant) performance gains.
Note that the OPAL compiler will automatically rewrite ever
to filter
if it can deduce that all the fields of the predicate are immutable. But that requires the metadata to be set correctly (see: set_col_immutable
).