Observe Performance Cookbook: Prefer Lead and Lag to First and Last¶
Problem¶
I want to efficiently propagate non-null values forward or backward in time to ensure that they are usable throughout a transaction.
Solution¶
Using the OPAL editor, find first_not_null
and last_not_null
usage. Replace with lead_not_null
and lag_not_null
window functions and re-verify that the use cases are still working as desired.
Explanation¶
The two methods have similar semantics, but lead_not_null
and lag_not_null
are much more efficient than first_not_null
and last_not_null
. This is because Observe can share all the lead/lag calculation for all frames, while have to compute the first/last for each frame individually.
Better
make_col container_id:window(lag_not_null(container_id), frame(back:10m), group_by(key))
Less Good
make_col container_id:window(last_not_null(container_id), frame(back:10m), group_by(key))