Prefer lead and lag to first and fast¶
Problem¶
I want to efficiently propagate non-null values forward or backward in time to ensure that they are usable throughout a transaction.
Solution¶
Using the OPAL editor, find first_not_null and last_not_null usage. Replace with lead_not_null and lag_not_null window functions and re-verify that the use cases are still working as desired.
Explanation¶
The two methods have similar semantics, but lead_not_null and lag_not_null are much more efficient than first_not_null and last_not_null. This is because Observe can share all the lead/lag calculation for all frames, while have to compute the first/last for each frame individually.
Better
make_col container_id:window(lag_not_null(container_id), frame(back:10m), group_by(key))
Less Good
make_col container_id:window(last_not_null(container_id), frame(back:10m), group_by(key))