Observe Performance Cookbook: Prefer Lead and Lag to First and Last


I want to efficiently propagate non-null values forward or backward in time to ensure that they are usable throughout a transaction.


Using the OPAL editor, find first_not_null and last_not_null usage. Replace with lead_not_null and lag_not_null window functions and re-verify that the use cases are still working as desired.


The two methods have similar semantics, but lead_not_null and lag_not_null are much more efficient than first_not_null and last_not_null. This is because Observe can share all the lead/lag calculation for all frames, while have to compute the first/last for each frame individually.


make_col container_id:window(lag_not_null(container_id), frame(back:10m), group_by(key))

Less Good

make_col container_id:window(last_not_null(container_id), frame(back:10m), group_by(key))