Observe Performance Cookbook: Limit Resource Time Windows¶
Problem¶
Using a dataset with resources is timing out or using a lot of query credits.
Solution¶
Edit the resource dataset, find the make_resource
line, and change the expiry
option to be shorter. Verify that the resource use case still works, and save the resource dataset definition.
Explanation¶
Verbs like make_resource
and set_valid_from
can introduce time dilation to the input query window. Observe looks back to the beginning of the expiry
time range to ensure that it is correctly defining a single resource, which can affect queries that use this resource. Even if the query time window is only 15 minutes, the query may still need to scan input data that falls into a window of 15 minutes plus the expiry
value. By default, the make_resource
verb uses 24 hours of input events to compute the state of each resource. Therefore, a fifteen minute query with a resource that has a default expiry
time range of 24 hours may actually read 24 hours and fifteen minutes of input data. Observe optimizes this behavior in the transformation used for published datasets, so a large time dilation at query time should not affect transform performance.
A 24 hour expiry
is correct for long-lived resources that expect relatively infrequent changes, such as an EC2 instance or a shipping bill-of-lading. It can be excessive for ephemeral resources, such as a container instance in GKE. The expiry
value should be set just large enough to capture the defining change events for your resource event set. For instance if your resource is an IP Address in a DHCP pool with a 1 hour lease duration, make_resource options(expiry:duration_min(75)), col1:col1, primary_key(pk1, pk2)
would be more appropriate than make_resource options(expiry:duration_hr(24)), col1:col1, primary_key(pk1, pk2)
. An even better approach is to use Intervals for ephemeral things.