Observe Performance Cookbook: Limit Resource Time Windows

Problem

Using a dataset with resources is timing out or using a lot of query credits.

Solution

Edit the resource dataset, find the make_resource line, and change the expiry option to be shorter. Verify that the resource use case still works, and save the resource dataset definition.

Explanation

Verbs like make_resource and set_valid_from can introduce time dilation to the input query window. Observe looks back to the beginning of the expiry time range to ensure that it is correctly defining a single resource, which can affect queries that use this resource. Even if the query time window is only 15 minutes, the query may still need to scan input data that falls into a window of 15 minutes plus the expiry value. By default, the make_resource verb uses 24 hours of input events to compute the state of each resource. Therefore, a fifteen minute query with a resource that has a default expiry time range of 24 hours may actually read 24 hours and fifteen minutes of input data. Observe optimizes this behavior in the transformation used for published datasets, so a large time dilation at query time should not affect transform performance.

A 24 hour expiry is correct for long-lived resources that expect relatively infrequent changes, such as an EC2 instance or a shipping bill-of-lading. It can be excessive for ephemeral resources, such as a container instance in GKE. The expiry value should be set just large enough to capture the defining change events for your resource event set. For instance if your resource is an IP Address in a DHCP pool with a 1 hour lease duration, make_resource options(expiry:duration_min(75)), col1:col1, primary_key(pk1, pk2) would be more appropriate than make_resource options(expiry:duration_hr(24)), col1:col1, primary_key(pk1, pk2). An even better approach is to use Intervals for ephemeral things.