Get dataset attribute statistics

get

https://{customerid}.observeinc.com/v1/datasets/stats

Returns aggregated statistics — distinct value count and the top-K
most frequent value/count pairs — for one or more dataset
attributes, optionally restricted to a CEL-filtered subset of
datasets.

Each attributes entry is a CEL expression that produces a
comparable value (string, int, bool, timestamp); compound
expressions are allowed (coalesce(_package, label)). The CEL
environment matches GET /v1/datasets; see that endpoint's
description for the field mapping, type discrepancies, and
available functions.

Each multiValueAttributes entry is a CEL expression that produces
a list<comparable>; the list is flattened across all matching
datasets and counted as total occurrences, so a single
dataset whose expression evaluates to a 3-element list
contributes 3 increments. multiValueAttributes count values may
therefore exceed meta.filteredDatasets. For attributes the
count is unchanged: the number of datasets with that value.

At least one of attributes or multiValueAttributes must be
non-empty; passing neither returns 400.

Stat values are returned as JSON strings regardless of the
expression's CEL type:

string — verbatim.
int — decimal digits (e.g. "41000234").
bool — "true" or "false".
timestamp — RFC 3339 (e.g. "2024-01-01T00:00:00Z").

The response meta block reports totalDatasets (before the
filter) and filteredDatasets (after) so callers can compute
coverage.

Recent Requests

Time	Status	User Agent
Retrieving recent requests…

Loading…

Query Params

filter

string

CEL expression restricting which datasets contribute to the
statistics. Must evaluate to bool. Must be URL-encoded.
If omitted, statistics are computed across all readable
datasets. See GET /v1/datasets for the CEL primer and
filter examples.

attributes

string

CSV list of CEL expressions, each producing a comparable
value (string, int, bool, timestamp). Each expression
identifies one stat to compute over the filtered dataset
set. Maximum 10 expressions per request. See the operation
description for the wire format of stat values.

Either attributes or multiValueAttributes must be non-empty;
both may be supplied together.

CSV-escape first, then URL-encode. Wrap an item in " if it
contains a comma or "; double a literal " to "" inside
a quoted item.

multiValueAttributes

string

CSV list of CEL expressions, each producing a list<comparable>
(e.g. interfaces.map(i, i.path), correlationTags.map(t, t.tag),
primaryKey, fieldList.map(f, f.name)). For each matching
dataset, the expression's list is flattened into a per-expression
bag and counts are total occurrences — a single dataset
contributes one increment per element. The flattened bag for
each expression is capped at 10,000 values across all matching
datasets; exceeding the cap returns 400.

Maximum 10 expressions. Either attributes or multiValueAttributes
must be non-empty; both may be supplied together. topK
applies independently to each entry of either array.

CSV-escape first, then URL-encode. Wrap an item in " if it
contains a comma or "; double a literal " to "" inside
a quoted item.

topK

int64

required

1 to 100

Number of top values to return per attribute, sorted by count
descending. Range 1–100. The total distinct count for each
attribute is always reported via distinctCount, even when it
exceeds topK.

Responses

200Dataset statistics retrieved successfully

400Bad request

401Unauthorized

403Forbidden

429Rate limit reached

5XXInternal server error