statsby [groupby: col storable]*, [groupOrAggregateFunction: expression]+

Computes aggregate expressions over the input, optionally partitioned by group_by(...) columns, and emits one row per distinct group key.

Grouping and aggregates

You must supply at least one grouping column or at least one name: aggregateExpr binding.

Non-aggregate column references inside aggregate expressions must be either grouped or wrapped in an aggregate—otherwise compilation fails.

Temporal shape of the result

If all of the input’s time columns that exist are included in the grouping set in the combinations the compiler treats as “carried forward,” Valid From / Valid To metadata and dataset kind are preserved and aggregation uses no global time window. If not, the verb applies a global aggregation over the query window and the result is a table without Valid From / Valid To metadata.

For per-timestamp aggregates while keeping the timeline, use timestats. For fixed-width time buckets, use timechart.

Categories

Accelerable

statsby is never accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

statsby RowCount: count(1), group_by(station_id)

Aggregates measurements into one row per station_id with a count of contributing input rows.

statsby AvgTemp: avg(temp_c), MaxTemp: max(temp_c), group_by(station_id, month)

Computes per-station monthly temperature summaries from raw measurements using two aggregates and two grouping columns.

statsby SumTemp: sum(temp_c), group_by(month)

Rolls up all measurements into one row per calendar month by summing temp_c, illustrating grouping without station granularity.