aggregate

Aliases: reaggregate (deprecated).

aggregate [groupby: col storable]*, [groupOrAggregateFunction: expression]+

Combine aggregate functions over metric columns while collapsing rows to the grouping keys you choose.

Use it on time-aligned metric data—typically after align or timechart—when you want to change tag or dimension groupings without merging different time steps. On inputs that already carry an aligned bucket column, rollups stay inside each bucket so tumbling windows do not collapse together when you change dimensions. The output dataset keeps the same kind as the input (for example, a resource-shaped metric stream stays a resource).

You must supply at least one grouping column (via group_by) or at least one name: expression aggregate binding. Each expression may call aggregate functions, but aggregate functions must not nest inside one another. Any column or link reference that appears outside an aggregate function must refer to a grouping column, and link references cannot stand alone as the aggregated value.

You cannot group on the dataset valid_from or valid_to columns, and you cannot bind aggregate outputs to names that collide with those interval endpoints.

When an aggregate expression references an input column that is marked as a metric, the output column is also marked as a metric; metadata such as unit and default aggregate method are inferred from the first metric path and the first aggregate function the compiler sees in that expression.

For statistics on raw events without this metric-oriented path, statsby is usually the better fit; for per-timestamp rollups keyed by the dataset primary key, consider timestats.

Chaining aggregate on top of a dataset that is already treated as aggregated can reduce how well later pipeline stages accelerate, because the acceleration model classifies the result as a heavier shape than a simple insert-time aggregate.

Categories

Accelerable

aggregate is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

// Input has metric column tx_bytes and primary keys podName, namespace, clusterUid.
aggregate total_tx:sum(tx_bytes), group_by(podName, namespace, clusterUid)

Sums tx_bytes within each combination of podName, namespace, and clusterUid, producing one row per group with the new total_tx column alongside the grouping keys.

// Input has metric columns cpu_avg and disk fields used in the ratio.
aggregate cpu_sum:sum(cpu_avg), disk_ratio:avg(disk_used) / avg(disk_total), group_by(tag, status_code)

Computes two aggregate columns in one step—a straight sum and a ratio built from separate avg aggregates—while grouping by tag and status_code so each pair gets its own output row.

// Derive a synthetic dimension from a tag, then roll the metric up onto it.
colmake node: tag.node | aggregate cpu_sum:sum(cpu_avg), group_by(node)

Shows reshaping aligned metrics onto a new grouping key built with colmake, so each derived node label gets its own summed cpu_avg row while preserving the metric pipeline.