bottomk k: int64, [score: expression]?, [groupby: col storable]?

Retains rows that belong to the lowest-ranked groups in the current query window and adds an int64 _c_rank column giving each kept row its group’s rank.

k must be a compile-time non-negative int64. Grouping comes from the active group_by columns when supplied; otherwise the input’s primary key columns define the groups. With k == 0, no rows are emitted (the rank column is still part of the schema). With an empty grouping set and k >= 1, every row sits in one group, so the group is passed through unchanged and _c_rank is 1.

You may pass a score expression after k. It must evaluate to a scalar storable type; aggregate functions inside the score are evaluated per group, and nested aggregates are rejected. Columns referenced in the score must be either grouping columns or inside aggregates.

If you omit score, groups are ordered by the first non–grouping-key column whose type is strictly numeric or duration and that is not a reserved name. If no such column exists, ordering uses all grouping keys; if any grouping key is not sortable that way, compilation fails unless you supply an explicit score.

When several groups tie on the chosen score, ties are broken by the grouping key columns. By default, rows order by _c_rank descending (whereas topk orders _c_rank ascending), then by the dataset time column when one exists.

For the highest-ranked groups instead of the lowest, use topk.

Categories

Accelerable

bottomk is never accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

bottomk 5

Keeps the five worst-ranked groups using default scoring over a non-key numeric column with primary-key grouping.

bottomk 3, group_by(station_id)

Ranks groups explicitly by station_id instead of the dataset primary key when choosing the bottom three groups by default numeric scoring.

bottomk 10, min(reading)

Selects the ten groups whose aggregate min(reading) in the window is smallest, using an explicit per-group score expression.

bottomk 1, group_by()

With an empty grouping set every row shares one group, so bottomk passes the input through and assigns rank 1 to each row.

bottomk 5, avg(reading), group_by(station_id)

Combines an explicit group_by with avg(reading) so only the five stations with the lowest mean readings remain.