topk¶
Type of operation: Filter
Description¶
Selects all data for each of top k ranked groups. If no rank method is provided, a default one will be used. If no grouping is specified, the set of primary key columns will be used as the grouping.
Usage¶
topk k [ , rank ] [ , groupby ]
Argument |
Type |
Required |
Multiple |
---|---|---|---|
k |
int64 |
Required |
Only one |
rank |
expression |
Optional |
Only one |
groupby |
fieldref |
Optional |
Only one |
Accelerable¶
topk is never accelerable. A dataset that only uses accelerable verbs, can be accelerated, making queries on the dataset respond faster.
Examples¶
topk 100
Select the top 100 groups using the default rank method: the hash of the group identifiers (the set of primary key columns).
topk 100, group_by(clusterUid, namespace)
Similar to the first example, but explicitly specifying the grouping
topk 100, max(restartCount)
Similar to the first example, but using a custom rank method to find the groups with most restarts
topk 1, group_by()
This topk operates on empty grouping, where all rows belong to the same group, and hence all rows will be selected