dedup [columnname: expression]*

Collapses duplicate rows while preserving the input column layout and dataset kind.

No arguments

With no arguments, any two rows that match in every column are merged into one row.

Grouping columns

With arguments, each must be a plain column reference on the default dataset (not an expression, not a path into a structured column). Rows that agree on all listed columns are merged into one row.

Event and interval time columns

On event or interval inputs, if you omit the active valid_from or valid_to column from the argument list, those columns are still included in the grouping key so rows at different times stay distinct.

Values in other columns

The active time columns are never merged from competing values; they only act as grouping keys when required. Every other non-time column outside the grouping key is reduced to a single value per group using a merge that prefers non-null values but does not guarantee which surviving value is kept when several non-null values disagree.

Resources

On resource inputs, only argumentless dedup is allowed; providing grouping columns is rejected at compile time.

The alias distinct is the same verb.

Accelerable

dedup is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making
queries on the dataset respond faster.

Examples

Demonstrates keyed dedup with a single grouping column: rows that agree on that column collapse to one row, with values in other columns merged using any-style semantics.

Demonstrates argumentless dedup, which merges only rows that agree on every column; any column difference keeps both rows.

Demonstrates keyed dedup with several grouping columns so the collapse key is the tuple of those columns, not each column in isolation.