dedup¶
Type of operation: Aggregate
Description¶
dedup collapses all rows in a dataset with the same timestamp and identical values in specified columns to one row.
dedup works on any Table, Event, or Interval dataset. For Interval datasets, the rows will only be collapsed if they have the same “Valid From” and the same “Valid To” time. For the remaining columns, an arbitrary value from the collapsed rows is picked while preferring non-null values.
When no column names are given, dedup collapses rows with identical values in all the columns to a single row.
Usage¶
dedup [ columnname_1, columnname_2, ... ]
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
columnname |
expression |
yes |
yes |
none |
Accelerable¶
dedup is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.
Examples¶
dedup vf, message
Collapse the rows with identical values in vf and message columns and with identical timestamps to a single row.
dedup
Remove duplicate rows in the input dataset
Aliases¶
distinct