dedup#

Type of operation: Aggregate

Description#

dedup collapses all rows in a dataset with the same timestamp and identical values in specified columns to one row.

dedup works on any Table, Event, or Interval dataset. For Interval datasets, the rows will only be collapsed if they have the same “Valid From” and the same “Valid To” time. For the remaining columns, an arbitrary value from the collapsed rows is picked while preferring non-null values.

When no column names are given, dedup collapses rows with identical values in all the columns to a single row.

Usage#

dedup [ columnname_1, columnname_2, ... ]

Argument

Type

Optional

Repeatable

Restrictions

columnname

expression

yes

yes

none

Accelerable#

dedup is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples#

dedup vf, message

Collapse the rows with identical values in vf and message columns and with identical timestamps to a single row.

dedup

Remove duplicate rows in the input dataset

Aliases#

  • distinct