dedup

Type of operation: Aggregate

Description

dedup collapses all rows in an event dataset with identical values in specified columns and with identical timestamp to a single row. For the remaining columns, an arbitrary value from the collapsed rows is picked while preferring non-null values.

When no column names are given, dedup collapses rows with identical values in all the columns to a single row.

Usage

dedup [ columnname ] ...

Argument

Type

Required

Multiple

columnname

expression

Optional

Can be multiple

Accelerable

dedup is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

dedup vf, message

Collapse the rows with identical values in vf and message columns and with identical timestamps to a single row.

dedup

Remove duplicate rows in the input dataset

Aliases

  • distinct