pivot

Type of operation: Aggregate

Description

The pivot verb rotates a table by transforming rows into columns.

This verb can be used to transform a narrow table (e.g. timestamp, year, product, sales) into a wider table (e.g. timestamp, year, laptop_sales, tablet_sales, smartphone_sales).

The arguments are as follows:

  • options(align:x): Optional argument to specify alignment, which denotes the period for aggregating nearby data points together onto a time grid

  • name: Expression used to select rows for generating the name_value rows

  • value: Expression used to populate the name_value columns

  • name_value_0, ..., name_value_n: Columns to generate. These arguments can be specified either using the name:expression syntax or by just providing an expression. If name:expression syntax is used, the name of the generated column is set to name. If only expression is passed, pivot will try to auto-generate a name. The rows of each name_value column is set to:

    • Any non-null result of evaluating the value argument, if the value of name argument equals name_value and no alignment is specified

    • Last non-null result of evaluating the value argument, if the value of name argument equals name_value and an alignment is specified using options(align:x)

    • Null, otherwise

  • group_by: Optional grouping. Default grouping is all columns not referenced in the preceding arguments. When no alignment is specified, there is also an implicit grouping on the timestamp column, similar to how the timestats verb treats timestamp columns.

Additional details:

  • None of the arguments can contain an aggregation function

  • pivot is not supported for interval and resource datasets

  • order_by specification is not allowed

Usage

pivot [ options ], name, value, name_value_1, name_value_2, ..., [ groupby ]

Argument

Type

Optional

Repeatable

Restrictions

options

options

yes

no

constant

name

storable

no

no

column

value

storable

no

no

column

name_value

expression

no

yes

none

groupby

grouping

yes

no

constant

Options

Option

Type

Meaning

align

duration

Specifies the period for aggregating nearby data points together onto a time grid

Accelerable

pivot is sometimes accelerable, depending on options used. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

pivot product, sales, laptop_sales:"laptop", tablet_sales:"tablet", smartphone_sales:"smartphone", group_by(year)

Consider the following input table A:

year

product

sales

2021

“laptop”

100

2021

“tablet”

200

2022

“laptop”

150

2022

“tablet”

250

2022

“smartphone”

300

2023

“smartphone”

350

The above statement transforms table A to:

timestamp

year

laptop_sales

tablet_sales

smartphone_sales

10/31/2023 13:51:27

2021

100

200

NULL

10/31/2023 15:57:03

2022

150

250

300

10/31/2023 15:60:12

2023

NULL

NULL

NULL

pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)

The above statement transforms table A to:

year

laptop

tablet

smartphone

2021

100

200

NULL

2022

150

250

300

2023

NULL

NULL

NULL

pivot in this case auto-generates the column names.

pivot product, sales, "laptop", "tablet", "smartphone"

The above statement transforms table A to:

year

laptop

tablet

smartphone

2021

100

200

NULL

2022

150

250

300

2023

NULL

NULL

NULL

The group_by optional and in this case is inferred to be group_by(year).

pivot concat_strings(product, "item"), sales * 2, "laptop_item", "tablet_item", "smartphone_item", group_by(year)

The above statement transforms table A to:

year

laptop_item

tablet_item

smartphone_item

2021

200

400

NULL

2022

300

500

600

2023

NULL

NULL

NULL

pivot accepts any scalar expression not containing an aggregate function as an argument.

pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)

Consider the following input table B:

year

product

sales

2021

“laptop”

100

2021

“laptop”

500

2021

“tablet”

200

2022

“laptop”

150

2022

“tablet”

250

2022

“tablet”

375

2022

“smartphone”

300

2023

“smartphone”

350

The result of the above statement for this input is non-deterministic because two groups (namely “laptop” for the year 2021 and “tablet” for the year 2022) contains multiple non-null value results. In such cases, pivot will pick any non-null result.

One possible output is:

year

laptop_sales

tablet_sales

smartphone_sales

2021

100

200

NULL

2022

150

250

300

2023

NULL

NULL

NULL

Again, consider another example, input table C:

timestamp

year

product

sales

10/31/2023 13:51:27

2021

“laptop”

100

10/31/2023 13:52:29

2021

“tablet”

200

10/31/2023 15:57:03

2022

“laptop”

150

10/31/2023 15:58:13

2022

“tablet”

250

10/31/2023 15:59:35

2022

“smartphone”

300

10/31/2023 16:00:12

2023

“smartphone”

350

The result of the above statement for this input, however, is not non-deterministic, despite the timestamp column having conflicting values per group. This is because the timestamp column is treated specially: it is implicitly used as a grouping column, similar to how the timestats verb treats timestamp columns.

The output is:

timestamp

year

laptop_sales

tablet_sales

smartphone_sales

10/31/2023 13:51:27

2021

100

NULL

NULL

10/31/2023 13:52:29

2021

NULL

200

NULL

10/31/2023 15:57:03

2022

150

NULL

NULL

10/31/2023 15:58:13

2022

NULL

250

NULL

10/31/2023 15:59:35

2022

NULL

NULL

300

10/31/2023 16:00:12

2023

NULL

NULL

350

For inputs with conflicting timestamps, it is recommended that you use the options(align:x) argument, as demonstrated in the next example.

pivot options(align:3m), product, sales, "laptop", "tablet", "smartphone", group_by(year)

The above statement, for the window 10/31/2023 15:00:00 to 10/31/2023 17:00:00, transforms table C to:

_c_bucket

_c_valid_from

_c_valid_to

year

laptop

tablet

smartphone

9437697

10/31/2023 13:51:00

10/31/2023 13:54:00

2021

100

200

NULL

9437739

10/31/2023 15:57:00

10/31/2023 16:00:00

2022

150

250

300

9437740

10/31/2023 16:00:00

10/31/2023 16:03:00

2023

NULL

NULL

350