pivot [options(...)]?, name: col storable, value: col storable, [name_value: expression]+, [group_by(...)]?

The pivot verb rotates a table by transforming rows into columns.

This verb can be used to transform a narrow table (e.g. timestamp, year, product, sales) into a wider table (e.g. timestamp, year, laptop_sales, tablet_sales, smartphone_sales).

The arguments are as follows:

  • options(align:x): Optional argument to specify alignment, which denotes the period for aggregating nearby data points together onto a time grid
  • name: Expression used to select rows for generating the name_value rows
  • value: Expression used to populate the name_value columns
  • name_value_0, ..., name_value_n: Columns to generate. These arguments can be specified either using the name:expression syntax or by just providing an expression. If name:expression syntax is used, the name of the generated column is set to name. If only expression is passed, pivot will try to auto-generate a name. The rows of each name_value column is set to:
    • Any non-null result of evaluating the value argument, if the value of name argument equals name_value and no alignment is specified
    • Last non-null result of evaluating the value argument, if the value of name argument equals name_value and an alignment is specified using options(align:x)
    • Null, otherwise
  • group_by: Optional grouping. Default grouping is all columns not referenced in the preceding arguments. When no alignment is specified, there is also an implicit grouping on the timestamp column, similar to how the timestats verb treats timestamp columns.

Additional details:

  • None of the arguments can contain an aggregation function
  • pivot is not supported for interval and resource datasets
  • order_by specification is not allowed

Options

OptionTypeMeaning
aligndurationSpecifies the period for aggregating nearby data points together onto a time grid

Categories

Accelerable

pivot is sometimes accelerable, depending on options used. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

pivot product, sales, laptop_sales:"laptop", tablet_sales:"tablet", smartphone_sales:"smartphone", group_by(year)

Consider the following input table A:

yearproductsales
2021"laptop"100
2021"tablet"200
2022"laptop"150
2022"tablet"250
2022"smartphone"300
2023"smartphone"350

The above statement transforms table A to:

timestampyearlaptop_salestablet_salessmartphone_sales
10/31/2023 13:51:272021100200NULL
10/31/2023 15:57:032022150250300
10/31/2023 15:60:122023NULLNULLNULL
pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)

The above statement transforms table A to:

yearlaptoptabletsmartphone
2021100200NULL
2022150250300
2023NULLNULLNULL

pivot in this case auto-generates the column names.

pivot product, sales, "laptop", "tablet", "smartphone"

The above statement transforms table A to:

yearlaptoptabletsmartphone
2021100200NULL
2022150250300
2023NULLNULLNULL

The group_by optional and in this case is inferred to be group_by(year).

pivot concat_strings(product, "item"), sales * 2, "laptop_item", "tablet_item", "smartphone_item", group_by(year)

The above statement transforms table A to:

yearlaptop_itemtablet_itemsmartphone_item
2021200400NULL
2022300500600
2023NULLNULLNULL

pivot accepts any scalar expression not containing an aggregate function as an argument.

pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)

Consider the following input table B:

yearproductsales
2021"laptop"100
2021"laptop"500
2021"tablet"200
2022"laptop"150
2022"tablet"250
2022"tablet"375
2022"smartphone"300
2023"smartphone"350

The result of the above statement for this input is non-deterministic because two groups (namely "laptop" for the year 2021 and "tablet" for the year 2022) contains multiple non-null value results. In such cases, pivot will pick any non-null result.

One possible output is:

yearlaptop_salestablet_salessmartphone_sales
2021100200NULL
2022150250300
2023NULLNULLNULL

Again, consider another example, input table C:

timestampyearproductsales
10/31/2023 13:51:272021"laptop"100
10/31/2023 13:52:292021"tablet"200
10/31/2023 15:57:032022"laptop"150
10/31/2023 15:58:132022"tablet"250
10/31/2023 15:59:352022"smartphone"300
10/31/2023 16:00:122023"smartphone"350

The result of the above statement for this input, however, is not non-deterministic, despite the timestamp column having conflicting values per group. This is because the timestamp column is treated specially: it is implicitly used as a grouping column, similar to how the timestats verb treats timestamp columns.

The output is:

timestampyearlaptop_salestablet_salessmartphone_sales
10/31/2023 13:51:272021100NULLNULL
10/31/2023 13:52:292021NULL200NULL
10/31/2023 15:57:032022150NULLNULL
10/31/2023 15:58:132022NULL250NULL
10/31/2023 15:59:352022NULLNULL300
10/31/2023 16:00:122023NULLNULL350

For inputs with conflicting timestamps, it is recommended that you use the options(align:x) argument, as demonstrated in the next example.

pivot options(align:3m), product, sales, "laptop", "tablet", "smartphone", group_by(year)

The above statement, for the window 10/31/2023 15:00:00 to 10/31/2023 17:00:00, transforms table C to:

_c_bucket_c_valid_from_c_valid_toyearlaptoptabletsmartphone
943769710/31/2023 13:51:0010/31/2023 13:54:002021100200NULL
943773910/31/2023 15:57:0010/31/2023 16:00:002022150250300
943774010/31/2023 16:00:0010/31/2023 16:03:002023NULLNULL350