pivot¶

Type of operation: Aggregate

Description¶

The pivot verb rotates a table by transforming rows into columns.

This verb can be used to transform a narrow table (e.g. timestamp, year, product, sales) into a wider table (e.g. timestamp, year, laptop_sales, tablet_sales, smartphone_sales).

The arguments are as follows:

options(align:x): Optional argument to specify alignment, which denotes the period for aggregating nearby data points together onto a time grid
name: Expression used to select rows for generating the name_value rows
value: Expression used to populate the name_value columns
name_value_0, ..., name_value_n: Columns to generate. These arguments can be specified either using the name:expression syntax or by just providing an expression. If name:expression syntax is used, the name of the generated column is set to name. If only expression is passed, pivot will try to auto-generate a name. The rows of each name_value column is set to:
- Any non-null result of evaluating the value argument, if the value of name argument equals name_value and no alignment is specified
- Last non-null result of evaluating the value argument, if the value of name argument equals name_value and an alignment is specified using options(align:x)
- Null, otherwise
group_by: Optional grouping. Default grouping is all columns not referenced in the preceding arguments. When no alignment is specified, there is also an implicit grouping on the timestamp column, similar to how the timestats verb treats timestamp columns.

Additional details:

None of the arguments can contain an aggregation function
pivot is not supported for interval and resource datasets
order_by specification is not allowed

Usage¶

pivot [ options ], name, value, name_value_1, name_value_2, ..., [ groupby ]

Argument	Type	Optional	Repeatable	Restrictions
options	options	yes	no	constant
name	storable	no	no	column
value	storable	no	no	column
name_value	expression	no	yes	none
groupby	grouping	yes	no	constant

Options¶

Option	Type	Meaning
align	duration	Specifies the period for aggregating nearby data points together onto a time grid

Accelerable¶

pivot is sometimes accelerable, depending on options used. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples¶

pivot product, sales, laptop_sales:"laptop", tablet_sales:"tablet", smartphone_sales:"smartphone", group_by(year)

Consider the following input table A:

year	product	sales
2021	“laptop”	100
2021	“tablet”	200
2022	“laptop”	150
2022	“tablet”	250
2022	“smartphone”	300
2023	“smartphone”	350

The above statement transforms table A to:

timestamp	year	laptop_sales	tablet_sales	smartphone_sales
10/31/2023 13:51:27	2021	100	200	NULL
10/31/2023 15:57:03	2022	150	250	300
10/31/2023 15:60:12	2023	NULL	NULL	NULL

pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)

The above statement transforms table A to:

year	laptop	tablet	smartphone
2021	100	200	NULL
2022	150	250	300
2023	NULL	NULL	NULL

pivot in this case auto-generates the column names.

pivot product, sales, "laptop", "tablet", "smartphone"

The above statement transforms table A to:

year	laptop	tablet	smartphone
2021	100	200	NULL
2022	150	250	300
2023	NULL	NULL	NULL

The group_by optional and in this case is inferred to be group_by(year).

pivot concat_strings(product, "item"), sales * 2, "laptop_item", "tablet_item", "smartphone_item", group_by(year)

The above statement transforms table A to:

year	laptop_item	tablet_item	smartphone_item
2021	200	400	NULL
2022	300	500	600
2023	NULL	NULL	NULL

pivot accepts any scalar expression not containing an aggregate function as an argument.

pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)

Consider the following input table B:

year	product	sales
2021	“laptop”	100
2021	“laptop”	500
2021	“tablet”	200
2022	“laptop”	150
2022	“tablet”	250
2022	“tablet”	375
2022	“smartphone”	300
2023	“smartphone”	350

The result of the above statement for this input is non-deterministic because two groups (namely “laptop” for the year 2021 and “tablet” for the year 2022) contains multiple non-null value results. In such cases, pivot will pick any non-null result.

One possible output is:

year	laptop_sales	tablet_sales	smartphone_sales
2021	100	200	NULL
2022	150	250	300
2023	NULL	NULL	NULL

Again, consider another example, input table C:

timestamp	year	product	sales
10/31/2023 13:51:27	2021	“laptop”	100
10/31/2023 13:52:29	2021	“tablet”	200
10/31/2023 15:57:03	2022	“laptop”	150
10/31/2023 15:58:13	2022	“tablet”	250
10/31/2023 15:59:35	2022	“smartphone”	300
10/31/2023 16:00:12	2023	“smartphone”	350

The result of the above statement for this input, however, is not non-deterministic, despite the timestamp column having conflicting values per group. This is because the timestamp column is treated specially: it is implicitly used as a grouping column, similar to how the timestats verb treats timestamp columns.

The output is:

timestamp	year	laptop_sales	tablet_sales	smartphone_sales
10/31/2023 13:51:27	2021	100	NULL	NULL
10/31/2023 13:52:29	2021	NULL	200	NULL
10/31/2023 15:57:03	2022	150	NULL	NULL
10/31/2023 15:58:13	2022	NULL	250	NULL
10/31/2023 15:59:35	2022	NULL	NULL	300
10/31/2023 16:00:12	2023	NULL	NULL	350

For inputs with conflicting timestamps, it is recommended that you use the options(align:x) argument, as demonstrated in the next example.

pivot options(align:3m), product, sales, "laptop", "tablet", "smartphone", group_by(year)

The above statement, for the window 10/31/2023 15:00:00 to 10/31/2023 17:00:00, transforms table C to:

_c_bucket	_c_valid_from	_c_valid_to	year	laptop	tablet	smartphone
9437697	10/31/2023 13:51:00	10/31/2023 13:54:00	2021	100	200	NULL
9437739	10/31/2023 15:57:00	10/31/2023 16:00:00	2022	150	250	300
9437740	10/31/2023 16:00:00	10/31/2023 16:03:00	2023	NULL	NULL	350