pivot¶
Type of operation: Aggregate
Description¶
The pivot
verb rotates a table by transforming rows into columns.
This verb can be used to transform a narrow table (e.g. timestamp
, year
, product
, sales
) into a wider table (e.g. timestamp
, year
, laptop_sales
, tablet_sales
, smartphone_sales
).
The arguments are as follows:
options(align:x)
: Optional argument to specify alignment, which denotes the period for aggregating nearby data points together onto a time gridname
: Expression used to select rows for generating thename_value
rowsvalue
: Expression used to populate thename_value
columnsname_value_0, ..., name_value_n
: Columns to generate. These arguments can be specified either using thename:expression
syntax or by just providing anexpression
. Ifname:expression
syntax is used, the name of the generated column is set toname
. If onlyexpression
is passed,pivot
will try to auto-generate a name. The rows of eachname_value
column is set to:Any non-null result of evaluating the
value
argument, if the value ofname
argument equalsname_value
and no alignment is specifiedLast non-null result of evaluating the
value
argument, if the value ofname
argument equalsname_value
and an alignment is specified usingoptions(align:x)
Null, otherwise
group_by
: Optional grouping. Default grouping is all columns not referenced in the preceding arguments. When no alignment is specified, there is also an implicit grouping on the timestamp column, similar to how thetimestats
verb treats timestamp columns.
Additional details:
None of the arguments can contain an aggregation function
pivot
is not supported for interval and resource datasetsorder_by
specification is not allowed
Usage¶
pivot [ options ], name, value, name_value_1, name_value_2, ..., [ groupby ]
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
options |
options |
yes |
no |
constant |
name |
storable |
no |
no |
column |
value |
storable |
no |
no |
column |
name_value |
expression |
no |
yes |
none |
groupby |
grouping |
yes |
no |
constant |
Options¶
Option |
Type |
Meaning |
---|---|---|
align |
duration |
Specifies the period for aggregating nearby data points together onto a time grid |
Accelerable¶
pivot is sometimes accelerable, depending on options used. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.
Examples¶
pivot product, sales, laptop_sales:"laptop", tablet_sales:"tablet", smartphone_sales:"smartphone", group_by(year)
Consider the following input table A:
year |
product |
sales |
---|---|---|
2021 |
“laptop” |
100 |
2021 |
“tablet” |
200 |
2022 |
“laptop” |
150 |
2022 |
“tablet” |
250 |
2022 |
“smartphone” |
300 |
2023 |
“smartphone” |
350 |
The above statement transforms table A to:
timestamp |
year |
laptop_sales |
tablet_sales |
smartphone_sales |
---|---|---|---|---|
10/31/2023 13:51:27 |
2021 |
100 |
200 |
NULL |
10/31/2023 15:57:03 |
2022 |
150 |
250 |
300 |
10/31/2023 15:60:12 |
2023 |
NULL |
NULL |
NULL |
pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)
The above statement transforms table A to:
year |
laptop |
tablet |
smartphone |
---|---|---|---|
2021 |
100 |
200 |
NULL |
2022 |
150 |
250 |
300 |
2023 |
NULL |
NULL |
NULL |
pivot
in this case auto-generates the column names.
pivot product, sales, "laptop", "tablet", "smartphone"
The above statement transforms table A to:
year |
laptop |
tablet |
smartphone |
---|---|---|---|
2021 |
100 |
200 |
NULL |
2022 |
150 |
250 |
300 |
2023 |
NULL |
NULL |
NULL |
The group_by
optional and in this case is inferred to be group_by(year)
.
pivot concat_strings(product, "item"), sales * 2, "laptop_item", "tablet_item", "smartphone_item", group_by(year)
The above statement transforms table A to:
year |
laptop_item |
tablet_item |
smartphone_item |
---|---|---|---|
2021 |
200 |
400 |
NULL |
2022 |
300 |
500 |
600 |
2023 |
NULL |
NULL |
NULL |
pivot
accepts any scalar expression not containing an aggregate function as an argument.
pivot product, sales, "laptop", "tablet", "smartphone", group_by(year)
Consider the following input table B:
year |
product |
sales |
---|---|---|
2021 |
“laptop” |
100 |
2021 |
“laptop” |
500 |
2021 |
“tablet” |
200 |
2022 |
“laptop” |
150 |
2022 |
“tablet” |
250 |
2022 |
“tablet” |
375 |
2022 |
“smartphone” |
300 |
2023 |
“smartphone” |
350 |
The result of the above statement for this input is non-deterministic because two groups (namely “laptop” for the year 2021 and “tablet” for the year 2022) contains multiple non-null value
results. In such cases, pivot
will pick any non-null result.
One possible output is:
year |
laptop_sales |
tablet_sales |
smartphone_sales |
---|---|---|---|
2021 |
100 |
200 |
NULL |
2022 |
150 |
250 |
300 |
2023 |
NULL |
NULL |
NULL |
Again, consider another example, input table C:
timestamp |
year |
product |
sales |
---|---|---|---|
10/31/2023 13:51:27 |
2021 |
“laptop” |
100 |
10/31/2023 13:52:29 |
2021 |
“tablet” |
200 |
10/31/2023 15:57:03 |
2022 |
“laptop” |
150 |
10/31/2023 15:58:13 |
2022 |
“tablet” |
250 |
10/31/2023 15:59:35 |
2022 |
“smartphone” |
300 |
10/31/2023 16:00:12 |
2023 |
“smartphone” |
350 |
The result of the above statement for this input, however, is not non-deterministic, despite the timestamp column having conflicting values per group. This is because the timestamp column is treated specially: it is implicitly used as a grouping column, similar to how the timestats
verb treats timestamp columns.
The output is:
timestamp |
year |
laptop_sales |
tablet_sales |
smartphone_sales |
---|---|---|---|---|
10/31/2023 13:51:27 |
2021 |
100 |
NULL |
NULL |
10/31/2023 13:52:29 |
2021 |
NULL |
200 |
NULL |
10/31/2023 15:57:03 |
2022 |
150 |
NULL |
NULL |
10/31/2023 15:58:13 |
2022 |
NULL |
250 |
NULL |
10/31/2023 15:59:35 |
2022 |
NULL |
NULL |
300 |
10/31/2023 16:00:12 |
2023 |
NULL |
NULL |
350 |
For inputs with conflicting timestamps, it is recommended that you use the options(align:x)
argument, as demonstrated in the next example.
pivot options(align:3m), product, sales, "laptop", "tablet", "smartphone", group_by(year)
The above statement, for the window 10/31/2023 15:00:00
to 10/31/2023 17:00:00
, transforms table C to:
_c_bucket |
_c_valid_from |
_c_valid_to |
year |
laptop |
tablet |
smartphone |
---|---|---|---|---|---|---|
9437697 |
10/31/2023 13:51:00 |
10/31/2023 13:54:00 |
2021 |
100 |
200 |
NULL |
9437739 |
10/31/2023 15:57:00 |
10/31/2023 16:00:00 |
2022 |
150 |
250 |
300 |
9437740 |
10/31/2023 16:00:00 |
10/31/2023 16:03:00 |
2023 |
NULL |
NULL |
350 |