flatten¶
Type of operation: Semistructured
Description¶
Given an object or array input, recursively flatten all child elements into key-value columns, with null intermediates.
The key and value child element columns are named ‘c_NAME_path’ and ‘_c_NAME_value’. NAME is replaced with the original column name.
Flatten is a more expensive operation than flatten_leaves or flatten_single. The default is to not suggest column types (‘suggesttypes’ = ‘false’.) See also flatten_leaves, flatten_single, and flatten_all.
Usage¶
flatten pathexpression, [ suggesttypes ]
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
pathexpression |
array or object |
no |
no |
column |
suggesttypes |
bool |
yes |
no |
none |
Accelerable¶
flatten is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.
Examples¶
flatten foo
Given this JSON in column foo
:
foo |
---|
|
flatten produces:
_c_foo_path |
_c_foo_value |
---|---|
|
3 |
|
1 |
|
2 |
|
null |
|
null |
|
null |
|
null |
|
null |
It recurses the JSON object to produce new columns that contain every possible path and its corresponding value, with null values for intermediate key paths so the full tree is returned. Column ‘foo’ will be removed.
flatten foo, true
Given this JSON in column foo
:
foo |
---|
|
flatten produces:
_c_foo_path |
_c_foo_value |
_c_foo_type |
---|---|---|
|
3 |
int64 |
|
1 |
int64 |
|
2 |
int64 |
|
null |
null |
|
null |
null |
|
null |
null |
|
null |
null |
|
null |
null |
It recurses the JSON object to produce new columns that contain every possible path and its corresponding value, with null values for intermediate key paths so the full tree is returned. It will also attempt to determine the value’s type, creating a third (hidden) column named ‘_c_foo_type’. Column ‘foo’ will be removed.