flatten

Type of operation: Semistructured

Description

Given an object or array input, recursively flatten all child elements into key-value columns, with null intermediates.

The key and value child element columns are named ‘c_NAME_path’ and ‘_c_NAME_value’. NAME is replaced with the original column name.

Flatten is a more expensive operation than flatten_leaves or flatten_single. The default is to not suggest column types (‘suggesttypes’ = ‘false’.) See also flatten_leaves, flatten_single, and flatten_all.

Usage

flatten pathexpression, [ suggesttypes ]

Argument

Type

Optional

Repeatable

Restrictions

pathexpression

array or object

no

no

column

suggesttypes

bool

yes

no

none

Accelerable

flatten is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

flatten foo

Given this JSON in column foo:

foo

{"a":{"aa":1},"b":{"bb":[{"bb1":2},{"bb2":3}]}}

flatten produces:

_c_foo_path

_c_foo_value

b.bb[1].bb2

3

a.aa

1

b.bb[0].bb1

2

a

null

b.bb[1]

null

b

null

b.bb[0]

null

b.bb

null

It recurses the JSON object to produce new columns that contain every possible path and its corresponding value, with null values for intermediate key paths so the full tree is returned. Column ‘foo’ will be removed.

flatten foo, true

Given this JSON in column foo:

foo

{"a":{"aa":1},"b":{"bb":[{"bb1":2},{"bb2":3}]}}

flatten produces:

_c_foo_path

_c_foo_value

_c_foo_type

b.bb[1].bb2

3

int64

a.aa

1

int64

b.bb[0].bb1

2

int64

a

null

null

b.bb[1]

null

null

b

null

null

b.bb[0]

null

null

b.bb

null

null

It recurses the JSON object to produce new columns that contain every possible path and its corresponding value, with null values for intermediate key paths so the full tree is returned. It will also attempt to determine the value’s type, creating a third (hidden) column named ‘_c_foo_type’. Column ‘foo’ will be removed.