flatten_leaves

Type of operation: Semistructured

Description

Given an object or array input, recursively flatten all child elements into key-value columns, returning only leaf values.

The key and value child element columns are named ‘_c_NAME_path’ and ‘_c_NAME_value’. NAME is replaced with the original column name.

Flatten_leaves is a less expensive operation than flatten or flatten_all. The default is to not suggest column types (‘suggesttypes’ = ‘false’.) See also flatten, flatten_single, and flatten_all.

Usage

flatten_leaves pathexpression, [ suggesttypes ]

Argument

Type

Optional

Repeatable

Restrictions

pathexpression

array or object

no

no

column

suggesttypes

bool

yes

no

none

Accelerable

flatten_leaves is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

flatten_leaves foo

Given this JSON in column foo:

foo

{"a":{"aa":1},"b":{"bb":[{"bb1":2},{"bb2":3}]}}

flatten_leaves produces:

_c_foo_path

_c_foo_value

a.aa

1

b.bb[0].bb1

2

b.bb[1].bb2

3

It recurses the JSON object and produce new columns that contain every leaf path and its corresponding value, without intermediate objects. Column ‘foo’ will be removed.

flatten_leaves foo, true

Given this JSON in column foo:

foo

{"a":{"aa":1},"b":{"bb":[{"bb1":2},{"bb2":3}]}}

flatten_leaves produces:

_c_foo_path

_c_foo_value

_c_foo_type

a.aa

1

int64

b.bb[0].bb1

2

int64

b.bb[1].bb2

3

int64

It recurses the JSON object and produces new columns that contain every leaf path and its corresponding value, without intermediate objects. It will also attempt to determine the value’s type, creating a third (hidden) column named ‘_c_foo_type’. Column ‘foo’ will be removed.

Aliases

  • flattenleaves (deprecated)