width_bucket

Description

Construct equi-width histogram by dividing up the range between min and max with num_buckets buckets, and compute an integer bucket ID for each value. The value range is left-inclusive and right-exclusive, which is usually represented as [min, max). The bucket IDs would start from 1 and end with num_buckets.

For out-of-bound values, special bucket IDs are returned: if a value is less than min, its bucket ID is 0. If a value is greater than or equal to max, its bucket ID is num_buckets + 1.

The function returns null if any input is null or when min is greater than or equal to max (the value range is empty).

Return type

int64

Domain

This is a scalar function (calculates a single output value for a single input row.)

Categories

Usage

width_bucket(value, min, max, num_buckets)

Argument

Type

Optional

Repeatable

Restrictions

value

numeric

no

no

none

min

numeric

no

no

none

max

numeric

no

no

none

num_buckets

int64

no

no

none

Examples

make_col bucket:width_bucket(value, 1000, 2000, 10)
statsby count(1), group_by(bucket)

Compute the histogram by assigning the value to 10 buckets ranging from 1000 to 2000.

make_col bucket:width_bucket(value, window(min(value)), window(max(value)), 10)
statsby count(1), group_by(bucket)

Compute the histogram using the dynamic range determined by the min and max values. Note that the bucket value can go from 1 to 11, because the values that match the max will be assigned to bucket 11.