width_bucket¶
Description¶
Construct equi-width histogram by dividing up the range between min
and max
with num_buckets
buckets, and compute an integer bucket ID for each value. The
value range is left-inclusive and right-exclusive, which is usually represented
as [min, max)
. The bucket IDs would start from 1
and end with num_buckets
.
For out-of-bound values, special bucket IDs are returned: if a value is less
than min
, its bucket ID is 0
. If a value is greater than or equal to max
,
its bucket ID is num_buckets + 1
.
The function returns null if any input is null or when min
is greater than or
equal to max
(the value range is empty).
Return type¶
int64
Domain¶
This is a scalar function (calculates a single output value for a single input row.)
Categories¶
Usage¶
width_bucket(value, min, max, num_buckets)
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
value |
numeric |
no |
no |
none |
min |
numeric |
no |
no |
none |
max |
numeric |
no |
no |
none |
num_buckets |
int64 |
no |
no |
none |
Examples¶
make_col bucket:width_bucket(value, 1000, 2000, 10)
statsby count(1), group_by(bucket)
Compute the histogram by assigning the value to 10 buckets ranging from 1000 to 2000.
make_col bucket:width_bucket(value, window(min(value)), window(max(value)), 10)
statsby count(1), group_by(bucket)
Compute the histogram using the dynamic range determined by the min and max
values. Note that the bucket
value can go from 1 to 11, because the values
that match the max will be assigned to bucket 11.