prom_quantile¶
Description¶
Calculates an approximate percentile value of the distribution in a histogram metric generated by a Prometheus data source.
This function is generally used in aggregate after previously time-aligning the metric with rate. Aggregation across key dimensions will be done with sum internally, and should not be separately specified.
The metric must have a name ending in _bucket
and there must be a tag named
le
with the bucket boundaries because of how Prometheus histograms are
generated. Additionally, the Grafana collection agent must be configured to not
drop these metrics, which is commonly a default because the size of each
Prometheus histogram is much bigger than a regular counter.
An additional restriction is that the Prometheus quantile function must be at the top level of an expression (i e, the output column value). To do further operations on this value, put it into a column, and do the additional calculations in a subsequent step.
Return type¶
float64
Domain¶
This is an aggregate function (aggregates rows over a group in aggregate verbs.)
Categories¶
Usage¶
prom_quantile(prom_bucket, quantile, [ le_val ])
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
prom_bucket |
numeric |
no |
no |
none |
quantile |
numeric |
no |
no |
constant |
le_val |
float64 |
yes |
no |
none |
Examples¶
align 5m, rate(m("request_latency_bucket"))
aggregate
p95:prom_quantile(request_latency_bucket, 0.95, tags.le),
group_by(tags.service)
The request latency histogram is aligned to 5 minute buckets, and rate is
calculated to get the rate of samples (because the histogram is cumulative).
The 95th percentile is then estimated using prom_quantile()
across
the service
tag, based on the le
bucket tags generated by the histogram
source.
align 5m, rate(m("request_latency_bucket")), rate(m("request_throughput_bucket"))
aggregate
p95_lat:prom_quantile(request_latency_bucket, 0.95),
p95_thru:prom_quantile(request_throughput_bucket, 0.95),
group_by(tags.service)
make_col bandwidth95_latency95_product:p95_lat * p95_thru
Use the “le” tag in the “label” object column (or another suitably named column), estimate the p95 of the request latency and request throughput based on the pre-defined Prometheus histogram buckets. Then calculate the “bandwidth delay product” of these two estimates. This calculation must be in its own subsequent operation.