make_metric

Type of operation: Metadata, Metrics

Description

Creates a metric dataset from dataset with precomputed time grid.

make_metric must specify 1 or more metric argument which are columns of numeric type, and a groupby argument which takes in the tag columns for the metric dataset.

metric argument only allows these types: int64, float64. groupby argument takes any existing columns from the dataset, as long as the column is not used as one of the metric arguments and not a valid_from or valid_to column of the dataset.

The produced dataset will be a metric dataset with this schema:

  • a timestamp column storing the reporting time for each metric point

  • a metric column storing the metric name for each metric point

  • a value column storing the metric value for each metric point

  • a few tags columns (as specified in groupby) storing the tags for each metric point

All other columns will be dropped as a result of this verb.

Ultimately, this verb helps user to avoid the tedious steps involved in publishing a metric dataset, where historically a user would need to manually reshape dataset using the pick_col, unpivot, make_object, flatten, interface verbs.

This verb specifically helps publishing computed metric with a time grid, which are typically produced by timechart or align. It is NOT intended to be used to publish arbitrary metric data. For those, use interface "metric" instead.

Usage

make_metric metric_1, metric_2, ..., [ groupby ]

Argument

Type

Optional

Repeatable

Restrictions

metric

numeric

no

yes

column

groupby

grouping

yes

no

constant

Accelerable

make_metric is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.

Examples

make_col error: if(status!=200, 1, 0)
timechart 1m, error: sum(error), total: count(), group_by(cluster)
make_metric error, total, group_by(cluster)

From the input dataset, it creates an error column that checks whether the status is not 200.

Afterwards, a timechart verb is applied to produce a time grid data with 1 minute interval.

Then we apply the make_metric verb to get the following metric dataset:

  • The “Valid From” column of the timechart becomes the timestamp column

  • The name of the error column becomes the name of each metric point within the metric column

  • The value within the error column becomes the value of each metric point within the value column

  • The cluster column becomes the tag column of each metric point

  • Rest of the columns are dropped

The above OPAL can be published directly as a metric dataset, and the metrics will show up in metric explorer. It is NOT necessary to further add any interface "metric" or set_metric. You can still optionally use set_metric to provide more metadata, see example below for details.

align 1m,
  memory_used: avg(m("container_memory_usage_bytes")),
  memory_requested: avg(m("kube_pod_container_resource_requests_memory_bytes"))
aggregate
  pod_memory_utilization: sum(memory_used) / sum(memory_requested),
  group_by(cluster, namespace, podName)
make_metric pod_memory_utilization, group_by(cluster, namespace, podName, containerName)

From the input metric dataset, it applies the align verb that aligns the metric points to a time grid of 1 minute. For each 1 minute time grid, it caculates the average of the “container_memory_usage_bytes” metric to the memory_used column and the average of the “kube_pod_container_resource_requests_memory_bytes” metric to the memory_requested column.

Afterwards, it applies the aggregate verb to create the pod_memory_utilization column that represents the memory utilization ratio for the pods, by dividing between the sum of memory_used and the sum of memory_requested grouped by each pod (i.e. cluster, namespace, podName). aggregate preserves the time grid, so the produced dataset still has a time grid of 1 minute.

Then we apply the make_metric verb to get the following metric dataset:

  • The “Valid From” column of the aggregate verb becomes the timestamp

  • The name of the pod_memory_utilization column becomes the name of each metric point within the metric column

  • The value within the pod_memory_utilization column becomes the value of each metric point within the value column

  • The cluster, namespace, podName, and containerName columns becomes the tag columns of each metric point

  • Rest of the columns are dropped

timechart 1m, request_payload_size: sum(strlen(http_request)), group_by(cluster, endpoint)
make_metric request_payload_size, group_by(cluster, endpoint)
set_metric options(unit: "B", description: "total HTTP request payload", type: "delta", interval: 1m), "request_payload_size"

From the input dataset, it applies the timechart verb to produce the request_payload_size column that’s the sum of the string length of http_request column’s values, which are grouped by the cluster and endpoint columns.

Then we apply the make_metric verb to get the following metric dataset:

  • The “Valid From” column of the timechart verb becomes the timestamp

  • The name of the request_payload_size column becomes the name of each metric point within the metric column

  • The value within the request_payload_size column becomes the value of each metric point within the value column

  • The cluster and endpoint columns becomes the tag columns of each metric point

  • Rest of the columns are dropped

Since the make_metric verb only preserves information about how each metric point is reported in every 1 minute interval, the set_metric can be applied to the metric dataset to add the following metadata:

  • Add a unit of “Bytes” for the “request_payload_size” metrics, so when we plot them the unit will be displayed automatically

  • Add a description to the “request_payload_size” metric that it’s about the “total HTTP request payload”

  • Specify that the “request_payload_size” metric being a “delta” metric, so when we plot them the right alignment method will be chosen (i.e. sum())

  • Most importantly, mark that the “request_payload_size” metric is being reported each 1 minute, so when we plot it a good resolution will be selected by default