make_metric¶
Type of operation: Metadata, Metrics
Description¶
Creates a metric dataset from dataset with precomputed time grid.
make_metric must specify 1 or more metric argument which are columns of numeric type, and a groupby argument which takes in the tag columns for the metric dataset.
metric argument only allows these types: int64, float64. groupby argument takes any existing columns from the dataset, as long as the column is not used as one of the metric arguments and not a valid_from or valid_to column of the dataset.
All the metrics will be defined as a gauge by default. To define the metric differently, please use set_metric afterward.
The produced dataset will be a metric dataset with this schema:
- a - timestampcolumn storing the reporting time for each metric point
- a - metriccolumn storing the metric name for each metric point
- a - valuecolumn storing the metric value for each metric point
- a few tags columns (as specified in - groupby) storing the tags for each metric point
All other columns will be dropped as a result of this verb.
Ultimately, this verb helps user to avoid the tedious steps involved in publishing a metric dataset, where historically a user would need to manually reshape dataset using the pick_col, unpivot, make_object, flatten, interface verbs.
This verb specifically helps publishing computed metric with a time grid, which are typically produced by timechart or align. It is NOT intended to be used to publish arbitrary metric data. For those, use interface "metric" instead.
Usage¶
make_metric metric_1, metric_2, ..., [ groupby ]
| Argument | Type | Optional | Repeatable | Restrictions | 
|---|---|---|---|---|
| metric | numeric | no | yes | column | 
| groupby | grouping | yes | no | constant | 
Accelerable¶
make_metric is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.
Examples¶
make_col error: if(status!=200, 1, 0)
timechart 1m, error: sum(error), total: count(), group_by(cluster)
make_metric error, total, group_by(cluster)
From the input dataset, it creates an error column that checks whether the status is not 200.
Afterwards, a timechart verb is applied to produce a time grid data with 1 minute interval.
Then we apply the make_metric verb to get the following metric dataset:
- The “Valid From” column of the - timechartbecomes the- timestampcolumn
- The name of the - errorcolumn becomes the name of each metric point within the- metriccolumn
- The value within the - errorcolumn becomes the value of each metric point within the- valuecolumn
- The - clustercolumn becomes the tag column of each metric point
- Rest of the columns are dropped 
The above OPAL can be published directly as a metric dataset, and the metrics will show up in metric explorer. It is NOT necessary to further add any interface "metric" or set_metric. You can still optionally use set_metric to provide more metadata, see example below for details.
align 1m,
  memory_used: avg(m("container_memory_usage_bytes")),
  memory_requested: avg(m("kube_pod_container_resource_requests_memory_bytes"))
aggregate
  pod_memory_utilization: sum(memory_used) / sum(memory_requested),
  group_by(cluster, namespace, podName)
make_metric pod_memory_utilization, group_by(cluster, namespace, podName, containerName)
From the input metric dataset, it applies the align verb that aligns the metric points to a time grid of 1 minute. For each 1 minute time grid, it caculates the average of the “container_memory_usage_bytes” metric to the memory_used column and the average of the “kube_pod_container_resource_requests_memory_bytes” metric to the memory_requested column.
Afterwards, it applies the aggregate verb to create the pod_memory_utilization column that represents the memory utilization ratio for the pods, by dividing between the sum of memory_used and the sum of memory_requested grouped by each pod (i.e. cluster, namespace, podName). aggregate preserves the time grid, so the produced dataset still has a time grid of 1 minute.
Then we apply the make_metric verb to get the following metric dataset:
- The “Valid From” column of the - aggregateverb becomes the- timestamp
- The name of the - pod_memory_utilizationcolumn becomes the name of each metric point within the- metriccolumn
- The value within the - pod_memory_utilizationcolumn becomes the value of each metric point within the- valuecolumn
- The - cluster,- namespace,- podName, and- containerNamecolumns becomes the tag columns of each metric point
- Rest of the columns are dropped 
timechart 1m, request_payload_size: sum(strlen(http_request)), group_by(cluster, endpoint)
make_metric request_payload_size, group_by(cluster, endpoint)
set_metric options(unit: "B", description: "total HTTP request payload", type: "delta", interval: 1m), "request_payload_size"
From the input dataset, it applies the timechart verb to produce the request_payload_size column that’s the sum of the string length of http_request column’s values, which are grouped by the cluster and endpoint columns.
Then we apply the make_metric verb to get the following metric dataset:
- The “Valid From” column of the - timechartverb becomes the- timestamp
- The name of the - request_payload_sizecolumn becomes the name of each metric point within the- metriccolumn
- The value within the - request_payload_sizecolumn becomes the value of each metric point within the- valuecolumn
- The - clusterand- endpointcolumns becomes the tag columns of each metric point
- Rest of the columns are dropped 
Since the make_metric verb only preserves information about how each metric point is reported in every 1 minute interval, the set_metric can be applied to the metric dataset to add the following metadata:
- Add a unit of “Bytes” for the “request_payload_size” metrics, so when we plot them the unit will be displayed automatically 
- Add a description to the “request_payload_size” metric that it’s about the “total HTTP request payload” 
- Specify that the “request_payload_size” metric being a “delta” metric, so when we plot them the right alignment method will be chosen (i.e. - sum())
- Most importantly, mark that the “request_payload_size” metric is being reported each 1 minute, so when we plot it a good resolution will be selected by default