Introduction to Metrics

A metric is any value you can measure over time. It can be blocks used on a filesystem, the number of nodes in a cluster, or a temperature reading. Observe reports Metrics in the form of a time series: a set of values in time order. Each point in a time series represents a measurement from a single resource, with its name, value, and tags.

Types of Metrics Collected by Observe

Each metric collected by Observe should be of a specific type. The type affects the way metrics values display when queried. You can view the metric type on the Metrics page by hovering over the name of a Metric Dataset and reviewing the displayed information on the left.

aws/Synthetic Canary Metrics 2XX

Figure 1 - Displaying Details about aws/Synthetic Canary Metrics 2XX

From the details displayed on the card, you can see in the Properties that this metric has the Type gauge.

The gauge metric submission type represents a snapshot of events in a one-time interval. This representative snapshot value consists of the last value submitted to Observe during a time interval. Use a gauge metric type to measure data reporting continuously such as available disk space or memory used.

For example, when you submit a gauge metric, such as temperature, from Open Weather app, the app sends values in a flush time interval, 81,81,81,81,81.5, and submits the last reported value, 81.5, as the gauge metric value.

Represents a running count of occurrences, and measures the change in the value of the metric from the previous data point. The value can only increase or reset to zero when restarted.

The overall change to a value since the initial measurement. For example, if CPU utilization initially measures 50% and the next measure reads 80%, the delta change would be 30%.

In certain operations, different metric types may be treated differently. For instance, rate() and deriv() alignment functions behave differently for gauge, cumulativeCounter, and delta metrics. See set_metric for more details.

  • Measurement type - describes the type of data reported in each data point. Currently, Observe only supports the float64 type.

  • Unit - describes the unit of measurement such as kb or kB/s.

  • Description - detailed information about the metric.

  • Tags - For a time series, tags better describe and differentiate the measurements. You can use them to identify individual times series during metric computations such as align and aggregate.

Pod memory usage metrics on the Pod dashboard

Figure 1 - Pod memory usage metrics on the Pod dashboard

A Metrics Dataset contains metric data recognized by Observe. Observe optimizes the metric dataset for scalable ingestion and queries, supporting a large number of metrics. A metric dataset has the following properties:

  • Each row in the dataset table describes one point in a time series.

  • A metric dataset contains a string type metric value column named metric.

  • Contains a float64 metric value column named value.

  • Contains a valid_from column with the measurement time for each data point.

  • The metric interface OPAL language designates a dataset as a metric dataset.

  • All non-metric names, values, and non-valid_from columns contain metric tags.

A Metrics Dataset is always an Event Dataset and the data either inherited from an upstream Metrics Dataset or created using the OPAL interface verb.

interface "metric", metric:my_name, value:the_reading

Now that you understand the types of Metrics collected by Observe and the Observe apps, use the Metrics Explorer feature to easily view and model Metrics Datasets.

Note

Metrics use OPAL in a worksheet to transform the raw data, add metadata, and create relationships between datasets. If you are not familiar with OPAL, please see OPAL — Observe Processing and Analysis Language

Metrics Video Tour

Figure 2 - Video tour of Observe Metrics Landing page

A metric dataset contains one metric per row - a single data point containing a timestamp, name, value, and zero or more tags. For example, the following table contains values for two metrics:

valid_from

metric

value

tags

00:00:00

disk_used_bytes

20000000

{“device”:”sda1”}

00:00:00

disk_total_bytes

50000000

{“device”:”sda1”}

00:01:00

disk_used_bytes

10000000

{“device”:”sda1”}

00:01:00

disk_total_bytes

50000000

{“device”:”sda1”}

00:02:00

disk_used_bytes

40000000

{“device”:”sda1”}

00:02:00

disk_total_bytes

50000000

{“device”:”sda1”}

Some systems generate this by default, or you can shape other data into the correct form with OPAL.

Note

Metric values must be float64. If you need to convert from another type, see the float64() function.

Metric Interfaces

The interface verb maps fields to a metric interface, so subsequent operations know which fields contain the metric names and values. This metadata-only operation prepares a dataset for use as metrics.

Example:

interface "metric"

The data you see doesn’t change, but registering or implementing the metric interface establishes the following conditions:

  • Each row represents one point in a time series

  • A field named metric contains the metric names

  • A field named value contains the metric values

If the metric names and values are already in fields called metric and value, interface discovers them automatically. See the docs for interface for more about fields with nonstandard names.)

Modeling Metadata into Metric Datasets

Every metric in Observe can have a type, unit, and description associated with it. Defining a metric using set_metric, lets you set values for all three of these.

Observe uses the metadata of a metric to visualize metrics appropriately. For example, rate rollup is chosen for metrics of the type cumulativeCounter, and the Y-axis of metric visualizations selects the unit based on the unit of the chosen metric.

Some metric providers, such as Google Cloud Platform (GCP), AWS, and OTEL, provide metric metadata with every metric point. The metric dataset looks similar to the following table:

Metric

Value

Tag

Type

Unit

Description

Tags

Other providers, such as Prometheus, send metadata separately from the metric and do not contain any information about the tags.

A sample observation of a metric point sent by Prometheus with the OBSERVATION_KIND set to prometheus:

{

  "__name__": "collector_request_duration_seconds_bucket",  // <- metric name
  "clusterUid": "23e17bad-48da-427e-9585-ead2231bbcae",     //---+
  "container": "collector",                                 //   |
  "endpoint": "/chronicle/eu-west1/dmgprotect",             //   |
  "instance": "172.20.68.149:11000",                        //   |
  "job": "integrations/kubernetes/pods",                    //   |<- metric tags
  "le": "0.05",                                             //   |
  "namespace": "prod-eu-1",                                 //   |
  "node": "ip-172-20-83-237.eu-central-1.compute.internal", //   |
  "pod": "collector-6dbb6787f9-5bqdn",                      //   |
  "status": "202"                                           //---+
 
}

A sample observation of metric metadata sent by Prometheus with the OBSERVATION_KIND set to prom_md:

{
  "help": "HTTP request latency in seconds",                    
  "metric_family_name": "collector_request_duration_seconds",  // <- metric name
  "type": "HISTOGRAM",
  "unit": ""
}

Modeling Metadata

Observe provides two ways of modeling metric metadata:

  • GCP, AWS, and OTEL metrics where the metric includes the metadata

  • Prometheus where the metrics and metadata are sent separately.

For metrics with included metadata, use the following OPAL:

interface "metric", metric:metric, value:value, metricType:type, metricUnit:unit, metricDescription:description

metricType:type, metricUnit:unit, and metricDescription are optional parameters for the metric interface. When defined, Observe can find the metric metadata in the columns for these fields.

To model metadata sent separately from metrics, store the metrics and metadata in separate datasets. Datasets containing metrics should implement metric interface and the dataset containing metadata should implement the metric_metadata interface.

Metric

Type

Unit

Description

Define the metric_metadata interface using the following OPAL:

interface "metric_metadata", metric:metric, metricType:type, metricUnit:unit, metricDescription:description

Tutorial: Shaping Metrics

To show how this works, use the following example of creating metrics from process data. Since you perform most of the shaping with OPAL, the walkthrough focuses on verbs and functions rather than UI actions.

Use the shell script to send data from ps to Observe every five seconds, sent to a data stream called “metrics-test.” Before you convert it to JSON, the original ps output looks like this:

PID   RSS     TIME %CPU COMMAND
  1 12752        1  2.0 systemd
  2     0        0  0.0 kthreadd
  3     0        0  0.0 rcu_gp

Field

Description

PID

Process ID

RSS

Resident set size (memory used, kb)

TIME

Accumulated CPU time

%CPU

Percent CPU utilization

COMMAND

Process name

As the data ingests into a datastream, Observe adds a timestamp, an ingest type, and metadata about the datastream. In this example, the process data is in FIELDS as a JSON object:

The metrics-test event dataset, opened in a new worksheet. Fields shown in the data table include BUNDLE_TIMESTAMP, FIELDS, and EXTRA. The Inspect tab of the OPAL console shows part of the value for the highlighted FIELDS row. It is a large JSON object containing the process ID, Resident Set Size, and name for each sampled process.

Figure 3 - Process data in a worksheet.

The first step to converting the datastream into metrics is shaping the data using OPAL.

  1. Open a new worksheet for the existing metrics-test event dataset.

  2. In the OPAL console, extract the necessary fields with flatten_leaves, pick_col, and extract_regex.

   // Flatten_Leaves creates a new row for each set of process data,
   // corresponding to one row in the original output
   // Creates _c_FIELDS_stdout_value containing each string
   // and _c_FIELDS_stdout_path for its position in the JSON object 
   flatten_leaves FIELDS.stdout
  // Select the field that contains the data you want. Rename the field too.
  // pick_col must include a timestamp, even if you aren't explicitly using it
   pick_col BUNDLE_TIMESTAMP, ps:string(_c_FIELDS_stdout_value)
 // Extract fields from the ps string output with a regex
   extract_regex ps, /^\s+(?P<pid>\d+)\s+(?P<rss>\d+)\s+(?P<cputimes>\d+)\s+(?P<pcpu>\d+.\d+)\s+(?P<command>\S+)\s*$/

The reformatted data now looks like the following:

BUNDLE_TIMESTAMP

ps

command

pcpu

cputimes

rss

pid

02/24/21 16:14:03.151

1 12752 1 2.0 systemd

systemd

2.0

1

12752

1

02/24/21 16:14:03.151

2 0 0 0.0 kthreadd

kthreadd

0.0

0

0

2

02/24/21 16:14:03.151

3 0 0 0.0 rcu_gp

rcu_gp

0.0

0

0

3

Note that you could also extract fields with a regex from the UI by selecting Extract from text from the column menu and using the Custom regular expression method. Although the other steps still require writing OPAL statements.

  1. Shape into narrow metrics:

    // Create a new object containing the desired values,
    // along with more verbose metric names
    make_col metrics:make_object("resident_set_size":rss, "cumulative_cpu_time":cputimes, "cpu_utilization":pcpu)
    
    // Flatten that metrics object to create one row for each value
    flatten_leaves metrics
    
    // Select the desired fields, renaming in the process
    // Also convert the value to float64, necessary for metric values
    pick_col valid_from:BUNDLE_TIMESTAMP,
      pid, command,
      metric:string(_c_metrics_path), value:float64(_c_metrics_value)
    

    After shaping, it appears like this:

    valid_from

    pid

    command

    metric

    value

    02/24/21 16:14:03.151

    1

    systemd

    cpu_utilization

    2.0

    02/24/21 16:14:03.151

    1

    systemd

    resident_set_size

    12752

    02/24/21 16:14:03.151

    1

    systemd

    cumulative_cpu_time

    1

    02/24/21 16:14:03.151

    2

    kthreadd

    cpu_utilization

    0.0

    02/24/21 16:14:03.151

    2

    kthreadd

    resident_set_size

    0

    02/24/21 16:14:03.151

    2

    kthreadd

    cumulative_cpu_time

    0

    02/24/21 16:14:03.151

    3

    rcu_gp

    cpu_utilization

    0.0

    02/24/21 16:14:03.151

    3

    rcu_gp

    resident_set_size

    0

    02/24/21 16:14:03.151

    3

    rcu_gp

    cumulative_cpu_time

    0

  2. Register an interface to identify this dataset as containing metrics data:

    // Metric names are in field "metric", values in "value"
    interface "metric"
    

    An interface "metric" statement tells Observe several important things about a dataset:

    • This is a narrow metric dataset, each row representing one metric point.

    • The values in field metric contains the metric names, such as cpu_utilization.

    • The values in field value contains the metric values, such as 2.0.

    • The values in valid_from contains the time of the observation

    • The other fields (pid and command) contain tags that provide additional context

  3. Save the shaped data as a new dataset.

    After shaping the data, save the results by publishing a new event stream. This creates a new dataset containing the metric events and allows them to be used by other datasets and worksheets.

    In the right menu, click Publish New Event Stream and give the new dataset a name. For this example, you name it process/linux-process-metrics to create the dataset in a new process package. Click Publish to save.

    Right menu with the Publish Event Stream dialog open.

    Figure 4 - The Publish Event Stream dialog.

  4. View the metrics in Observe.

    Now that you identified that the dataset contains metrics, Observe discovers the individual metrics without further shaping. This process takes a few minutes, after which you can find the new metrics in the Metrics tab of the Explore page.

  5. Search for the package process to view only the metrics for this package. Click on a metric to view the details:

    The Metrics tab, displaying the resident_set_size metric. The summary card on the left shows the name, type, the dataset this metric belongs to, and the description "Auto Detected Metric." On the right is additional information about this metric, including a chart of values.

    Figure 5 - The Metrics tab.

Advanced metric shaping

If auto-detected metrics incorrectly handle your data, you may also explicitly define the metrics of interest with the set_metric verb.

Example:

set_metric options(label:"Ingress Bytes", type:"cumulativeCounter", unit:"bytes", description:"Ingress reported from somewhere", rollup:"rate", aggregate:"sum"), "ingress_bytes"

This statement registers the metric ingress_bytes as a cumulativeCounter, aggregating over time as a rate and across multiple tags as a sum. To learn more about allowed values for the rollup and aggregate options, see the OPAL verb documentation for set_metric.

Note

set_metric units use standard SI unit names from the math.js library, with the exceptions noted below. You can combine them for compound units like rates and ratios. Other units may not scale appropriately in charts. Please contact us if you require help with an unusual or custom unit.

You may use the unit names or abbreviations, and most names can be either singular, hour, or plural, hours. Please see the math.js documentation for details.

Observe recommends using full names for clarity. Note that both names and abbreviations are case-sensitive. The metric does not contain a unit if you omit unit:. You may also use unit:"" to indicate values without units.

Examples of data units:

Name

Abbreviation

bits

b

bytes

B

kilobytes

kB

gigabytes

GB

terabytes

TB

bytes/second

B/s

megabits/second

Mb/s

Exceptions:

  • m denotes minutes. Use meter for length.

  • C denotes degrees celsius. Use coulomb for electric charge.

  • F denotes degrees Fahrenheit. Use farad for capacitance.

Units based on the B scale by a factor of 1000 on inboard cards. the metric value displays with larger units as its value increases. For example, 1,000 B bytes is 1 kB.

To scale by 1024, use By units: By, KiB, MiB, GiB, or TiB.