Prometheus Target Allocator

The Observe Agent can deploy the OpenTelemetry Target Allocator (TA) to scale Prometheus scraping. The Target Allocator does the following things:

  • Sharding — distributes scrape targets across multiple prometheus-scraper replicas using consistent hashing, so you can scale scraping horizontally without producing duplicate samples.
  • CRD discovery — optionally discovers scrape targets from Prometheus Operator ServiceMonitor and PodMonitor custom resources, in addition to (or instead of) the chart's annotation-based Prometheus autodiscovery.

Prerequisites

The Target Allocator runs on the dedicated prometheus-scraper deployment. It requires the following values:

  • application.prometheusScrape.enabled: true
  • application.prometheusScrape.independentDeployment: true
  • node.metrics.cadvisor.separate_pipeline: false

The chart fails to render if these are not set when the Target Allocator is enabled.

Shard Prometheus scraping across replicas

By default a single prometheus-scraper pod scrapes every target. Running more than one replica without the Target Allocator causes every replica to scrape every target, producing duplicate samples. Enable the Target Allocator so replicas can shard targets between themselves.

Create target-allocator.yaml with the following configuration:

application:
  prometheusScrape:
    enabled: true
    # the Target Allocator runs on the dedicated prometheus-scraper deployment
    independentDeployment: true
    targetAllocator:
      enabled: true
      # how often each scraper polls the Target Allocator for its assigned
      # targets. Lower values pick up short-lived targets (such as Kubernetes
      # Jobs) sooner, at the cost of more requests.
      interval: 30s

node:
  metrics:
    cadvisor:
      # required to be false when the Target Allocator is enabled
      separate_pipeline: false

# scale the scraper horizontally; targets are sharded across the replicas
prometheus-scraper:
  replicaCount: 3

With this configuration, the chart deploys the opentelemetry-target-allocator subchart, and each prometheus-scraper replica polls it to learn which targets it owns.

Discover targets from ServiceMonitor and PodMonitor CRDs

If your cluster uses the Prometheus Operator, the Target Allocator can discover scrape targets directly from ServiceMonitor and PodMonitor custom resources. This lets the Observe Agent reuse the scrape configuration you already maintain for those workloads.

📘

Note

This requires the Prometheus Operator ServiceMonitor and PodMonitor CRDs to be installed in the cluster. The Target Allocator subchart's ClusterRole already grants the necessary read access to the monitoring.coreos.com API group.

To enable CRD discovery, set prometheusCR.enabled: true:

application:
  prometheusScrape:
    enabled: true
    independentDeployment: true
    targetAllocator:
      enabled: true
      prometheusCR:
        enabled: true

node:
  metrics:
    cadvisor:
      separate_pipeline: false

By default, CRD-discovered targets are scraped in addition to the chart's static pod-annotation jobs.

Use pure-CRD discovery

To scrape only the targets discovered from ServiceMonitor and PodMonitor CRDs—dropping the chart's static pod-metrics and cAdvisor scrape jobs—set includeStaticScrapeConfigs: false:

application:
  prometheusScrape:
    enabled: true
    independentDeployment: true
    targetAllocator:
      enabled: true
      includeStaticScrapeConfigs: false
      prometheusCR:
        enabled: true
🚧

Warning

Do not set both prometheusCR.enabled: false and includeStaticScrapeConfigs: false. With neither source enabled, the Target Allocator has no scrape configs to serve, and the chart fails to render.

Filter which ServiceMonitors and PodMonitors are discovered

By default the Target Allocator discovers every ServiceMonitor and PodMonitor in every namespace. On multi-tenant clusters, use the selector values to restrict discovery—for example, to pick up only the resources labeled for Observe:

application:
  prometheusScrape:
    targetAllocator:
      prometheusCR:
        enabled: true
        # label selectors — empty (default) matches all
        serviceMonitorSelector:
          matchLabels:
            release: observe
        podMonitorSelector:
          matchLabels:
            release: observe
        # namespace selectors — empty (default) matches all namespaces
        serviceMonitorNamespaceSelector:
          matchLabels:
            kubernetes.io/metadata.name: my-app
        podMonitorNamespaceSelector: {}

Set a default scrape interval

A ServiceMonitor or PodMonitor that does not declare its own interval is scraped at the Target Allocator's built-in default of 30 seconds. Use scrapeInterval to change that default without editing each custom resource:

application:
  prometheusScrape:
    targetAllocator:
      prometheusCR:
        enabled: true
        # default interval for SMs/PMs that omit their own `interval`.
        # Leave empty to use the Target Allocator's built-in 30s default.
        scrapeInterval: 60s

This only sets the default. Any ServiceMonitor or PodMonitor that declares its own interval keeps that value.

Redeploy the Observe Agent

Run the following command to redeploy the Observe Agent in the observe namespace with the Target Allocator configuration.

helm upgrade --reuse-values observe-agent observe/agent -n observe --values target-allocator.yaml

Restart the pods

Run the following commands to restart the pods with the updated configuration.

kubectl rollout restart deployment -n observe
kubectl rollout restart daemonset -n observe

Run the following command to make sure the Observe Agent has been redeployed successfully.

kubectl get pods -o wide -n observe

Monitor the Target Allocator

The Observe Agent's Monitor scrapes the Target Allocator's own metrics, so you can observe target distribution and shard balance. The following opentelemetry_allocator_* metrics are available:

MetricDescription
opentelemetry_allocator_targetsThe number of targets known to the allocator.
opentelemetry_allocator_collectors_allocatableThe number of scraper collectors that targets can be assigned to.
opentelemetry_allocator_targets_per_collectorThe number of targets each collector is assigned, which indicates shard balance.
opentelemetry_allocator_time_to_allocateHow long an allocation pass takes.

Further reading