Prometheus Target Allocator
The Observe Agent can deploy the OpenTelemetry Target Allocator (TA) to scale Prometheus scraping. The Target Allocator does the following things:
- Sharding — distributes scrape targets across multiple
prometheus-scraperreplicas using consistent hashing, so you can scale scraping horizontally without producing duplicate samples. - CRD discovery — optionally discovers scrape targets from Prometheus Operator
ServiceMonitorandPodMonitorcustom resources, in addition to (or instead of) the chart's annotation-based Prometheus autodiscovery.
Prerequisites
The Target Allocator runs on the dedicated prometheus-scraper deployment. It requires the following values:
application.prometheusScrape.enabled: trueapplication.prometheusScrape.independentDeployment: truenode.metrics.cadvisor.separate_pipeline: false
The chart fails to render if these are not set when the Target Allocator is enabled.
Shard Prometheus scraping across replicas
By default a single prometheus-scraper pod scrapes every target. Running more than one replica without the Target Allocator causes every replica to scrape every target, producing duplicate samples. Enable the Target Allocator so replicas can shard targets between themselves.
Create target-allocator.yaml with the following configuration:
application:
prometheusScrape:
enabled: true
# the Target Allocator runs on the dedicated prometheus-scraper deployment
independentDeployment: true
targetAllocator:
enabled: true
# how often each scraper polls the Target Allocator for its assigned
# targets. Lower values pick up short-lived targets (such as Kubernetes
# Jobs) sooner, at the cost of more requests.
interval: 30s
node:
metrics:
cadvisor:
# required to be false when the Target Allocator is enabled
separate_pipeline: false
# scale the scraper horizontally; targets are sharded across the replicas
prometheus-scraper:
replicaCount: 3With this configuration, the chart deploys the opentelemetry-target-allocator subchart, and each prometheus-scraper replica polls it to learn which targets it owns.
Discover targets from ServiceMonitor and PodMonitor CRDs
If your cluster uses the Prometheus Operator, the Target Allocator can discover scrape targets directly from ServiceMonitor and PodMonitor custom resources. This lets the Observe Agent reuse the scrape configuration you already maintain for those workloads.
NoteThis requires the Prometheus Operator
ServiceMonitorandPodMonitorCRDs to be installed in the cluster. The Target Allocator subchart'sClusterRolealready grants the necessary read access to themonitoring.coreos.comAPI group.
To enable CRD discovery, set prometheusCR.enabled: true:
application:
prometheusScrape:
enabled: true
independentDeployment: true
targetAllocator:
enabled: true
prometheusCR:
enabled: true
node:
metrics:
cadvisor:
separate_pipeline: falseBy default, CRD-discovered targets are scraped in addition to the chart's static pod-annotation jobs.
Use pure-CRD discovery
To scrape only the targets discovered from ServiceMonitor and PodMonitor CRDs—dropping the chart's static pod-metrics and cAdvisor scrape jobs—set includeStaticScrapeConfigs: false:
application:
prometheusScrape:
enabled: true
independentDeployment: true
targetAllocator:
enabled: true
includeStaticScrapeConfigs: false
prometheusCR:
enabled: true
WarningDo not set both
prometheusCR.enabled: falseandincludeStaticScrapeConfigs: false. With neither source enabled, the Target Allocator has no scrape configs to serve, and the chart fails to render.
Filter which ServiceMonitors and PodMonitors are discovered
By default the Target Allocator discovers every ServiceMonitor and PodMonitor in every namespace. On multi-tenant clusters, use the selector values to restrict discovery—for example, to pick up only the resources labeled for Observe:
application:
prometheusScrape:
targetAllocator:
prometheusCR:
enabled: true
# label selectors — empty (default) matches all
serviceMonitorSelector:
matchLabels:
release: observe
podMonitorSelector:
matchLabels:
release: observe
# namespace selectors — empty (default) matches all namespaces
serviceMonitorNamespaceSelector:
matchLabels:
kubernetes.io/metadata.name: my-app
podMonitorNamespaceSelector: {}Set a default scrape interval
A ServiceMonitor or PodMonitor that does not declare its own interval is scraped at the Target Allocator's built-in default of 30 seconds. Use scrapeInterval to change that default without editing each custom resource:
application:
prometheusScrape:
targetAllocator:
prometheusCR:
enabled: true
# default interval for SMs/PMs that omit their own `interval`.
# Leave empty to use the Target Allocator's built-in 30s default.
scrapeInterval: 60sThis only sets the default. Any ServiceMonitor or PodMonitor that declares its own interval keeps that value.
Redeploy the Observe Agent
Run the following command to redeploy the Observe Agent in the observe namespace with the Target Allocator configuration.
helm upgrade --reuse-values observe-agent observe/agent -n observe --values target-allocator.yamlRestart the pods
Run the following commands to restart the pods with the updated configuration.
kubectl rollout restart deployment -n observe
kubectl rollout restart daemonset -n observeRun the following command to make sure the Observe Agent has been redeployed successfully.
kubectl get pods -o wide -n observeMonitor the Target Allocator
The Observe Agent's Monitor scrapes the Target Allocator's own metrics, so you can observe target distribution and shard balance. The following opentelemetry_allocator_* metrics are available:
| Metric | Description |
|---|---|
opentelemetry_allocator_targets | The number of targets known to the allocator. |
opentelemetry_allocator_collectors_allocatable | The number of scraper collectors that targets can be assigned to. |
opentelemetry_allocator_targets_per_collector | The number of targets each collector is assigned, which indicates shard balance. |
opentelemetry_allocator_time_to_allocate | How long an allocation pass takes. |
Further reading
- Prometheus autodiscovery
- Helm Chart components
- Tune service resource requests and limits
- Target Allocator sharding example in the Helm charts documentation.
- ServiceMonitor / PodMonitor CRD discovery example in the Helm charts documentation.