Tuning Service Resource Requests and Limits¶
When deploying the Observe Agent using the Helm chart, you can customize the CPU and memory resources allocated to each component. This allows you to optimize resource usage based on your cluster’s capacity and the specific needs of your environment.
Identifying Current Usage¶
By default, the services instantiated by the Observe Agent Helm Chart are provisioned to fit expected workflows: up to 300m
of CPU and 512Mi
of memory for some pods but less for others (see the default values below). These are the minimum default values that we recommend. Since estimating resource consumption is difficult and varies widely based on the properties of your Kubernetes cluster and applications instrumented, it’s recommended to instead first install the Helm chart and then examine the current resource consumption and tune based on that. To find your current usage, you can utilize the K8s Explorer. To do so, navigate to the corresponding Workload page for each of the deployments and daemonsets created by the Observe Agent Helm Chart. Specifically, the services to look at are:
Daemonsets:
observe-node-logs-metrics-agent
observe-forwarder-agent
Deployments:
observe-cluster-events
observe-cluster-metrics
observe-agent-monitor
For each of the services, select a pod to examine that has the highest CPU and memory utilization. Using the charts on the Pod page, determine the peak current CPU and memory utilization over the last week. Based on those values, we recommend the following settings.
resources:
requests:
cpu: 1.25 * ${MAX_CPU_USAGE}
memory: 1.25 * ${MAX_MEMORY_USAGE}
limits:
memory: 1.25 * ${MAX_MEMORY_USAGE}
Example pod resource utilization¶
Let’s take a look at a real world example of a pod in the K8s Explorer Pod view and how we can use that to determine the appropriate resource requests and limits.

Figure 1 - Example Pod View including resource utilization charts
For the example pod above, we can see that the CPU utilization averages around 20m
. Since this is below the minimum value of 250m
, we’ll keep that as the value for the CPU request. For memory, we can see that the memory utilization averages around 220Mi
but spikes up to 300Mi
. In order for us to provide a healthy overhead, we’ll set the memory request limit to 375Mi
. So, our new values based on the above pod view would be
resources:
requests:
cpu: 250m
memory: 375Mi
limits:
memory: 375Mi
Estimating resource requirements by data rate¶
We performed benchmarking tests to arrive at resource recommendations based on data ingest rates. All recommendations below are based off of testing performed on a 10 node kubernetes cluster. All data rates are at the cluster-wide level.
† The resources necessary for cluster metrics can vary greatly based on your applications’ various uses of prometheus metrics. We recommend that sizing for this deployment be verified and tuned to your individual usage patterns if your pods are exposing large amounts of prometheus metrics.
100 DPS¶
This scenario is based on ingesting 100 logs per second (each 1KiB), 100 instrumented app metrics per second, 100 spans per second, and an application with around 25 active pods.
Service |
CPU Request |
Memory Request |
Memory Limit |
---|---|---|---|
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
1,000 DPS¶
This scenario is based on ingesting 1000 logs per second (each 1KiB), 1000 instrumented app metrics per second, 1000 spans per second, and an application with around 250 active pods.
Service |
CPU Request |
Memory Request |
Memory Limit |
---|---|---|---|
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
|
100m |
128Mi |
128Mi |
|
100m |
384Mi |
384Mi |
10,000 DPS¶
This scenario is based on ingesting 10k logs per second (each 1KiB), 10k instrumented app metrics per second, 10k spans per second, and an application with around 500 active pods.
These are the default values in our helm chart.
Service |
CPU Request |
Memory Request |
Memory Limit |
---|---|---|---|
|
100m |
128Mi |
128Mi |
|
100m |
256Mi |
256Mi |
|
300m |
512Mi |
512Mi |
|
100m |
128Mi |
128Mi |
|
200m |
512Mi |
512Mi |
Modifying Resource Allocations¶
If the calculated values are lower than the default of 250m
for CPU and 256Mi
for memory, then use the default values. The resource requests and limits might be different for each of the services and should be tuned individually. These can be set in the values.yaml
file for each of the services.
Service |
SERVICE_NAME |
---|---|
|
|
|
|
|
|
|
|
|
|
${SERVICE_NAME}:
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
memory: 256Mi
Apply the new resource configurations¶
Redeploy the Observe Agent
Run the following command to redeploy the Observe Agent in the observe
namespace with the new resource configuration.
helm upgrade --reset-values observe-agent observe/agent -n observe --values values.yaml
Restart pods
Run the following commands to restart the pods with the updated configuration.
kubectl rollout restart deployment -n observe
kubectl rollout restart daemonset -n observe
Check pod status
Run the following command to make sure the Observe Agent has been redeployed successfully.
kubectl get pods -o wide -n observe