Tuning Service Resource Requests and Limits

When deploying the Observe Agent using the Helm chart, you can customize the CPU and memory resources allocated to each component. This allows you to optimize resource usage based on your cluster’s capacity and the specific needs of your environment.

Identifying Current Usage

By default, the services instantiated by the Observe Agent Helm Chart are provisioned with 250m of CPU and 256Mi of memory. These are the minimum default values that we recommend. Since estimating resource consumption is difficult and varies widely based on the properties of your Kubernetes cluster and applications instrumented, it’s recommended to instead first install the Helm chart and then examine the current resource consumption and tune based on that. To find your current usage, you can utilize the K8s Explorer. To do so, navigate to the corresponding Workload page for each of the deployments and daemonsets created by the Observe Agent Helm Chart. Specifically, the services to look at are:

Daemonsets:

  • observe-node-logs-metrics-agent

Deployments:

  • observe-cluster-events

  • observe-cluster-metrics

  • observe-agent-monitor

For each of the services, select a pod to examine that has the highest CPU and memory utilization. Using the charts on the Pod page, determine the peak current CPU and memory utilization over the last week. Based on those values, we recommend the following settings.

resources:
  requests:
    cpu: 1.25 * ${MAX_CPU_USAGE}
    memory: 1.25 * ${MAX_MEMORY_USAGE}
  limits:
    memory: 1.25 * ${MAX_MEMORY_USAGE}

Example pod resource utilization

Let’s take a look at a real world example of a pod in the K8s Explorer Pod view and how we can use that to determine the appropriate resource requests and limits.

Pod View

Figure 1 - Example Pod View including resource utilization charts

For the example pod above, we can see that the CPU utilization averages around 20m. Since this is below the minimum value of 250m, we’ll keep that as the value for the CPU request. For memory, we can see that the memory utilization averages around 220Mi but spikes up to 300Mi. In order for us to provide a healthy overhead, we’ll set the memory request limit to 375Mi. So, our new values based on the above pod view would be

resources:
  requests:
    cpu: 250m
    memory: 375Mi
  limits:
    memory: 375Mi

Modifying Resource Allocations

If the calculated values are lower than the default of 250m for CPU and 256Mi for memory, then use the default values. The resource requests and limits might be different for each of the services and should be tuned individually. These can be set in the values.yaml file for each of the services.

Service

SERVICE_NAME

observe-node-logs-metrics-agent

node-logs-metrics

observe-cluster-events

cluster-events

observe-cluster-metrics

cluster-metrics

observe-agent-monitor

monitor

${SERVICE_NAME}:
  resources:
    requests:
      cpu: 250m
      memory: 256Mi
    limits:
      memory: 256Mi

Apply the new resource configurations:

  1. Redeploy the Observe Agent

Run the following command to redeploy the Observe Agent in the observe namespace with the new resource configuration.

helm upgrade --reset-values observe-agent observe/agent -n observe --values values.yaml
  1. Restart pods

Run the following commands to restart the pods with the updated configuration.

kubectl rollout restart deployment -n observe
kubectl rollout restart daemonset -n observe
  1. Check pod status

Run the following command to make sure the Observe Agent has been redeployed successfully.

kubectl get pods -o wide -n observe