Node affinity, taints, and tolerations
Node affinity is a property of Kubernetes pods that attracts them to a set of nodes (either as a preference or a hard requirement). Taints are the opposite -- they allow a node to repel a set of pods.
Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods with matching taints. Tolerations allow scheduling but don't guarantee scheduling: the scheduler also evaluates other parameters as part of its function.
Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.
Apply node affinity configurations
Let's suppose you have a nine-node Kubernetes cluster.
gke-cluster-1-default-pool-020b00c1-9s2c
gke-cluster-1-default-pool-020b00c1-czlf
gke-cluster-1-default-pool-020b00c1-xhq2
gke-cluster-1-default-pool-7a5f9d4f-tjs6
gke-cluster-1-default-pool-7a5f9d4f-z4hq
gke-cluster-1-default-pool-7a5f9d4f-zxbt
gke-cluster-1-default-pool-ad91df86-1r4c
gke-cluster-1-default-pool-ad91df86-b0hv
gke-cluster-1-default-pool-ad91df86-rw8fYou'd like to assign pods to a specific node, gke-cluster-1-default-pool-020b00c1-9s2c. Perform the following steps:
- Label your nodes.
kubectl label nodes gke-cluster-1-default-pool-020b00c1-9s2c node-type=useme kubectl label nodes gke-cluster-1-default-pool-020b00c1-czlf \ gke-cluster-1-default-pool-020b00c1-xhq2 \ gke-cluster-1-default-pool-7a5f9d4f-tjs6 \ gke-cluster-1-default-pool-7a5f9d4f-z4hq \ gke-cluster-1-default-pool-7a5f9d4f-zxbt \ gke-cluster-1-default-pool-ad91df86-1r4c \ gke-cluster-1-default-pool-ad91df86-b0hv \ gke-cluster-1-default-pool-ad91df86-rw8f \ node-type=nouseme - Create
affinity-values.yamlwith the following configuration.# 1) Define an anchor for the repeated affinity block. affinityBase: &affinityBase nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-type operator: In values: [useme] - key: observeinc.com/unschedulable operator: DoesNotExist - key: kubernetes.io/os operator: NotIn values: [windows] cluster-events: affinity: *affinityBase cluster-metrics: affinity: *affinityBase node-logs-metrics: affinity: *affinityBase monitor: affinity: *affinityBase # # Uncomment these if you are using Agent Chart v0.41+ # # observe-forward # forwarder: # affinity: *affinityBase - Run the following command to redeploy the Observe Agent in the observe namespace with the affinity configuration.
helm upgrade --reuse-values observe-agent observe/agent -n observe --values affinity-values.yaml - Run the following command to make sure the Observe Agent has been redeployed successfully.
kubectl get pods -o wide -n observe
All pods are now assigned to the gke-cluster-1-default-pool-020b00c1-9s2c node as expected.
$kubectl get pods -o wide -n observe
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
observe-agent-cluster-events-7bd9ccddcf-m6svd 1/1 Running 0 54s 10.248.8.6 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
observe-agent-cluster-metrics-7fc5987bcb-v95rl 1/1 Running 0 54s 10.248.8.5 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
observe-agent-monitor-b9f8c59c-9gpx2 1/1 Running 0 53s 10.248.8.8 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
observe-agent-node-logs-metrics-agent-j59kh 1/1 Running 0 54s 10.248.8.7 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>Apply taints and tolerations configurations
Let's suppose you have a nine-node Kubernetes cluster.
gke-cluster-1-default-pool-7fbc1d97-pt9k
gke-cluster-1-default-pool-7fbc1d97-w262
gke-cluster-1-default-pool-7fbc1d97-wshj
gke-cluster-1-default-pool-8dab22e2-2mhx
gke-cluster-1-default-pool-8dab22e2-lrj8
gke-cluster-1-default-pool-8dab22e2-rkd7
gke-cluster-1-default-pool-c192ec15-1wnx
gke-cluster-1-default-pool-c192ec15-3717
gke-cluster-1-default-pool-c192ec15-gt34You'd like to assign pods to a specific node, gke-cluster-1-default-pool-7fbc1d97-pt9k with taints and tolerations. Perform the following steps:
-
Taint a node:
kubectl taint nodes gke-cluster-1-default-pool-7fbc1d97-pt9k deployObserve=notAllowed:NoSchedule -
Check taints:
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taintsAs you can see below, gke-cluster-1-default-pool-7fbc1d97-pt9k is tainted.
$ kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints NAME TAINTS gke-cluster-1-default-pool-7fbc1d97-pt9k [map[effect:NoSchedule key:deployObserve value:notAllowed]] gke-cluster-1-default-pool-7fbc1d97-w262 <none> gke-cluster-1-default-pool-7fbc1d97-wshj <none> gke-cluster-1-default-pool-8dab22e2-2mhx <none> gke-cluster-1-default-pool-8dab22e2-lrj8 <none> gke-cluster-1-default-pool-8dab22e2-rkd7 <none> gke-cluster-1-default-pool-c192ec15-1wnx <none> gke-cluster-1-default-pool-c192ec15-3717 <none> gke-cluster-1-default-pool-c192ec15-gt34 <none> -
Create
taints-tolerations-values.yamlwith the following configuration.# 1) Define the anchor for the repeated tolerations array: tolerationsBase: &tolerationsBase - key: "deployObserve" operator: "Equal" value: "notAllowed" effect: "NoSchedule" cluster-events: tolerations: *tolerationsBase cluster-metrics: tolerations: *tolerationsBase node-logs-metrics: tolerations: *tolerationsBase monitor: tolerations: *tolerationsBase # # Uncomment these if you are using Agent Chart v0.41+ # # observe-forward # forwarder: # tolerations: *tolerationsBase -
Run the following command to redeploy the Observe Agent in the
observenamespace with the tolerations configuration.helm upgrade --reuse-values observe-agent observe/agent -n observe --values taints-tolerations-values.yaml -
Run the following command to make sure the Observe Agent has been redeployed successfully.
kubectl get pods -o wide -n observe
As you can see below, some pods are assigned to the tainted gke-cluster-1-default-pool-7fbc1d97-pt9 node as expected.
$ kubectl get pods -o wide -n observe
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
observe-agent-cluster-events-7c95b84b6-xgx9v 1/1 Running 0 2m3s 10.232.0.6 gke-cluster-1-default-pool-7fbc1d97-pt9k <none> <none>
observe-agent-cluster-metrics-c84d7c769-v8gmr 1/1 Running 0 2m3s 10.232.4.5 gke-cluster-1-default-pool-c192ec15-gt34 <none> <none>
observe-agent-monitor-855569455b-5dqgf 1/1 Running 0 2m2s 10.232.5.5 gke-cluster-1-default-pool-c192ec15-3717 <none> <none>
observe-agent-node-logs-metrics-agent-6lnpg 1/1 Running 0 5m11s 10.232.1.5 gke-cluster-1-default-pool-7fbc1d97-w262 <none> <none>
observe-agent-node-logs-metrics-agent-ch2np 1/1 Running 0 87s 10.232.8.6 gke-cluster-1-default-pool-8dab22e2-lrj8 <none> <none>
observe-agent-node-logs-metrics-agent-g6lw6 1/1 Running 0 5m11s 10.232.7.5 gke-cluster-1-default-pool-8dab22e2-rkd7 <none> <none>
observe-agent-node-logs-metrics-agent-jxp94 1/1 Running 0 5m12s 10.232.6.12 gke-cluster-1-default-pool-8dab22e2-2mhx <none> <none>
observe-agent-node-logs-metrics-agent-kfgjc 1/1 Running 0 50s 10.232.5.6 gke-cluster-1-default-pool-c192ec15-3717 <none> <none>
observe-agent-node-logs-metrics-agent-lfwcj 1/1 Running 0 5m12s 10.232.2.5 gke-cluster-1-default-pool-7fbc1d97-wshj <none> <none>
observe-agent-node-logs-metrics-agent-lqdx7 1/1 Running 0 2m3s 10.232.0.5 gke-cluster-1-default-pool-7fbc1d97-pt9k <none> <none>
observe-agent-node-logs-metrics-agent-n8vh4 0/1 Running 0 14s 10.232.4.6 gke-cluster-1-default-pool-c192ec15-gt34 <none> <none>
observe-agent-node-logs-metrics-agent-rx4jb 1/1 Running 0 5m12s 10.232.3.4 gke-cluster-1-default-pool-c192ec15-1wnx <none> <none>For more examples, see Example deployment scenarios in the Observe Helm chart documentation.
Updated 9 days ago