Node Affinity, Taints, and Tolerations¶
Node affinity is a property of Pods that attracts them to a set of nodes (either as a preference or a hard requirement). Taints are the opposite – they allow a node to repel a set of pods.
Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods with matching taints. Tolerations allow scheduling but don’t guarantee scheduling: the scheduler also evaluates other parameters as part of its function.
Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.
Apply Node Affinity Configurations¶
Let’s suppose you have a nine-node Kubernete cluster.
gke-cluster-1-default-pool-020b00c1-9s2c
gke-cluster-1-default-pool-020b00c1-czlf
gke-cluster-1-default-pool-020b00c1-xhq2
gke-cluster-1-default-pool-7a5f9d4f-tjs6
gke-cluster-1-default-pool-7a5f9d4f-z4hq
gke-cluster-1-default-pool-7a5f9d4f-zxbt
gke-cluster-1-default-pool-ad91df86-1r4c
gke-cluster-1-default-pool-ad91df86-b0hv
gke-cluster-1-default-pool-ad91df86-rw8f
You’d like to assign pods to a specific node, gke-cluster-1-default-pool-020b00c1-9s2c
.
Label your nodes.
kubectl label nodes gke-cluster-1-default-pool-020b00c1-9s2c node-type=useme
kubectl label nodes gke-cluster-1-default-pool-020b00c1-czlf \
gke-cluster-1-default-pool-020b00c1-xhq2 \
gke-cluster-1-default-pool-7a5f9d4f-tjs6 \
gke-cluster-1-default-pool-7a5f9d4f-z4hq \
gke-cluster-1-default-pool-7a5f9d4f-zxbt \
gke-cluster-1-default-pool-ad91df86-1r4c \
gke-cluster-1-default-pool-ad91df86-b0hv \
gke-cluster-1-default-pool-ad91df86-rw8f \
node-type=nouseme
Create
affinity-values.yaml
with the following configuration.
cluster-events:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: [useme]
- key: observeinc.com/unschedulable
operator: DoesNotExist
- key: kubernetes.io/os
operator: NotIn
values: [windows]
cluster-metrics:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: [useme]
- key: observeinc.com/unschedulable
operator: DoesNotExist
- key: kubernetes.io/os
operator: NotIn
values: [windows]
node-logs-metrics:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: [useme]
- key: observeinc.com/unschedulable
operator: DoesNotExist
- key: kubernetes.io/os
operator: NotIn
values: [windows]
monitor:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: [useme]
- key: observeinc.com/unschedulable
operator: DoesNotExist
- key: kubernetes.io/os
operator: NotIn
values: [windows]
Redeploy the Observe Agent
Run the following command to redeploy the Observe Agent in the observe
namespace with the affinity configuration.
helm upgrade --reuse-values observe-agent observe/agent -n observe --values affinity-values.yaml
Run the following command to make sure the Observe Agent has been redeployed successfully.
kubectl get pods -o wide -n observe
As you can see below, all pods are assigned to the gke-cluster-1-default-pool-020b00c1-9s2c
node as expected.
$kubectl get pods -o wide -n observe
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
observe-agent-cluster-events-7bd9ccddcf-m6svd 1/1 Running 0 54s 10.248.8.6 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
observe-agent-cluster-metrics-7fc5987bcb-v95rl 1/1 Running 0 54s 10.248.8.5 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
observe-agent-monitor-b9f8c59c-9gpx2 1/1 Running 0 53s 10.248.8.8 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
observe-agent-node-logs-metrics-agent-j59kh 1/1 Running 0 54s 10.248.8.7 gke-cluster-1-default-pool-020b00c1-9s2c <none> <none>
Applying Taints and Tolerations Configurations¶
Let’s suppose you have a nine-node Kubernete cluster.
gke-cluster-1-default-pool-7fbc1d97-pt9k
gke-cluster-1-default-pool-7fbc1d97-w262
gke-cluster-1-default-pool-7fbc1d97-wshj
gke-cluster-1-default-pool-8dab22e2-2mhx
gke-cluster-1-default-pool-8dab22e2-lrj8
gke-cluster-1-default-pool-8dab22e2-rkd7
gke-cluster-1-default-pool-c192ec15-1wnx
gke-cluster-1-default-pool-c192ec15-3717
gke-cluster-1-default-pool-c192ec15-gt34
You’d like to assign pods to a specific node, gke-cluster-1-default-pool-7fbc1d97-pt9k
with Taints and Tolerations.
Taint a node
kubectl taint nodes gke-cluster-1-default-pool-7fbc1d97-pt9k deployObserve=notAllowed:NoSchedule
Check taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
As you can see below, gke-cluster-1-default-pool-7fbc1d97-pt9k
is tainted.
$ kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
NAME TAINTS
gke-cluster-1-default-pool-7fbc1d97-pt9k [map[effect:NoSchedule key:deployObserve value:notAllowed]]
gke-cluster-1-default-pool-7fbc1d97-w262 <none>
gke-cluster-1-default-pool-7fbc1d97-wshj <none>
gke-cluster-1-default-pool-8dab22e2-2mhx <none>
gke-cluster-1-default-pool-8dab22e2-lrj8 <none>
gke-cluster-1-default-pool-8dab22e2-rkd7 <none>
gke-cluster-1-default-pool-c192ec15-1wnx <none>
gke-cluster-1-default-pool-c192ec15-3717 <none>
gke-cluster-1-default-pool-c192ec15-gt34 <none>
Create
taints-tolerations-values.yaml
with the following configuration.
cluster-events:
tolerations:
- key: "deployObserve"
operator: "Equal"
value: "notAllowed"
effect: "NoSchedule"
cluster-metrics:
tolerations:
- key: "deployObserve"
operator: "Equal"
value: "notAllowed"
effect: "NoSchedule"
node-logs-metrics:
tolerations:
- key: "deployObserve"
operator: "Equal"
value: "notAllowed"
effect: "NoSchedule"
monitor:
tolerations:
- key: "deployObserve"
operator: "Equal"
value: "notAllowed"
effect: "NoSchedule"
Redeploy the Observe Agent
Run the following command to redeploy the Observe Agent in the observe
namespace with the tolerations configuration.
helm upgrade --reuse-values observe-agent observe/agent -n observe --values taints-tolerations-values.yaml
Run the following command to make sure the Observe Agent has been redeployed successfully.
kubectl get pods -o wide -n observe
As you can see below, some pods are assigned to the tainted gke-cluster-1-default-pool-7fbc1d97-pt9
node as expected.
$ kubectl get pods -o wide -n observe
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
observe-agent-cluster-events-7c95b84b6-xgx9v 1/1 Running 0 2m3s 10.232.0.6 gke-cluster-1-default-pool-7fbc1d97-pt9k <none> <none>
observe-agent-cluster-metrics-c84d7c769-v8gmr 1/1 Running 0 2m3s 10.232.4.5 gke-cluster-1-default-pool-c192ec15-gt34 <none> <none>
observe-agent-monitor-855569455b-5dqgf 1/1 Running 0 2m2s 10.232.5.5 gke-cluster-1-default-pool-c192ec15-3717 <none> <none>
observe-agent-node-logs-metrics-agent-6lnpg 1/1 Running 0 5m11s 10.232.1.5 gke-cluster-1-default-pool-7fbc1d97-w262 <none> <none>
observe-agent-node-logs-metrics-agent-ch2np 1/1 Running 0 87s 10.232.8.6 gke-cluster-1-default-pool-8dab22e2-lrj8 <none> <none>
observe-agent-node-logs-metrics-agent-g6lw6 1/1 Running 0 5m11s 10.232.7.5 gke-cluster-1-default-pool-8dab22e2-rkd7 <none> <none>
observe-agent-node-logs-metrics-agent-jxp94 1/1 Running 0 5m12s 10.232.6.12 gke-cluster-1-default-pool-8dab22e2-2mhx <none> <none>
observe-agent-node-logs-metrics-agent-kfgjc 1/1 Running 0 50s 10.232.5.6 gke-cluster-1-default-pool-c192ec15-3717 <none> <none>
observe-agent-node-logs-metrics-agent-lfwcj 1/1 Running 0 5m12s 10.232.2.5 gke-cluster-1-default-pool-7fbc1d97-wshj <none> <none>
observe-agent-node-logs-metrics-agent-lqdx7 1/1 Running 0 2m3s 10.232.0.5 gke-cluster-1-default-pool-7fbc1d97-pt9k <none> <none>
observe-agent-node-logs-metrics-agent-n8vh4 0/1 Running 0 14s 10.232.4.6 gke-cluster-1-default-pool-c192ec15-gt34 <none> <none>
observe-agent-node-logs-metrics-agent-rx4jb 1/1 Running 0 5m12s 10.232.3.4 gke-cluster-1-default-pool-c192ec15-1wnx <none> <none>
For more examples, see Example deployment scenarios