Core Workload Resources: Pods, Deployments, and ReplicaSetsLesson 2.4

Kubernetes resource requests and limits: how to prevent noisy neighbors

resource requests definition, resource limits definition, CPU units millicores, memory units mebibytes, Quality of Service classes, OOMKilled, CPU throttling, LimitRange, ResourceQuota

Requests vs Limits

Kubernetes resource requests and limits diagram

Requests are what the scheduler uses to find a node with enough free capacity. A Pod with memory: 256Mi request will only be scheduled to a node with at least 256Mi available.

Limits are enforced at runtime. If a container exceeds its memory limit, the kernel kills it (OOMKilled). If it exceeds its CPU limit, it is throttled - slowed down, not killed.

Setting Resources

containers:
- name: api
  image: my-api:2.0
  resources:
    requests:
      memory: "256Mi"    # 256 mebibytes
      cpu: "250m"         # 250 millicores = 0.25 vCPU
    limits:
      memory: "512Mi"
      cpu: "500m"

Quality of Service Classes

Kubernetes assigns a QoS class based on resources. Guaranteed: requests equal limits. Burstable: limits higher than requests. BestEffort: no requests or limits set. When a node runs low on memory, BestEffort Pods are evicted first, then Burstable, then Guaranteed.

# See QoS class of a pod
kubectl get pod my-pod -o jsonpath='{.status.qosClass}'