Running production workloads on Kubernetes requires more than just deploying your services, it demands a deep understanding of the cluster’s internals and how to keep it healthy under load.
While working with Oracle Cloud Infrastructure (OCI) and its managed Kubernetes service, Oracle Kubernetes Engine (OKE), I found that tuning resource reservations on worker nodes can significantly improve stability and resilience.
The Challenge: Resource Contention and Node Stability
As my workloads and cluster usage patterns evolved, I started noticing occasional instability. After careful monitoring and debugging, the issue pointed to resource contention on the worker nodes.
Sometimes pods were starved of CPU or memory, but more critically, essential node-level components such as the kubelet and system daemons were also competing for resources.
This situation, commonly known as node choking, isn’t specific to OCI or OKE, it’s a general risk in Kubernetes clusters where resource reservations aren’t explicitly defined. When too many resources are assigned to pods, the node’s operating system and control components can become deprived, resulting in issues like unscheduled pods, degraded node health, and delayed cluster operations.
The Solution: Reserve Resources for the System and Kubelet
To mitigate this, we implemented explicit resource reservations using --kube-reserved
and --system-reserved
flags passed to the kubelet during initialization.
This is an example of how we configure it on our OKE worker nodes pools via cloud-init:
#!/bin/bash
curl --fail -H "Authorization: Bearer Oracle" -L0 \
http://169.254.169.254/opc/v2/instance/metadata/oke_init_script \
| base64 --decode > /var/run/oke-init.sh
bash /var/run/oke-init.sh \
--kubelet-extra-args "--kube-reserved=cpu=1000m,memory=2Gi \
--system-reserved=cpu=1000m,memory=1000Mi"
Explanation:
--kube-reserved
: Reserves CPU and memory for Kubernetes internal components like the kubelet, cAdvisor, and others.--system-reserved
: Reserves resources for OS-level processes including logging, monitoring agents, and other system services.
By doing this, I ensured that critical processes remain unaffected even during high-load scenarios.
The Impact: Resilience and Predictability
Since adopting this setup, I’ve observed a significant improvement in node stability and workload reliability:
- No kubelet OOM errors even under peak load
- Faster node response during autoscaling and upgrades
- More reliable pod scheduling, especially in resource-constrained moments
- Lower operational overhead due to fewer node-related issues
The most valuable gain, however, was improved predictability. With system needs explicitly reserved, I can better model node capacity and performance under real-world load.
Lessons Learned from the Field
OKE offers a solid managed Kubernetes platform, and with a few thoughtful configurations, you can further improve its robustness. Explicit resource reservation is one of those hidden gems that pays off over time.
That said, there is a tradeoff.
By reserving a portion of each node’s resources for system and Kubernetes operations, you effectively reduce the capacity available for running workloads. This can mean needing additional nodes to handle the same application footprint, which translates into higher infrastructure costs. It’s important to weigh this added cost against the operational stability gained.
In my case, the improvement in reliability and predictability has justified the higher monthly bill. But like any optimization, it depends on your specific workload patterns and availability requirements.
If you’re running OKE in production and haven’t yet reserved system resources on your nodes, it’s worth considering. It only takes a small adjustment to prevent resource starvation and dramatically boost stability.
Note: Starting from Kubernetes version 1.30 on OKE, Oracle Cloud Infrastructure applies default resource reservations for system and Kubernetes components on managed nodes. If you're using an image from June 2024 or newer, these settings may already be configured automatically. That said, it's still a good idea to verify the defaults and adjust them to fit your workload.
https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengaboutk8sversions.htm?utm_source=sre.engineer
No Comments
Leave a comment Cancel