Calculate optimal Kubernetes container CPU/memory requests and limits with YAML output and node count estimates.
Last verified: May 2026
Output will appear here...Kubernetes CPU and memory requests and limits are the difference between a cluster that bin-packs efficiently and one that wastes 40% of node capacity. The Container Resource Calculator takes your container's measured or estimated usage profile and produces request/limit values that follow the conventional rules — requests sized to typical load, limits sized to absorb spikes — then translates the result into ready-to-paste YAML and estimates how many nodes a given replica count will consume on common node sizes.
Always set memory limits — exceeding them causes an OOMKill, which is the correct behavior for a misbehaving container. CPU limits are more nuanced: setting them aggressively causes throttling that can hurt latency-sensitive workloads, so many teams set requests but omit CPU limits and let the kernel scheduler manage contention. Decide per workload, not by default.
Request should cover steady-state usage; limit should cover realistic spikes. A 1:1 ratio (Guaranteed QoS) is appropriate for latency-critical workloads that need predictable performance. A 1:3 or 1:4 ratio (Burstable QoS) is appropriate for workloads with occasional bursts. Anything wider than 1:5 is a sign the workload is poorly characterized — measure before guessing.
Your team's latency dashboard shows the API service's p99 spiking at exactly the times the cluster autoscaler adds a node. You profile the service: 250m CPU steady state, 700m at peak, 300 MiB steady, 450 MiB at peak. Current config: requests 100m/200MiB, limits 2000m/1Gi. The 100m request meant the scheduler packed 30 pods per 8-vCPU node — and they all throttled together when load came in. You bump to requests 300m/400Mi, limits 1000m/600Mi. Node count rises slightly; p99 drops by half. Trade made deliberately.
The calculator takes the typical and peak CPU/memory values you supply, applies the chosen QoS strategy (Guaranteed, Burstable narrow, Burstable wide) to derive request and limit, and computes node count by dividing the total replica resource demand by per-node capacity (after subtracting kube-reserved and system-reserved overheads). Output is generated as both a resources block ready to paste into a Deployment and a per-replica/per-node summary table.
Memory request should be set near the workload's working set, not its peak. A pod with 4 GiB request and occasional spikes to 6 GiB will bin-pack well and absorb spikes; a pod with 6 GiB request and constant 4 GiB usage wastes 2 GiB on every node it lands on.
If your node is 32 GiB and your pod requests are 4 GiB, kube-reserved and system-reserved take 1-2 GiB before user pods get scheduled. Account for that overhead when calculating max-pods-per-node or you will be confused why pods stay pending on a 'half-empty' node.
Was this tool helpful?
Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.