Container Resource Calculator

ComputeMulti-Cloud

Calculate optimal Kubernetes container CPU/memory requests and limits with YAML output and node count estimates.

Last verified: May 2026

Container Usage Data

Avg CPU (millicores)

Peak CPU (millicores)

Avg Memory (MB)

Peak Memory (MB)

Replica Count

Safety Margin: 20%

0%50%100%

Raw Output

Output will appear here...

How It Helps

Kubernetes CPU and memory requests and limits are the difference between a cluster that bin-packs efficiently and one that wastes 40% of node capacity. The Container Resource Calculator takes your container's measured or estimated usage profile and produces request/limit values that follow the conventional rules — requests sized to typical load, limits sized to absorb spikes — then translates the result into ready-to-paste YAML and estimates how many nodes a given replica count will consume on common node sizes.

Things Engineers Ask

Should CPU and memory limits be set, or just requests?

Always set memory limits — exceeding them causes an OOMKill, which is the correct behavior for a misbehaving container. CPU limits are more nuanced: setting them aggressively causes throttling that can hurt latency-sensitive workloads, so many teams set requests but omit CPU limits and let the kernel scheduler manage contention. Decide per workload, not by default.

How do I pick the right ratio between request and limit?

Request should cover steady-state usage; limit should cover realistic spikes. A 1:1 ratio (Guaranteed QoS) is appropriate for latency-critical workloads that need predictable performance. A 1:3 or 1:4 ratio (Burstable QoS) is appropriate for workloads with occasional bursts. Anything wider than 1:5 is a sign the workload is poorly characterized — measure before guessing.

In Practice

Your team's latency dashboard shows the API service's p99 spiking at exactly the times the cluster autoscaler adds a node. You profile the service: 250m CPU steady state, 700m at peak, 300 MiB steady, 450 MiB at peak. Current config: requests 100m/200MiB, limits 2000m/1Gi. The 100m request meant the scheduler packed 30 pods per 8-vCPU node — and they all throttled together when load came in. You bump to requests 300m/400Mi, limits 1000m/600Mi. Node count rises slightly; p99 drops by half. Trade made deliberately.

Practical Applications

1Right-sizing a new microservice deployment based on a few hours of soak-test profiling instead of guessing.
2Auditing an existing deployment whose pods routinely OOMKill or get throttled because limits were set carelessly.
3Estimating the node count needed to host a horizontally-scaled service before signing off on a cluster capacity plan.
4Generating a starting Resource Quota and LimitRange for a new namespace based on the workloads it will host.

Behind the Scenes

The calculator takes the typical and peak CPU/memory values you supply, applies the chosen QoS strategy (Guaranteed, Burstable narrow, Burstable wide) to derive request and limit, and computes node count by dividing the total replica resource demand by per-node capacity (after subtracting kube-reserved and system-reserved overheads). Output is generated as both a resources block ready to paste into a Deployment and a per-replica/per-node summary table.

Things the Docs Don’t Tell You

TIP

Memory request should be set near the workload's working set, not its peak. A pod with 4 GiB request and occasional spikes to 6 GiB will bin-pack well and absorb spikes; a pod with 6 GiB request and constant 4 GiB usage wastes 2 GiB on every node it lands on.

TIP

If your node is 32 GiB and your pod requests are 4 GiB, kube-reserved and system-reserved take 1-2 GiB before user pods get scheduled. Account for that overhead when calculating max-pods-per-node or you will be confused why pods stay pending on a 'half-empty' node.

Was this tool helpful?

Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.