Estimate monthly Cloud Run costs based on requests, CPU, and memory allocation.
Last verified: April 2026
Free tier includes 2M requests/month
Max concurrent requests each instance handles (default 80, max 1000)
Instances kept warm to avoid cold starts. Set to 0 for scale-to-zero.
Output will appear here...The GCP Cloud Run Cost Estimator calculates monthly costs based on request volume, CPU allocation, memory allocation, and execution time. Cloud Run offers both request-based and instance-based billing models with a generous free tier. The tool factors in CPU throttling settings (allocated only during requests vs. always allocated), minimum instances for cold start mitigation, and concurrency settings.
Cloud Run provides 2 million requests, 360,000 vCPU-seconds, 180,000 GiB-seconds, and 1 GB of outbound data per month free. This is per-account and covers most low-to-moderate traffic services entirely. The free tier makes Cloud Run one of the most cost-effective serverless container platforms for small workloads.
With 'CPU allocated only during request processing,' you only pay for CPU when handling requests, and CPU is throttled between requests. With 'CPU always allocated,' you pay for CPU as long as the instance is running (including between requests), enabling background processing, WebSocket connections, and consistent performance.
Higher concurrency (up to 1,000) means each instance handles more simultaneous requests, reducing the number of instances needed and total cost. Lower concurrency may be needed for CPU-intensive or memory-intensive request processing. The default concurrency of 80 works well for most web workloads.
You're deploying a notification service that processes webhook callbacks — bursty traffic with 5,000 requests during peak hour and near-zero traffic overnight. The estimator shows Cloud Run at $8/month with CPU-during-request mode and min-instances=0, compared to a minimum $50/month for the equivalent always-on Compute Engine VM. Even with 3-second cold starts on the first request after idle periods, the webhook callers have a 30-second timeout so it's a non-issue. You save $500/year and eliminate all server management.
The estimator models Cloud Run costs by calculating billable vCPU-seconds and GiB-seconds based on request volume, concurrency, and average request duration. For 'CPU allocated during request processing' mode, billable time is (requests / concurrency) x average_duration. For 'CPU always allocated' mode, billable time is instance_count x uptime. It subtracts the monthly free tier (180,000 vCPU-seconds, 360,000 GiB-seconds, 2M requests), applies per-unit pricing, and adds minimum instance idle charges if configured.
Cloud Run's concurrency setting is the single biggest cost lever. At the default concurrency of 80, one instance handles 80 simultaneous requests. If you lower it to 1 (for CPU-heavy work), Cloud Run spins up 80 instances for the same traffic — and you pay for all 80. Only reduce concurrency below 80 if your request handler genuinely cannot share CPU with other requests.
Setting minimum instances to 0 gives you free-tier eligibility but introduces cold starts of 1-5 seconds. Setting min-instances to 1 eliminates cold starts but costs ~$15-20/month even with zero traffic (1 vCPU always allocated). For user-facing APIs, min-instances=1 is almost always worth the cost. For webhooks or async processors that tolerate latency, min-instances=0 is fine.
Cloud Run charges for CPU and memory independently. A common mistake is allocating 4 GB memory with 1 vCPU — you pay the full memory rate even though most apps can't use 4 GB with only 1 CPU. Match your memory allocation to actual usage (check container metrics) and only increase CPU when you see request latency rising under load.
Was this tool helpful?
Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.