GCP Cloud Run Cost Estimator

ComputeGCP

Estimate monthly Cloud Run costs based on requests, CPU, and memory allocation.

Last verified: April 2026

Cloud Run Configuration

Requests per Month

Free tier includes 2M requests/month

Average Duration (ms)

CPU Allocation (vCPU)

Memory (MB)

Concurrent Requests per Instance

Max concurrent requests each instance handles (default 80, max 1000)

Minimum Instances

Instances kept warm to avoid cold starts. Set to 0 for scale-to-zero.

CPU Always Allocated ($0.000018/vCPU-sec vs $0.000024/vCPU-sec request-based)

Include Free Tier (180K vCPU-sec + 360K GiB-sec + 2M requests/month)

Raw Output

Output will appear here...

How It Helps

The GCP Cloud Run Cost Estimator calculates monthly costs based on request volume, CPU allocation, memory allocation, and execution time. Cloud Run offers both request-based and instance-based billing models with a generous free tier. The tool factors in CPU throttling settings (allocated only during requests vs. always allocated), minimum instances for cold start mitigation, and concurrency settings.

Things Engineers Ask

What is included in the Cloud Run free tier?

Cloud Run provides 2 million requests, 360,000 vCPU-seconds, 180,000 GiB-seconds, and 1 GB of outbound data per month free. This is per-account and covers most low-to-moderate traffic services entirely. The free tier makes Cloud Run one of the most cost-effective serverless container platforms for small workloads.

What is the difference between CPU allocation modes?

With 'CPU allocated only during request processing,' you only pay for CPU when handling requests, and CPU is throttled between requests. With 'CPU always allocated,' you pay for CPU as long as the instance is running (including between requests), enabling background processing, WebSocket connections, and consistent performance.

How does concurrency affect cost?

Higher concurrency (up to 1,000) means each instance handles more simultaneous requests, reducing the number of instances needed and total cost. Lower concurrency may be needed for CPU-intensive or memory-intensive request processing. The default concurrency of 80 works well for most web workloads.

In Practice

You're deploying a notification service that processes webhook callbacks — bursty traffic with 5,000 requests during peak hour and near-zero traffic overnight. The estimator shows Cloud Run at $8/month with CPU-during-request mode and min-instances=0, compared to a minimum $50/month for the equivalent always-on Compute Engine VM. Even with 3-second cold starts on the first request after idle periods, the webhook callers have a 30-second timeout so it's a non-issue. You save $500/year and eliminate all server management.

Practical Applications

1Estimating monthly Cloud Run costs for a containerized API with projected request volume and average response time.
2Comparing request-based vs. always-allocated CPU pricing for different traffic patterns.
3Calculating the cost impact of setting minimum instances to eliminate cold starts.
4Modeling free tier coverage for a low-traffic service to determine if Cloud Run is effectively free.

Behind the Scenes

The estimator models Cloud Run costs by calculating billable vCPU-seconds and GiB-seconds based on request volume, concurrency, and average request duration. For 'CPU allocated during request processing' mode, billable time is (requests / concurrency) x average_duration. For 'CPU always allocated' mode, billable time is instance_count x uptime. It subtracts the monthly free tier (180,000 vCPU-seconds, 360,000 GiB-seconds, 2M requests), applies per-unit pricing, and adds minimum instance idle charges if configured.

Things the Docs Don’t Tell You

TIP

Cloud Run's concurrency setting is the single biggest cost lever. At the default concurrency of 80, one instance handles 80 simultaneous requests. If you lower it to 1 (for CPU-heavy work), Cloud Run spins up 80 instances for the same traffic — and you pay for all 80. Only reduce concurrency below 80 if your request handler genuinely cannot share CPU with other requests.

TIP

Setting minimum instances to 0 gives you free-tier eligibility but introduces cold starts of 1-5 seconds. Setting min-instances to 1 eliminates cold starts but costs ~$15-20/month even with zero traffic (1 vCPU always allocated). For user-facing APIs, min-instances=1 is almost always worth the cost. For webhooks or async processors that tolerate latency, min-instances=0 is fine.

TIP

Cloud Run charges for CPU and memory independently. A common mistake is allocating 4 GB memory with 1 vCPU — you pay the full memory rate even though most apps can't use 4 GB with only 1 CPU. Match your memory allocation to actual usage (check container metrics) and only increase CPU when you see request latency rising under load.

Related Learning Guides

GKE vs Cloud Run Decision24 min read
Cost Optimization Guide24 min read

Was this tool helpful?

Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.