Skip to main content
All articles

Database Connection Pooling in the Cloud: RDS Proxy, PgBouncer, and Serverless Gotchas

Why serverless kills connection limits, RDS Proxy internals and costs, PgBouncer pool modes, Azure built-in pooling, and Cloud SQL Auth Proxy patterns.

CloudToolStack TeamMarch 20, 202613 min read

Why Serverless Broke Database Connections

The first time I deployed a Lambda function connected to an RDS PostgreSQL instance, everything worked perfectly in testing. We had a handful of concurrent invocations, each opening a database connection, running a query, and closing the connection. Then we went to production, and within 30 minutes, our database was rejecting connections. The PostgreSQL log was full of "FATAL: too many connections for role" errors. Our db.t3.medium instance had a max_connections limit of 112, and Lambda had blown through it in minutes.

This is the fundamental tension between serverless compute and relational databases. Serverless scales connections horizontally -- every function invocation can create a new database connection. Relational databases scale connections vertically -- they have a fixed maximum based on instance size and available memory. A single Lambda function with 1,000 concurrent invocations needs 1,000 database connections. An RDS db.r6g.large instance supports about 1,600 max connections. That sounds like headroom until you realize you probably have more than one Lambda function, and each connection consumes about 10 MB of PostgreSQL memory.

Connection pooling solves this mismatch by sitting between your application and the database, multiplexing many application connections over a smaller number of database connections. But not all connection pooling solutions work the same way, and choosing the wrong approach can introduce more problems than it solves.

Connection Limits by Instance Size

Before discussing pooling solutions, you need to understand how many connections your database can actually handle. These numbers surprise most people.

AWS RDS PostgreSQL: max_connections defaults to LEAST(DBInstanceClassMemory/9531392, 5000). For a db.t3.micro (1 GB RAM), that is about 112 connections. For a db.r6g.xlarge (32 GB RAM), it is about 3,500. For a db.r6g.16xlarge (512 GB RAM), it hits the 5,000 cap.

AWS RDS MySQL: max_connections defaults to DBInstanceClassMemory/12582880. A db.t3.micro gets about 85 connections. A db.r6g.xlarge gets about 2,700.

Azure Database for PostgreSQL Flexible Server: max_connections varies by compute tier. A Burstable B1ms (2 GB RAM) supports 50 connections. A General Purpose D4s_v3 (16 GB RAM) supports 859. A Memory Optimized E16s_v3 (128 GB RAM) supports 5,000.

Google Cloud SQL for PostgreSQL: max_connections defaults to based on the instance's memory. A db-f1-micro gets about 25. A db-custom-4-26624 (4 vCPUs, 26 GB RAM) gets about 400. A db-custom-96-614400 gets about 4,000.

The max_connections lie

Just because your instance supports 3,500 max connections does not mean it can handle 3,500 active connections well. Each idle PostgreSQL connection consumes about 5 to 10 MB of memory. Each active connection running a query can consume significantly more. At 3,500 connections, you are using 17 to 35 GB just for connection overhead, leaving less memory for shared_buffers, work_mem, and the OS page cache. In practice, most PostgreSQL instances perform well with no more than 200 to 400 active connections, regardless of what max_connections allows.

RDS Proxy: AWS's Managed Solution

Amazon RDS Proxy sits between your application and your RDS or Aurora database, maintaining a pool of database connections and multiplexing application connections over them. It was designed specifically for the Lambda connection explosion problem, and it works well for that use case, but it has important limitations.

How RDS Proxy Works

RDS Proxy maintains a pool of connections to your database and reuses them across application requests. When a Lambda function (or any client) connects to the proxy, it borrows a connection from the pool, executes queries, and returns the connection when the client session ends. The proxy handles connection initialization, authentication, and session state management transparently.

Connection multiplexing is the key feature. If 1,000 Lambda invocations connect to the proxy, but only 50 are actively querying at any given moment, the proxy only needs 50 backend connections to the database. This dramatically reduces the connection pressure on the database.

RDS Proxy Limitations

Cost: RDS Proxy charges per vCPU of the associated RDS instance, per hour. For a db.r6g.xlarge (4 vCPUs), that is about $0.30/hour or roughly $219/month. For larger instances, the cost increases linearly. This is not trivial -- it can add 15 to 30 percent to your database costs.

Latency: RDS Proxy adds 1 to 5ms of latency per query. For most web applications, this is negligible. For high-frequency trading or sub-millisecond workloads, it is significant.

Session pinning: If your application uses session-level features like prepared statements, temporary tables, advisory locks, or SET commands, the proxy "pins" the application connection to a specific backend connection, which defeats the purpose of multiplexing. The proxy logs pinning events, and you should monitor the DatabaseConnectionsCurrentlySessionPinned CloudWatch metric. High pinning rates mean you are paying for RDS Proxy without getting the multiplexing benefit.

Engine support: RDS Proxy supports MySQL 5.7+, PostgreSQL 10+, and MariaDB 10.3+ on RDS and Aurora. It does not support SQL Server, Oracle, or other engines.

Avoiding session pinning

The most common cause of session pinning in PostgreSQL is prepared statements. If you use an ORM like Prisma, Sequelize, or Django with prepared statements enabled, every connection will be pinned. To avoid this, disable prepared statements in your ORM configuration when using RDS Proxy. In Prisma, set pgbouncer=true in your connection string (yes, this flag works with RDS Proxy too, not just PgBouncer). In Django, set DISABLE_SERVER_SIDE_CURSORS = True.

PgBouncer: The Self-Managed Standard

PgBouncer has been the standard PostgreSQL connection pooler for over 15 years. It is lightweight (typically using less than 50 MB of memory), fast (sub-millisecond overhead), and extremely reliable. If you are on a platform that does not offer a managed proxy, or if you need more control than RDS Proxy provides, PgBouncer is the answer.

Pool Modes: Transaction vs Session vs Statement

PgBouncer offers three pooling modes, and choosing the right one is critical.

Transaction mode releases the backend connection back to the pool after each transaction completes. This gives you the best multiplexing ratio because connections are only held during active transactions, not during idle time between queries. This is the mode you want for serverless workloads and most web applications. The catch is that per-session features (prepared statements, SET commands, LISTEN/NOTIFY) do not work because your next transaction might use a different backend connection.

Session mode holds a backend connection for the entire duration of the client session, only returning it when the client disconnects. This supports all PostgreSQL features but provides minimal multiplexing benefit. You are essentially just adding a connection queue in front of your database. Use this mode only if you need prepared statements or other session-level features and cannot disable them.

Statement mode releases the backend connection after each individual statement. This provides the most aggressive multiplexing but breaks multi-statement transactions. It is only useful for simple, single-query workloads -- which describes many read-heavy APIs but not much else.

PgBouncer Deployment Patterns

There are two common deployment patterns for PgBouncer in cloud environments.

Sidecar pattern: Run PgBouncer as a sidecar container in the same pod (Kubernetes) or as a companion container in the same task (ECS). Each application instance gets its own PgBouncer. This eliminates network hops between the application and the pooler, and scaling the application automatically scales the pooling layer. The downside is that each PgBouncer maintains its own pool, so the total backend connections equal pool_size multiplied by the number of application instances. With 20 pods and a pool_size of 10, you need 200 backend connections.

Centralized pattern: Run PgBouncer as a standalone service (one or two instances with a load balancer). All application instances connect to the central PgBouncer. This gives better connection multiplexing because all application traffic shares one pool, but it introduces a single point of failure and a network hop. Deploy at least two instances behind a Network Load Balancer for high availability.

Azure Flexible Server Built-in Pooling

Azure Database for PostgreSQL Flexible Server includes a built-in PgBouncer instance that you can enable with a single setting. This is by far the easiest connection pooling option across any cloud provider -- no additional infrastructure to manage, no additional cost, and no separate endpoint.

To enable it, set the pgbouncer.enabled server parameter to true. The built-in PgBouncer listens on port 6432 (the standard PgBouncer port) while your database continues to listen on port 5432. You can connect directly to the database for admin tasks while your application uses the pooler.

The built-in pooler defaults to transaction mode with a pool_size of 50 connections per database-user pair. You can tune these via server parameters: pgbouncer.pool_mode, pgbouncer.default_pool_size, pgbouncer.max_client_conn, and others. The limitation is that you cannot run custom PgBouncer configurations -- you are limited to the parameters Azure exposes. For most workloads, the default configuration works well.

Azure built-in pooling performance

In my testing, the built-in PgBouncer on Azure Flexible Server adds about 0.5 to 1ms of latency per query, which is lower than RDS Proxy. It supports up to 2,000 client connections by default, multiplexed over the configured pool size. For an Azure Function app with hundreds of concurrent invocations, this turns a connection exhaustion scenario into a non-issue with zero additional infrastructure.

Cloud SQL Auth Proxy: Google's Approach

The Cloud SQL Auth Proxy (formerly Cloud SQL Proxy) deserves mention here, though it is technically not a connection pooler. It provides secure, authenticated connectivity to Cloud SQL instances using IAM-based authentication and encrypted tunnels. Every Cloud SQL tutorial recommends it, but many teams misunderstand what it does and does not do.

The Auth Proxy does not pool connections. It creates a one-to-one mapping between client connections and database connections. If your application opens 100 connections through the Auth Proxy, there are 100 connections to the Cloud SQL instance. The proxy handles TLS encryption and IAM authentication, not connection multiplexing.

For connection pooling on GCP, you need to combine the Auth Proxy with a separate pooler. The common pattern is to run PgBouncer between your application and the Auth Proxy: your application connects to PgBouncer, PgBouncer connects to the Auth Proxy, and the Auth Proxy connects to Cloud SQL. This works but adds two hops of latency. An alternative is to use Cloud SQL's built-in pgpool integration or to deploy PgBouncer on Cloud Run as a centralized pooling service.

For Cloud Run and Cloud Functions workloads connecting to Cloud SQL, Google recommends using the Cloud SQL connector libraries (available for Go, Java, Python, and Node.js) combined with application-level connection pooling through your ORM or database driver. This eliminates the need for the Auth Proxy sidecar while providing the same security benefits.

Serverless-Specific Connection Strategies

Serverless functions create unique connection management challenges that go beyond basic pooling.

Lambda Connection Reuse

Lambda execution environments persist between invocations (warm starts). You can create a database connection outside the handler function, and it will be reused across invocations on the same execution environment. This dramatically reduces connection churn. However, each concurrent execution environment gets its own connection, so at 500 concurrent invocations, you still have 500 connections -- just not the thousands you would have if each invocation created a new connection.

The trap is that Lambda execution environments can be frozen for minutes between invocations. During this time, the database connection is idle but held open. If the environment is eventually recycled, the connection may be closed abruptly without a proper disconnect. Configure your database client with a connection timeout and implement reconnection logic in your handler.

Connection Limits as a Scaling Bottleneck

Here is a real scenario. You have a Lambda function with a reserved concurrency of 500. It connects to an RDS db.r6g.large with max_connections of 1,600. Sounds fine. But you also have three other Lambda functions connecting to the same database, each with 200 to 300 reserved concurrency. Plus an ECS service with 10 tasks, each maintaining a pool of 20 connections. Your total connection demand is 500 + 200 + 300 + 250 + 200 = 1,450 connections -- dangerously close to the limit. Add a traffic spike or a deployment that briefly doubles Lambda concurrency, and you are in trouble.

The solution is to put RDS Proxy (or PgBouncer) in front of the database and set reserved concurrency on all Lambda functions to a value that, combined, stays well below the proxy's backend pool size. This requires coordination across teams, which is why I recommend documenting your connection budget in a shared runbook that lists every service, its expected connection count, and the database's total capacity.

Estimate RDS instance costs and compare connection limitsCalculate Cloud SQL costs across machine types and editions

Choosing the Right Pooling Solution

Here is a decision framework based on your platform and requirements.

  • AWS + Lambda + RDS/Aurora: Use RDS Proxy. The cost is justified by the connection management headache it eliminates. Disable prepared statements in your ORM to avoid session pinning.
  • AWS + ECS/EKS + RDS/Aurora: Use PgBouncer as a sidecar in transaction mode. You have more control, lower latency, and no additional per-vCPU cost.
  • Azure + Functions + Flexible Server: Enable the built-in PgBouncer. There is no reason to deploy additional infrastructure when the database includes a pooler for free.
  • Azure + AKS/Container Apps: Use the built-in PgBouncer on Flexible Server and connect directly from your containers. Add a PgBouncer sidecar only if you need per-pod pool management.
  • GCP + Cloud Run/Functions + Cloud SQL: Use the Cloud SQL connector library with application-level pooling. For high-concurrency workloads, add a centralized PgBouncer.
  • GCP + GKE + Cloud SQL: Run the Cloud SQL Auth Proxy as a sidecar and PgBouncer as a centralized service. Connect applications to PgBouncer, which connects through the Auth Proxy to Cloud SQL.

Monitoring Connection Pool Health

Regardless of which pooling solution you choose, monitor these metrics.

  • Active backend connections: Should stay well below max_connections. Alert at 80 percent.
  • Waiting clients: Clients queued because no backend connections are available. Should be zero in steady state. Any sustained waiting means your pool is too small.
  • Connection wait time: How long clients wait for a connection. Alert if p99 exceeds 100ms.
  • Connection errors: Failed connection attempts. Should be zero. Any non-zero value indicates pool exhaustion or database issues.
  • Session pinning rate (RDS Proxy only): Percentage of connections that are pinned. Above 10 percent means you are losing multiplexing efficiency.

For PgBouncer, connect to the admin console (psql -p 6432 pgbouncer) and run SHOW POOLS and SHOW STATS. For RDS Proxy, check CloudWatch metrics prefixed with RDSProxy. For Azure built-in PgBouncer, check the pgbouncer category in Azure Monitor metrics.

Connection pooling is infrastructure that you configure once and then mostly forget about -- until it breaks. When it breaks, it takes down every service connected to that database simultaneously. Invest the time upfront to choose the right solution, configure it correctly, and set up monitoring that catches problems before they cascade.

Written by CloudToolStack Team

Cloud architects with 15+ years of production experience across AWS, Azure, GCP, and OCI. We build free tools and write practical guides to help engineers navigate multi-cloud infrastructure.

Disclaimer: This article is for informational purposes. Cloud services and pricing change frequently; always verify with official provider documentation. AWS, Azure, GCP, and OCI are trademarks of their respective owners.