Skip to main content
All articles

Redis Caching Patterns in the Cloud: ElastiCache vs Azure Cache vs Memorystore

Cache-aside, write-through, and read-through patterns explained with eviction policies, cluster mode guidance, and specific sizing and cost comparisons across ElastiCache, Azure Cache for Redis, and Memorystore.

CloudToolStack TeamMarch 3, 202614 min read

Why Your Cache Strategy Matters More Than Your Cache Size

Redis is the default caching layer for cloud applications, and every major cloud provider offers a managed service: AWS ElastiCache, Azure Cache for Redis, and GCP Memorystore. Teams spin up a Redis cluster, throw cache-aside logic into their application code, and call it done. Six months later, they are dealing with cache stampedes during deployments, stale data bugs that take hours to reproduce, and a monthly bill that keeps climbing because nobody right-sized the cluster after launch.

The problem is rarely Redis itself. Redis is fast, well-understood, and battle-tested. The problem is that teams do not think carefully about their caching patterns, eviction policies, or cluster topology before going to production. A cache that returns stale data is worse than no cache at all because it introduces a class of bugs that are invisible to your monitoring and infuriating to debug.

This guide covers the caching patterns that actually work in production, the managed Redis offerings across all three clouds, and specific sizing and cost guidance based on real workloads.

Caching Patterns: Choosing the Right One

There are three fundamental caching patterns, and most teams only know the first one. Each has different consistency guarantees, failure modes, and implementation complexity.

Cache-Aside (Lazy Loading)

Cache-aside is the most common pattern and the one most developers implement by default. The application checks the cache first. On a miss, it reads from the database, writes the result to the cache, and returns it to the caller. The application is responsible for all cache management -- the cache itself has no awareness of the database.

When to use it: Read-heavy workloads where occasional stale data is acceptable. Product catalogs, user profiles, configuration data, and any read path where you can tolerate a TTL-based staleness window.

The trap: Cache-aside gives you stale data by design. When the database is updated, the cache still holds the old value until the TTL expires or the entry is explicitly invalidated. Many teams set TTLs of 5 to 15 minutes and assume that is good enough. It is not good enough for inventory counts, pricing data, or anything where users can see two different values depending on whether their request hits the cache or not.

The correct implementation of cache-aside includes explicit invalidation on writes. When you update a record in the database, you also delete (not update) the corresponding cache key. Deleting rather than updating avoids a race condition where two concurrent writes can leave the cache with a value that never existed in the database.

Write-Through

In a write-through pattern, every write goes to both the cache and the database. The application (or a proxy layer) writes to the cache first, and the cache layer synchronously writes to the database. This guarantees that the cache is always consistent with the database.

When to use it: Workloads where read-after-write consistency is critical. Shopping carts, session stores, and any flow where a user makes a change and expects to immediately see that change reflected.

The trap: Write-through adds latency to every write because you are writing to two places synchronously. It also fills your cache with data that may never be read. If you write 100,000 user profile updates per day but only 10,000 of those users log in, you are caching 90,000 records for nothing. Combine write-through with a TTL to evict records that are not being read.

Read-Through

Read-through moves the database-fetching logic into the cache layer itself. The application asks the cache for a key, and if the cache does not have it, the cache fetches from the database, stores the result, and returns it. The application never talks to the database directly for cached queries.

When to use it: When you want to decouple your application from database-fetching logic. Read-through works well with write-through or write-behind as a complete caching abstraction layer.

The trap: Redis does not natively support read-through. You need a proxy layer like Redis Gears, a sidecar, or an application library that implements the pattern. Most teams that think they want read-through actually want cache-aside with a well-structured repository layer.

Write-behind for high-throughput writes

Write-behind (also called write-back) writes to the cache immediately and asynchronously flushes to the database in batches. This dramatically reduces database write load but introduces the risk of data loss if the cache node fails before the flush. Use write-behind for analytics counters, page view counts, or any data where losing a few seconds of writes is acceptable. Never use it for financial transactions or order data.

Eviction Policies: Getting This Wrong Tanks Your Hit Rate

When Redis runs out of memory, it needs to decide which keys to evict. The default policy in most managed Redis services is noeviction, which means Redis returns errors when memory is full. This is almost certainly not what you want in production.

allkeys-lru is the right choice for most caching workloads. It evicts the least recently used key across all keys, regardless of whether they have a TTL. This is the closest to an ideal cache behavior -- frequently accessed data stays, infrequently accessed data gets evicted.

volatile-lru only evicts keys that have a TTL set. If you mix cached data (with TTLs) and persistent data (session tokens, rate limiting counters) in the same Redis instance, volatile-lru protects your persistent data from eviction. But this only works if you are disciplined about setting TTLs on cached data.

allkeys-lfu (least frequently used) was added in Redis 4.0 and is superior to LRU for most workloads. LRU can evict a popular key that just happened to not be accessed in the last few milliseconds. LFU tracks access frequency and evicts keys that are accessed least often. If your Redis version supports it, prefer allkeys-lfu over allkeys-lru.

Monitor your eviction rate. If Redis is evicting thousands of keys per second, your instance is too small or you are caching too aggressively. A healthy cache should evict occasionally, not constantly. On ElastiCache, watch the Evictions CloudWatch metric. On Azure Cache, check evictedkeys. On Memorystore, check redis.googleapis.com/stats/evicted_keys.

Managed Redis Across Clouds: The Real Differences

AWS ElastiCache for Redis

ElastiCache is the most mature managed Redis offering. It supports Redis 7.x, cluster mode with up to 500 shards, and both encryption at rest and in transit. The node types range from cache.t4g.micro (0.5 GB) to cache.r7g.16xlarge (419 GB).

Pricing: A cache.r7g.large (13.07 GB) node in us-east-1 costs approximately $0.166 per hour on-demand, or about $120 per month. A typical production cluster with 3 shards, each with one replica, runs 6 nodes -- roughly $720 per month for a 78 GB cluster. Reserved nodes with a 1-year all-upfront commitment save about 33 percent, bringing that to approximately $480 per month.

What I like: ElastiCache supports Global Datastore for cross-region replication with sub-second latency, which is essential for multi-region applications. The auto-scaling for cluster mode scales shards based on CPU or memory utilization without downtime.

What catches people: ElastiCache runs inside your VPC, so you need VPC peering or Transit Gateway to access it from other accounts. There is no public endpoint option. Also, failover to a read replica takes 30 to 60 seconds, during which your cache is unavailable for writes.

Generate ElastiCache Redis cluster configurations with the right node types and shard count

Azure Cache for Redis

Azure offers three tiers: Basic (no replication, no SLA -- development only), Standard (replicated, 99.9 percent SLA), and Premium (clustering, persistence, VNet injection). There is also an Enterprise tier powered by Redis Enterprise that supports RediSearch, RedisJSON, and RedisTimeSeries modules.

Pricing: A Standard C2 (13 GB) instance costs approximately $168 per month in East US. A Premium P2 (13 GB per shard) with 3 shards and replication runs about $1,512 per month. The Enterprise E10 (12 GB) starts at approximately $344 per month. Azure pricing is generally 10 to 20 percent higher than ElastiCache for equivalent capacity.

What I like: The Enterprise tier gives you Redis modules without managing Redis Enterprise yourself. RediSearch for full-text search and RedisTimeSeries for time-series data eliminate the need for separate services. Zone redundancy on Premium and Enterprise tiers distributes replicas across availability zones automatically.

What catches people: The Standard tier does not support clustering, so you are limited to a single node's memory (53 GB max). If you need more than 53 GB, you must use Premium or Enterprise. Also, scaling operations on Standard and Premium tiers cause 10 to 30 minutes of elevated latency because Azure migrates data to new nodes.

Size your Azure Cache for Redis tier and configuration

GCP Memorystore for Redis

Memorystore offers two tiers: Basic (no replication) and Standard (automatic failover with one replica). It supports Redis 7.x and instance sizes from 1 GB to 300 GB. Cluster mode launched in GA and supports up to 25 shards.

Pricing: A Standard tier 13 GB instance in us-central1 costs approximately $0.065 per GB per hour for the primary and $0.034 per GB per hour for the replica, totaling about $95 per month. Memorystore is consistently the cheapest managed Redis option -- 20 to 40 percent less than ElastiCache for equivalent capacity.

What I like: The pricing. For teams running on GCP, Memorystore offers excellent value. The managed patching and automatic failover work reliably. AUTH support and in-transit encryption are straightforward to enable.

What catches people: Memorystore does not support Redis modules, so no RediSearch or RedisTimeSeries. Cross-region replication is not available -- if you need multi-region caching on GCP, you are running separate Memorystore instances and managing replication at the application level. The 25-shard cluster limit is also restrictive for very large caching workloads.

Estimate Memorystore for Redis costs across tiers and configurations

Cluster Mode: When You Need It and When You Do Not

Redis cluster mode shards your data across multiple nodes. Each shard holds a subset of the keyspace (determined by hash slots), and you can scale horizontally by adding more shards. This is how you get beyond the memory limit of a single node.

You need cluster mode when: Your dataset exceeds the largest available single-node memory (about 400 GB on ElastiCache, 300 GB on Memorystore, 120 GB on Azure Premium). Or when your write throughput exceeds what a single node can handle -- roughly 100,000 to 200,000 operations per second depending on operation complexity.

You do not need cluster mode when: Your dataset fits in a single node and your throughput is within bounds. Cluster mode adds operational complexity. Multi-key operations (MGET, transactions, Lua scripts that touch multiple keys) only work on keys that hash to the same slot. You have to use hash tags to colocate related keys, and that introduces hotspot risks.

A common anti-pattern is enabling cluster mode with 2 shards when the data fits in a single 13 GB node. You are paying twice as much for no benefit and adding constraints on multi-key operations. Start with a single shard plus a replica, and only add shards when monitoring shows you are approaching memory or throughput limits.

Cache Stampede Prevention

A cache stampede happens when a popular key expires and dozens (or thousands) of concurrent requests all miss the cache simultaneously, all query the database, and all write back to the cache. For a key that is accessed 1,000 times per second, a TTL expiry triggers 1,000 simultaneous database queries. This can overwhelm your database.

Solution 1: Probabilistic early expiration. Instead of all requests seeing the key expire at the same moment, have each request independently decide whether to refresh the key slightly before the TTL expires. The probability increases as the TTL approaches zero. This spreads the refresh load across time so only one or two requests actually hit the database.

Solution 2: Locking. When a request sees a cache miss, it sets a short-lived lock key in Redis (using SET NX with a TTL). If the lock is acquired, that request fetches from the database and populates the cache. Other requests that see the lock either wait briefly and retry, or return a slightly stale value from a separate stale cache.

Solution 3: Background refresh. A background job refreshes popular keys before they expire. The cache never actually expires for hot keys -- it is always fresh. This is the most reliable approach for keys with very high read rates but requires you to identify which keys are hot.

TTL jitter is not optional

If you set the same TTL on every key (for example, 300 seconds), a large batch of keys written at the same time will all expire at the same time, causing a synchronized stampede. Always add random jitter to your TTLs. Instead of 300 seconds, use 270 to 330 seconds (300 plus or minus 10 percent). This simple change prevents synchronized expiration and is one of the highest-impact improvements you can make to a caching layer.

Sizing and Cost Optimization

Most teams over-provision their Redis clusters because they do not know how much memory they actually need. Here is a practical approach to sizing.

Step 1: Calculate your working set. Identify the data you actually need to cache. If your database has 50 million rows but only 2 million are accessed in any given hour, your working set is 2 million rows. Multiply by the average serialized size of each cached value (including the key). For JSON objects, this is typically 500 bytes to 5 KB. For a 2-million-record working set at 2 KB average, you need about 4 GB.

Step 2: Add overhead. Redis uses memory for internal data structures, replication buffers, and fragmentation. Plan for 25 to 30 percent overhead. That 4 GB working set needs a 5 to 5.5 GB instance.

Step 3: Choose the right node type. For caching workloads, memory-optimized nodes (r-family on AWS, Memory Optimized on Azure) give you the best GB-per-dollar. For session stores or rate limiting where throughput matters more than capacity, general-purpose nodes (m-family) provide better CPU.

Step 4: Use reserved pricing. If your Redis cluster runs 24/7 (and it probably does), reserved instances save 30 to 40 percent. On ElastiCache, 1-year all-upfront reservations are almost always worth it. On Azure, reserved capacity works similarly. Memorystore offers committed use discounts on GCP.

Quick Cost Comparison for a 50 GB Caching Cluster

For a 50 GB cluster with high availability (replication enabled), running 3 shards with replicas:

  • ElastiCache: 6x cache.r7g.xlarge (26 GB each) -- approximately $1,440 per month on-demand, $960 with 1-year reserved
  • Azure Cache Premium: P3 (26 GB) x 3 shards with replication -- approximately $2,268 per month, $1,590 with 1-year reserved
  • Memorystore Standard: 3x 17 GB shards with replicas -- approximately $660 per month, $460 with committed use

GCP Memorystore is consistently the most affordable, but it lacks the advanced features (modules, Global Datastore) that ElastiCache and Azure Enterprise offer. Choose based on your requirements, not just price.

Monitoring That Actually Catches Problems

The four metrics that matter for Redis caching:

  • Hit rate. A healthy cache has a hit rate above 90 percent. Below 80 percent, your cache is not providing enough value to justify its cost. Check your TTLs, eviction policy, and whether you are caching the right data.
  • Eviction rate. A sustained eviction rate above 100 keys per second usually means you need a larger instance. Occasional spikes during batch operations are fine.
  • Memory utilization. Keep this below 80 percent of the node's maxmemory. Above 80 percent, Redis starts evicting aggressively and performance becomes unpredictable. Above 95 percent with noeviction policy, Redis returns OOM errors.
  • Connection count. Redis handles connections efficiently, but each connection uses memory. If your connection count is growing unbounded, you likely have a connection leak in your application. ElastiCache defaults to a 65,000 connection limit per node.

Common Production Mistakes

After troubleshooting Redis issues across dozens of production environments, these are the mistakes I see most often.

  • Using Redis as a primary database. Redis is a cache and a session store. It is not a primary database. If your application cannot function when Redis is down, you have a Redis dependency, not a Redis cache. Design your application to degrade gracefully when the cache is unavailable -- slower, but functional.
  • Not setting maxmemory-policy. The default noeviction policy causes Redis to return errors when full. Set allkeys-lfu or allkeys-lru and configure maxmemory to 75 percent of available node memory.
  • Storing large objects. Values larger than 100 KB cause Redis to block while serializing and deserializing. If you are caching large JSON blobs, consider compressing them (gzip reduces JSON by 70 to 90 percent) or splitting them into smaller keys.
  • Not using connection pooling. Creating a new Redis connection per request adds 1 to 2 milliseconds of overhead and wastes memory. Use a connection pool with 10 to 50 connections per application instance.
  • Skipping encryption in transit. All three managed Redis services support TLS. Enable it. The performance impact is typically less than 5 percent, and unencrypted Redis traffic in a shared VPC is a security audit finding waiting to happen.

The thundering herd during deployments

Rolling deployments restart application instances, which drops cached connections and local in-process caches. When all instances come up simultaneously with cold local caches, they all hit Redis, which then hits the database for cache misses. To mitigate this, stagger your deployment so instances restart one at a time, and implement a cache warmup step that pre-populates critical keys on application startup.

Choosing Between Clouds

If you are already running on one cloud, use that cloud's managed Redis. The network latency benefit of having your cache in the same VPC as your application far outweighs any feature or cost differences between providers.

If you are starting fresh or running multi-cloud: ElastiCache is the safest choice with the most features and the largest community knowledge base. Azure Cache Enterprise is compelling if you need Redis modules. Memorystore wins on cost but lags on features.

Regardless of which provider you choose, the fundamentals are the same: choose the right caching pattern for your consistency requirements, set an appropriate eviction policy, prevent cache stampedes with TTL jitter and locking, right-size your cluster based on your working set, and monitor hit rate and eviction rate to catch problems before users do.

Written by CloudToolStack Team

Cloud architects with 15+ years of production experience across AWS, Azure, GCP, and OCI. We build free tools and write practical guides to help engineers navigate multi-cloud infrastructure.

Disclaimer: This article is for informational purposes. Cloud services and pricing change frequently; always verify with official provider documentation. AWS, Azure, GCP, and OCI are trademarks of their respective owners.