AzureDatabasesintermediate

Cosmos DB Consistency Levels

Understand the five Cosmos DB consistency levels and their trade-offs for global distribution.

CloudToolStack Editorial24 min readPublished Feb 22, 2026

Prerequisites

Azure subscription with Cosmos DB access
Understanding of distributed database concepts
Basic knowledge of CAP theorem

Cosmos DB Consistency Levels Explained

Azure Cosmos DB is a globally distributed, multi-model database service designed for applications that need single-digit millisecond response times at any scale, anywhere in the world. One of its most distinctive and powerful features is the ability to choose from five well-defined consistency levels, each representing a different trade-off between data freshness, availability, latency, and throughput.

Understanding these consistency levels is critical for building applications that behave correctly while maximizing performance and minimizing cost. Choosing the wrong consistency level can result in applications that show stale data (causing user confusion or business logic errors), or applications that pay a significant performance and cost penalty for guarantees they don't actually need. This guide explains each level in depth, provides practical code examples, and offers a framework for selecting the right consistency level for your specific use case.

The Consistency Spectrum

Traditional databases typically offer only two options: strong consistency (relational databases with synchronous replication) or eventual consistency (most NoSQL databases). Cosmos DB fills in the spectrum with three additional levels between these extremes, giving you fine-grained control over the consistency-performance trade-off.

Level	Guarantee	Read Latency	Write Latency	Throughput	Read RU Cost
Strong	Linearizable (always latest committed write)	Highest	Highest	Lowest	2x
Bounded Staleness	Reads lag by at most K versions or T time	High	High	Low	2x
Session	Read-your-own-writes within a session	Low	Low	High	1x
Consistent Prefix	Reads never see out-of-order writes	Low	Low	High	1x
Eventual	No ordering guarantee; convergence over time	Lowest	Lowest	Highest	1x

Session Consistency Is the Default

Cosmos DB defaults to Session consistency, and Microsoft recommends it as the starting point for most applications. Session consistency provides read-your-own-writes within a client session (using a session token), which is what most applications actually need. It keeps latency low, throughput high, and reads cost 1x RU, half the cost of Strong or Bounded Staleness reads. Unless you have a specific requirement for stronger or weaker guarantees, start with Session.

Strong Consistency

Strong consistency provides linearizable reads. A read is guaranteed to return the most recent committed write. This is the equivalent of a single-region, single-replica database in terms of data freshness: readers always see the latest state, regardless of which replica they connect to.

How Strong Consistency Works

When you read with Strong consistency, Cosmos DB contacts a quorum of replicas to determine the latest committed version before returning the result. This quorum check adds latency and consumes additional RUs, but guarantees that you never read stale data.

Trade-offs and Constraints

2x RU cost for reads: Each read operation consumes twice the RUs compared to Session, Consistent Prefix, or Eventual consistency. For read-heavy workloads, this doubles your throughput cost.
Higher write latency: Writes must be replicated to a quorum of replicas before being acknowledged, increasing write latency compared to weaker consistency levels.
Not available with multi-region writes: Strong consistency is only available with single-region write configurations. If you enable multi-region writes, the strongest available option is Bounded Staleness.
Reduced availability during outages: Quorum reads may fail during regional outages because the required number of replicas may not be reachable.

When to Use Strong Consistency

Financial transactions where reading stale data could cause incorrect calculations
Inventory management where overselling must be prevented
Leader election or distributed locking scenarios
Any application where correctness is more important than performance

Terminal: Configure Strong consistency

# Set account-level consistency to Strong
az cosmosdb update \
  --name mycosmosdb \
  --resource-group myRG \
  --default-consistency-level Strong

# Verify the consistency configuration
az cosmosdb show \
  --name mycosmosdb \
  --resource-group myRG \
  --query '{ConsistencyLevel:consistencyPolicy.defaultConsistencyLevel, MaxStalenessPrefix:consistencyPolicy.maxStalenessPrefix, MaxIntervalInSeconds:consistencyPolicy.maxIntervalInSeconds}' \
  --output table

# Note: Strong consistency requires single-region writes
# Verify write region configuration
az cosmosdb show \
  --name mycosmosdb \
  --resource-group myRG \
  --query '{EnableMultipleWriteLocations:enableMultipleWriteLocations, WriteLocations:writeLocations[].locationName, ReadLocations:readLocations[].locationName}' \
  --output json

Bounded Staleness

Bounded Staleness guarantees that reads lag behind writes by at most K versions (operations) or T time interval, whichever is reached first. Within the write region, Bounded Staleness behaves identically to Strong consistency. In secondary (read) regions, the staleness bound applies, meaning reads may be up to K versions or T seconds behind the latest write.

Configuring Staleness Bounds

The staleness bound is configured with two parameters:

maxStalenessPrefix (K): Maximum number of version operations that reads can lag behind. For multi-region accounts, Microsoft recommends at least 100,000.
maxIntervalInSeconds (T): Maximum time in seconds that reads can lag. For multi-region accounts, Microsoft recommends at least 300 seconds (5 minutes).

Terminal: Configure Bounded Staleness

# Set consistency to Bounded Staleness with recommended multi-region values
az cosmosdb update \
  --name mycosmosdb \
  --resource-group myRG \
  --default-consistency-level BoundedStaleness \
  --max-staleness-prefix 100000 \
  --max-interval 300

# For single-region accounts, tighter bounds are acceptable:
# --max-staleness-prefix 10 --max-interval 5
# This means reads are at most 10 versions or 5 seconds behind

# Bounded Staleness is the strongest level compatible with multi-region writes
az cosmosdb update \
  --name mycosmosdb \
  --resource-group myRG \
  --enable-multiple-write-locations true \
  --default-consistency-level BoundedStaleness

Bounded Staleness Cost Impact

Like Strong consistency, Bounded Staleness charges 2x RU for reads because Cosmos DB must check version information across replicas. In multi-region deployments with heavy read traffic, this can significantly increase costs. Model your read/write ratio carefully before choosing this level. If you need strong guarantees only within a single user session (which covers most web and mobile applications), Session consistency achieves this at 1x RU cost, a 50% savings on reads.

Session Consistency

Session consistency is the sweet spot for the vast majority of applications. It guarantees that within a single client session (identified by a session token), a client will always see its own writes and all writes that happened before its own writes. Other clients may see slightly stale data, but individual users experience fully consistent behavior within their own session.

Why Session Is Usually the Right Choice

Consider a typical web application: User A updates their profile name. With Session consistency, User A immediately sees their new name on the next page load (read-your-own-writes). User B might see the old name for a brief moment, but this is acceptable in virtually all real-world scenarios. User B doesn't know User A just changed their name and won't notice the brief delay.

How Session Tokens Work

When a client performs a write, Cosmos DB returns a session token in thex-ms-session-token response header.
The client passes this token with subsequent read requests to ensure it reads at least the data from its own writes.
Azure SDKs (Java, .NET, Python, JavaScript) handle session token management automatically when using a single CosmosClient instance.
In distributed architectures where reads and writes may go through different service instances (e.g., a microservices architecture with separate read and write services), you must propagate the session token explicitly between services.

Session token management (Node.js / JavaScript)

const { CosmosClient } = require("@azure/cosmos");

const client = new CosmosClient({
  endpoint: process.env.COSMOS_ENDPOINT,
  key: process.env.COSMOS_KEY,
  consistencyLevel: "Session"
});

const container = client.database("mydb").container("orders");

// --- Single Service Pattern (automatic token management) ---
// When using a single CosmosClient instance, session tokens
// are managed automatically by the SDK

async function createAndReadOrder() {
  // Write an order
  const { resource: order } = await container.items.create({
    id: "order-123",
    customerId: "cust-456",
    total: 99.99,
    status: "pending"
  });

  // This read is guaranteed to see the write above
  // (SDK passes session token automatically)
  const { resource: readOrder } = await container
    .item("order-123", "cust-456")
    .read();

  console.log(readOrder.status); // Guaranteed: "pending"
}

// --- Distributed Service Pattern (manual token propagation) ---
// When reads and writes happen in different service instances,
// pass the session token explicitly

async function writeOrder(orderData) {
  const { resource, headers } = await container.items.create(orderData);
  const sessionToken = headers["x-ms-session-token"];

  // Return the session token to the caller (e.g., via HTTP header)
  return { order: resource, sessionToken };
}

async function readOrder(orderId, partitionKey, sessionToken) {
  // Pass the session token from the write service
  const { resource } = await container
    .item(orderId, partitionKey)
    .read({ sessionToken });

  return resource; // Guaranteed to see the write
}

Session token propagation in a REST API (Express.js)

const express = require("express");
const { CosmosClient } = require("@azure/cosmos");

const app = express();
const client = new CosmosClient({ /* config */ });
const container = client.database("mydb").container("orders");

// POST /orders - Create an order and return session token
app.post("/orders", async (req, res) => {
  const { resource, headers } = await container.items.create(req.body);

  // Pass session token back to the client via response header
  res.set("x-cosmos-session-token", headers["x-ms-session-token"]);
  res.json(resource);
});

// GET /orders/:id - Read with optional session token for consistency
app.get("/orders/:id", async (req, res) => {
  const options = {};

  // If client passes a session token, use it for read-your-own-writes
  const sessionToken = req.headers["x-cosmos-session-token"];
  if (sessionToken) {
    options.sessionToken = sessionToken;
  }

  const { resource } = await container
    .item(req.params.id, req.query.partitionKey)
    .read(options);

  res.json(resource);
});

Session Token in Web Applications

In a typical web application with a single backend, the Cosmos DB SDK manages session tokens automatically, so no extra code is needed. You only need manual session token propagation in specific architectures: CQRS patterns with separate read/write services, serverless functions where each invocation may use a different SDK instance, or load-balanced services where requests from the same user may hit different backends. In these cases, pass the session token through HTTP headers or a shared cache (Redis).

Consistent Prefix

Consistent Prefix guarantees that reads never see out-of-order writes. If writes occur in the order A, B, C, a reader will see one of: nothing, A, A+B, or A+B+C. It will never see A+C (skipping B) or B+A (reordered). This is a weaker guarantee than Session (no read-your-own-writes) but stronger than Eventual (which has no ordering guarantee).

Use Cases for Consistent Prefix

Activity feeds and timelines: Social media feeds where the order of posts matters but a slight delay in seeing the latest post is acceptable.
Event sourcing: Event streams where consumers must process events in order but do not need to see the latest event immediately.
Change data capture: Systems that read change feeds for data synchronization where ordering is essential but real-time freshness is not.
Multi-step workflows: Where workflow state transitions must be observed in order (e.g., Pending → Processing → Completed, never Pending → Completed).

Eventual Consistency

Eventual consistency provides no ordering guarantees whatsoever. Replicas will eventually converge to the same state, but a read at any given moment may return any previously acknowledged write, possibly an older version. This level offers the lowest latency and highest throughput because reads can be served from any replica without checking version information.

When Eventual Is Appropriate

Counters and aggregates: View counts, like counts, and statistics where approximate values are acceptable
Non-critical telemetry: IoT sensor readings, application metrics, and log data
Social media interactions: Likes, reactions, and shares where slight inconsistency is invisible to users
Search indexes: Full-text search or recommendation systems that tolerate brief staleness
Analytics queries: Analytical workloads that aggregate large datasets where individual record freshness is less important

Eventual Does Not Mean Slow

In practice, eventual consistency converges very quickly, typically within milliseconds for single-region accounts and within a few seconds for multi-region accounts. The term “eventual” refers to the guarantee (or lack thereof), not the actual time it takes. For many read-heavy workloads, Eventual consistency provides effectively the same user experience as Session while maximizing throughput and minimizing cost.

Per-Request Consistency Overrides

A powerful but often overlooked feature is the ability to override the default consistency level on individual read requests. The account-level setting is the default, but you canweaken it (never strengthen it) per operation. This enables scenarios where different operations within the same application use different consistency levels based on their specific requirements.

Per-request consistency overrides (.NET)

// Account default: Session consistency

// Critical operation: Use account default (Session)
// User reads their own order - needs read-your-own-writes
var orderResponse = await container.ReadItemAsync<Order>(
    orderId,
    new PartitionKey(customerId)
    // No override - uses Session default
);

// Analytics query: Override to Eventual for better throughput
var analyticsOptions = new QueryRequestOptions
{
    ConsistencyLevel = ConsistencyLevel.Eventual,
    MaxItemCount = 100
};

var analyticsQuery = container.GetItemQueryIterator<PageView>(
    "SELECT * FROM c WHERE c.type = 'pageview' AND c.timestamp > @since",
    requestOptions: analyticsOptions
);
// Reads at 1x RU cost with maximum throughput

// Feed display: Override to Consistent Prefix
var feedOptions = new QueryRequestOptions
{
    ConsistencyLevel = ConsistencyLevel.ConsistentPrefix,
    MaxItemCount = 50
};

var feedQuery = container.GetItemQueryIterator<FeedItem>(
    "SELECT * FROM c WHERE c.feedId = @feedId ORDER BY c.timestamp DESC",
    requestOptions: feedOptions
);
// Guarantees in-order reads without per-session tracking

Account Default	Can Override To	Cannot Override To
Strong	Bounded Staleness, Session, Consistent Prefix, Eventual	N/A (strongest level)
Bounded Staleness	Session, Consistent Prefix, Eventual	Strong
Session	Consistent Prefix, Eventual	Strong, Bounded Staleness
Consistent Prefix	Eventual	Strong, Bounded Staleness, Session
Eventual	N/A (weakest level)	Strong, Bounded Staleness, Session, Consistent Prefix

Cost Optimization with Per-Request Overrides

Set your account default to Session (which covers most needs) and use Eventual or Consistent Prefix for specific read operations that do not need session guarantees. This is particularly effective for analytics queries, dashboard displays, and reporting workloads where slightly stale data is acceptable. The 2x RU penalty for Strong and Bounded Staleness reads makes per-request overrides a significant cost optimization lever for mixed workloads.

Consistency and Global Distribution

Consistency levels become especially important in globally distributed Cosmos DB accounts where data must be replicated across regions. The interaction between consistency choice and multi-region configuration directly affects availability SLAs, write latency, and cost.

Multi-Region Write Compatibility

Consistency Level	Single-Region Write	Multi-Region Write	Conflict Resolution Needed
Strong	Yes	No (not supported)	N/A
Bounded Staleness	Yes	Yes (strongest option)	Yes (LWW or custom)
Session	Yes	Yes	Yes (LWW or custom)
Consistent Prefix	Yes	Yes	Yes (LWW or custom)
Eventual	Yes	Yes	Yes (LWW or custom)

Terminal: Configure multi-region Cosmos DB

# Create a multi-region Cosmos DB account with Session consistency
az cosmosdb create \
  --name mycosmosdb-global \
  --resource-group myRG \
  --default-consistency-level Session \
  --locations regionName=eastus2 failoverPriority=0 isZoneRedundant=true \
  --locations regionName=westeurope failoverPriority=1 isZoneRedundant=true \
  --locations regionName=southeastasia failoverPriority=2 isZoneRedundant=true

# Enable multi-region writes (with conflict resolution)
az cosmosdb update \
  --name mycosmosdb-global \
  --resource-group myRG \
  --enable-multiple-write-locations true

# Create a container with Last Write Wins conflict resolution
az cosmosdb sql container create \
  --account-name mycosmosdb-global \
  --database-name mydb \
  --name orders \
  --partition-key-path "/customerId" \
  --throughput 4000 \
  --conflict-resolution-policy '{"mode": "LastWriterWins", "conflictResolutionPath": "/_ts"}'

# Enable autoscale throughput for variable workloads
az cosmosdb sql container throughput migrate \
  --account-name mycosmosdb-global \
  --database-name mydb \
  --name orders \
  --throughput-type autoscale

Choosing the Right Consistency Level

Scenario	Recommended Level	Reasoning
Financial transactions, inventory	Strong	Cannot tolerate stale reads; correctness is critical
Multi-region with strong guarantees	Bounded Staleness	Strongest option compatible with multi-region writes
User-facing web/mobile apps	Session	Users see their own changes; best cost/consistency balance
Event streams, activity feeds	Consistent Prefix	Order matters; slight delay is acceptable
Telemetry, analytics, social likes	Eventual	Stale reads are fine; maximize throughput and minimize cost
Mixed workload (API + analytics)	Session (with per-request overrides)	Session for user operations; Eventual for analytics queries
Gaming leaderboards	Eventual	Approximate scores are acceptable; high write throughput needed
E-commerce product catalog	Session	Sellers see their updates; buyers can tolerate brief delay

Request Units (RU) and Cost Impact

The consistency level you choose directly impacts your Cosmos DB bill because Strong and Bounded Staleness reads consume 2x RUs compared to Session, Consistent Prefix, and Eventual reads. For read-heavy workloads, this cost difference is substantial.

RU cost comparison calculation

Scenario: E-commerce application
  - 10 million reads per day (1 KB average document)
  - 1 million writes per day
  - Read RU cost per operation: ~1 RU (for 1 KB document)
  - Write RU cost per operation: ~6 RU (for 1 KB document)

Strong/Bounded Staleness reads:
  Reads:  10M x 2 RU = 20M RU/day
  Writes: 1M x 6 RU  = 6M RU/day
  Total:  26M RU/day = ~300 RU/s average

Session/Consistent Prefix/Eventual reads:
  Reads:  10M x 1 RU = 10M RU/day
  Writes: 1M x 6 RU  = 6M RU/day
  Total:  16M RU/day = ~185 RU/s average

Cost difference: ~38% reduction in provisioned RU/s
At $0.008 per 100 RU/s per hour:
  Strong:  300 RU/s = ~$17.50/day = ~$525/month
  Session: 185 RU/s = ~$10.80/day = ~$324/month
  Savings: ~$201/month (38%)

For applications with higher read-to-write ratios, the savings
from Session vs Strong consistency are even more dramatic.

Autoscale Provisioned Throughput

Use autoscale throughput instead of manual provisioned throughput for workloads with variable traffic patterns. Autoscale automatically adjusts between 10% and 100% of your maximum configured RU/s, and you only pay for the throughput actually consumed. This is particularly effective when combined with the lower RU cost of Session or Eventual consistency, as your effective cost scales down proportionally during low-traffic periods.

Optimize Cosmos DB costs as part of your FinOps practice Apply Reliability and Performance pillars to database consistency choices Configure Private Endpoints for secure Cosmos DB access Compare Cosmos DB with DynamoDB and Cloud Spanner consistency models

Key Takeaways

1Cosmos DB offers five consistency levels: Strong, Bounded Staleness, Session, Consistent Prefix, and Eventual.
2Session consistency is the default and most popular, guaranteeing read-your-writes per session.
3Stronger consistency reduces availability and increases latency; weaker consistency improves both.
4Bounded Staleness provides strong consistency guarantees with configurable lag tolerance.
5Choose consistency based on application requirements, not as a global database setting.
6Multi-region writes require Eventual or Session consistency; Strong requires single-write region.

Frequently Asked Questions

What are the five Cosmos DB consistency levels?

Strong (linearizable reads), Bounded Staleness (reads lag behind writes by configurable time/versions), Session (read-your-writes within a session), Consistent Prefix (reads never see out-of-order writes), and Eventual (no ordering guarantees, highest performance).

Which consistency level should I choose?

Session consistency is recommended for most applications because it guarantees you see your own writes while offering good performance. Use Strong only when you need linearizable reads (financial transactions). Use Eventual for high-throughput scenarios where stale reads are acceptable.

Does consistency level affect Cosmos DB pricing?

Yes. Stronger consistency levels consume more Request Units (RU) per operation. Strong consistency reads consume 2x the RUs of Eventual reads because they must read from quorum replicas. This directly impacts throughput costs.

Can I use different consistency levels for different queries?

Yes. The account-level setting is the maximum consistency. Individual requests can use any level at or below the account setting. This allows strong consistency for critical reads and eventual consistency for others.

What is the difference between Session and Eventual consistency?

Session consistency guarantees that within a client session, you always read your own writes and reads are monotonic. Eventual consistency provides no such guarantee, and you may read stale data even from your own writes. Session is slightly more expensive but much safer.

Written by CloudToolStack Editorial

Written and reviewed by the CloudToolStack editorial team. Every guide is verified against current provider documentation and revised in place when providers change pricing, deprecate services, or release meaningfully better alternatives.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.