DynamoDB Design Patterns: Single-Table, GSI Overloading, and When to Use What

DynamoDB Is Not a Relational Database (Stop Treating It Like One)

The single biggest mistake I see teams make with DynamoDB is approaching it with a relational mindset. They normalize their data, create multiple tables, and then wonder why their queries are slow and expensive. DynamoDB is a fundamentally different tool. It is optimized for known access patterns at any scale, not for ad-hoc queries across normalized data. If you design your DynamoDB tables the way you design PostgreSQL schemas, you will end up with a slow, expensive database that is harder to use than the relational database you could have picked instead.

This guide covers the design patterns that actually work in production: single-table design, GSI overloading, capacity mode selection, and cost optimization. We will use real examples from e-commerce, user profile management, and IoT time-series workloads to illustrate each pattern. The goal is to help you decide when DynamoDB is the right choice and, once you have committed to it, how to design your data model so it performs well and costs less than you expect.

When DynamoDB Is the Right Choice

DynamoDB excels in specific scenarios and is a poor fit for others. Understanding this distinction saves months of frustration.

DynamoDB is the right choice when you have well-defined access patterns that are known at design time, when you need consistent single-digit millisecond latency at any scale, when your workload is write-heavy or has high throughput requirements, and when you want operational simplicity -- no patching, no failover management, no connection pool tuning.

DynamoDB is the wrong choice when you need complex joins across multiple entity types for analytics, when your access patterns are unpredictable or frequently changing, when you need full-text search (use OpenSearch instead), or when your data model has deep, many-to-many relationships that require traversal (consider Neptune or a relational database). It is also a poor fit for workloads under 25 GB with simple CRUD operations -- RDS or even SQLite on EFS will be simpler and cheaper.

Single-Table Design: The Core Pattern

Single-table design stores multiple entity types in one DynamoDB table, using generic partition key (PK) and sort key (SK) attributes to support all access patterns. This sounds wrong to anyone with a relational background, but it is the recommended approach from DynamoDB's own team, and the reasons are practical.

First, DynamoDB charges per table for on-demand capacity, and provisioned capacity must be configured per table. Fewer tables means simpler capacity management. Second, DynamoDB transactions can only span items within a single table (up to 100 items). If related entities are in different tables, you cannot update them atomically. Third, Global Secondary Indexes (GSIs) project data from the base table, and you get a maximum of 20 GSIs per table. Using one table with well-designed GSIs is more flexible than using multiple tables each with their own GSIs.

E-Commerce Example

Consider an e-commerce application with three entity types: customers, orders, and order items. In a relational database, these would be three normalized tables with foreign keys. In DynamoDB single-table design, all three entities live in one table:

Customer: PK = "CUSTOMER#12345", SK = "PROFILE"
Order: PK = "CUSTOMER#12345", SK = "ORDER#2024-03-15#98765"
Order Item: PK = "ORDER#98765", SK = "ITEM#001"

This structure supports the primary access patterns efficiently:

Get customer profile: GetItem with PK = "CUSTOMER#12345", SK = "PROFILE"
List customer orders (newest first): Query with PK = "CUSTOMER#12345", SK begins_with "ORDER#", ScanIndexForward = false
Get all items in an order: Query with PK = "ORDER#98765", SK begins_with "ITEM#"
Get customer profile and recent orders in one call: Query with PK = "CUSTOMER#12345" (returns both the profile and all orders)

Notice how the sort key for orders includes the date in ISO format. This makes chronological ordering natural -- DynamoDB sorts string sort keys lexicographically, and ISO dates sort correctly as strings. The order ID at the end ensures uniqueness.

Build DynamoDB queries interactively

GSI Overloading: Enabling Additional Access Patterns

Your base table's primary key supports one access pattern efficiently. Global Secondary Indexes enable additional access patterns by projecting the same data with different key structures. GSI overloading takes this further: you reuse generic GSI key attributes (GSI1PK, GSI1SK) across entity types to serve multiple access patterns from a single GSI.

Continuing the E-Commerce Example

Suppose you need these additional access patterns: list orders by status (for fulfillment dashboard), look up orders by order ID (for customer service), and list all orders in a date range (for reporting).

Add a GSI with GSI1PK and GSI1SK attributes:

Order entity: GSI1PK = "STATUS#SHIPPED", GSI1SK = "2024-03-15#ORDER#98765"
Order entity: GSI1PK = "ORDER#98765", GSI1SK = "ORDER#98765" (for direct lookup)

Wait -- an item can only have one value for GSI1PK. You need to choose which access pattern is more important for this GSI, or add a second GSI (GSI2PK, GSI2SK) for the other pattern. This is where the design requires careful thought about access pattern priority.

In practice, the order-by-status pattern is more valuable as a GSI because it requires a Query operation (list all shipped orders). The order-by-ID lookup can be handled by storing the order ID in a well-known GSI key pattern. The date-range query for reporting is typically better served by exporting data to S3 and querying with Athena, since reporting queries do not need single-digit millisecond latency.

GSI design rule of thumb

Design your GSIs for your highest-throughput read access patterns. Low-frequency queries (reporting, admin dashboards, one-off lookups) are often better served by Scans with filters, PartiQL, or exporting to an analytics system. Do not burn GSI slots on patterns that run a few times per day.

User Profile Pattern: Sparse GSIs

Consider a user management system where users can be looked up by user ID (primary), email address, or phone number. Not all users have phone numbers, and some users have multiple email addresses.

The base table uses PK = "USER#uuid" and SK = "PROFILE" for the primary user record. Create a GSI on an "email" attribute. Only items that have the email attribute will appear in the GSI -- this is a sparse GSI, and it is one of DynamoDB's most useful features. The GSI only contains the items you care about, keeping it lean and cheap.

For users with multiple emails, store each email as a separate item: PK = "USER#uuid", SK = "EMAIL#jeff@example.com", with the email attribute set to "jeff@example.com". The GSI on the email attribute lets you query by email, and the base table Query by PK lets you list all emails for a user. This pattern handles the one-to-many relationship without a join.

IoT Time-Series Pattern

Time-series data from IoT devices is one of DynamoDB's strongest use cases, but it requires careful partitioning to avoid hot partitions. The naive approach of PK = "DEVICE#123" and SK = timestamp works for devices that write infrequently, but for devices writing multiple times per second, you will hit the partition throughput limit of 1,000 WCU per partition.

Time-Based Partition Bucketing

The proven pattern for high-throughput time-series is to bucket partitions by time period:

PK: "DEVICE#123#2024-03-15-14" (device ID + hour bucket)
SK: "2024-03-15T14:30:45.123Z" (full ISO timestamp)

This distributes writes across hourly partitions. To query the last 24 hours of data for a device, you issue 24 parallel Query operations (one per hour bucket) and merge the results client-side. This is more code than a single query, but each individual query is fast and the parallel execution keeps total latency low.

For retention management, use DynamoDB's Time To Live (TTL) feature. Add a "ttl" attribute with a Unix timestamp, and DynamoDB will automatically delete expired items within 48 hours of the TTL time (usually much sooner). TTL deletes are free -- they do not consume write capacity units. For IoT data with a 30-day retention requirement, set the TTL to 30 days from the write time, and you never need to run cleanup jobs.

Capacity Modes: On-Demand vs. Provisioned

DynamoDB offers two capacity modes, and choosing the wrong one is one of the most common cost mistakes.

On-Demand Mode

On-demand mode charges per request: $1.25 per million write request units and $0.25 per million read request units (US East pricing). There is no capacity planning, no throttling from under-provisioned capacity, and no wasted spend from over-provisioned capacity. It scales automatically from zero to millions of requests per second.

On-demand is the right choice for: new tables where you do not know the traffic pattern yet, workloads with unpredictable spikes (event-driven architectures, viral content), development and staging environments, and tables with low or zero traffic most of the time.

Provisioned Mode with Auto Scaling

Provisioned mode charges per hour for provisioned capacity: approximately $0.00065 per WCU-hour and $0.00013 per RCU-hour. With auto scaling enabled, DynamoDB adjusts provisioned capacity based on actual utilization, targeting a utilization percentage you specify (typically 70 percent).

Provisioned mode is cheaper than on-demand for predictable workloads. A table that consistently handles 1,000 WCU costs approximately $0.65 per hour provisioned ($468/month) versus $1.25 per million writes on-demand. If you are doing 1,000 writes per second consistently, that is 2.59 billion writes per month, costing $3,240 on-demand. The provisioned cost is 86 percent less.

The breakeven point is roughly: if your table has consistent traffic for more than 18 to 20 hours per day, provisioned mode with auto scaling is cheaper. If traffic is bursty with long idle periods, on-demand wins.

Calculate DynamoDB capacity and cost

Reserved capacity for big savings

For tables with predictable baseline throughput, DynamoDB Reserved Capacity provides an additional 53 to 76 percent discount over provisioned pricing with a one-year or three-year commitment. A table that costs $468/month provisioned drops to roughly $160/month with a three-year reserved commitment. The catch: reserved capacity commits to a specific number of WCU/RCU in a specific region and cannot be adjusted.

Cost Optimization Strategies

1. Use Eventually Consistent Reads

Eventually consistent reads cost half as much as strongly consistent reads (0.5 RCU per 4 KB versus 1 RCU per 4 KB). For most read workloads -- product catalogs, user profiles, dashboards -- eventual consistency is fine. Data is typically consistent within milliseconds. Only use strongly consistent reads when you absolutely need read-after-write consistency, such as immediately after a financial transaction.

2. Project Only Needed Attributes

DynamoDB charges read capacity based on the size of the item read, not the attributes returned. A ProjectionExpression reduces network transfer but not RCU cost. However, for GSIs, choosing which attributes to project (KEYS_ONLY, INCLUDE specific attributes, or ALL) directly affects GSI storage cost and write throughput. Project only the attributes you need in your GSI -- every extra attribute increases storage cost and write amplification.

3. Batch Operations

BatchGetItem retrieves up to 100 items in a single API call, which reduces network round trips and API call overhead. BatchWriteItem processes up to 25 puts or deletes. Use these for bulk operations instead of individual GetItem/PutItem calls. The cost in RCU/WCU is the same, but the reduced API call count matters for on-demand tables where you are charged per request.

4. Use DAX for Read-Heavy Workloads

DynamoDB Accelerator (DAX) is an in-memory cache that sits in front of DynamoDB. For read-heavy workloads (10:1 read-to-write ratio or higher), DAX can reduce read costs by 90 percent or more while delivering microsecond response times. A dax.t3.small node costs approximately $0.04 per hour ($29/month). If your table's read costs exceed $100/month and your read patterns have temporal locality (the same items are read repeatedly), DAX almost certainly pays for itself.

5. Compress Large Items

DynamoDB items can be up to 400 KB, but larger items consume more RCU/WCU. If you store JSON blobs, logs, or other compressible data, gzip-compress the attribute value before writing and decompress after reading. This can reduce item sizes by 60 to 80 percent, directly reducing capacity consumption. Store the compressed data as a Binary attribute type.

Common Anti-Patterns to Avoid

Scan as a Query Strategy

A Scan reads every item in the table and filters client-side. For a table with 10 million items, a full scan reads all 10 million items and charges you for every single read, even if the filter returns only 100 items. If you find yourself writing Scans with FilterExpressions, your table design does not support the access pattern you need. Add a GSI, restructure your keys, or export the data to a system designed for analytical queries.

Too Many Tables

Teams from relational backgrounds create one table per entity type: Users table, Orders table, Products table, Inventory table. This creates several problems: transactions cannot span tables, capacity management multiplies by the number of tables, and you miss opportunities for efficient single-query access patterns. Start with single-table design and only split into multiple tables when you have genuinely independent access patterns with no transactional relationship.

Large Partition Keys

Using a high-cardinality attribute like a UUID as the partition key is generally good for distribution. But using a low-cardinality attribute like "status" (with values like "active" and "inactive") creates hot partitions. If 95 percent of your items have status = "active", 95 percent of your traffic hits the same partition. Design partition keys to distribute traffic evenly.

Compare NoSQL databases across clouds

Practical Recommendations

Start every DynamoDB project by listing your access patterns. Write them down as concrete queries: "Get user by ID," "List orders for user sorted by date," "Find all orders with status PENDING." Design your table keys and GSIs to support these patterns, then verify with sample data before writing application code. If you cannot support a critical access pattern with Query operations (as opposed to Scans), reconsider whether DynamoDB is the right database for this workload.

Use on-demand capacity for new tables. Switch to provisioned with auto scaling once your traffic patterns are stable and predictable, typically after 4 to 8 weeks in production. Enable DynamoDB Streams if you need change data capture for downstream processing, and use TTL for any data with a natural expiration to avoid paying to store data you no longer need.

Finally, do not over-invest in single-table design for simple applications. If your application has three entity types with straightforward CRUD operations and no transactional requirements across entities, three simple tables with clear keys are easier to understand and maintain than a single table with complex key overloading. Single-table design pays off at scale and complexity -- for a small service with two developers, simplicity is more valuable than optimization.

The access pattern spreadsheet

Before writing any DynamoDB code, create a spreadsheet with columns for access pattern name, query type (GetItem/Query/Scan), key condition, filter condition, and expected throughput. Fill in every access pattern your application needs. This spreadsheet becomes your table design document and your test plan. If you cannot fill it in, you do not understand your data model well enough to build it.