Skip to main content
AWSNetworkingintermediate

VPC Architecture Patterns

Common VPC designs including multi-tier, hub-and-spoke, and transit gateway architectures.

CloudToolStack Team28 min readPublished Feb 22, 2026

Prerequisites

Understanding VPC Fundamentals

An Amazon Virtual Private Cloud (VPC) is a logically isolated section of the AWS cloud where you launch resources in a virtual network that you define. Think of it as your own private data center within AWS, complete with subnets, route tables, gateways, and security controls. Every VPC spans a single AWS region but can cover all Availability Zones within that region.

You assign a CIDR block (for example, 10.0.0.0/16) when creating a VPC, which defines the IP address space available for your subnets and resources. This seemingly simple decision has lasting consequences; your CIDR allocation affects how many resources you can deploy, whether you can peer VPCs, and how your network integrates with on-premises infrastructure. Getting the network foundation right is critical because changing it later requires migrating workloads to a new VPC.

This guide explores the most common and effective VPC architecture patterns, from simple single-VPC deployments to complex multi-account, multi-region topologies. Each pattern includes specific guidance on when to use it, how to implement it, and what trade-offs to consider.

CIDR Planning Is Foundational

Plan your CIDR blocks carefully before creating VPCs. Overlapping CIDR ranges prevent VPC peering and Transit Gateway connectivity. Use a /16 for production VPCs (65,536 addresses) and reserve non-overlapping ranges across all environments, accounts, and regions. Create a centralized IP Address Management (IPAM) plan using AWS VPC IPAM or a spreadsheet at minimum. A common scheme uses 10.<account>.<env>.0/24 where account and environment are encoded in the second and third octets.

The Three-Tier Architecture

The most common and well-understood VPC pattern separates resources into three tiers: public, private (application), and data. Each tier uses dedicated subnets with different routing and access controls. This separation creates defense-in-depth by ensuring that even if one tier is compromised, the attacker cannot directly access resources in other tiers without traversing additional security controls.

Tier Definitions

TierSubnet TypeInternet AccessTypical Resources
PublicPublic subnet (route to IGW)Direct inbound and outbound via Internet GatewayALB, NAT Gateway, Bastion hosts, VPN endpoints
ApplicationPrivate subnet (route to NAT GW)Outbound only via NAT GatewayEC2, ECS tasks, Lambda (VPC-attached), application servers
DataIsolated subnet (no internet route)None, completely isolated from the internetRDS, ElastiCache, Redshift, OpenSearch, EFS

The key principle is that each tier only communicates with adjacent tiers. The public tier receives internet traffic and forwards it to the application tier. The application tier processes requests and reads/writes to the data tier. The data tier never communicates directly with the internet. Security groups enforce these communication boundaries at the instance level.

CloudFormation Implementation

Deploy subnets across at least two Availability Zones for high availability. Each AZ should have its own public, private, and data subnet. This gives you six subnets minimum for a production VPC. For three AZ deployments (recommended for production), you need nine subnets.

three-tier-vpc.yaml
AWSTemplateFormatVersion: '2010-09-09'
Description: Three-tier VPC with public, private, and data subnets across 2 AZs

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      EnableDnsSupport: true
      EnableDnsHostnames: true
      Tags:
        - Key: Name
          Value: production-vpc

  InternetGateway:
    Type: AWS::EC2::InternetGateway

  IGWAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      VpcId: !Ref VPC
      InternetGatewayId: !Ref InternetGateway

  # Public Subnets
  PublicSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.1.0/24
      AvailabilityZone: !Select [0, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: public-a

  PublicSubnetB:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.2.0/24
      AvailabilityZone: !Select [1, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: public-b

  # Private (Application) Subnets
  PrivateSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.10.0/24
      AvailabilityZone: !Select [0, !GetAZs '']
      Tags:
        - Key: Name
          Value: private-a

  PrivateSubnetB:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.11.0/24
      AvailabilityZone: !Select [1, !GetAZs '']
      Tags:
        - Key: Name
          Value: private-b

  # Data (Isolated) Subnets
  DataSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.20.0/24
      AvailabilityZone: !Select [0, !GetAZs '']
      Tags:
        - Key: Name
          Value: data-a

  DataSubnetB:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.21.0/24
      AvailabilityZone: !Select [1, !GetAZs '']
      Tags:
        - Key: Name
          Value: data-b

  # NAT Gateway for private subnets
  NATGatewayEIP:
    Type: AWS::EC2::EIP
    Properties:
      Domain: vpc

  NATGateway:
    Type: AWS::EC2::NatGateway
    Properties:
      AllocationId: !GetAtt NATGatewayEIP.AllocationId
      SubnetId: !Ref PublicSubnetA

NAT Gateway High Availability

A single NAT Gateway is a single point of failure. For production workloads, deploy a NAT Gateway in each AZ and configure route tables so each AZ's private subnets route through their local NAT Gateway. This ensures that an AZ failure does not take down outbound internet access for resources in the remaining AZs. Be aware that each NAT Gateway costs approximately $32/month plus $0.045/GB data processing.

Subnet Sizing Guidelines

AWS reserves 5 IP addresses in every subnet (network address, VPC router, DNS server, future use, and broadcast). Plan subnet sizes based on the resources you expect to run:

CIDR PrefixTotal IPsUsable IPsRecommended For
/24256251Most workloads, good default
/221,0241,019EKS clusters (pods consume IPs aggressively)
/204,0964,091Large EKS clusters, Lambda-heavy workloads
/281611Minimal subnets (NAT GW, firewall endpoints)
AWS Networking Deep Dive: IP Address Planning for EKS and Lambda

Hub-and-Spoke with Transit Gateway

As your environment grows beyond a few VPCs, VPC peering becomes unmanageable due to its non-transitive nature. With N VPCs, peering requires N*(N-1)/2 connections. Five VPCs need 10 peering connections; ten VPCs need 45. AWS Transit Gateway (TGW) solves this by acting as a central hub that all your VPCs and on-premises networks connect to, simplifying routing and management to a hub-and-spoke model.

In a hub-and-spoke model, a shared-services VPC hosts common resources like DNS resolvers, Active Directory, centralized logging, and CI/CD infrastructure. Application VPCs connect through Transit Gateway and route traffic to shared services or the internet through the hub. This pattern scales to hundreds of VPCs while keeping the network architecture manageable.

When to Use Transit Gateway vs VPC Peering

CriteriaVPC PeeringTransit Gateway
Number of VPCs2-3 VPCs3+ VPCs
TransitivityNon-transitive (direct connections only)Transitive (hub-and-spoke)
CostFree (data transfer only)$0.05/hour per attachment + $0.02/GB
BandwidthNo limit50 Gbps per VPC attachment
Cross-regionYes (inter-region peering)Yes (TGW peering)
Route tablesOne route per peeringCentralized, segmented route tables
VPN/DX integrationNoYes, native integration

Transit Gateway Costs

Transit Gateway charges per attachment ($0.05/hour, approximately $36/month) and per GB of data processed ($0.02/GB). For simple two-VPC connectivity, VPC peering is cheaper since it has no hourly charge. Transit Gateway becomes cost-effective at three or more VPCs, and essential when you need network segmentation, centralized egress, or hybrid connectivity with VPN or Direct Connect.

transit-gateway.yaml
# Transit Gateway with hub-and-spoke architecture
Resources:
  TransitGateway:
    Type: AWS::EC2::TransitGateway
    Properties:
      DefaultRouteTableAssociation: disable
      DefaultRouteTablePropagation: disable
      Description: Central hub for all VPCs
      DnsSupport: enable
      VpnEcmpSupport: enable
      Tags:
        - Key: Name
          Value: central-tgw

  # Shared Services VPC Attachment
  SharedServicesAttachment:
    Type: AWS::EC2::TransitGatewayAttachment
    Properties:
      TransitGatewayId: !Ref TransitGateway
      VpcId: !Ref SharedServicesVPC
      SubnetIds:
        - !Ref SharedTransitSubnetA
        - !Ref SharedTransitSubnetB
      Tags:
        - Key: Name
          Value: shared-services-attachment

  # Production VPC Attachment
  ProdVpcAttachment:
    Type: AWS::EC2::TransitGatewayAttachment
    Properties:
      TransitGatewayId: !Ref TransitGateway
      VpcId: !Ref ProdVPC
      SubnetIds:
        - !Ref ProdTransitSubnetA
        - !Ref ProdTransitSubnetB
      Tags:
        - Key: Name
          Value: prod-attachment

  # Route Tables for network segmentation
  SharedServicesRouteTable:
    Type: AWS::EC2::TransitGatewayRouteTable
    Properties:
      TransitGatewayId: !Ref TransitGateway
      Tags:
        - Key: Name
          Value: shared-services-rt

  ProdRouteTable:
    Type: AWS::EC2::TransitGatewayRouteTable
    Properties:
      TransitGatewayId: !Ref TransitGateway
      Tags:
        - Key: Name
          Value: prod-rt

  # Associate and propagate routes
  SharedServicesAssociation:
    Type: AWS::EC2::TransitGatewayRouteTableAssociation
    Properties:
      TransitGatewayRouteTableId: !Ref SharedServicesRouteTable
      TransitGatewayAttachmentId: !Ref SharedServicesAttachment

  ProdToSharedPropagation:
    Type: AWS::EC2::TransitGatewayRouteTablePropagation
    Properties:
      TransitGatewayRouteTableId: !Ref ProdRouteTable
      TransitGatewayAttachmentId: !Ref SharedServicesAttachment

Network Segmentation with TGW Route Tables

Transit Gateway route tables enable powerful network segmentation. By creating separate route tables for different environments, you can control exactly which VPCs can communicate with each other. This is the primary mechanism for enforcing network isolation between production and development environments.

Route TableAssociationsPropagationsPurpose
Shared Services RTShared Services VPCAll VPCsShared services can reach all VPCs
Production RTProduction VPCsShared Services + Prod VPCsProd can reach shared services and other prod
Development RTDev VPCsShared Services onlyDev can only reach shared services, not prod
Egress RTEgress VPCAll VPCsCentralized internet egress for all VPCs
AWS Networking Deep Dive: Transit Gateway Advanced Routing

Security Layers: Security Groups and NACLs

VPC security operates at two levels: security groups (stateful, instance-level) and network ACLs (stateless, subnet-level). Understanding the difference between these two mechanisms is critical for effective network security. They serve complementary purposes and should be used together as part of a defense-in-depth strategy.

FeatureSecurity GroupNetwork ACL
LevelENI (instance/task level)Subnet level
StatefulnessStateful, return traffic auto-allowedStateless, return traffic must be explicitly allowed
Default behaviorDeny all inbound, allow all outboundAllow all inbound and outbound
Rule typeAllow rules onlyAllow and deny rules
Rule evaluationAll rules evaluated togetherRules evaluated in numerical order
SG referencesCan reference other security groupsCan only reference CIDR blocks

Security Group Best Practices

Security groups should be your primary line of defense. Their stateful nature makes them easier to manage and less error-prone than NACLs. Follow these practices:

  • Reference other security groups: Instead of using CIDR blocks, reference the security group of the calling tier. For example, allow the ALB security group to access port 8080 on your application security group.
  • One security group per tier/role: Create separate security groups for ALBs, application servers, and databases. This makes rules clear and auditable.
  • Minimize outbound rules: The default allows all outbound traffic. For sensitive workloads, restrict outbound to only necessary destinations.
  • Use prefix lists: AWS managed prefix lists (like com.amazonaws.us-east-1.s3) let you reference AWS service IP ranges without hardcoding them.
security-groups.yaml
# Three-tier security group chain
ALBSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties:
    GroupDescription: ALB - allows internet HTTPS traffic
    VpcId: !Ref VPC
    SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: 443
        ToPort: 443
        CidrIp: 0.0.0.0/0
        Description: HTTPS from internet

AppSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties:
    GroupDescription: App tier - only accepts traffic from ALB
    VpcId: !Ref VPC
    SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: 8080
        ToPort: 8080
        SourceSecurityGroupId: !Ref ALBSecurityGroup
        Description: Application port from ALB

DatabaseSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties:
    GroupDescription: Database - only accepts traffic from app tier
    VpcId: !Ref VPC
    SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: 5432
        ToPort: 5432
        SourceSecurityGroupId: !Ref AppSecurityGroup
        Description: PostgreSQL from app tier

Do Not Rely Solely on NACLs

NACLs are stateless, meaning you must explicitly allow return traffic on ephemeral ports (1024-65535). They are best used as a coarse-grained backup or for blocking specific known-bad IP addresses, not as your primary security mechanism. A common pattern is to keep NACLs at their default (allow all) and rely on security groups for fine-grained access control. Use NACLs only when you need subnet-level deny rules that security groups cannot provide.

VPC Endpoints for Private Connectivity

VPC endpoints allow your resources to communicate with AWS services without traversing the internet. This improves security by keeping traffic on the AWS private network, reduces NAT Gateway costs by bypassing the NAT for AWS service traffic, and lowers latency by avoiding the roundtrip through the internet gateway. There are two types of VPC endpoints with different use cases and cost models.

Gateway Endpoints

Gateway endpoints are free and available for S3 and DynamoDB only. They are implemented as route table entries that direct traffic to the AWS service through the AWS private network. Gateway endpoints are the preferred choice for S3 and DynamoDB access because they have zero cost and no bandwidth limitations.

Interface Endpoints (PrivateLink)

Interface endpoints are powered by AWS PrivateLink and available for most other AWS services. They create Elastic Network Interfaces (ENIs) in your subnets that serve as private entry points for the service. Interface endpoints cost $0.01/hour per AZ (approximately $7.20/month per AZ) plus $0.01/GB of data processed.

FeatureGateway EndpointInterface Endpoint
ServicesS3, DynamoDB onlyMost AWS services (100+)
CostFree$0.01/hour/AZ + $0.01/GB
ImplementationRoute table entryENI in your subnet
SecurityEndpoint policy (IAM)Endpoint policy + security groups
DNSNot applicableOptional private DNS override
Access from on-premisesNoYes (via DNS)
vpc-endpoints.yaml
# Free gateway endpoint for S3
S3GatewayEndpoint:
  Type: AWS::EC2::VPCEndpoint
  Properties:
    ServiceName: !Sub com.amazonaws.${AWS::Region}.s3
    VpcId: !Ref VPC
    RouteTableIds:
      - !Ref PrivateRouteTableA
      - !Ref PrivateRouteTableB
      - !Ref DataRouteTable
    VpcEndpointType: Gateway
    PolicyDocument:
      Version: '2012-10-17'
      Statement:
        - Effect: Allow
          Principal: '*'
          Action:
            - 's3:GetObject'
            - 's3:PutObject'
            - 's3:ListBucket'
          Resource:
            - 'arn:aws:s3:::my-app-bucket/*'
            - 'arn:aws:s3:::my-app-bucket'

# Free gateway endpoint for DynamoDB
DynamoDBGatewayEndpoint:
  Type: AWS::EC2::VPCEndpoint
  Properties:
    ServiceName: !Sub com.amazonaws.${AWS::Region}.dynamodb
    VpcId: !Ref VPC
    RouteTableIds:
      - !Ref PrivateRouteTableA
      - !Ref PrivateRouteTableB
    VpcEndpointType: Gateway

# Interface endpoint for Secrets Manager
SecretsManagerEndpoint:
  Type: AWS::EC2::VPCEndpoint
  Properties:
    ServiceName: !Sub com.amazonaws.${AWS::Region}.secretsmanager
    VpcId: !Ref VPC
    SubnetIds:
      - !Ref PrivateSubnetA
      - !Ref PrivateSubnetB
    SecurityGroupIds:
      - !Ref EndpointSecurityGroup
    VpcEndpointType: Interface
    PrivateDnsEnabled: true

# Interface endpoint for ECR (needed for Fargate image pulls)
ECRApiEndpoint:
  Type: AWS::EC2::VPCEndpoint
  Properties:
    ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.api
    VpcId: !Ref VPC
    SubnetIds:
      - !Ref PrivateSubnetA
      - !Ref PrivateSubnetB
    SecurityGroupIds:
      - !Ref EndpointSecurityGroup
    VpcEndpointType: Interface
    PrivateDnsEnabled: true

ECRDockerEndpoint:
  Type: AWS::EC2::VPCEndpoint
  Properties:
    ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.dkr
    VpcId: !Ref VPC
    SubnetIds:
      - !Ref PrivateSubnetA
      - !Ref PrivateSubnetB
    SecurityGroupIds:
      - !Ref EndpointSecurityGroup
    VpcEndpointType: Interface
    PrivateDnsEnabled: true

Cost Savings with S3 Gateway Endpoints

Data flowing through NAT Gateway to S3 costs $0.045/GB for NAT processing plus standard data transfer rates. An S3 gateway endpoint is completely free. If your workloads regularly access S3 (logs, data processing, backups), this can save hundreds or thousands of dollars per month. Always deploy S3 and DynamoDB gateway endpoints as part of your baseline VPC configuration.

Essential VPC Endpoints for Common Workloads

Depending on your workloads, consider deploying these commonly needed interface endpoints to avoid NAT Gateway charges and improve security:

  • ECS/Fargate: ecr.api, ecr.dkr, s3 (gateway), logs, secretsmanager, ssm
  • Lambda (VPC-attached): s3 (gateway), dynamodb (gateway), sqs, sns, secretsmanager
  • EKS: ecr.api, ecr.dkr, s3 (gateway), sts, elasticloadbalancing, autoscaling
  • Systems Manager: ssm, ssmmessages, ec2messages (all three required for Session Manager)
AWS Cost Optimization: Reducing Data Transfer Costs with VPC Endpoints

Centralized Egress Architecture

Rather than deploying NAT Gateways in every VPC (at $0.045/hour each plus data processing), a centralized egress pattern routes all internet-bound traffic from spoke VPCs through a shared egress VPC. This consolidates NAT Gateways, enables centralized logging of internet-bound traffic, and allows insertion of network firewalls for traffic inspection.

Architecture Overview

In a centralized egress architecture, spoke VPCs have no NAT Gateways or Internet Gateways. All outbound internet traffic flows through Transit Gateway to the egress VPC, which hosts the NAT Gateways and optionally AWS Network Firewall for inspection. This pattern significantly reduces costs when you have many VPCs while providing a single point for traffic monitoring and policy enforcement.

centralized-egress-routes.sh
# Spoke VPC route table: send all non-local traffic to Transit Gateway
aws ec2 create-route \
  --route-table-id rtb-spoke-private \
  --destination-cidr-block 0.0.0.0/0 \
  --transit-gateway-id tgw-0123456789abcdef0

# Egress VPC Transit Gateway subnet: route internet traffic to NAT Gateway
aws ec2 create-route \
  --route-table-id rtb-egress-tgw-subnet \
  --destination-cidr-block 0.0.0.0/0 \
  --nat-gateway-id nat-0123456789abcdef0

# Egress VPC public subnet: route spoke CIDR blocks back to TGW
aws ec2 create-route \
  --route-table-id rtb-egress-public \
  --destination-cidr-block 10.0.0.0/8 \
  --transit-gateway-id tgw-0123456789abcdef0

# TGW spoke route table: default route to egress VPC attachment
aws ec2 create-transit-gateway-route \
  --transit-gateway-route-table-id tgw-rtb-spoke \
  --destination-cidr-block 0.0.0.0/0 \
  --transit-gateway-attachment-id tgw-attach-egress-vpc

Appliance Mode for Firewalls

When routing through a network appliance (AWS Network Firewall, third-party IDS/IPS), enable appliance mode on the Transit Gateway VPC attachment. Without it, TGW may route request and response traffic through different Availability Zones, causing the firewall to see asymmetric flows and drop packets. This is one of the most common and frustrating TGW troubleshooting issues.

Multi-Region VPC Architecture

For applications that require low-latency access from global users or need multi-region disaster recovery, you must connect VPCs across AWS regions. There are two primary mechanisms for cross-region VPC connectivity:

Inter-Region VPC Peering

VPC peering works across regions with encrypted traffic traversing the AWS backbone. It is the simplest option for connecting a small number of VPCs across regions but suffers from the same non-transitive limitation as intra-region peering. Data transfer between regions varies by region pair but is typically $0.02/GB.

Transit Gateway Inter-Region Peering

Transit Gateway peering connects TGWs in different regions, creating a full mesh of hub-and-spoke networks. Each regional TGW manages its local VPCs, and the peering connection enables cross-region traffic to flow. This is the recommended approach for organizations with multiple VPCs in multiple regions.

tgw-peering.yaml
# Regional TGW in us-east-1 peers with eu-west-1 TGW
TGWPeeringAttachment:
  Type: AWS::EC2::TransitGatewayPeeringAttachment
  Properties:
    TransitGatewayId: !Ref USEast1TGW
    PeerTransitGatewayId: tgw-eu-west-1-id
    PeerRegion: eu-west-1
    PeerAccountId: !Ref AWS::AccountId
    Tags:
      - Key: Name
        Value: us-east-1-to-eu-west-1

# Static route to the peer region CIDR via peering attachment
CrossRegionRoute:
  Type: AWS::EC2::TransitGatewayRoute
  Properties:
    TransitGatewayRouteTableId: !Ref ProdRouteTable
    DestinationCidrBlock: 10.1.0.0/16
    TransitGatewayAttachmentId: !Ref TGWPeeringAttachment
Route 53 DNS Patterns: Multi-Region Traffic Routing

VPC Design for EKS and Container Workloads

Amazon EKS and containerized workloads place unique demands on VPC architecture. The AWS VPC CNI plugin assigns each Kubernetes pod a real IP address from your subnet, meaning pod density is limited by available IP addresses. A node running 30 pods consumes 30 IPs (plus the node's primary IP), which can exhaust a /24 subnet quickly.

IP Address Planning for EKS

  • Use /20 or larger subnets: A /24 subnet supports only about 251 IPs, which limits you to approximately 8 nodes with 30 pods each. Use /20 subnets (4,091 IPs) for production EKS clusters.
  • Enable secondary CIDR blocks: If your primary CIDR is too small, add a secondary CIDR block (from the 100.64.0.0/10 range) for pod networking while keeping node IPs on the primary CIDR.
  • Consider prefix delegation: VPC CNI prefix delegation assigns /28 prefixes to nodes instead of individual IPs, significantly increasing pod density per node.
eks-vpc-secondary-cidr.sh
# Add secondary CIDR for EKS pod networking
aws ec2 associate-vpc-cidr-block \
  --vpc-id vpc-0123456789abcdef0 \
  --cidr-block 100.64.0.0/16

# Create subnets from secondary CIDR for pods
aws ec2 create-subnet \
  --vpc-id vpc-0123456789abcdef0 \
  --cidr-block 100.64.0.0/19 \
  --availability-zone us-east-1a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=eks-pods-a}]'

aws ec2 create-subnet \
  --vpc-id vpc-0123456789abcdef0 \
  --cidr-block 100.64.32.0/19 \
  --availability-zone us-east-1b \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=eks-pods-b}]'

# Configure VPC CNI to use custom networking
kubectl set env daemonset aws-node -n kube-system \
  AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true \
  ENI_CONFIG_LABEL_DEF=topology.kubernetes.io/zone
ECS vs EKS Decision Guide: Networking Considerations

Monitoring and Troubleshooting

Network issues are notoriously difficult to diagnose. AWS provides several tools to help you monitor VPC health, troubleshoot connectivity problems, and maintain visibility into network traffic patterns.

VPC Flow Logs

Enable VPC Flow Logs on every VPC to capture network traffic metadata. Flow Logs record source and destination IPs, ports, protocols, packet counts, byte counts, and whether the traffic was accepted or rejected. They are invaluable for troubleshooting connectivity issues, detecting anomalous traffic, and meeting compliance requirements.

enable-flow-logs.sh
# Enable VPC Flow Logs to S3 (recommended for cost)
aws ec2 create-flow-logs \
  --resource-type VPC \
  --resource-ids vpc-0123456789abcdef0 \
  --traffic-type ALL \
  --log-destination-type s3 \
  --log-destination arn:aws:s3:::my-flow-logs-bucket/vpc-logs/ \
  --max-aggregation-interval 60 \
  --log-format '${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status} ${vpc-id} ${subnet-id} ${az-id} ${sublocation-type} ${sublocation-id} ${pkt-srcaddr} ${pkt-dstaddr} ${region} ${pkt-src-aws-service} ${pkt-dst-aws-service} ${flow-direction} ${traffic-path}'

# Query flow logs with Athena for analysis
# This query finds rejected traffic to identify security group/NACL blocks

Reachability Analyzer

Use Reachability Analyzer to diagnose connectivity issues between two endpoints in your VPC without sending any actual traffic. It analyzes your configuration (route tables, security groups, NACLs, endpoint policies) and identifies the specific blocking component. Each analysis costs $0.10 but can save hours of manual troubleshooting.

Network Manager and Traffic Mirroring

  • AWS Network Manager: Provides a global view of your network across regions, including Transit Gateway topologies, VPN connections, and Direct Connect links
  • Traffic Mirroring: Copies network traffic from ENIs to a target for deep packet inspection using tools like Suricata or Zeek. Available on nitro-based instances.
  • Network Access Analyzer: Identifies unintended network access paths, such as resources reachable from the internet or cross-VPC paths that should not exist

Flow Logs Cost Optimization

VPC Flow Logs to CloudWatch Logs cost $0.50/GB ingestion plus storage. For high-volume VPCs, send Flow Logs to S3 instead ($0.023/GB storage) and query them with Athena. Use the 10-minute aggregation interval for most use cases to reduce log volume. Only use the 1-minute interval when actively troubleshooting or for security-critical VPCs.

VPC Architecture Decision Framework

Use this decision framework to select the right VPC architecture pattern for your environment:

  • Single application, single account → Three-tier VPC with public, private, and data subnets across 2-3 AZs
  • Multiple applications, single account → Separate VPCs per application with VPC peering for shared services
  • Multi-account (3-10 accounts) → Transit Gateway hub-and-spoke with shared services VPC
  • Enterprise (10+ accounts) → Transit Gateway with network segmentation, centralized egress, and AWS Network Firewall
  • Hybrid (on-premises connectivity) → Transit Gateway with Direct Connect and VPN attachments
  • Multi-region → Regional Transit Gateways with inter-region peering

Architecture Summary

Start with a three-tier VPC for most workloads. Use Transit Gateway when connecting three or more VPCs or when you need network segmentation. Always deploy across multiple AZs for high availability. Deploy S3 and DynamoDB gateway endpoints from day one (they are free). Enable VPC Flow Logs immediately for troubleshooting and security visibility. Plan CIDR blocks carefully before deployment because changing them later requires migrating workloads. For EKS workloads, use /20 or larger subnets and consider secondary CIDR blocks for pod networking.

AWS Well-Architected Framework: Reliability PillarMulti-Cloud Networking Glossary

Key Takeaways

  1. 1Plan CIDR blocks carefully because VPC ranges cannot be changed after creation.
  2. 2Multi-tier architectures separate public, private, and data subnets.
  3. 3Hub-and-spoke with Transit Gateway centralizes routing for multi-VPC environments.
  4. 4Use VPC endpoints and PrivateLink to keep traffic off the public internet.
  5. 5Deploy across multiple AZs for high availability in every architecture pattern.
  6. 6Network ACLs are stateless; Security Groups are stateful. Use both in layers.

Frequently Asked Questions

What is the maximum number of VPCs per AWS region?
The default limit is 5 VPCs per region, but this can be increased to hundreds via a support request. Each VPC can have up to 200 subnets. Plan your multi-VPC strategy around Transit Gateway for scalability.
Should I use VPC peering or Transit Gateway?
Use VPC peering for simple point-to-point connections between 2-3 VPCs. Use Transit Gateway when you have 4+ VPCs, need centralized routing, or require transitive connectivity. Transit Gateway supports up to 5,000 attachments.
What CIDR range should I use for my VPC?
Use private RFC 1918 ranges: 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16. Size your VPC for growth, as a /16 gives 65,536 IPs. Avoid overlapping ranges with on-premises networks or other VPCs you plan to peer.
What is the difference between public and private subnets?
Public subnets have a route to an Internet Gateway, so resources can have public IPs. Private subnets route through a NAT Gateway for outbound internet access. Database and application tiers should always be in private subnets.
How do I connect my VPC to an on-premises network?
Use AWS Site-to-Site VPN for encrypted connectivity over the internet, or AWS Direct Connect for dedicated private connectivity. For hybrid architectures, combine Direct Connect with VPN as a backup path through Transit Gateway.

Written by CloudToolStack Team

Cloud engineers and architects with hands-on experience across AWS, Azure, and GCP. We write guides based on real-world production patterns, not just documentation rewrites.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.