Skip to main content
All articles

Zero Trust Networking on AWS, Azure, and GCP: A Practical Implementation Guide

Identity-based access, micro-segmentation, PrivateLink, Private Endpoints, and VPC Service Controls -- real implementation patterns across all three major clouds.

CloudToolStack TeamApril 5, 202616 min read

Beyond the Perimeter: Why Zero Trust Is Not Optional

The traditional network security model -- trust everything inside the perimeter, block everything outside -- was already failing before cloud adoption. In the cloud, it is fundamentally broken. There is no perimeter. Your workloads run across VPCs, regions, and providers. Your users connect from home networks, coffee shops, and mobile devices. Your partners access APIs from their own cloud accounts. The attack surface is everywhere, and a compromised credential or misconfigured security group gives an attacker lateral movement across your entire environment.

Zero trust networking replaces implicit trust with explicit verification. Every request, every connection, every API call must prove its identity and authorization before being allowed. This is not a product you buy. It is an architecture you build, layer by layer, using the native services each cloud provider offers. This article covers practical implementation patterns on AWS, Azure, and GCP -- not the marketing version, but the engineering version with real configurations and hard-won lessons.

The Core Principles

Zero trust is built on four pillars, and skipping any one of them undermines the entire model.

  • Identity-based access: Every entity (user, service, device) has a verified identity. Network location does not confer trust. Being inside the VPC does not mean you are authorized.
  • Least-privilege authorization: Every identity gets the minimum permissions needed and nothing more. This applies to IAM policies, security group rules, and API scopes.
  • Micro-segmentation: Network segments are small and isolated. Services can only reach the specific endpoints they need, not entire subnets or VPCs.
  • Continuous verification: Trust is not a one-time check. Sessions are validated continuously. Tokens expire. Certificates rotate. Anomalous behavior triggers re-authentication.

AWS: Building Zero Trust with VPC and IAM

Private connectivity with VPC Endpoints and PrivateLink

The first step in AWS zero trust is eliminating unnecessary internet exposure. VPC Interface Endpoints (powered by AWS PrivateLink) create private connections between your VPC and AWS services without traversing the public internet. Instead of your EC2 instance calling the S3 API over the internet through a NAT Gateway, the traffic stays on the AWS network via a private endpoint in your VPC.

Deploy VPC endpoints for every AWS service your workloads use: S3, DynamoDB, SQS, SNS, Secrets Manager, KMS, ECR, CloudWatch Logs, and SSM at minimum. Each endpoint costs approximately $7.30 per month per AZ plus data processing charges, but this is typically cheaper than NAT Gateway data processing at $0.045 per GB. More importantly, it removes the internet path entirely. No internet path means no internet-based attack vector.

For service-to-service communication across accounts, AWS PrivateLink lets you expose services behind a Network Load Balancer as private endpoints in other VPCs. This is how you give your partner or customer account access to your API without opening security groups to IP ranges or peering entire VPCs. The consumer creates an interface endpoint in their VPC, and traffic flows over the AWS backbone. The provider never sees the consumer's network, and the consumer never sees the provider's network.

Micro-segmentation with security groups

AWS security groups are the primary tool for micro-segmentation. The critical pattern is referencing security groups by ID rather than by CIDR range. Instead of allowing inbound traffic from 10.0.0.0/16 (the entire VPC), allow traffic from the specific security group attached to the calling service. This creates identity-based network access: only instances that are members of the authorized security group can connect.

In practice, this means each microservice gets its own security group. The API gateway security group allows inbound HTTPS from the load balancer security group. The application security group allows inbound traffic only from the API gateway security group on port 8080. The database security group allows inbound traffic only from the application security group on port 5432. There is no path from the load balancer directly to the database, even though they are in the same VPC.

Lint and validate your security group rules

IAM-based service authorization

Network-level controls are necessary but not sufficient. Layer IAM authorization on top. Use IAM roles for EC2 instances and ECS tasks with policies scoped to exactly the resources they need. An application that reads from a specific S3 bucket should have a policy allowing s3:GetObject only on that bucket's ARN, not s3:* on *. Use IAM condition keys to further restrict access: aws:SourceVpc limits API calls to those originating from a specific VPC, and aws:SourceVpce limits them to a specific VPC endpoint.

For cross-account access, use IAM roles with explicit trust policies rather than sharing long-lived access keys. The trust policy should specify the exact principal ARN, not a wildcard account ID. Combine this with resource policies on S3 buckets, KMS keys, and SQS queues that deny access from any principal not in your allowed list. Defense in depth means both the caller and the resource enforce authorization independently.

The VPC Peering Trap

VPC peering creates a flat network between two VPCs. Once peered, any instance in VPC A can potentially reach any instance in VPC B, limited only by security groups and NACLs. This undermines micro-segmentation. For service-to-service access between VPCs, prefer PrivateLink, which exposes only specific endpoints. If you must peer, use NACLs as an additional layer to restrict which subnets can communicate across the peering connection.

Azure: Identity-Centric Zero Trust

Private Endpoints and Private Link Service

Azure Private Endpoints are the equivalent of AWS PrivateLink. They create a network interface in your VNet with a private IP address that maps to an Azure PaaS service (SQL Database, Storage, Key Vault, etc.) or a custom service behind a Standard Load Balancer. Once you create a private endpoint for Azure SQL Database, you disable public network access entirely. The database is reachable only from within VNets that have the private endpoint.

Every Azure PaaS service your workloads use should have a private endpoint. This is non-negotiable for zero trust. Azure Storage accounts, SQL databases, Cosmos DB instances, Key Vaults, Container Registries, and Event Hubs all support private endpoints. The cost is approximately $7.30 per month per endpoint plus data processing. Enable the "Deny public network access" flag on every service that supports it. If a service is accessible from the internet, it is not zero trust.

NSG micro-segmentation with Application Security Groups

Azure Network Security Groups (NSGs) work similarly to AWS security groups, but Azure adds Application Security Groups (ASGs) as an abstraction layer. Instead of referencing individual NICs or IP ranges, you assign VMs to ASGs and then write NSG rules using ASG names. This creates a clean mapping between logical application tiers and network rules.

Create ASGs for each application tier: web-asg, app-asg, db-asg. Then write NSG rules like: allow TCP 443 from web-asg to app-asg, allow TCP 5432 from app-asg to db-asg, deny all other traffic. When you add a new VM to the app tier, assign it to the app-asg and it automatically inherits the correct network rules. No IP address management, no rule updates.

Validate your Azure NSG rules

Entra ID Conditional Access

Azure's zero trust story is strongest around identity because of its deep integration with Entra ID (formerly Azure AD). Conditional Access policies evaluate signals -- user identity, device compliance, location, application, risk level -- and enforce access decisions in real time. You can require MFA for all access to Azure management APIs, block access from non-compliant devices, require a managed device for accessing sensitive applications, and restrict admin access to specific named locations.

For service-to-service communication, Azure Managed Identities eliminate the need for credentials entirely. A VM or App Service with a managed identity authenticates to other Azure services using Entra ID tokens. No passwords, no certificates, no rotation headaches. The identity is tied to the lifecycle of the resource -- when the resource is deleted, the identity is automatically cleaned up. This is what zero trust looks like in practice: strong identity, automatic credential management, continuous authorization.

GCP: BeyondCorp and VPC Service Controls

The BeyondCorp model

Google pioneered zero trust internally with BeyondCorp, published the research papers, and then productized it as BeyondCorp Enterprise (now part of Chrome Enterprise Premium). The core idea is that access to applications is determined by the identity of the user and the security posture of their device, not their network location. There is no VPN. Users access internal applications through a secure web gateway (Identity-Aware Proxy) that verifies identity and device trust on every request.

Identity-Aware Proxy (IAP) sits in front of your applications and enforces authentication and authorization at the application layer. When a user requests access to an internal app, IAP verifies their Google identity, checks access levels (device management status, OS patch level, disk encryption), and either allows or denies the request. The application never handles authentication -- it trusts the identity headers injected by IAP. This works for web applications, SSH, and TCP forwarding to any port.

The practical advantage of IAP over traditional VPN is granular application-level access. A VPN gives users access to an entire network segment. IAP gives users access to specific applications based on their identity and device posture. A compromised device can be blocked from sensitive applications while still allowing access to lower-risk ones.

Private Service Connect

GCP's answer to PrivateLink is Private Service Connect (PSC). It creates a private endpoint in your VPC that provides connectivity to Google APIs, managed services, or third-party services without internet exposure. PSC endpoints use consumer-allocated IP addresses from your VPC, which simplifies DNS and firewall configuration.

For Google APIs (Cloud Storage, BigQuery, Pub/Sub, etc.), create a PSC endpoint that provides a private IP address for the API. Configure DNS to resolve the API hostname to the PSC endpoint IP. All traffic to Google APIs then stays on Google's network. For service-to-service access, PSC lets you publish a service behind an Internal Load Balancer and expose it to consumers via a service attachment. Consumers create a PSC endpoint in their VPC and get a private IP that reaches the published service.

VPC Service Controls

VPC Service Controls (VPC-SC) are unique to GCP and represent one of the most powerful zero trust mechanisms available on any cloud. A VPC Service Control perimeter creates a security boundary around GCP resources that restricts data movement. Resources inside the perimeter can communicate with each other, but data cannot cross the perimeter boundary without an explicit ingress or egress policy.

This prevents data exfiltration even by privileged insiders. If an attacker compromises a service account with Storage Admin permissions, they can read data within the perimeter but cannot copy it to a bucket outside the perimeter. VPC-SC perimeters can include projects, VPC networks, and access levels. Ingress and egress rules define exactly which identities can bring data in or out, and under what conditions.

Build GCP VPC Service Controls perimeters

Start With VPC Service Controls

If you run sensitive workloads on GCP, VPC Service Controls should be your first zero trust investment. They provide a data exfiltration boundary that no amount of IAM policy refinement can match. Start with a dry-run perimeter that logs violations without blocking, review the logs for a week, fix any legitimate access that would be blocked, then enforce the perimeter.

Cross-Cloud Zero Trust Patterns

Multi-cloud service mesh

For organizations running workloads across multiple clouds, a service mesh provides a consistent zero trust layer. Istio, Linkerd, and Consul Connect all support mutual TLS (mTLS) between services, meaning every service-to-service connection is authenticated and encrypted. The mesh issues short-lived certificates to each service, handles automatic rotation, and enforces authorization policies based on service identity.

The practical challenge is running a mesh across cloud boundaries. The control plane needs connectivity between clusters, and latency between clouds (typically 10 to 50 ms) affects mesh performance. For most teams, the pragmatic approach is running separate meshes per cloud with explicit API gateways at the boundaries. The gateway handles authentication, rate limiting, and protocol translation between clouds.

Consistent firewall policies

Maintaining consistent network policies across AWS security groups, Azure NSGs, and GCP firewall rules is operationally challenging. Each uses different syntax, different concepts (stateful vs. stateless, priority-based vs. first-match), and different management APIs. Define your network policy intent in a cloud-agnostic format -- a YAML document listing which services can communicate with which other services on which ports -- and then translate that intent to cloud-specific rules using Terraform or a policy engine like OPA.

Plan your VPC CIDR ranges for multi-cloud environmentsBuild GCP firewall rules

Implementation Roadmap

Zero trust is not a weekend project. It is a multi-quarter initiative. Here is the order I recommend for teams starting from a traditional perimeter model.

  1. Phase 1 (Weeks 1 to 4): Deploy private endpoints for all PaaS services. Disable public access. This eliminates the largest attack surface with minimal application changes.
  2. Phase 2 (Weeks 4 to 8): Implement micro-segmentation. Replace broad CIDR-based rules with identity-based security group and NSG rules. Audit and remove any 0.0.0.0/0 ingress rules.
  3. Phase 3 (Weeks 8 to 12): Tighten IAM policies to least privilege. Replace long-lived credentials with managed identities and IAM roles. Implement credential rotation for any remaining static credentials.
  4. Phase 4 (Weeks 12 to 20): Deploy identity-aware access for user-facing applications. Replace VPN with IAP (GCP), Azure AD Application Proxy (Azure), or AWS Verified Access (AWS).
  5. Phase 5 (Ongoing): Implement continuous monitoring. Set up alerts for security group changes, IAM policy modifications, and anomalous network traffic patterns. Run regular penetration tests against the zero trust controls.

Measure Your Progress

Track three metrics as you implement zero trust: the percentage of services reachable only via private endpoints (target: 100 percent), the percentage of security group rules using identity-based references vs. CIDR ranges (target: above 90 percent), and the number of long-lived credentials in use (target: zero). These metrics give you a concrete measure of your zero trust maturity.

The Hard Truth About Zero Trust

Zero trust increases operational complexity. More granular network rules mean more rules to manage. Private endpoints mean more infrastructure to deploy and monitor. Strict IAM policies mean more troubleshooting when a new service cannot access a resource. Some teams implement zero trust controls and then quietly add overly broad exceptions when things break, which defeats the purpose entirely.

The investment is worth it because the alternative -- a flat network with broad access -- means that a single compromised credential can reach everything. I have seen production databases exposed because a security group allowed 0.0.0.0/0 on port 5432 "temporarily" during a debugging session. I have seen S3 buckets with public read access because the bucket policy was copied from a blog post without understanding the conditions. Zero trust makes these mistakes structurally impossible, not just policy violations.

Start with the highest-risk workloads -- databases, credential stores, PII systems -- and expand outward. Perfect zero trust across your entire environment is the goal, but meaningful security improvements come from the first phase of private endpoints and micro-segmentation. Do not let the scope of the full vision prevent you from starting.

Written by CloudToolStack Team

Cloud architects with 15+ years of production experience across AWS, Azure, GCP, and OCI. We build free tools and write practical guides to help engineers navigate multi-cloud infrastructure.

Disclaimer: This article is for informational purposes. Cloud services and pricing change frequently; always verify with official provider documentation. AWS, Azure, GCP, and OCI are trademarks of their respective owners.