GCPDevOps & IaCintermediate

Terraform for GCP

Getting started with Terraform on GCP including project setup, state management, and modules.

CloudToolStack Editorial26 min readPublished Feb 22, 2026

Prerequisites

GCP project with appropriate permissions
Terraform CLI installed (v1.0+)
Basic understanding of infrastructure as code

Why Terraform for GCP

Terraform is the most widely adopted Infrastructure as Code (IaC) tool for Google Cloud. While GCP offers its own Deployment Manager (now largely deprecated in favor of Terraform), Terraform provides multi-cloud support, a massive provider ecosystem, and a mature state management system. The Google Cloud Terraform provider is maintained jointly by Google and HashiCorp, ensuring fast support for new GCP features, typically within days of a service reaching GA.

Terraform uses a declarative configuration language (HCL) where you describe the desired state of your infrastructure. The Terraform engine calculates the difference between desired and actual state, then applies only the necessary changes. This approach is idempotent: running the same configuration multiple times produces the same result. This idempotency is what makes Terraform safe for automated pipelines: you can run terraform apply repeatedly without creating duplicate resources.

The benefits of managing GCP infrastructure with Terraform include:

Version control: All infrastructure changes go through code review, providing an audit trail and preventing ad-hoc console modifications.
Reproducibility: Create identical environments (dev, staging, prod) from the same module with different variable values.
Dependency management: Terraform understands resource dependencies and creates/destroys resources in the correct order.
Drift detection: Compare actual infrastructure state against the declared configuration to detect manual changes.
Multi-cloud consistency: Use the same workflow and language for GCP, AWS, Azure, and hundreds of other providers.

Provider Versions Matter

The google and google-beta providers are released independently from Terraform itself. Always pin your provider version to avoid breaking changes. The google-beta provider includes features that are in preview and may have breaking changes before GA. Use google-beta only when you need a specific preview feature, and pin to a specific version.

Project Structure

A well-organized Terraform project separates concerns into logical modules and uses directory-based environments. The structure should make it easy to understand what infrastructure exists, make changes safely, and reuse modules across environments. Here is a recommended structure for GCP infrastructure:

Recommended directory layout

infrastructure/
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── gke-cluster/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── cloud-sql/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── cloud-run/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── iam/
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── backend.tf
│   │   └── terraform.tfvars
│   ├── staging/
│   │   ├── main.tf
│   │   ├── backend.tf
│   │   └── terraform.tfvars
│   └── prod/
│       ├── main.tf
│       ├── backend.tf
│       └── terraform.tfvars
└── global/
    ├── iam.tf
    ├── org-policies.tf
    └── backend.tf

Module Design Principles

Each module should represent a single logical component of your infrastructure. Follow these design principles:

Single responsibility: A VPC module creates the VPC, subnets, and NAT. It does not create the GKE cluster that runs in the VPC; that belongs in a separate module.
Explicit inputs and outputs: Use variables.tf for all configurable values and outputs.tf for values other modules need (like VPC IDs, subnet IDs).
No hardcoded values: Project IDs, regions, and service-specific settings should always be variables. The module should work for any project or region.
Sensible defaults: Provide default values for variables where a best practice exists (e.g., private_ip_google_access = true).

modules/vpc/variables.tf

variable "project_id" {
  description = "The GCP project ID"
  type        = string
}

variable "network_name" {
  description = "Name of the VPC network"
  type        = string
}

variable "subnets" {
  description = "Map of subnet configurations"
  type = map(object({
    cidr             = string
    region           = string
    secondary_ranges = optional(map(string), {})
    flow_logs        = optional(bool, true)
  }))
}

variable "enable_nat" {
  description = "Enable Cloud NAT for each region with subnets"
  type        = bool
  default     = true
}

variable "nat_min_ports_per_vm" {
  description = "Minimum NAT ports per VM"
  type        = number
  default     = 2048
}

Provider Configuration and State Management

Store Terraform state remotely in a GCS bucket with versioning enabled. This ensures state is not lost if a local machine fails and enables team collaboration with state locking via the built-in GCS backend. The state file is the single source of truth for what Terraform manages. Losing it means Terraform loses track of all your resources.

State Bucket Bootstrap

The state bucket is the one resource you need to create before Terraform can manage anything. This is a deliberate chicken-and-egg problem. Create the state bucket manually or with a separate bootstrap script:

Bootstrap the Terraform state bucket

# Create the state bucket (do this once, manually)
gcloud storage buckets create gs://mycompany-terraform-state \
  --location=us-central1 \
  --uniform-bucket-level-access \
  --public-access-prevention

# Enable versioning (allows state recovery)
gcloud storage buckets update gs://mycompany-terraform-state --versioning

# Set lifecycle to keep 30 versions of state files
cat > /tmp/state-lifecycle.json << 'EOF'
{
  "lifecycle": {
    "rule": [
      {
        "action": { "type": "Delete" },
        "condition": { "numNewerVersions": 30, "isLive": false }
      }
    ]
  }
}
EOF
gcloud storage buckets update gs://mycompany-terraform-state \
  --lifecycle-file=/tmp/state-lifecycle.json

# Restrict access to only the CI/CD service account
gcloud storage buckets add-iam-policy-binding gs://mycompany-terraform-state \
  --member="serviceAccount:terraform@my-project.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# Remove default access (if any)
gcloud storage buckets remove-iam-policy-binding gs://mycompany-terraform-state \
  --member="projectEditor:my-project" \
  --role="roles/storage.legacyBucketOwner" 2>/dev/null || true

backend.tf - Remote state in GCS

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
    google-beta = {
      source  = "hashicorp/google-beta"
      version = "~> 5.0"
    }
  }

  backend "gcs" {
    bucket = "mycompany-terraform-state"
    prefix = "environments/prod"
  }
}

provider "google" {
  project = var.project_id
  region  = var.region
}

provider "google-beta" {
  project = var.project_id
  region  = var.region
}

Protect Your State Bucket

Your Terraform state file contains every secret and sensitive value in your infrastructure (database passwords, API keys, etc.) in plaintext. Enable versioning and object lock on the state bucket, restrict access to only your CI/CD pipeline service account, and enable Cloud Audit Logs on the bucket. Never store the state bucket configuration in the same Terraform root module it manages. Consider enabling CMEK encryption on the state bucket for additional protection.

Essential GCP Resource Patterns

Here are Terraform patterns for the most common GCP resources, incorporating production best practices. Each pattern includes security hardening, proper lifecycle management, and common configuration decisions.

VPC and Networking

modules/vpc/main.tf - Production VPC module

resource "google_compute_network" "vpc" {
  name                    = var.network_name
  auto_create_subnetworks = false
  routing_mode            = "GLOBAL"
  project                 = var.project_id
}

resource "google_compute_subnetwork" "subnets" {
  for_each = var.subnets

  name                     = each.key
  ip_cidr_range            = each.value.cidr
  region                   = each.value.region
  network                  = google_compute_network.vpc.id
  private_ip_google_access = true

  dynamic "secondary_ip_range" {
    for_each = lookup(each.value, "secondary_ranges", {})
    content {
      range_name    = secondary_ip_range.key
      ip_cidr_range = secondary_ip_range.value
    }
  }

  dynamic "log_config" {
    for_each = each.value.flow_logs ? [1] : []
    content {
      aggregation_interval = "INTERVAL_5_SEC"
      flow_sampling        = 0.5
      metadata             = "INCLUDE_ALL_METADATA"
    }
  }
}

resource "google_compute_router" "router" {
  for_each = var.enable_nat ? toset(distinct([for s in var.subnets : s.region])) : []

  name    = "${var.network_name}-router-${each.key}"
  network = google_compute_network.vpc.id
  region  = each.key
}

resource "google_compute_router_nat" "nat" {
  for_each = google_compute_router.router

  name                               = "${each.value.name}-nat"
  router                             = each.value.name
  region                             = each.value.region
  nat_ip_allocate_option             = "AUTO_ONLY"
  source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"

  enable_dynamic_port_allocation = true
  min_ports_per_vm               = var.nat_min_ports_per_vm
  max_ports_per_vm               = 65536

  log_config {
    enable = true
    filter = "ERRORS_ONLY"
  }
}

GCP VPC Network Design Patterns

Cloud SQL with High Availability

modules/cloud-sql/main.tf - Cloud SQL with HA

resource "google_sql_database_instance" "primary" {
  name                = var.instance_name
  database_version    = "POSTGRES_15"
  region              = var.region
  project             = var.project_id
  deletion_protection = true

  settings {
    tier              = var.tier
    availability_type = "REGIONAL"  # Enables HA with automatic failover
    disk_autoresize   = true
    disk_size         = var.disk_size_gb
    disk_type         = "PD_SSD"

    backup_configuration {
      enabled                        = true
      point_in_time_recovery_enabled = true
      start_time                     = "03:00"
      transaction_log_retention_days = 7

      backup_retention_settings {
        retained_backups = 30
        retention_unit   = "COUNT"
      }
    }

    ip_configuration {
      ipv4_enabled                                  = false  # No public IP
      private_network                               = var.vpc_id
      enable_private_path_for_google_cloud_services = true
    }

    maintenance_window {
      day          = 7  # Sunday
      hour         = 4  # 4 AM UTC
      update_track = "stable"
    }

    insights_config {
      query_insights_enabled  = true
      record_application_tags = true
      record_client_address   = true
    }

    database_flags {
      name  = "log_checkpoints"
      value = "on"
    }
    database_flags {
      name  = "log_connections"
      value = "on"
    }
    database_flags {
      name  = "log_disconnections"
      value = "on"
    }
  }

  lifecycle {
    prevent_destroy = true
  }
}

resource "google_sql_database" "database" {
  name     = var.database_name
  instance = google_sql_database_instance.primary.name
}

resource "random_password" "db_password" {
  length  = 32
  special = true
}

resource "google_secret_manager_secret" "db_password" {
  secret_id = "${var.instance_name}-db-password"

  replication {
    auto {}
  }
}

resource "google_secret_manager_secret_version" "db_password" {
  secret      = google_secret_manager_secret.db_password.id
  secret_data = random_password.db_password.result
}

resource "google_sql_user" "app_user" {
  name     = var.database_user
  instance = google_sql_database_instance.primary.name
  password = random_password.db_password.result
}

Cloud Run Service

modules/cloud-run/main.tf - Cloud Run service

resource "google_cloud_run_v2_service" "service" {
  name     = var.service_name
  location = var.region
  project  = var.project_id

  template {
    service_account = var.service_account_email

    scaling {
      min_instance_count = var.min_instances
      max_instance_count = var.max_instances
    }

    vpc_access {
      network_interfaces {
        network    = var.vpc_name
        subnetwork = var.subnet_name
      }
      egress = "PRIVATE_RANGES_ONLY"
    }

    containers {
      image = var.image

      ports {
        container_port = var.port
      }

      resources {
        limits = {
          cpu    = var.cpu
          memory = var.memory
        }
        cpu_idle          = true
        startup_cpu_boost = true
      }

      dynamic "env" {
        for_each = var.env_vars
        content {
          name  = env.key
          value = env.value
        }
      }

      dynamic "env" {
        for_each = var.secret_env_vars
        content {
          name = env.key
          value_source {
            secret_key_ref {
              secret  = env.value.secret_id
              version = env.value.version
            }
          }
        }
      }

      startup_probe {
        http_get {
          path = var.health_check_path
          port = var.port
        }
        initial_delay_seconds = 5
        period_seconds        = 10
        failure_threshold     = 3
      }
    }
  }

  traffic {
    type    = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"
    percent = 100
  }

  lifecycle {
    ignore_changes = [
      template[0].containers[0].image,  # Image updated by CI/CD
    ]
  }
}

resource "google_cloud_run_v2_service_iam_member" "invoker" {
  for_each = toset(var.invoker_members)

  project  = var.project_id
  location = var.region
  name     = google_cloud_run_v2_service.service.name
  role     = "roles/run.invoker"
  member   = each.value
}

Cloud Functions vs Cloud Run

GKE Autopilot Cluster

modules/gke-cluster/main.tf - GKE Autopilot

resource "google_container_cluster" "autopilot" {
  name     = var.cluster_name
  location = var.region
  project  = var.project_id

  enable_autopilot = true

  network    = var.vpc_id
  subnetwork = var.subnet_id

  ip_allocation_policy {
    cluster_secondary_range_name  = var.pods_range_name
    services_secondary_range_name = var.services_range_name
  }

  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    master_ipv4_cidr_block  = var.master_cidr
  }

  master_authorized_networks_config {
    dynamic "cidr_blocks" {
      for_each = var.authorized_networks
      content {
        cidr_block   = cidr_blocks.value.cidr
        display_name = cidr_blocks.value.name
      }
    }
  }

  release_channel {
    channel = "REGULAR"
  }

  workload_identity_config {
    workload_pool = "${var.project_id}.svc.id.goog"
  }

  dns_config {
    cluster_dns       = "CLOUD_DNS"
    cluster_dns_scope = "VPC_SCOPE"
  }

  lifecycle {
    prevent_destroy = true
    ignore_changes = [
      node_pool,  # Managed by Autopilot
    ]
  }
}

GKE vs Cloud Run Decision Guide

API Enablement

A missing API enable is the most common cause of “Permission denied” errors on first deployment. Use the google_project_service resource to enable APIs declaratively. This ensures that the APIs your infrastructure depends on are always enabled and prevents manual console steps.

Enable required GCP APIs

locals {
  required_apis = [
    "compute.googleapis.com",
    "container.googleapis.com",
    "sqladmin.googleapis.com",
    "run.googleapis.com",
    "cloudfunctions.googleapis.com",
    "cloudbuild.googleapis.com",
    "secretmanager.googleapis.com",
    "dns.googleapis.com",
    "monitoring.googleapis.com",
    "logging.googleapis.com",
    "cloudresourcemanager.googleapis.com",
    "iam.googleapis.com",
    "artifactregistry.googleapis.com",
    "servicenetworking.googleapis.com",
  ]
}

resource "google_project_service" "apis" {
  for_each = toset(local.required_apis)

  project = var.project_id
  service = each.value

  disable_on_destroy         = false
  disable_dependent_services = false
}

Set disable_on_destroy to false

Always set disable_on_destroy = false on google_project_service resources. If set to true (the default), running terraform destroy will disable the API, which can delete all resources associated with that API. This is almost never the desired behavior and can cause catastrophic data loss.

IAM Management with Terraform

Managing IAM with Terraform requires care to avoid clobbering existing bindings. GCP offers three Terraform resources for IAM, each with different behavior:

Resource	Behavior	When to Use	Risk Level
`google_project_iam_policy`	Replaces the entire project IAM policy	Almost never: only for full-control automation	Dangerous: can lock out all users
`google_project_iam_binding`	Controls all members for a specific role	When you want Terraform to fully own a role's membership	Medium: can remove manually-added members
`google_project_iam_member`	Adds a single member to a role (additive)	Default choice: safe, additive, does not remove existing bindings	Low: only adds, never removes

IAM management best practices

# PREFERRED: Use iam_member for additive bindings
resource "google_project_iam_member" "cloud_run_sa" {
  project = var.project_id
  role    = "roles/run.invoker"
  member  = "serviceAccount:${google_service_account.api.email}"
}

# Create a dedicated service account per workload
resource "google_service_account" "api" {
  account_id   = "api-service"
  display_name = "API Service Account"
  project      = var.project_id
}

# Grant specific permissions to the service account
resource "google_project_iam_member" "api_permissions" {
  for_each = toset([
    "roles/cloudsql.client",
    "roles/secretmanager.secretAccessor",
    "roles/logging.logWriter",
    "roles/monitoring.metricWriter",
  ])

  project = var.project_id
  role    = each.value
  member  = "serviceAccount:${google_service_account.api.email}"
}

# Workload Identity binding for GKE
resource "google_service_account_iam_member" "workload_identity" {
  service_account_id = google_service_account.api.name
  role               = "roles/iam.workloadIdentityUser"
  member             = "serviceAccount:${var.project_id}.svc.id.goog[production/api-sa]"
}

GCP IAM and Organization Policies

CI/CD Integration

Automate Terraform runs using Cloud Build or GitHub Actions. The key principles are: always run terraform plan on pull requests for review, require approval before terraform apply, and never run apply locally for production environments. Automating Terraform prevents drift, ensures consistency, and provides a complete audit trail of who changed what and when.

Cloud Build Pipeline

cloudbuild.yaml - Terraform CI/CD pipeline

steps:
  - id: 'terraform-init'
    name: 'hashicorp/terraform:1.7'
    entrypoint: 'sh'
    args:
      - '-c'
      - 'cd environments/$_ENVIRONMENT && terraform init -no-color'

  - id: 'terraform-validate'
    name: 'hashicorp/terraform:1.7'
    entrypoint: 'sh'
    args:
      - '-c'
      - 'cd environments/$_ENVIRONMENT && terraform validate -no-color'

  - id: 'terraform-plan'
    name: 'hashicorp/terraform:1.7'
    entrypoint: 'sh'
    args:
      - '-c'
      - |
        cd environments/$_ENVIRONMENT
        terraform plan -no-color -out=tfplan
        terraform show -no-color tfplan > plan-output.txt

  - id: 'terraform-apply'
    name: 'hashicorp/terraform:1.7'
    entrypoint: 'sh'
    args:
      - '-c'
      - 'cd environments/$_ENVIRONMENT && terraform apply -no-color -auto-approve tfplan'

substitutions:
  _ENVIRONMENT: 'dev'

options:
  logging: CLOUD_LOGGING_ONLY

GitHub Actions Pipeline

.github/workflows/terraform.yml

name: Terraform
on:
  pull_request:
    paths: ['infrastructure/**']
  push:
    branches: [main]
    paths: ['infrastructure/**']

permissions:
  id-token: write  # Required for Workload Identity Federation
  contents: read
  pull-requests: write

jobs:
  plan:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        environment: [dev, staging, prod]
    steps:
      - uses: actions/checkout@v4

      - id: auth
        uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: projects/123456/locations/global/workloadIdentityPools/github/providers/github
          service_account: terraform@my-project.iam.gserviceaccount.com

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.0

      - name: Terraform Init
        working-directory: infrastructure/environments/${{ matrix.environment }}
        run: terraform init

      - name: Terraform Plan
        working-directory: infrastructure/environments/${{ matrix.environment }}
        run: terraform plan -no-color -out=tfplan

      - name: Post Plan to PR
        if: github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            const output = \`#### Terraform Plan - ${{ matrix.environment }}
            \\\`\\\`\\\`
            ${{ steps.plan.outputs.stdout }}
            \\\`\\\`\\\`\`;

  apply:
    needs: plan
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production  # Requires approval
    steps:
      - uses: actions/checkout@v4
      # ... auth and apply steps

Never Run Apply Without Plan Review

Always separate plan and apply into distinct steps with a review gate between them. For production environments, require at least one approval before apply runs. Use GitHub Environments or Cloud Build approval gates to enforce this. An unreviewed apply against production can destroy databases, delete storage buckets, or modify IAM policies in ways that are difficult to reverse.

Workload Identity Federation for CI/CD

State Management Advanced Patterns

As your infrastructure grows, state management becomes increasingly important. Large state files slow down every Terraform operation, and a single state file for all infrastructure creates a blast radius where a mistake in one area can affect everything.

State Splitting Strategy

Split your state into logical units based on change frequency and risk:

State Scope	Change Frequency	Risk Level	Examples
Global / Organization	Rarely	Critical	Org policies, IAM, folders
Shared Infrastructure	Monthly	High	VPCs, DNS, Interconnects
Application Infrastructure	Weekly	Medium	GKE clusters, Cloud SQL, Cloud Run
Application Configuration	Daily	Low	Environment variables, feature flags

Remote State Data Sources

When splitting state, modules that depend on each other need to share data. Use terraform_remote_state data sources or, better yet, use google_* data sources to look up resources directly from GCP. Data sources are more resilient because they do not depend on the structure of another module's state.

Using data sources instead of remote state

# PREFERRED: Look up VPC directly from GCP
data "google_compute_network" "vpc" {
  name    = "prod-vpc"
  project = var.host_project_id
}

data "google_compute_subnetwork" "subnet" {
  name    = "prod-us-central1"
  region  = "us-central1"
  project = var.host_project_id
}

# Use the data source outputs in your resources
resource "google_container_cluster" "autopilot" {
  network    = data.google_compute_network.vpc.id
  subnetwork = data.google_compute_subnetwork.subnet.id
  # ...
}

# ALTERNATIVE: Remote state (creates coupling between modules)
data "terraform_remote_state" "network" {
  backend = "gcs"
  config = {
    bucket = "mycompany-terraform-state"
    prefix = "shared/network"
  }
}

# Use remote state outputs
resource "google_container_cluster" "autopilot" {
  network    = data.terraform_remote_state.network.outputs.vpc_id
  subnetwork = data.terraform_remote_state.network.outputs.subnet_id
  # ...
}

Import and Migration

If you have existing GCP resources created via the console or gcloud, you need to bring them under Terraform management. Terraform 1.5+ supports import blocks, which are the preferred way to import resources because they can be code-reviewed and are declarative.

Import existing resources with import blocks (Terraform 1.5+)

# Import an existing VPC
import {
  to = google_compute_network.vpc
  id = "projects/my-project/global/networks/prod-vpc"
}

# Import an existing Cloud SQL instance
import {
  to = google_sql_database_instance.primary
  id = "projects/my-project/instances/prod-db"
}

# Import an existing Cloud Run service
import {
  to = google_cloud_run_v2_service.api
  id = "projects/my-project/locations/us-central1/services/api-service"
}

# After adding import blocks, run:
# terraform plan -generate-config-out=generated.tf
# This generates the HCL configuration for imported resources

Use Google's Bulk Export Tool

Google provides the gcloud resource-config bulk-export command that can generate Terraform HCL from existing GCP resources. This dramatically speeds up migration from console-managed infrastructure to Terraform. Run it against a project or organization to generate a starting point, then refactor into proper modules.

Common Pitfalls and Best Practices

After managing GCP infrastructure with Terraform across many organizations, these are the most impactful best practices and the most common pitfalls to avoid:

Resource Protection

Use prevent_destroy lifecycle rules on databases, storage buckets, and other stateful resources to prevent accidental destruction. A terraform destroy or a resource rename without a moved block will attempt to delete and recreate the resource.
Use moved blocks when refactoring resource addresses to avoid destroy/recreate cycles. This tells Terraform that a resource has been renamed or reorganized, not deleted.
Use ignore_changes for fields managed outside Terraform (like container image tags updated by CI/CD).

Resource protection patterns

# Prevent accidental deletion of stateful resources
resource "google_sql_database_instance" "primary" {
  # ...
  deletion_protection = true
  lifecycle {
    prevent_destroy = true
  }
}

resource "google_storage_bucket" "data" {
  # ...
  lifecycle {
    prevent_destroy = true
  }
}

# Use moved blocks when refactoring
moved {
  from = google_compute_network.main
  to   = module.network.google_compute_network.vpc
}

# Ignore CI/CD-managed fields
resource "google_cloud_run_v2_service" "api" {
  # ...
  lifecycle {
    ignore_changes = [
      template[0].containers[0].image,
    ]
  }
}

General Best Practices

Never hardcode project IDs or regions. Use variables and data sources so modules are reusable across environments.
Use google_project_service to enable APIs declaratively. A missing API enable is the most common cause of “Permission denied” errors on first deployment.
Pin module versions when using the Terraform Registry. Unpinned modules can introduce breaking changes during terraform init.
Run terraform fmt and terraform validate in CI before every plan. These catch syntax errors and formatting inconsistencies early.
Use tflint and checkov for static analysis. These tools catch security misconfigurations, deprecated patterns, and compliance violations before apply.

Google Cloud Foundation Toolkit

Google maintains the Cloud Foundation Toolkit (CFT), a collection of production-ready Terraform modules for GCP. These modules encapsulate Google's recommended practices and are extensively tested. Start with CFT modules for VPC, GKE, Cloud SQL, and project factory rather than writing everything from scratch. CFT modules handle edge cases and security hardening that are easy to miss in custom modules.

Terraform Checklist for GCP Projects

Category	Check	Priority
State	Remote state in GCS with versioning and restricted access	Critical
Providers	Provider versions pinned with ~> constraint	High
APIs	All required APIs enabled via google_project_service	High
CI/CD	Plan on PR, apply on merge with approval gate	High
Protection	prevent_destroy on all stateful resources	High
IAM	Using iam_member (not iam_policy or iam_binding)	High
Modules	Reusable modules with clear inputs/outputs	Medium
Linting	tflint and checkov running in CI	Medium

GCP Architecture Framework GCP Cost Optimization Guide

Key Takeaways

1Use the google and google-beta Terraform providers for full GCP API coverage.
2Store Terraform state in a GCS backend with versioning and locking enabled.
3Organize code into reusable modules for VPC, GKE, IAM, and other common patterns.
4Use Workload Identity Federation for Terraform CI/CD to avoid service account keys.
5Implement plan-and-apply workflows with manual approval gates for production changes.
6Use terraform import and the GCP Terraform resource generation tool for brownfield adoption.

Frequently Asked Questions

How do I set up Terraform state for GCP?

Create a GCS bucket with versioning enabled. Configure the Terraform backend with bucket name, prefix, and project. Enable object versioning for state history. Use state locking to prevent concurrent modifications in team environments.

Should I use Terraform or Deployment Manager for GCP?

Use Terraform. Google Cloud Deployment Manager is in maintenance mode with no new features. Terraform has a much larger community, better tooling, and supports multi-cloud. Google actively contributes to the Terraform Google provider.

How do I authenticate Terraform with GCP?

For local development, use 'gcloud auth application-default login'. For CI/CD, use Workload Identity Federation (no keys needed). Avoid downloading service account keys. The google provider auto-discovers credentials from the environment.

What is the recommended Terraform project structure for GCP?

Use separate directories per environment (dev, staging, prod) with shared modules. Keep modules in a modules/ directory. Use terragrunt or workspaces for environment management. Separate state files per environment to limit blast radius.

How do I import existing GCP resources into Terraform?

Use 'terraform import' with the resource address and GCP resource ID. Google also provides gcloud resource-config bulk-export to generate Terraform HCL from existing resources. Always run 'terraform plan' after import to verify no drift.

Written by CloudToolStack Editorial

Written and reviewed by the CloudToolStack editorial team. Every guide is verified against current provider documentation and revised in place when providers change pricing, deprecate services, or release meaningfully better alternatives.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.