Multi-CloudObservabilityintermediate

Centralized Logging Architecture

Aggregate logs across clouds with OpenTelemetry, ELK, Loki, and native services. Cost optimization and correlation.

CloudToolStack Editorial24 min readPublished Mar 14, 2026

Prerequisites

Basic understanding of logging and observability concepts
Familiarity with at least one cloud logging service

Why Centralized Logging Matters

In multi-cloud environments, logs are scattered across providers, services, and regions. AWS generates logs in CloudWatch Logs and CloudTrail, Azure uses Monitor Logs and Activity Log, GCP uses Cloud Logging, and OCI uses the Logging service. Without centralization, investigating incidents requires switching between four different consoles, query languages, and time formats. A centralized logging architecture aggregates all logs into a single platform for unified search, correlation, alerting, and compliance.

This guide covers the native logging services of each cloud provider, aggregation patterns for centralization, popular logging platforms (ELK, Datadog, Splunk, Grafana Loki), OpenTelemetry for vendor-neutral log collection, and cost optimization strategies for high-volume log environments.

Logging Cost Is Significant

Logging costs can easily exceed your compute costs if not managed carefully. CloudWatch Logs charges $0.50/GB for ingestion and $0.03/GB/month for storage. Azure Monitor charges $2.76/GB for ingestion. Cloud Logging charges $0.50/GB beyond the free 50 GB/month. A busy application generating 100 GB/day of logs costs $1,500-8,000/month in log ingestion alone. Always implement log level filtering, sampling, and retention policies.

Native Logging Services

Feature	AWS CloudWatch Logs	Azure Monitor Logs	GCP Cloud Logging
Query language	CloudWatch Logs Insights	KQL (Kusto Query Language)	Logging Query Language
Retention	1 day to indefinite	30 days to 2 years	30 days (default), custom buckets
Free tier	5 GB ingest, 5 GB storage	5 GB/month (first 31 days free)	50 GB/month
Streaming export	Subscription filters to S3/Lambda/Kinesis	Diagnostic settings to Event Hubs/Storage	Log sinks to Pub/Sub/Storage/BigQuery
Alerting	Metric filters + CloudWatch Alarms	Alert rules (KQL-based)	Log-based metrics + alerting

Cloud-Native Log Collection

AWS CloudWatch Logs

bash

# Stream CloudWatch Logs to S3 for centralization
aws logs put-subscription-filter \
  --log-group-name "/aws/lambda/myapp" \
  --filter-name "to-kinesis" \
  --filter-pattern "" \
  --destination-arn "arn:aws:kinesis:us-east-1:123456789012:stream/log-stream"

# Export logs to S3 (batch)
aws logs create-export-task \
  --log-group-name "/aws/lambda/myapp" \
  --from 1709251200000 \
  --to 1709337600000 \
  --destination "log-export-bucket" \
  --destination-prefix "cloudwatch-logs/"

# Query logs with CloudWatch Logs Insights
aws logs start-query \
  --log-group-name "/aws/lambda/myapp" \
  --start-time 1709251200 \
  --end-time 1709337600 \
  --query-string 'fields @timestamp, @message | filter @message like /ERROR/ | sort @timestamp desc | limit 50'

GCP Cloud Logging

bash

# Create a log sink to BigQuery for analysis
gcloud logging sinks create bigquery-audit-sink \
  "bigquery.googleapis.com/projects/PROJECT/datasets/audit_logs" \
  --log-filter='logName:"cloudaudit.googleapis.com"' \
  --organization=ORG_ID \
  --include-children

# Create a log sink to Pub/Sub for streaming to external SIEM
gcloud logging sinks create pubsub-all-logs \
  "pubsub.googleapis.com/projects/PROJECT/topics/all-logs" \
  --log-filter='severity >= WARNING' \
  --project=PROJECT

# Create a log sink to Cloud Storage for archival
gcloud logging sinks create storage-archive \
  "storage.googleapis.com/log-archive-bucket" \
  --log-filter='resource.type="gce_instance"' \
  --project=PROJECT

Azure Monitor Logs

bash

# Create a Log Analytics workspace
az monitor log-analytics workspace create \
  --workspace-name central-logs \
  --resource-group monitoring-rg \
  --location eastus \
  --retention-time 90

# Enable diagnostic settings for a resource
az monitor diagnostic-settings create \
  --name "send-to-workspace" \
  --resource "/subscriptions/SUB_ID/resourceGroups/myapp-rg/providers/Microsoft.Web/sites/myapp" \
  --workspace "/subscriptions/SUB_ID/resourceGroups/monitoring-rg/providers/Microsoft.OperationalInsights/workspaces/central-logs" \
  --logs '[{"category": "AppServiceHTTPLogs", "enabled": true}, {"category": "AppServiceConsoleLogs", "enabled": true}]' \
  --metrics '[{"category": "AllMetrics", "enabled": true}]'

# Query with KQL
az monitor log-analytics query \
  --workspace central-logs-id \
  --analytics-query "AppServiceHTTPLogs | where ScStatus >= 500 | summarize count() by bin(TimeGenerated, 1h) | order by TimeGenerated desc"

Centralized Logging Architecture

There are three main approaches to centralizing logs across clouds: aggregating to a single cloud's native service, using a dedicated log management platform, or building a self-hosted solution.

Architecture Patterns

Pattern	Implementation	Pros	Cons
Single cloud aggregation	Forward all logs to one cloud (e.g., S3 + Athena)	Simple, uses native tools	Egress costs, single-cloud dependency
SaaS platform	Datadog, Splunk Cloud, Elastic Cloud	Best UX, managed, multi-cloud native	Expensive at scale
Self-hosted	ELK Stack, Grafana Loki, ClickHouse	Cost-effective at scale, full control	Operational overhead

OpenTelemetry for Unified Collection

OpenTelemetry (OTel) provides a vendor-neutral standard for collecting logs, metrics, and traces. The OpenTelemetry Collector can receive logs from any source (files, syslog, cloud APIs) and export them to any backend (Elasticsearch, Loki, Datadog, CloudWatch). Using OTel decouples your log collection from the backend, enabling easy migration between platforms.

yaml

# OpenTelemetry Collector configuration for multi-cloud logs
receivers:
  # Receive logs from applications via OTLP
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

  # Collect AWS CloudWatch Logs
  awscloudwatch:
    region: us-east-1
    logs:
      poll_interval: 1m
      groups:
        named:
          /aws/lambda/myapp:
          /aws/ecs/myapp:

  # Collect from files (for VMs)
  filelog:
    include:
      - /var/log/app/*.log
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.timestamp
          layout: '%Y-%m-%dT%H:%M:%SZ'

processors:
  # Add cloud provider metadata
  attributes:
    actions:
      - key: cloud.provider
        value: "multi-cloud"
        action: upsert

  # Batch for efficiency
  batch:
    send_batch_size: 10000
    timeout: 10s

  # Filter out noisy logs
  filter:
    logs:
      exclude:
        match_type: regexp
        bodies:
          - "health check"
          - "readiness probe"

exporters:
  # Send to Elasticsearch
  elasticsearch:
    endpoints: ["https://elasticsearch.internal:9200"]
    logs_index: "cloud-logs"

  # Send to Grafana Loki
  loki:
    endpoint: "https://loki.internal:3100/loki/api/v1/push"
    labels:
      attributes:
        cloud.provider:
        service.name:

service:
  pipelines:
    logs:
      receivers: [otlp, awscloudwatch, filelog]
      processors: [attributes, batch, filter]
      exporters: [elasticsearch, loki]

Log Correlation and Investigation

Effective log analysis requires correlating events across clouds. Use consistent structured logging formats, trace IDs, and request IDs to connect related events. A request that starts in a GCP Cloud Run service, calls an AWS Lambda function, and writes to an Azure Cosmos DB should be traceable end-to-end through a single correlation ID.

json

{
  "timestamp": "2026-03-14T10:30:00.000Z",
  "level": "INFO",
  "service": "order-api",
  "cloud": "aws",
  "region": "us-east-1",
  "trace_id": "abc123def456",
  "span_id": "span-789",
  "request_id": "req-abc-123",
  "message": "Order created successfully",
  "order_id": "ord-456",
  "customer_id": "cust-789",
  "duration_ms": 142,
  "http_status": 201
}

Cost Optimization Strategies

Strategy	Savings	Implementation
Log level filtering	50-80% volume reduction	Only ingest WARN+ in production; DEBUG in dev only
Sampling	90%+ for high-volume services	Sample 10% of success logs; keep 100% of errors
Tiered retention	60-80% storage savings	Hot: 7 days, Warm: 30 days, Cold: 1 year
Structured logging	Faster queries, less processing	JSON format with consistent field names
Log routing	Varies	Route debug logs to cheap storage, errors to SIEM
Compression	5-10x storage reduction	GZIP/ZSTD for archived logs in object storage

bash

# AWS: Set CloudWatch log retention to reduce costs
aws logs put-retention-policy \
  --log-group-name "/aws/lambda/myapp" \
  --retention-in-days 14

# GCP: Exclude debug logs from ingestion (save money)
gcloud logging sinks create exclude-debug \
  "logging.googleapis.com/projects/PROJECT/locations/global/buckets/_Default" \
  --log-filter='NOT severity="DEBUG"' \
  --exclusion='name=exclude-debug,filter=severity="DEBUG"'

# GCP: Route logs to cheaper bucket with different retention
gcloud logging buckets create cold-logs \
  --location=global \
  --retention-days=365 \
  --project=PROJECT

gcloud logging sinks create route-to-cold \
  "logging.googleapis.com/projects/PROJECT/locations/global/buckets/cold-logs" \
  --log-filter='severity <= "INFO"' \
  --project=PROJECT

Start with Structured Logging

The single most impactful investment in logging is structured logging (JSON format with consistent field names). Structured logs enable efficient parsing, indexing, and querying across all platforms. Define a logging schema that includes timestamp, level, service name, cloud provider, region, trace ID, and request ID. Enforce this schema across all teams and clouds.

Multi-Cloud Observability Comparison AWS CloudWatch Observability Guide

Key Takeaways

1Native logging services use different query languages: Insights, KQL, and LQL.
2OpenTelemetry Collector provides vendor-neutral log collection across all clouds.
3Structured JSON logging with consistent fields enables cross-cloud correlation.
4Log level filtering and sampling reduce costs by 50-90%.

Frequently Asked Questions

What is the best centralized logging platform for multi-cloud?

Small teams: aggregate to one cloud. Medium teams: SaaS platform (Datadog, Elastic). Large teams: self-host Loki or ELK. The best choice depends on team size, budget, and operational capacity.

How do I correlate logs across clouds?

Use consistent trace IDs and request IDs in all log entries. Propagate them via HTTP headers. OpenTelemetry handles this automatically.

Written by CloudToolStack Editorial

Written and reviewed by the CloudToolStack editorial team. Every guide is verified against current provider documentation and revised in place when providers change pricing, deprecate services, or release meaningfully better alternatives.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.