Skip to main content
Multi-CloudObservabilityintermediate

Centralized Logging Architecture

Aggregate logs across clouds with OpenTelemetry, ELK, Loki, and native services. Cost optimization and correlation.

CloudToolStack Team24 min readPublished Mar 14, 2026

Prerequisites

  • Basic understanding of logging and observability concepts
  • Familiarity with at least one cloud logging service

Why Centralized Logging Matters

In multi-cloud environments, logs are scattered across providers, services, and regions. AWS generates logs in CloudWatch Logs and CloudTrail, Azure uses Monitor Logs and Activity Log, GCP uses Cloud Logging, and OCI uses the Logging service. Without centralization, investigating incidents requires switching between four different consoles, query languages, and time formats. A centralized logging architecture aggregates all logs into a single platform for unified search, correlation, alerting, and compliance.

This guide covers the native logging services of each cloud provider, aggregation patterns for centralization, popular logging platforms (ELK, Datadog, Splunk, Grafana Loki), OpenTelemetry for vendor-neutral log collection, and cost optimization strategies for high-volume log environments.

Logging Cost Is Significant

Logging costs can easily exceed your compute costs if not managed carefully. CloudWatch Logs charges $0.50/GB for ingestion and $0.03/GB/month for storage. Azure Monitor charges $2.76/GB for ingestion. Cloud Logging charges $0.50/GB beyond the free 50 GB/month. A busy application generating 100 GB/day of logs costs $1,500-8,000/month in log ingestion alone. Always implement log level filtering, sampling, and retention policies.

Native Logging Services

FeatureAWS CloudWatch LogsAzure Monitor LogsGCP Cloud Logging
Query languageCloudWatch Logs InsightsKQL (Kusto Query Language)Logging Query Language
Retention1 day to indefinite30 days to 2 years30 days (default), custom buckets
Free tier5 GB ingest, 5 GB storage5 GB/month (first 31 days free)50 GB/month
Streaming exportSubscription filters to S3/Lambda/KinesisDiagnostic settings to Event Hubs/StorageLog sinks to Pub/Sub/Storage/BigQuery
AlertingMetric filters + CloudWatch AlarmsAlert rules (KQL-based)Log-based metrics + alerting

Cloud-Native Log Collection

AWS CloudWatch Logs

bash
# Stream CloudWatch Logs to S3 for centralization
aws logs put-subscription-filter \
  --log-group-name "/aws/lambda/myapp" \
  --filter-name "to-kinesis" \
  --filter-pattern "" \
  --destination-arn "arn:aws:kinesis:us-east-1:123456789012:stream/log-stream"

# Export logs to S3 (batch)
aws logs create-export-task \
  --log-group-name "/aws/lambda/myapp" \
  --from 1709251200000 \
  --to 1709337600000 \
  --destination "log-export-bucket" \
  --destination-prefix "cloudwatch-logs/"

# Query logs with CloudWatch Logs Insights
aws logs start-query \
  --log-group-name "/aws/lambda/myapp" \
  --start-time 1709251200 \
  --end-time 1709337600 \
  --query-string 'fields @timestamp, @message | filter @message like /ERROR/ | sort @timestamp desc | limit 50'

GCP Cloud Logging

bash
# Create a log sink to BigQuery for analysis
gcloud logging sinks create bigquery-audit-sink \
  "bigquery.googleapis.com/projects/PROJECT/datasets/audit_logs" \
  --log-filter='logName:"cloudaudit.googleapis.com"' \
  --organization=ORG_ID \
  --include-children

# Create a log sink to Pub/Sub for streaming to external SIEM
gcloud logging sinks create pubsub-all-logs \
  "pubsub.googleapis.com/projects/PROJECT/topics/all-logs" \
  --log-filter='severity >= WARNING' \
  --project=PROJECT

# Create a log sink to Cloud Storage for archival
gcloud logging sinks create storage-archive \
  "storage.googleapis.com/log-archive-bucket" \
  --log-filter='resource.type="gce_instance"' \
  --project=PROJECT

Azure Monitor Logs

bash
# Create a Log Analytics workspace
az monitor log-analytics workspace create \
  --workspace-name central-logs \
  --resource-group monitoring-rg \
  --location eastus \
  --retention-time 90

# Enable diagnostic settings for a resource
az monitor diagnostic-settings create \
  --name "send-to-workspace" \
  --resource "/subscriptions/SUB_ID/resourceGroups/myapp-rg/providers/Microsoft.Web/sites/myapp" \
  --workspace "/subscriptions/SUB_ID/resourceGroups/monitoring-rg/providers/Microsoft.OperationalInsights/workspaces/central-logs" \
  --logs '[{"category": "AppServiceHTTPLogs", "enabled": true}, {"category": "AppServiceConsoleLogs", "enabled": true}]' \
  --metrics '[{"category": "AllMetrics", "enabled": true}]'

# Query with KQL
az monitor log-analytics query \
  --workspace central-logs-id \
  --analytics-query "AppServiceHTTPLogs | where ScStatus >= 500 | summarize count() by bin(TimeGenerated, 1h) | order by TimeGenerated desc"

Centralized Logging Architecture

There are three main approaches to centralizing logs across clouds: aggregating to a single cloud's native service, using a dedicated log management platform, or building a self-hosted solution.

Architecture Patterns

PatternImplementationProsCons
Single cloud aggregationForward all logs to one cloud (e.g., S3 + Athena)Simple, uses native toolsEgress costs, single-cloud dependency
SaaS platformDatadog, Splunk Cloud, Elastic CloudBest UX, managed, multi-cloud nativeExpensive at scale
Self-hostedELK Stack, Grafana Loki, ClickHouseCost-effective at scale, full controlOperational overhead

OpenTelemetry for Unified Collection

OpenTelemetry (OTel) provides a vendor-neutral standard for collecting logs, metrics, and traces. The OpenTelemetry Collector can receive logs from any source (files, syslog, cloud APIs) and export them to any backend (Elasticsearch, Loki, Datadog, CloudWatch). Using OTel decouples your log collection from the backend, enabling easy migration between platforms.

yaml
# OpenTelemetry Collector configuration for multi-cloud logs
receivers:
  # Receive logs from applications via OTLP
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

  # Collect AWS CloudWatch Logs
  awscloudwatch:
    region: us-east-1
    logs:
      poll_interval: 1m
      groups:
        named:
          /aws/lambda/myapp:
          /aws/ecs/myapp:

  # Collect from files (for VMs)
  filelog:
    include:
      - /var/log/app/*.log
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.timestamp
          layout: '%Y-%m-%dT%H:%M:%SZ'

processors:
  # Add cloud provider metadata
  attributes:
    actions:
      - key: cloud.provider
        value: "multi-cloud"
        action: upsert

  # Batch for efficiency
  batch:
    send_batch_size: 10000
    timeout: 10s

  # Filter out noisy logs
  filter:
    logs:
      exclude:
        match_type: regexp
        bodies:
          - "health check"
          - "readiness probe"

exporters:
  # Send to Elasticsearch
  elasticsearch:
    endpoints: ["https://elasticsearch.internal:9200"]
    logs_index: "cloud-logs"

  # Send to Grafana Loki
  loki:
    endpoint: "https://loki.internal:3100/loki/api/v1/push"
    labels:
      attributes:
        cloud.provider:
        service.name:

service:
  pipelines:
    logs:
      receivers: [otlp, awscloudwatch, filelog]
      processors: [attributes, batch, filter]
      exporters: [elasticsearch, loki]

Log Correlation and Investigation

Effective log analysis requires correlating events across clouds. Use consistent structured logging formats, trace IDs, and request IDs to connect related events. A request that starts in a GCP Cloud Run service, calls an AWS Lambda function, and writes to an Azure Cosmos DB should be traceable end-to-end through a single correlation ID.

json
{
  "timestamp": "2026-03-14T10:30:00.000Z",
  "level": "INFO",
  "service": "order-api",
  "cloud": "aws",
  "region": "us-east-1",
  "trace_id": "abc123def456",
  "span_id": "span-789",
  "request_id": "req-abc-123",
  "message": "Order created successfully",
  "order_id": "ord-456",
  "customer_id": "cust-789",
  "duration_ms": 142,
  "http_status": 201
}

Cost Optimization Strategies

StrategySavingsImplementation
Log level filtering50-80% volume reductionOnly ingest WARN+ in production; DEBUG in dev only
Sampling90%+ for high-volume servicesSample 10% of success logs; keep 100% of errors
Tiered retention60-80% storage savingsHot: 7 days, Warm: 30 days, Cold: 1 year
Structured loggingFaster queries, less processingJSON format with consistent field names
Log routingVariesRoute debug logs to cheap storage, errors to SIEM
Compression5-10x storage reductionGZIP/ZSTD for archived logs in object storage
bash
# AWS: Set CloudWatch log retention to reduce costs
aws logs put-retention-policy \
  --log-group-name "/aws/lambda/myapp" \
  --retention-in-days 14

# GCP: Exclude debug logs from ingestion (save money)
gcloud logging sinks create exclude-debug \
  "logging.googleapis.com/projects/PROJECT/locations/global/buckets/_Default" \
  --log-filter='NOT severity="DEBUG"' \
  --exclusion='name=exclude-debug,filter=severity="DEBUG"'

# GCP: Route logs to cheaper bucket with different retention
gcloud logging buckets create cold-logs \
  --location=global \
  --retention-days=365 \
  --project=PROJECT

gcloud logging sinks create route-to-cold \
  "logging.googleapis.com/projects/PROJECT/locations/global/buckets/cold-logs" \
  --log-filter='severity <= "INFO"' \
  --project=PROJECT

Start with Structured Logging

The single most impactful investment in logging is structured logging (JSON format with consistent field names). Structured logs enable efficient parsing, indexing, and querying across all platforms. Define a logging schema that includes timestamp, level, service name, cloud provider, region, trace ID, and request ID. Enforce this schema across all teams and clouds.

Multi-Cloud Observability ComparisonAWS CloudWatch Observability Guide

Key Takeaways

  1. 1Native logging services use different query languages: Insights, KQL, and LQL.
  2. 2OpenTelemetry Collector provides vendor-neutral log collection across all clouds.
  3. 3Structured JSON logging with consistent fields enables cross-cloud correlation.
  4. 4Log level filtering and sampling reduce costs by 50-90%.

Frequently Asked Questions

What is the best centralized logging platform for multi-cloud?
Small teams: aggregate to one cloud. Medium teams: SaaS platform (Datadog, Elastic). Large teams: self-host Loki or ELK. The best choice depends on team size, budget, and operational capacity.
How do I correlate logs across clouds?
Use consistent trace IDs and request IDs in all log entries. Propagate them via HTTP headers. OpenTelemetry handles this automatically.

Written by CloudToolStack Team

Cloud engineers and architects with hands-on experience across AWS, Azure, and GCP. We write guides based on real-world production patterns, not just documentation rewrites.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.