Centralized Logging Architecture
Aggregate logs across clouds with OpenTelemetry, ELK, Loki, and native services. Cost optimization and correlation.
Prerequisites
- Basic understanding of logging and observability concepts
- Familiarity with at least one cloud logging service
Why Centralized Logging Matters
In multi-cloud environments, logs are scattered across providers, services, and regions. AWS generates logs in CloudWatch Logs and CloudTrail, Azure uses Monitor Logs and Activity Log, GCP uses Cloud Logging, and OCI uses the Logging service. Without centralization, investigating incidents requires switching between four different consoles, query languages, and time formats. A centralized logging architecture aggregates all logs into a single platform for unified search, correlation, alerting, and compliance.
This guide covers the native logging services of each cloud provider, aggregation patterns for centralization, popular logging platforms (ELK, Datadog, Splunk, Grafana Loki), OpenTelemetry for vendor-neutral log collection, and cost optimization strategies for high-volume log environments.
Logging Cost Is Significant
Logging costs can easily exceed your compute costs if not managed carefully. CloudWatch Logs charges $0.50/GB for ingestion and $0.03/GB/month for storage. Azure Monitor charges $2.76/GB for ingestion. Cloud Logging charges $0.50/GB beyond the free 50 GB/month. A busy application generating 100 GB/day of logs costs $1,500-8,000/month in log ingestion alone. Always implement log level filtering, sampling, and retention policies.
Native Logging Services
| Feature | AWS CloudWatch Logs | Azure Monitor Logs | GCP Cloud Logging |
|---|---|---|---|
| Query language | CloudWatch Logs Insights | KQL (Kusto Query Language) | Logging Query Language |
| Retention | 1 day to indefinite | 30 days to 2 years | 30 days (default), custom buckets |
| Free tier | 5 GB ingest, 5 GB storage | 5 GB/month (first 31 days free) | 50 GB/month |
| Streaming export | Subscription filters to S3/Lambda/Kinesis | Diagnostic settings to Event Hubs/Storage | Log sinks to Pub/Sub/Storage/BigQuery |
| Alerting | Metric filters + CloudWatch Alarms | Alert rules (KQL-based) | Log-based metrics + alerting |
Cloud-Native Log Collection
AWS CloudWatch Logs
# Stream CloudWatch Logs to S3 for centralization
aws logs put-subscription-filter \
--log-group-name "/aws/lambda/myapp" \
--filter-name "to-kinesis" \
--filter-pattern "" \
--destination-arn "arn:aws:kinesis:us-east-1:123456789012:stream/log-stream"
# Export logs to S3 (batch)
aws logs create-export-task \
--log-group-name "/aws/lambda/myapp" \
--from 1709251200000 \
--to 1709337600000 \
--destination "log-export-bucket" \
--destination-prefix "cloudwatch-logs/"
# Query logs with CloudWatch Logs Insights
aws logs start-query \
--log-group-name "/aws/lambda/myapp" \
--start-time 1709251200 \
--end-time 1709337600 \
--query-string 'fields @timestamp, @message | filter @message like /ERROR/ | sort @timestamp desc | limit 50'GCP Cloud Logging
# Create a log sink to BigQuery for analysis
gcloud logging sinks create bigquery-audit-sink \
"bigquery.googleapis.com/projects/PROJECT/datasets/audit_logs" \
--log-filter='logName:"cloudaudit.googleapis.com"' \
--organization=ORG_ID \
--include-children
# Create a log sink to Pub/Sub for streaming to external SIEM
gcloud logging sinks create pubsub-all-logs \
"pubsub.googleapis.com/projects/PROJECT/topics/all-logs" \
--log-filter='severity >= WARNING' \
--project=PROJECT
# Create a log sink to Cloud Storage for archival
gcloud logging sinks create storage-archive \
"storage.googleapis.com/log-archive-bucket" \
--log-filter='resource.type="gce_instance"' \
--project=PROJECTAzure Monitor Logs
# Create a Log Analytics workspace
az monitor log-analytics workspace create \
--workspace-name central-logs \
--resource-group monitoring-rg \
--location eastus \
--retention-time 90
# Enable diagnostic settings for a resource
az monitor diagnostic-settings create \
--name "send-to-workspace" \
--resource "/subscriptions/SUB_ID/resourceGroups/myapp-rg/providers/Microsoft.Web/sites/myapp" \
--workspace "/subscriptions/SUB_ID/resourceGroups/monitoring-rg/providers/Microsoft.OperationalInsights/workspaces/central-logs" \
--logs '[{"category": "AppServiceHTTPLogs", "enabled": true}, {"category": "AppServiceConsoleLogs", "enabled": true}]' \
--metrics '[{"category": "AllMetrics", "enabled": true}]'
# Query with KQL
az monitor log-analytics query \
--workspace central-logs-id \
--analytics-query "AppServiceHTTPLogs | where ScStatus >= 500 | summarize count() by bin(TimeGenerated, 1h) | order by TimeGenerated desc"Centralized Logging Architecture
There are three main approaches to centralizing logs across clouds: aggregating to a single cloud's native service, using a dedicated log management platform, or building a self-hosted solution.
Architecture Patterns
| Pattern | Implementation | Pros | Cons |
|---|---|---|---|
| Single cloud aggregation | Forward all logs to one cloud (e.g., S3 + Athena) | Simple, uses native tools | Egress costs, single-cloud dependency |
| SaaS platform | Datadog, Splunk Cloud, Elastic Cloud | Best UX, managed, multi-cloud native | Expensive at scale |
| Self-hosted | ELK Stack, Grafana Loki, ClickHouse | Cost-effective at scale, full control | Operational overhead |
OpenTelemetry for Unified Collection
OpenTelemetry (OTel) provides a vendor-neutral standard for collecting logs, metrics, and traces. The OpenTelemetry Collector can receive logs from any source (files, syslog, cloud APIs) and export them to any backend (Elasticsearch, Loki, Datadog, CloudWatch). Using OTel decouples your log collection from the backend, enabling easy migration between platforms.
# OpenTelemetry Collector configuration for multi-cloud logs
receivers:
# Receive logs from applications via OTLP
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
# Collect AWS CloudWatch Logs
awscloudwatch:
region: us-east-1
logs:
poll_interval: 1m
groups:
named:
/aws/lambda/myapp:
/aws/ecs/myapp:
# Collect from files (for VMs)
filelog:
include:
- /var/log/app/*.log
operators:
- type: json_parser
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%SZ'
processors:
# Add cloud provider metadata
attributes:
actions:
- key: cloud.provider
value: "multi-cloud"
action: upsert
# Batch for efficiency
batch:
send_batch_size: 10000
timeout: 10s
# Filter out noisy logs
filter:
logs:
exclude:
match_type: regexp
bodies:
- "health check"
- "readiness probe"
exporters:
# Send to Elasticsearch
elasticsearch:
endpoints: ["https://elasticsearch.internal:9200"]
logs_index: "cloud-logs"
# Send to Grafana Loki
loki:
endpoint: "https://loki.internal:3100/loki/api/v1/push"
labels:
attributes:
cloud.provider:
service.name:
service:
pipelines:
logs:
receivers: [otlp, awscloudwatch, filelog]
processors: [attributes, batch, filter]
exporters: [elasticsearch, loki]Log Correlation and Investigation
Effective log analysis requires correlating events across clouds. Use consistent structured logging formats, trace IDs, and request IDs to connect related events. A request that starts in a GCP Cloud Run service, calls an AWS Lambda function, and writes to an Azure Cosmos DB should be traceable end-to-end through a single correlation ID.
{
"timestamp": "2026-03-14T10:30:00.000Z",
"level": "INFO",
"service": "order-api",
"cloud": "aws",
"region": "us-east-1",
"trace_id": "abc123def456",
"span_id": "span-789",
"request_id": "req-abc-123",
"message": "Order created successfully",
"order_id": "ord-456",
"customer_id": "cust-789",
"duration_ms": 142,
"http_status": 201
}Cost Optimization Strategies
| Strategy | Savings | Implementation |
|---|---|---|
| Log level filtering | 50-80% volume reduction | Only ingest WARN+ in production; DEBUG in dev only |
| Sampling | 90%+ for high-volume services | Sample 10% of success logs; keep 100% of errors |
| Tiered retention | 60-80% storage savings | Hot: 7 days, Warm: 30 days, Cold: 1 year |
| Structured logging | Faster queries, less processing | JSON format with consistent field names |
| Log routing | Varies | Route debug logs to cheap storage, errors to SIEM |
| Compression | 5-10x storage reduction | GZIP/ZSTD for archived logs in object storage |
# AWS: Set CloudWatch log retention to reduce costs
aws logs put-retention-policy \
--log-group-name "/aws/lambda/myapp" \
--retention-in-days 14
# GCP: Exclude debug logs from ingestion (save money)
gcloud logging sinks create exclude-debug \
"logging.googleapis.com/projects/PROJECT/locations/global/buckets/_Default" \
--log-filter='NOT severity="DEBUG"' \
--exclusion='name=exclude-debug,filter=severity="DEBUG"'
# GCP: Route logs to cheaper bucket with different retention
gcloud logging buckets create cold-logs \
--location=global \
--retention-days=365 \
--project=PROJECT
gcloud logging sinks create route-to-cold \
"logging.googleapis.com/projects/PROJECT/locations/global/buckets/cold-logs" \
--log-filter='severity <= "INFO"' \
--project=PROJECTStart with Structured Logging
The single most impactful investment in logging is structured logging (JSON format with consistent field names). Structured logs enable efficient parsing, indexing, and querying across all platforms. Define a logging schema that includes timestamp, level, service name, cloud provider, region, trace ID, and request ID. Enforce this schema across all teams and clouds.
Key Takeaways
- 1Native logging services use different query languages: Insights, KQL, and LQL.
- 2OpenTelemetry Collector provides vendor-neutral log collection across all clouds.
- 3Structured JSON logging with consistent fields enables cross-cloud correlation.
- 4Log level filtering and sampling reduce costs by 50-90%.
Frequently Asked Questions
What is the best centralized logging platform for multi-cloud?
How do I correlate logs across clouds?
Written by CloudToolStack Team
Cloud engineers and architects with hands-on experience across AWS, Azure, and GCP. We write guides based on real-world production patterns, not just documentation rewrites.
Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.