Migrating to Cloud Data Warehouses: Redshift vs Synapse vs BigQuery vs ADW
Migration from on-prem data warehouses to Redshift, Synapse, BigQuery, and Oracle ADW with schema conversion, query compatibility, performance tuning, cost modeling, and a realistic 28-week timeline.
Why Data Warehouse Migrations Take Twice as Long as You Think
Every data warehouse migration starts with the same optimism: the schema is well-documented, the queries are straightforward SQL, and the vendor promises a migration tool that handles everything. Six months later, the team is manually rewriting stored procedures, discovering undocumented data dependencies, and explaining to leadership why the project is behind schedule.
I have been through four data warehouse migrations -- from Oracle to Redshift, from Teradata to BigQuery, from on-prem SQL Server to Synapse, and from Redshift to BigQuery. Every single one took longer than estimated, and the issues were remarkably consistent across all of them. The data and SQL are the easy part. The hard parts are the ETL pipelines that feed the warehouse, the BI tools that read from it, the tribal knowledge about how certain columns are actually used, and the political battle over who owns the migration timeline.
This guide covers the four major cloud data warehouses -- Redshift, Synapse, BigQuery, and Oracle Autonomous Data Warehouse -- with a practical migration approach based on real experience.
The Four Contenders in 2026
Amazon Redshift
Redshift is a columnar, MPP (massively parallel processing) data warehouse that has been AWS's flagship analytics service since 2012. Redshift Serverless (launched in 2022) added a consumption-based pricing option that eliminates cluster management. Redshift now supports semi-structured data (SUPER type), machine learning inference via Redshift ML, and data sharing across accounts without data movement.
Strengths: Mature ecosystem. Excellent integration with the AWS data stack (Glue, Lake Formation, Athena, QuickSight). Provisioned clusters offer predictable pricing for steady workloads. The AQUA (Advanced Query Accelerator) layer provides hardware-accelerated cache for frequently accessed data.
Weaknesses: Cluster resizing is still disruptive -- elastic resize takes 10 to 15 minutes and classic resize can take hours. Concurrency scaling helps with burst workloads but adds cost. Vacuum operations for maintaining sort order are a constant maintenance burden that BigQuery and Synapse do not have.
Pricing: A dc2.large node (160 GB SSD) costs $0.25 per hour ($180 per month). A production cluster with 4 ra3.xlarge nodes (128 TB managed storage) costs about $3.26 per hour ($2,350 per month). Redshift Serverless charges $0.375 per RPU-hour, where a typical ad-hoc query workload might use 32 to 128 RPUs.
Estimate Redshift cluster costs for provisioned and serverless configurationsAzure Synapse Analytics
Synapse is Microsoft's unified analytics platform that combines a data warehouse (dedicated SQL pools), a data lake (serverless SQL pools), Spark integration, and data integration pipelines. The dedicated SQL pool is the MPP data warehouse component, while the serverless SQL pool lets you query data in Azure Data Lake Storage directly without loading it.
Strengths: The unified platform story is compelling if you use the full Synapse ecosystem. Integration with Power BI is unmatched -- direct query mode, materialized views for BI acceleration, and Azure AD authentication pass-through. The serverless SQL pool is excellent for exploratory analysis without provisioning a warehouse.
Weaknesses: The dedicated SQL pool pricing is aggressive. DWU (Data Warehouse Unit) pricing is opaque -- it bundles compute, memory, and I/O into a single unit, making it hard to right-size. Scaling a dedicated pool takes 5 to 10 minutes and drops all connections. The T-SQL dialect is close to but not identical to SQL Server, which causes subtle migration issues for teams coming from on-prem SQL Server.
Pricing: A DW100c (the smallest dedicated pool) costs $1.20 per hour ($864 per month). A production DW500c costs $6.00 per hour ($4,320 per month). You can pause dedicated pools when not in use, which is a significant cost saver for development and staging environments.
Estimate Azure Synapse dedicated and serverless pool costsGoogle BigQuery
BigQuery is a fully serverless, columnar data warehouse with automatic scaling and a unique pricing model. There are no clusters to manage, no nodes to provision, and no vacuum operations to run. BigQuery introduced editions (Standard, Enterprise, Enterprise Plus) in 2023, adding slot-based pricing alongside the original on-demand per-query pricing.
Strengths: Truly serverless. No infrastructure management. Per-query pricing is ideal for unpredictable or bursty workloads. BigQuery ML lets you train and serve ML models directly in SQL. The separation of storage and compute is complete -- you pay for storage based on data volume and for compute based on queries or reserved slots.
Weaknesses: Per-query pricing can produce bill shock. A poorly optimized query that scans a 10 TB table costs $50 at $5 per TB scanned. Teams new to BigQuery need strict cost controls -- required query previews, per-user quotas, and monitoring. The SQL dialect (GoogleSQL) has unique syntax for arrays, structs, and date functions that requires query rewriting from other platforms. DML operations (UPDATE, DELETE) are significantly slower than on traditional databases.
Pricing: On-demand pricing charges $6.25 per TB scanned (first 1 TB per month free). Storage costs $0.02 per GB per month for active storage and $0.01 per GB for long-term storage (data not modified for 90 days). Enterprise edition slots start at $0.04 per slot-hour with autoscaling.
Estimate BigQuery costs for on-demand and slot-based pricing modelsOracle Autonomous Data Warehouse (ADW)
Oracle ADW is the managed cloud version of Oracle Database optimized for analytics workloads. It is fully autonomous -- it handles provisioning, patching, scaling, backups, and tuning automatically. ADW supports Oracle SQL, PL/SQL, and the full Oracle analytics stack.
Strengths: If you are migrating from an on-prem Oracle data warehouse, ADW offers the closest-to-zero-rewrite migration path. PL/SQL procedures, Oracle-specific SQL syntax, and Oracle-native tools all work. The autonomous scaling and tuning reduce DBA overhead significantly.
Weaknesses: Oracle licensing on OCI is cheaper than on other clouds, but it is still Oracle pricing. ADW starts at 1 OCPU ($1.3441 per hour) and scales from there. Egress costs on OCI are lower than AWS, but the ecosystem of BI and ETL tools that integrate natively with Oracle is smaller than what Redshift, Synapse, or BigQuery offer.
Schema Conversion: Where Migrations Actually Get Hard
Moving data between warehouses is straightforward -- it is essentially an unload-and-load operation. Converting schemas, data types, and SQL dialects is where the real effort lives.
Data Type Mapping
Every warehouse has different data type names and slightly different semantics for types that share names:
- Dates and timestamps: Redshift uses TIMESTAMP and TIMESTAMPTZ. BigQuery uses TIMESTAMP (always UTC) and DATETIME (no timezone). Synapse uses DATETIME2 with precision up to 7 decimal places. Subtle timezone handling differences cause data discrepancies that are maddening to debug.
- Strings: Redshift VARCHAR has a 65,535-byte limit. BigQuery STRING has no practical limit. Synapse NVARCHAR(MAX) supports up to 2 GB. If your source warehouse has CLOB or TEXT columns, verify that the target can handle the maximum actual data length, not just the declared type.
- Numeric precision: Financial data requires exact decimal arithmetic. Verify that the target warehouse's DECIMAL/NUMERIC type supports the same precision and scale as the source. BigQuery's NUMERIC is 38 digits of precision with 9 decimal places. Redshift DECIMAL supports up to 38 digits with configurable scale.
- Semi-structured data: If your source uses JSON columns, Redshift uses SUPER type, BigQuery uses native JSON, and Synapse uses NVARCHAR with JSON functions. The query syntax for accessing nested fields differs significantly across platforms.
NULL handling will bite you
Every database handles NULLs slightly differently in edge cases. Empty strings vs NULLs, NULL sorting order, NULL behavior in GROUP BY, and NULL propagation in string concatenation all vary across platforms. Write explicit test cases for NULL handling in your most critical queries. I have seen migrations where a financial report was off by millions of dollars because the target warehouse treated empty strings differently from the source in aggregation functions.
Query Compatibility
Most SELECT queries migrate with minor syntax changes. The problems come from platform-specific features:
- Window functions: All four warehouses support standard window functions, but the supported frame specifications and function names differ. QUALIFY (BigQuery) vs subquery-based filtering of window function results (Redshift, Synapse).
- String functions: CONCAT behavior varies. Redshift CONCAT only takes two arguments. BigQuery CONCAT takes any number. Synapse uses the + operator. SPLIT, REGEXP, and pattern matching syntax differ across all platforms.
- Date functions: DATEDIFF, DATE_TRUNC, and interval arithmetic have different syntax on every platform. This is the single largest category of query rewrites in every migration I have done.
- Stored procedures: Redshift supports PL/pgSQL stored procedures. Synapse uses T-SQL. BigQuery uses JavaScript or SQL scripting. Oracle uses PL/SQL. Stored procedure migration is essentially a rewrite, not a conversion.
Performance Tuning on the Target Platform
A query that runs in 5 seconds on the source warehouse might take 5 minutes on the target -- not because the target is slower, but because the optimization strategies are different.
Redshift
Distribution style (KEY, EVEN, ALL) and sort keys are critical. A table joined frequently on customer_id should use KEY distribution on customer_id so that rows with the same key are on the same node. Compound sort keys on columns used in WHERE and ORDER BY clauses reduce disk I/O dramatically. Run ANALYZE after loading data to update statistics, and VACUUM to reclaim space and restore sort order.
BigQuery
Partitioning and clustering replace sort keys and distribution styles. Partition tables by date (nearly every analytics table has a date column). Cluster by the columns most frequently used in WHERE clauses (up to 4 columns). BigQuery automatically manages data organization within clusters. The single most impactful optimization is reducing the amount of data scanned -- use columnar queries (select only the columns you need) and leverage partition pruning.
Synapse
Distribution (HASH, ROUND_ROBIN, REPLICATE) and columnstore indexes determine performance. Large fact tables should use HASH distribution on the most common join key. Small dimension tables should use REPLICATE to copy them to every compute node. Columnstore indexes are created automatically on all tables in dedicated SQL pools -- do not create traditional row-store indexes unless you have a specific reason.
Migration Timeline and Checklist
Based on migrating a mid-size data warehouse (5 to 20 TB, 200 to 500 tables, 50 to 100 ETL pipelines), here is a realistic timeline:
Phase 1: Assessment (Weeks 1 to 4)
- Inventory all tables, views, stored procedures, and materialized views in the source warehouse.
- Catalog all ETL pipelines, their schedules, dependencies, and owners.
- Identify all downstream consumers -- BI dashboards, reporting tools, data science notebooks, application queries.
- Run the cloud provider's schema conversion tool (AWS SCT, Azure Database Migration Service, BigQuery Migration Service) to get an initial assessment of compatibility.
- Estimate target warehouse sizing and cost.
Phase 2: Schema and Query Conversion (Weeks 5 to 12)
- Convert table DDL to the target platform's syntax. Handle data type mapping.
- Convert views and stored procedures. This is the most labor-intensive phase.
- Rewrite any queries that use platform-specific syntax.
- Set up the target warehouse with appropriate distribution, partitioning, and indexing.
- Create a validation framework that compares query results between source and target for a set of benchmark queries.
Phase 3: Data Migration (Weeks 10 to 16)
- Extract data from the source warehouse. For large tables, use parallel extraction (Redshift UNLOAD, BigQuery EXPORT, Synapse CTAS with external tables).
- Load data into the target warehouse. Use the target's bulk loading mechanism (Redshift COPY, BigQuery load jobs, Synapse COPY INTO).
- Validate row counts, checksums, and sample data between source and target.
- Set up incremental sync for tables that change during the migration period.
Migrate incrementally, not big-bang
Do not try to migrate everything at once. Start with one data domain (for example, marketing analytics) and migrate it end-to-end: the tables, the ETL pipelines that feed them, and the dashboards that read from them. Run both source and target in parallel for 2 to 4 weeks and compare results daily. Once you have confidence, cut over that domain and move to the next. This approach reduces risk and lets you learn migration patterns that speed up subsequent domains.
Phase 4: ETL Pipeline Migration (Weeks 12 to 20)
- Migrate or rewrite ETL pipelines to target the new warehouse. If you are changing ETL tools (for example, from Informatica to Glue or Data Factory), this is a significant effort on its own.
- Test pipeline execution: run each pipeline and verify that the target tables match the source after a full refresh.
- Set up monitoring and alerting for the new pipelines.
Phase 5: BI and Consumer Migration (Weeks 16 to 24)
- Reconfigure BI tools (Tableau, Power BI, Looker, QuickSight) to connect to the new warehouse.
- Validate that every dashboard, report, and scheduled export produces the same results from the new warehouse.
- Update application connection strings and queries.
- Retrain users on any new query syntax or tools.
Phase 6: Parallel Running and Cutover (Weeks 20 to 28)
- Run both warehouses in parallel. Compare key metrics daily.
- Gradually shift traffic to the new warehouse.
- Cut over completely once confidence is established.
- Keep the source warehouse available for 4 to 8 weeks after cutover as a rollback option.
- Decommission the source warehouse.
Total timeline: 24 to 28 weeks for a mid-size warehouse. Larger warehouses (50 TB or more, 1,000+ tables) typically take 9 to 12 months. If anyone tells you a major data warehouse migration can be done in 8 weeks, they have not done one.
Cost Modeling: The Comparison Nobody Wants to Do
Comparing warehouse costs across clouds is difficult because the pricing models are fundamentally different. Redshift charges for provisioned nodes. BigQuery charges per query (on-demand) or per slot (editions). Synapse charges for DWUs. Here is a rough comparison for a 10 TB warehouse running 1,000 queries per day:
- Redshift (4x ra3.xlarge, 3-year reserved): Approximately $1,100 per month compute + $240 per month managed storage = $1,340 per month.
- BigQuery (on-demand, 2 TB average scanned per day): Approximately $310 per month queries + $200 per month storage = $510 per month. But a single expensive query can spike this dramatically.
- BigQuery (Enterprise, 100 slots): Approximately $2,920 per month compute + $200 per month storage = $3,120 per month. More predictable but more expensive.
- Synapse (DW500c, 12 hours per day): Approximately $2,160 per month compute + $230 per month storage = $2,390 per month. Pausing during off-hours saves significantly.
BigQuery on-demand is the cheapest for light workloads. Redshift reserved is the cheapest for heavy, predictable workloads. Synapse is the most expensive but offers the best integration with the Microsoft ecosystem. Choose based on your workload patterns and existing cloud investments, not just the sticker price.
Do not forget the migration cost itself
Budget 2 to 4 full-time engineers for 6 months for a mid-size migration. At $150,000 per engineer annually, that is $150,000 to $300,000 in labor alone. Add cloud costs for running both warehouses in parallel during migration (4 to 8 weeks at double your current spend), data transfer costs for moving terabytes between clouds, and the productivity hit to BI teams who need to revalidate their reports. The total migration cost is often 2 to 3 times the annual savings from the new warehouse. Make sure the ROI justifies the investment over a 3 to 5 year horizon.
Choosing the Right Target
After going through multiple migrations, here is my decision framework:
- Choose Redshift if you are already on AWS, have predictable query workloads, and want the deepest integration with the AWS data ecosystem (Glue, Lake Formation, Athena, SageMaker).
- Choose BigQuery if you want zero infrastructure management, have variable workloads, or want the best price-performance for ad-hoc analytics. BigQuery is the best choice for organizations that cannot predict their query volume.
- Choose Synapse if you are a Microsoft shop with heavy Power BI usage, existing Azure AD infrastructure, and teams that know T-SQL. The unified analytics story (SQL, Spark, and pipelines in one service) is genuine differentiator.
- Choose ADW if you are migrating from Oracle on-prem and want the path of least resistance. The SQL compatibility advantage is real and can save months of query rewriting.
The worst reason to choose a data warehouse is because it won a benchmark. Benchmarks test specific query patterns on specific data distributions. Your workload is not a benchmark. Run a proof-of-concept with your actual data and your actual queries on each platform you are considering. Two weeks of POC testing will tell you more than any number of vendor presentations.
Try These Tools
Written by CloudToolStack Team
Cloud architects with 15+ years of production experience across AWS, Azure, GCP, and OCI. We build free tools and write practical guides to help engineers navigate multi-cloud infrastructure.
Disclaimer: This article is for informational purposes. Cloud services and pricing change frequently; always verify with official provider documentation. AWS, Azure, GCP, and OCI are trademarks of their respective owners.