OCIDatabasesintermediate

MySQL HeatWave on OCI

Provision MySQL HeatWave for combined OLTP and OLAP: HeatWave cluster, Lakehouse for Object Storage queries, and in-database ML.

CloudToolStack Editorial24 min readPublished Mar 14, 2026

Prerequisites

Basic MySQL knowledge (SQL queries, database concepts)
OCI account with MySQL Database Service permissions

Introduction to MySQL HeatWave on OCI

MySQL HeatWave is a fully managed database service on Oracle Cloud Infrastructure that combines MySQL with an integrated, high-performance in-memory query accelerator called HeatWave. Unlike traditional MySQL deployments where analytical queries compete with transactional workloads for the same resources, HeatWave offloads analytics to a massively parallel processing (MPP) engine that can accelerate queries by orders of magnitude without any application changes.

The service provides the full MySQL 8.0 or 9.0 experience with enterprise features including automatic backups, point-in-time recovery, high availability with automatic failover, read replicas, and data-at-rest encryption. What makes it unique is the HeatWave cluster that loads data from MySQL InnoDB tables into memory across multiple nodes and executes analytical queries using columnar storage and vectorized processing. Queries that take minutes on standard MySQL complete in seconds on HeatWave.

This guide covers provisioning a MySQL DB System, enabling and configuring the HeatWave cluster, loading data into HeatWave, optimizing analytical queries, using HeatWave Lakehouse to query data in Object Storage, leveraging HeatWave ML for in-database machine learning, configuring high availability, and managing costs.

HeatWave vs. Traditional Analytics

Traditional approaches to MySQL analytics require extracting data to a separate data warehouse (ETL pipelines to Redshift, BigQuery, or Snowflake). HeatWave eliminates this by running analytics directly against your MySQL data with zero ETL. Your application continues to use standard MySQL queries and connectors. HeatWave automatically determines which queries benefit from acceleration and routes them to the HeatWave cluster. Oracle benchmarks show HeatWave is up to 6,500x faster than MySQL alone and 7x faster than Amazon Redshift for standard TPC-H and TPC-DS benchmarks.

Provisioning a MySQL DB System

A MySQL DB System is a managed MySQL server instance running on dedicated OCI compute. You choose the shape (CPU and memory), storage size, MySQL version, and networking configuration. The service handles patching, backups, monitoring, and failover automatically.

bash

# Create a MySQL DB System
oci mysql db-system create \
  --compartment-id <compartment-ocid> \
  --display-name "analytics-db" \
  --description "MySQL HeatWave for analytics workloads" \
  --shape-name "MySQL.HeatWave.VM.Standard" \
  --availability-domain "AD-1" \
  --subnet-id <private-subnet-ocid> \
  --admin-username "admin" \
  --admin-password "YourStr0ng_P@ssword!" \
  --data-storage-size-in-gbs 1024 \
  --mysql-version "9.0.0" \
  --backup-policy '{
    "isEnabled": true,
    "retentionInDays": 7,
    "windowStartTime": "02:00"
  }' \
  --maintenance '{
    "windowStartTime": "sun 03:00"
  }' \
  --is-highly-available true \
  --configuration-id <configuration-ocid>

# Wait for the DB System to become active
oci mysql db-system get \
  --db-system-id <db-system-ocid> \
  --query 'data.{"Name": "display-name", "State": "lifecycle-state", "Shape": "shape-name", "Endpoint": endpoints[0]."ip-address", "Port": endpoints[0].port}' \
  --output table

# List available MySQL shapes
oci mysql shape list \
  --compartment-id <compartment-ocid> \
  --query 'data[?contains(name, `HeatWave`)].{"Shape": name, "CPUs": "cpu-core-count", "Memory (GB)": "memory-size-in-gbs"}' \
  --output table

# Connect to the MySQL instance
mysql -h <endpoint-ip> -u admin -p -P 3306

Network Configuration

MySQL DB Systems should be placed in a private subnet for security. Access them from compute instances, Container Instances, or Functions within the same VCN. For local development access, use an OCI Bastion session to create an SSH tunnel. Never expose MySQL directly to the internet. If you need external application access, use an OCI Load Balancer or API Gateway as a secure proxy layer.

Enabling the HeatWave Cluster

The HeatWave cluster is a set of memory-optimized nodes that work alongside your MySQL DB System. When enabled, you choose the number of HeatWave nodes based on the size of your data. Each node provides approximately 512 GB of in-memory storage. Data is automatically distributed across nodes using hash partitioning for parallel query execution.

bash

# Add a HeatWave cluster to an existing DB System
oci mysql heat-wave-cluster add \
  --db-system-id <db-system-ocid> \
  --shape-name "MySQL.HeatWave.VM.Standard" \
  --cluster-size 3 \
  --is-lakehouse-enabled true

# Check HeatWave cluster status
oci mysql heat-wave-cluster get \
  --db-system-id <db-system-ocid> \
  --query 'data.{"State": "lifecycle-state", "Nodes": "cluster-size", "Shape": "shape-name", "Lakehouse": "is-lakehouse-enabled"}' \
  --output table

# Use HeatWave Auto Shape Estimation to determine optimal cluster size
# Run from MySQL shell:
# CALL sys.heatwave_advisor(JSON_OBJECT("target_schema", JSON_ARRAY("analytics_db")));

# Scale the HeatWave cluster
oci mysql heat-wave-cluster update \
  --db-system-id <db-system-ocid> \
  --cluster-size 5

# Stop the HeatWave cluster to save costs (data is reloaded on restart)
oci mysql heat-wave-cluster stop \
  --db-system-id <db-system-ocid>

# Restart the HeatWave cluster
oci mysql heat-wave-cluster start \
  --db-system-id <db-system-ocid>

Loading Data into HeatWave

Data is loaded from MySQL InnoDB tables into the HeatWave cluster using the SECONDARY_ENGINE feature. You mark tables for HeatWave acceleration, and the system loads the data into columnar format across the HeatWave nodes. Changes to the InnoDB tables are automatically propagated to HeatWave in real time, so the in-memory data stays synchronized without any ETL pipeline.

sql

-- Enable HeatWave for specific tables
ALTER TABLE orders SECONDARY_ENGINE = RAPID;
ALTER TABLE order_items SECONDARY_ENGINE = RAPID;
ALTER TABLE products SECONDARY_ENGINE = RAPID;
ALTER TABLE customers SECONDARY_ENGINE = RAPID;

-- Load the tables into HeatWave memory
ALTER TABLE orders SECONDARY_LOAD;
ALTER TABLE order_items SECONDARY_LOAD;
ALTER TABLE products SECONDARY_LOAD;
ALTER TABLE customers SECONDARY_LOAD;

-- Check which tables are loaded into HeatWave
SELECT NAME, LOAD_STATUS, LOAD_PROGRESS,
       POOL_TYPE, DATA_MEMORY_BYTES / 1024 / 1024 AS data_mb
FROM performance_schema.rpd_tables
ORDER BY data_mb DESC;

-- Check HeatWave node status
SELECT * FROM performance_schema.rpd_nodes;

-- Use HeatWave Auto Parallel Load for large tables
CALL sys.heatwave_load(JSON_ARRAY("analytics_db"), NULL);

-- Verify a query is using HeatWave (check EXPLAIN)
EXPLAIN SELECT
    c.customer_name,
    COUNT(o.order_id) AS total_orders,
    SUM(oi.quantity * oi.unit_price) AS total_spend
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
WHERE o.order_date >= '2026-01-01'
GROUP BY c.customer_name
ORDER BY total_spend DESC
LIMIT 20;

-- The EXPLAIN output should show "Using secondary engine RAPID"
-- If it shows standard MySQL execution, the optimizer determined
-- HeatWave would not be faster for this specific query

HeatWave Query Offloading

The MySQL optimizer automatically decides whether to execute a query on InnoDB or HeatWave. Simple point lookups and small result set queries typically run on InnoDB (which is faster for these patterns), while complex aggregations, joins, and full table scans are offloaded to HeatWave. You can force HeatWave execution with SET SESSION use_secondary_engine=FORCED for testing, but in production, let the optimizer choose. Use EXPLAIN to verify which engine will handle your query.

HeatWave Lakehouse

HeatWave Lakehouse extends HeatWave's analytical capabilities to data stored in OCI Object Storage. Instead of loading data into MySQL tables first, Lakehouse can query CSV, Parquet, Avro, and JSON files directly in Object Storage and join them with MySQL tables. This is ideal for analyzing large datasets that are too large or too infrequently accessed to store in MySQL, or for querying data lake files alongside operational data.

sql

-- Create an external table pointing to Object Storage
CREATE TABLE sales_history_ext ENGINE = LAKEHOUSE
SECONDARY_ENGINE = RAPID
AS SELECT * FROM (
    SELECT
        CAST(col1 AS DATE) AS sale_date,
        CAST(col2 AS VARCHAR(50)) AS product_id,
        CAST(col3 AS VARCHAR(100)) AS product_name,
        CAST(col4 AS DECIMAL(10,2)) AS sale_amount,
        CAST(col5 AS INT) AS quantity,
        CAST(col6 AS VARCHAR(50)) AS region
    FROM TABLE(
        OCI_LAKEHOUSE_READER(
            'oci://data-lake@namespace/sales/',
            JSON_OBJECT(
                'format', 'parquet',
                'schema', JSON_ARRAY(
                    JSON_OBJECT('name', 'sale_date', 'type', 'DATE'),
                    JSON_OBJECT('name', 'product_id', 'type', 'VARCHAR(50)'),
                    JSON_OBJECT('name', 'product_name', 'type', 'VARCHAR(100)'),
                    JSON_OBJECT('name', 'sale_amount', 'type', 'DECIMAL(10,2)'),
                    JSON_OBJECT('name', 'quantity', 'type', 'INT'),
                    JSON_OBJECT('name', 'region', 'type', 'VARCHAR(50)')
                )
            )
        )
    )
);

-- Load the external table data into HeatWave
ALTER TABLE sales_history_ext SECONDARY_LOAD;

-- Query across MySQL tables and Lakehouse data
SELECT
    p.category,
    sh.region,
    DATE_FORMAT(sh.sale_date, '%Y-%m') AS month,
    COUNT(*) AS transactions,
    SUM(sh.sale_amount) AS revenue,
    AVG(sh.sale_amount) AS avg_order_value
FROM sales_history_ext sh
JOIN products p ON sh.product_id = p.product_id
WHERE sh.sale_date >= '2025-01-01'
GROUP BY p.category, sh.region, month
ORDER BY revenue DESC
LIMIT 50;

-- Auto-generate Lakehouse table definition from file metadata
CALL sys.heatwave_load(
    JSON_ARRAY("analytics_db"),
    JSON_OBJECT(
        "mode", "dryrun",
        "output", JSON_ARRAY("lakehouse_auto_schema")
    )
);

HeatWave ML (In-Database Machine Learning)

HeatWave ML brings machine learning directly into MySQL. You can train models, make predictions, and explain model behavior using standard SQL statements without moving data to an external ML platform or learning a new API. HeatWave ML supports classification, regression, time series forecasting, anomaly detection, and recommendation tasks. It automatically selects the best algorithm, tunes hyperparameters, and handles feature engineering.

sql

-- Train a classification model to predict customer churn
CALL sys.ML_TRAIN(
    'analytics_db.customer_features',  -- Training data table
    'churned',                          -- Target column
    JSON_OBJECT(
        'task', 'classification',
        'model_handle', 'churn_predictor_v1'
    )
);

-- Check model training status
CALL sys.ML_MODEL_STATUS('churn_predictor_v1');

-- View model metadata and performance metrics
CALL sys.ML_EXPLAIN(
    'analytics_db.customer_features',
    'churned',
    JSON_OBJECT('model_handle', 'churn_predictor_v1')
);

-- Make predictions on new data
CALL sys.ML_PREDICT_TABLE(
    'analytics_db.new_customers',       -- Input table
    'churn_predictor_v1',                -- Model handle
    'analytics_db.churn_predictions'     -- Output table
);

-- View predictions
SELECT
    customer_id,
    customer_name,
    ml_results->>'$.prediction' AS predicted_churn,
    ml_results->>'$.probability' AS churn_probability
FROM analytics_db.churn_predictions
WHERE ml_results->>'$.prediction' = '1'
ORDER BY ml_results->>'$.probability' DESC
LIMIT 20;

-- Train a regression model for revenue forecasting
CALL sys.ML_TRAIN(
    'analytics_db.monthly_revenue',
    'revenue',
    JSON_OBJECT(
        'task', 'regression',
        'model_handle', 'revenue_forecast_v1'
    )
);

-- Train a time series forecasting model
CALL sys.ML_TRAIN(
    'analytics_db.daily_sales',
    'sales_amount',
    JSON_OBJECT(
        'task', 'forecasting',
        'model_handle', 'sales_forecast',
        'datetime_index', 'sale_date',
        'forecast_horizon', 30
    )
);

-- Train an anomaly detection model
CALL sys.ML_TRAIN(
    'analytics_db.transaction_features',
    NULL,  -- No target for unsupervised learning
    JSON_OBJECT(
        'task', 'anomaly_detection',
        'model_handle', 'fraud_detector_v1'
    )
);

HeatWave AutoML

HeatWave ML uses AutoML to automatically select the best algorithm from a library including XGBoost, Random Forest, LightGBM, and linear models. It performs automatic feature preprocessing, handles missing values, encodes categorical variables, and tunes hyperparameters using Bayesian optimization. The entire training process is transparent: you can view the algorithm selection, feature importance, and performance metrics through the ML_EXPLAINstored procedure.

High Availability and Backup

MySQL HeatWave supports high availability (HA) with automatic failover. When HA is enabled, the service maintains a standby instance in a different availability domain or fault domain. If the primary instance fails, the service automatically promotes the standby to primary with minimal downtime (typically 30-60 seconds). The HeatWave cluster data is automatically reloaded after failover.

bash

# Enable high availability on an existing DB System
oci mysql db-system update \
  --db-system-id <db-system-ocid> \
  --is-highly-available true

# Create a read replica for read scaling
oci mysql replica create \
  --db-system-id <db-system-ocid> \
  --display-name "analytics-read-replica" \
  --description "Read replica for reporting queries"

# List read replicas
oci mysql replica list \
  --compartment-id <compartment-ocid> \
  --db-system-id <db-system-ocid> \
  --query 'data.items[].{"Name": "display-name", "State": "lifecycle-state", "IP": "ip-address"}' \
  --output table

# Create a manual backup
oci mysql backup create \
  --db-system-id <db-system-ocid> \
  --display-name "pre-migration-backup" \
  --backup-type FULL \
  --retention-in-days 30

# List backups
oci mysql backup list \
  --compartment-id <compartment-ocid> \
  --db-system-id <db-system-ocid> \
  --query 'data.items[].{"Name": "display-name", "Type": "backup-type", "State": "lifecycle-state", "Size (GB)": "data-storage-size-in-gbs", "Created": "time-created"}' \
  --output table

# Restore from a backup (creates a new DB System)
oci mysql db-system create \
  --compartment-id <compartment-ocid> \
  --display-name "analytics-db-restored" \
  --shape-name "MySQL.HeatWave.VM.Standard" \
  --subnet-id <subnet-ocid> \
  --source '{
    "sourceType": "BACKUP",
    "backupId": "<backup-ocid>"
  }'

Performance Tuning and Optimization

Optimizing MySQL HeatWave performance involves tuning both the MySQL server configuration and the HeatWave cluster settings. The HeatWave Advisor provides automated recommendations for table loading, cluster sizing, and query optimization.

MySQL Configuration Variables for HeatWave

Variable	Default	Description
`use_secondary_engine`	ON	Enable automatic query offloading to HeatWave
`rapid_change_propagation_status`	ON	Enable real-time change propagation to HeatWave
`rapid_change_propagation_blocking`	OFF	Block DML until changes propagate to HeatWave
`secondary_engine_cost_threshold`	0	Minimum InnoDB cost before offloading to HeatWave
`rapid_acceleration_max_table_count`	500	Maximum number of tables loaded into HeatWave

sql

-- Check current HeatWave performance
SELECT
    COUNT_STAR AS total_queries,
    SUM_ROWS_EXAMINED AS rows_examined,
    SUM_ROWS_SENT AS rows_returned,
    ROUND(AVG_TIMER_WAIT / 1000000000, 2) AS avg_time_ms
FROM performance_schema.events_statements_summary_by_digest
WHERE QUERY_SAMPLE_TEXT LIKE '%RAPID%'
ORDER BY COUNT_STAR DESC
LIMIT 10;

-- Run the HeatWave Advisor for optimization recommendations
CALL sys.heatwave_advisor(
    JSON_OBJECT(
        "target_schema", JSON_ARRAY("analytics_db"),
        "output", JSON_ARRAY(
            "recommendations",
            "table_analysis",
            "column_analysis"
        )
    )
);

-- Check HeatWave memory utilization
SELECT * FROM sys.heatwave_cluster_status;

-- Monitor change propagation lag
SELECT
    TABLE_NAME,
    LOAD_STATUS,
    ROWS_LOADED,
    ROWS_MODIFIED_SINCE_LOAD
FROM performance_schema.rpd_tables
WHERE ROWS_MODIFIED_SINCE_LOAD > 0
ORDER BY ROWS_MODIFIED_SINCE_LOAD DESC;

Cost Management

HeatWave cluster nodes are billed per hour regardless of query activity. If your analytics workload runs only during business hours, consider stopping the HeatWave cluster overnight and on weekends to reduce costs by up to 70%. Data is automatically reloaded when the cluster restarts (reload time depends on data size). For intermittent analytical workloads, evaluate whether the query performance improvement justifies the continuous cluster cost compared to running queries on MySQL InnoDB directly.

Summary and Architecture Recommendations

MySQL HeatWave on OCI eliminates the need for separate analytics infrastructure by combining OLTP and OLAP capabilities in a single database service. Use the MySQL DB System for your transactional workload, enable HeatWave for analytical queries, use Lakehouse to query data lake files, and leverage HeatWave ML for in-database machine learning. This unified architecture reduces complexity, eliminates ETL pipelines, and ensures analytics always run against fresh data.

For production deployments, enable high availability for automatic failover, configure automated backups with sufficient retention, place the DB System in a private subnet, use read replicas to separate reporting traffic from the primary, and monitor performance using OCI Monitoring metrics. Use the HeatWave Advisor regularly to optimize your cluster configuration as your data grows.

Getting Started with Oracle Cloud OCI Data Science Platform OCI VCN Networking Deep Dive

Key Takeaways

1HeatWave accelerates analytical queries by orders of magnitude using in-memory columnar processing.
2HeatWave Lakehouse queries CSV, Parquet, and JSON files in Object Storage with zero ETL.
3HeatWave ML provides in-database machine learning with AutoML for classification, regression, and forecasting.
4Changes to InnoDB tables propagate to HeatWave in real time, eliminating the need for ETL pipelines.

Frequently Asked Questions

How much faster is HeatWave than standard MySQL?

Oracle benchmarks show HeatWave is up to 6,500x faster than MySQL alone for analytical queries and 7x faster than Amazon Redshift for TPC-H and TPC-DS benchmarks. Actual speedup depends on query complexity, data volume, and cluster size. Simple point lookups run on InnoDB (which is faster for those patterns), while complex aggregations and joins are offloaded to HeatWave.

Do applications need to change to use HeatWave?

No. HeatWave works transparently with standard MySQL queries and connectors. The MySQL optimizer automatically determines which queries benefit from HeatWave acceleration and routes them accordingly. You mark tables for HeatWave with ALTER TABLE ... SECONDARY_ENGINE = RAPID, and the optimizer handles the rest.

Written by CloudToolStack Editorial

Written and reviewed by the CloudToolStack editorial team. Every guide is verified against current provider documentation and revised in place when providers change pricing, deprecate services, or release meaningfully better alternatives.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.