Skip to content

Materialized Views User Guide

Version: v5.5 Last Updated: January 4, 2026


Table of Contents

  1. Introduction
  2. Creating Materialized Views
  3. Refresh Strategies
  4. Query Rewriting
  5. Storage Backends
  6. Monitoring and Statistics
  7. Configuration Tuning
  8. Best Practices
  9. Advanced Topics

Introduction

Materialized views store pre-computed query results for faster access. Unlike regular views, which execute the underlying query each time, materialized views cache results and update them according to a defined refresh strategy.

When to Use Materialized Views

Good candidates: - Frequently executed expensive queries - Queries with complex joins - Aggregations over large datasets - Queries with stable predicates - OLAP/analytics workloads

Poor candidates: - Highly volatile data requiring real-time accuracy - One-time queries - Simple primary key lookups - Queries where refresh cost exceeds query cost


Creating Materialized Views

Basic Creation

use heliosdb_compute::materialized_views::*;

let manager = MaterializedViewManager::new();

// Create a simple materialized view
let view = MaterializedView::new(
    "user_order_summary".to_string(),
    r#"
        SELECT
            u.id as user_id,
            u.name,
            COUNT(o.id) as order_count,
            SUM(o.total) as total_spent
        FROM users u
        LEFT JOIN orders o ON u.id = o.user_id
        GROUP BY u.id, u.name
    "#.to_string(),
    RefreshStrategy::Manual,
);

let view_id = manager.create_view(view).await?;

With Source Table Tracking

let mut view = MaterializedView::new(name, query, strategy);

// Track source tables for dependency management
view.add_source_table(1); // users table
view.add_source_table(2); // orders table

let view_id = manager.create_view(view).await?;

With Storage Backend

let view = MaterializedView::new(name, query, strategy)
    .with_storage_backend(StorageBackend::Columnar); // For analytics

// Options:
// - StorageBackend::Lsm (default) - General purpose
// - StorageBackend::Memory - Fast, volatile
// - StorageBackend::Columnar - Analytics optimized
// - StorageBackend::Hybrid - Hot/cold tiering

Refresh Strategies

Manual Refresh

Refresh on-demand when you explicitly call refresh:

let view = MaterializedView::new(
    "my_view".to_string(),
    "SELECT * FROM orders WHERE status = 'completed'".to_string(),
    RefreshStrategy::Manual,
);

manager.create_view(view).await?;

// Manually trigger refresh
manager.refresh_view(view_id).await?;

On-Commit Refresh

Refresh automatically after each transaction commit:

let view = MaterializedView::new(
    "realtime_view".to_string(),
    "SELECT user_id, COUNT(*) FROM events GROUP BY user_id".to_string(),
    RefreshStrategy::OnCommit,
);

Note: On-commit refresh adds latency to commits. Use for small views only.

Interval-Based Refresh

Refresh at regular time intervals:

let schedule = RefreshSchedule {
    schedule_type: ScheduleType::Interval { seconds: 300 }, // Every 5 minutes
    enabled: true,
    start_time: None,
    end_time: None,
};

let view = MaterializedView::new(
    "periodic_view".to_string(),
    query,
    RefreshStrategy::Scheduled(schedule),
);

Cron-Based Refresh

Refresh according to cron expression:

let schedule = RefreshSchedule {
    schedule_type: ScheduleType::Cron {
        expression: "0 */6 * * *".to_string(), // Every 6 hours
    },
    enabled: true,
    start_time: None,
    end_time: None,
};

let view = MaterializedView::new(
    "scheduled_view".to_string(),
    query,
    RefreshStrategy::Scheduled(schedule),
);

Common Cron Expressions:

Expression Schedule
"0 * * * *" Every hour at minute 0
"0 0 * * *" Daily at midnight
"0 0 * * 0" Weekly on Sunday at midnight
"*/15 * * * *" Every 15 minutes
"0 9-17 * * 1-5" Weekdays 9 AM to 5 PM hourly
"0 0 1 * *" Monthly on the 1st at midnight

Choosing a Refresh Strategy

Data Change Rate Recommended Strategy Notes
Milliseconds OnCommit Real-time requirements
Seconds Interval (short) High freshness, higher cost
Minutes Interval or Cron Balanced approach
Hours Cron Standard for daily patterns
Days Cron Low-frequency updates
On-demand Manual User-triggered

Query Rewriting

The query rewriter automatically substitutes materialized views to accelerate queries.

Enabling Query Rewriting

let rewriter = QueryRewriter::new();

// Register materialized views
rewriter.register_view(view).await?;

// Attempt to rewrite a query
let result = rewriter.rewrite_query(&query).await?;

if result.success {
    println!("Cost savings: {:.2}%",
             (result.cost_savings / result.original_cost) * 100.0);
}

Matching Strategies

  1. Exact Match: Query exactly matches view definition
  2. Join Elimination: View pre-computes expensive joins
  3. Aggregation Pushdown: View pre-computes aggregations
  4. Superset Match: View contains all query data plus more
  5. Filter Pushdown: View has beneficial filters
  6. Column Projection: Query needs subset of view columns
  7. Custom Rules: User-defined pattern matching

Example: Join Elimination

-- Original query requires expensive join
SELECT u.name, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'completed';

-- Materialized view pre-computes the join
CREATE MATERIALIZED VIEW mv_user_orders AS
SELECT u.name, o.total, o.status
FROM users u
JOIN orders o ON u.id = o.user_id;

-- Rewritten query uses view (no join!)
SELECT name, total FROM mv_user_orders WHERE status = 'completed';

Example: Aggregation Pushdown

-- Original query computes aggregation
SELECT user_id, COUNT(*), SUM(amount)
FROM orders
GROUP BY user_id;

-- Materialized view pre-computes aggregations
CREATE MATERIALIZED VIEW mv_user_summary AS
SELECT user_id, COUNT(*) as order_count, SUM(amount) as total_amount
FROM orders
GROUP BY user_id;

-- Rewritten query uses view (no aggregation!)
SELECT user_id, order_count, total_amount FROM mv_user_summary;

Storage Backends

LSM Tree (Default)

General-purpose storage with good write and read performance:

let view = view.with_storage_backend(StorageBackend::Lsm);

Memory

Fast but volatile; data lost on restart:

let view = view.with_storage_backend(StorageBackend::Memory);

Columnar

Optimized for analytical queries with column scans:

let view = view.with_storage_backend(StorageBackend::Columnar);

Hybrid

Automatic hot/cold tiering:

let view = view.with_storage_backend(StorageBackend::Hybrid);

Monitoring and Statistics

View Statistics

let view = manager.get_view(view_id).await?;

println!("Row count: {}", view.statistics.row_count);
println!("Size: {} bytes", view.statistics.size_bytes);
println!("Last refresh: {}", view.statistics.last_refresh_time);
println!("Avg refresh duration: {} ms", view.statistics.avg_refresh_duration_ms);
println!("Query hits: {}", view.statistics.query_hit_count);
println!("Failures: {}", view.statistics.failure_count);

Rewriter Statistics

let stats = rewriter.get_statistics().await;

println!("Total queries: {}", stats.total_queries);
println!("Successful rewrites: {}", stats.successful_rewrites);
println!("Success rate: {:.2}%", stats.success_rate);
println!("Total cost savings: {}", stats.total_cost_savings);

Health Checks

// Check if view is fresh
if view.is_fresh() {
    println!("View is up to date");
} else {
    println!("View needs refresh");
}

// Check view state
match view.state {
    ViewState::Active => println!("View is active"),
    ViewState::Stale => println!("View needs refresh"),
    ViewState::Refreshing => println!("View is refreshing"),
    ViewState::Error => println!("View has errors"),
}

Configuration Tuning

Incremental Refresh Config

let config = IncrementalConfig {
    max_batch_size: 10000,
    enable_compression: true,
    enable_coalescing: true,
    full_refresh_threshold_percent: 25.0, // Full refresh if >25% changed
};

Refresh Manager Config

let config = RefreshConfig {
    max_concurrent_refreshes: 4,
    max_retries: 3,
    retry_delay_seconds: 60,
    task_timeout_seconds: 3600,
    enable_throttling: true,
    min_refresh_interval_seconds: 10,
};

Rewriter Config

let config = RewriterConfig {
    enable_rewriting: true,
    min_cost_savings_percent: 10.0, // Only rewrite if >10% savings
    max_views_to_consider: 20,
    enable_partial_matching: true,
    enable_view_composition: false,
};

Best Practices

1. Start with Manual Refresh

Test view creation and queries before enabling automatic refresh:

// Start manual for testing
let view = MaterializedView::new(name, query, RefreshStrategy::Manual);

// After validation, update to scheduled
view.set_refresh_strategy(RefreshStrategy::Scheduled(schedule));

2. Monitor Statistics

Track refresh duration and query hit rates:

if view.statistics.avg_refresh_duration_ms > 60000 {
    println!("Average refresh > 1 minute, consider optimization");
}

if view.statistics.query_hit_count < 10 {
    println!("Low hit rate, consider dropping view");
}

3. Use Incremental Refresh

Enable incremental refresh for large views:

let engine = IncrementalRefreshEngine::new();
// Much faster than full refresh for small changes

4. Choose Appropriate Storage

Workload Backend
General OLTP LSM
Analytics Columnar
Low-latency cache Memory
Mixed workload Hybrid

5. Set Realistic Schedules

Match refresh frequency to data volatility:

// High volatility: frequent refresh
ScheduleType::Interval { seconds: 60 }

// Low volatility: daily refresh
ScheduleType::Cron { expression: "0 0 * * *".to_string() }

6. Create Indexes on Views

Improve query performance on materialized results:

CREATE INDEX idx_mv_user_id ON mv_user_summary(user_id);

7. Clean Up Unused Views

Drop views with low query hit rates:

for view in manager.list_views().await? {
    if view.statistics.query_hit_count < 5 {
        println!("Consider dropping: {}", view.name);
    }
}

Advanced Topics

Cascading Refresh

Views that depend on other views:

// View A: Base aggregation
let view_a = MaterializedView::new(
    "daily_sales".to_string(),
    "SELECT date, SUM(amount) FROM orders GROUP BY date".to_string(),
    RefreshStrategy::Scheduled(daily_schedule),
);

// View B: Depends on View A
let view_b = MaterializedView::new(
    "monthly_sales".to_string(),
    "SELECT MONTH(date), SUM(daily_total) FROM daily_sales GROUP BY MONTH(date)".to_string(),
    RefreshStrategy::Manual, // Refresh after view_a
);

Partition-Aware Refresh

Refresh only specific partitions:

// Refresh only today's partition
manager.refresh_partition(view_id, "2026-01-04").await?;

View Composition

Combine multiple views for complex queries (experimental):

let config = RewriterConfig {
    enable_view_composition: true,
    ..Default::default()
};