Materialized Views User Guide¶
Version: v5.5 Last Updated: January 4, 2026
Table of Contents¶
- Introduction
- Creating Materialized Views
- Refresh Strategies
- Query Rewriting
- Storage Backends
- Monitoring and Statistics
- Configuration Tuning
- Best Practices
- Advanced Topics
Introduction¶
Materialized views store pre-computed query results for faster access. Unlike regular views, which execute the underlying query each time, materialized views cache results and update them according to a defined refresh strategy.
When to Use Materialized Views¶
Good candidates: - Frequently executed expensive queries - Queries with complex joins - Aggregations over large datasets - Queries with stable predicates - OLAP/analytics workloads
Poor candidates: - Highly volatile data requiring real-time accuracy - One-time queries - Simple primary key lookups - Queries where refresh cost exceeds query cost
Creating Materialized Views¶
Basic Creation¶
use heliosdb_compute::materialized_views::*;
let manager = MaterializedViewManager::new();
// Create a simple materialized view
let view = MaterializedView::new(
"user_order_summary".to_string(),
r#"
SELECT
u.id as user_id,
u.name,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name
"#.to_string(),
RefreshStrategy::Manual,
);
let view_id = manager.create_view(view).await?;
With Source Table Tracking¶
let mut view = MaterializedView::new(name, query, strategy);
// Track source tables for dependency management
view.add_source_table(1); // users table
view.add_source_table(2); // orders table
let view_id = manager.create_view(view).await?;
With Storage Backend¶
let view = MaterializedView::new(name, query, strategy)
.with_storage_backend(StorageBackend::Columnar); // For analytics
// Options:
// - StorageBackend::Lsm (default) - General purpose
// - StorageBackend::Memory - Fast, volatile
// - StorageBackend::Columnar - Analytics optimized
// - StorageBackend::Hybrid - Hot/cold tiering
Refresh Strategies¶
Manual Refresh¶
Refresh on-demand when you explicitly call refresh:
let view = MaterializedView::new(
"my_view".to_string(),
"SELECT * FROM orders WHERE status = 'completed'".to_string(),
RefreshStrategy::Manual,
);
manager.create_view(view).await?;
// Manually trigger refresh
manager.refresh_view(view_id).await?;
On-Commit Refresh¶
Refresh automatically after each transaction commit:
let view = MaterializedView::new(
"realtime_view".to_string(),
"SELECT user_id, COUNT(*) FROM events GROUP BY user_id".to_string(),
RefreshStrategy::OnCommit,
);
Note: On-commit refresh adds latency to commits. Use for small views only.
Interval-Based Refresh¶
Refresh at regular time intervals:
let schedule = RefreshSchedule {
schedule_type: ScheduleType::Interval { seconds: 300 }, // Every 5 minutes
enabled: true,
start_time: None,
end_time: None,
};
let view = MaterializedView::new(
"periodic_view".to_string(),
query,
RefreshStrategy::Scheduled(schedule),
);
Cron-Based Refresh¶
Refresh according to cron expression:
let schedule = RefreshSchedule {
schedule_type: ScheduleType::Cron {
expression: "0 */6 * * *".to_string(), // Every 6 hours
},
enabled: true,
start_time: None,
end_time: None,
};
let view = MaterializedView::new(
"scheduled_view".to_string(),
query,
RefreshStrategy::Scheduled(schedule),
);
Common Cron Expressions:
| Expression | Schedule |
|---|---|
"0 * * * *" |
Every hour at minute 0 |
"0 0 * * *" |
Daily at midnight |
"0 0 * * 0" |
Weekly on Sunday at midnight |
"*/15 * * * *" |
Every 15 minutes |
"0 9-17 * * 1-5" |
Weekdays 9 AM to 5 PM hourly |
"0 0 1 * *" |
Monthly on the 1st at midnight |
Choosing a Refresh Strategy¶
| Data Change Rate | Recommended Strategy | Notes |
|---|---|---|
| Milliseconds | OnCommit | Real-time requirements |
| Seconds | Interval (short) | High freshness, higher cost |
| Minutes | Interval or Cron | Balanced approach |
| Hours | Cron | Standard for daily patterns |
| Days | Cron | Low-frequency updates |
| On-demand | Manual | User-triggered |
Query Rewriting¶
The query rewriter automatically substitutes materialized views to accelerate queries.
Enabling Query Rewriting¶
let rewriter = QueryRewriter::new();
// Register materialized views
rewriter.register_view(view).await?;
// Attempt to rewrite a query
let result = rewriter.rewrite_query(&query).await?;
if result.success {
println!("Cost savings: {:.2}%",
(result.cost_savings / result.original_cost) * 100.0);
}
Matching Strategies¶
- Exact Match: Query exactly matches view definition
- Join Elimination: View pre-computes expensive joins
- Aggregation Pushdown: View pre-computes aggregations
- Superset Match: View contains all query data plus more
- Filter Pushdown: View has beneficial filters
- Column Projection: Query needs subset of view columns
- Custom Rules: User-defined pattern matching
Example: Join Elimination¶
-- Original query requires expensive join
SELECT u.name, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'completed';
-- Materialized view pre-computes the join
CREATE MATERIALIZED VIEW mv_user_orders AS
SELECT u.name, o.total, o.status
FROM users u
JOIN orders o ON u.id = o.user_id;
-- Rewritten query uses view (no join!)
SELECT name, total FROM mv_user_orders WHERE status = 'completed';
Example: Aggregation Pushdown¶
-- Original query computes aggregation
SELECT user_id, COUNT(*), SUM(amount)
FROM orders
GROUP BY user_id;
-- Materialized view pre-computes aggregations
CREATE MATERIALIZED VIEW mv_user_summary AS
SELECT user_id, COUNT(*) as order_count, SUM(amount) as total_amount
FROM orders
GROUP BY user_id;
-- Rewritten query uses view (no aggregation!)
SELECT user_id, order_count, total_amount FROM mv_user_summary;
Storage Backends¶
LSM Tree (Default)¶
General-purpose storage with good write and read performance:
Memory¶
Fast but volatile; data lost on restart:
Columnar¶
Optimized for analytical queries with column scans:
Hybrid¶
Automatic hot/cold tiering:
Monitoring and Statistics¶
View Statistics¶
let view = manager.get_view(view_id).await?;
println!("Row count: {}", view.statistics.row_count);
println!("Size: {} bytes", view.statistics.size_bytes);
println!("Last refresh: {}", view.statistics.last_refresh_time);
println!("Avg refresh duration: {} ms", view.statistics.avg_refresh_duration_ms);
println!("Query hits: {}", view.statistics.query_hit_count);
println!("Failures: {}", view.statistics.failure_count);
Rewriter Statistics¶
let stats = rewriter.get_statistics().await;
println!("Total queries: {}", stats.total_queries);
println!("Successful rewrites: {}", stats.successful_rewrites);
println!("Success rate: {:.2}%", stats.success_rate);
println!("Total cost savings: {}", stats.total_cost_savings);
Health Checks¶
// Check if view is fresh
if view.is_fresh() {
println!("View is up to date");
} else {
println!("View needs refresh");
}
// Check view state
match view.state {
ViewState::Active => println!("View is active"),
ViewState::Stale => println!("View needs refresh"),
ViewState::Refreshing => println!("View is refreshing"),
ViewState::Error => println!("View has errors"),
}
Configuration Tuning¶
Incremental Refresh Config¶
let config = IncrementalConfig {
max_batch_size: 10000,
enable_compression: true,
enable_coalescing: true,
full_refresh_threshold_percent: 25.0, // Full refresh if >25% changed
};
Refresh Manager Config¶
let config = RefreshConfig {
max_concurrent_refreshes: 4,
max_retries: 3,
retry_delay_seconds: 60,
task_timeout_seconds: 3600,
enable_throttling: true,
min_refresh_interval_seconds: 10,
};
Rewriter Config¶
let config = RewriterConfig {
enable_rewriting: true,
min_cost_savings_percent: 10.0, // Only rewrite if >10% savings
max_views_to_consider: 20,
enable_partial_matching: true,
enable_view_composition: false,
};
Best Practices¶
1. Start with Manual Refresh¶
Test view creation and queries before enabling automatic refresh:
// Start manual for testing
let view = MaterializedView::new(name, query, RefreshStrategy::Manual);
// After validation, update to scheduled
view.set_refresh_strategy(RefreshStrategy::Scheduled(schedule));
2. Monitor Statistics¶
Track refresh duration and query hit rates:
if view.statistics.avg_refresh_duration_ms > 60000 {
println!("Average refresh > 1 minute, consider optimization");
}
if view.statistics.query_hit_count < 10 {
println!("Low hit rate, consider dropping view");
}
3. Use Incremental Refresh¶
Enable incremental refresh for large views:
4. Choose Appropriate Storage¶
| Workload | Backend |
|---|---|
| General OLTP | LSM |
| Analytics | Columnar |
| Low-latency cache | Memory |
| Mixed workload | Hybrid |
5. Set Realistic Schedules¶
Match refresh frequency to data volatility:
// High volatility: frequent refresh
ScheduleType::Interval { seconds: 60 }
// Low volatility: daily refresh
ScheduleType::Cron { expression: "0 0 * * *".to_string() }
6. Create Indexes on Views¶
Improve query performance on materialized results:
7. Clean Up Unused Views¶
Drop views with low query hit rates:
for view in manager.list_views().await? {
if view.statistics.query_hit_count < 5 {
println!("Consider dropping: {}", view.name);
}
}
Advanced Topics¶
Cascading Refresh¶
Views that depend on other views:
// View A: Base aggregation
let view_a = MaterializedView::new(
"daily_sales".to_string(),
"SELECT date, SUM(amount) FROM orders GROUP BY date".to_string(),
RefreshStrategy::Scheduled(daily_schedule),
);
// View B: Depends on View A
let view_b = MaterializedView::new(
"monthly_sales".to_string(),
"SELECT MONTH(date), SUM(daily_total) FROM daily_sales GROUP BY MONTH(date)".to_string(),
RefreshStrategy::Manual, // Refresh after view_a
);
Partition-Aware Refresh¶
Refresh only specific partitions:
View Composition¶
Combine multiple views for complex queries (experimental):
Related Documentation¶
- README.md - Feature overview
- QUICK_START.md - Getting started guide
- TROUBLESHOOTING.md - Common issues
- Implementation Details - Technical reference