Storage Cost Attribution - Quick Start Guide¶
Get started with HeliosDB's storage cost attribution and optimization in 5 minutes.
What is Storage Cost Attribution?¶
A comprehensive system for tracking, analyzing, and optimizing storage costs at granular table and column levels. Enables 20-30% cost reduction through intelligent tiering and compression.
Quick Start¶
1. Initialize Storage Attributor¶
use heliosdb_cost_optimizer_v2::{StorageAttributor, TierCostConfig};
// Create with default costs ($0.10/GB/month for hot tier)
let attributor = StorageAttributor::new(0.10);
// OR create with custom tier costs
let tier_costs = TierCostConfig {
hot_cost_per_gb: 0.10, // SSD
warm_cost_per_gb: 0.05, // HDD
cold_cost_per_gb: 0.01, // Object storage
archive_cost_per_gb: 0.004, // Glacier-like
};
let attributor = StorageAttributor::with_tier_costs(tier_costs);
2. Track Table Metrics¶
use heliosdb_cost_optimizer_v2::{TableStorageMetrics, StorageTier, AccessFrequency, AccessPattern};
let metrics = TableStorageMetrics {
table_name: "users".to_string(),
total_bytes: 10_737_418_240, // 10 GB
row_count: 1_000_000,
index_bytes: 1_073_741_824,
data_bytes: 9_663_676_416,
compressed_bytes: 3_579_139_413,
uncompressed_bytes: 10_737_418_240,
compression_ratio: 3.0,
avg_row_size: 10737.42,
storage_tier: StorageTier::Hot,
last_accessed: chrono::Utc::now(),
created_at: chrono::Utc::now() - chrono::Duration::days(365),
access_frequency: AccessFrequency {
reads_per_day: 1000.0,
writes_per_day: 100.0,
last_7_days_reads: 7000,
last_30_days_reads: 30000,
access_pattern: AccessPattern::Hot,
},
};
attributor.update_table_metrics(metrics).await;
3. Calculate Costs¶
// Total cost across all tables
let total_cost = attributor.calculate_total_cost().await;
println!("Total monthly cost: ${:.2}", total_cost);
// Cost by table
let cost_by_table = attributor.cost_by_table().await;
for (table, cost) in cost_by_table {
println!("{}: ${:.2}/month", table, cost);
}
// Compression savings
let savings = attributor.compression_savings().await;
println!("Monthly savings from compression: ${:.2}", savings);
4. Get Optimization Recommendations¶
// Tiering recommendations
let tiering_recs = attributor.tiering_recommendations().await;
for rec in tiering_recs.iter().take(5) {
println!("Move {} from {:?} to {:?}", rec.table, rec.from_tier, rec.to_tier);
println!(" Annual savings: ${:.2}", rec.annual_savings_usd);
println!(" Reason: {}", rec.reason);
}
// Compression recommendations
let compression_recs = attributor.compression_recommendations().await;
for rec in compression_recs.iter().take(5) {
println!("Compress {}.{} with {:?}", rec.table, rec.column, rec.recommended_compression);
println!(" Annual savings: ${:.2}", rec.expected_savings_usd_annual);
}
5. Track Trends and Forecast¶
use heliosdb_cost_optimizer_v2::{StorageTrendTracker, StorageSnapshot};
let tracker = StorageTrendTracker::new();
// Add daily snapshots
let snapshot = StorageSnapshot {
timestamp: chrono::Utc::now(),
total_bytes: 5_368_709_120_000,
total_cost: 500.0,
table_count: 100,
by_table: HashMap::new(),
by_tier: HashMap::new(),
};
tracker.add_snapshot(snapshot).await;
// Forecast 30 days
let forecast = tracker.forecast_growth(30).await;
println!("Current: {} GB", forecast.current_bytes / 1_073_741_824);
println!("Predicted in 30 days: {} GB", forecast.predicted_bytes / 1_073_741_824);
println!("Growth rate: {:.2} GB/day", forecast.growth_rate_gb_per_day);
println!("Trend: {:?}", forecast.growth_trend);
// Analyze growth
let analysis = tracker.analyze_growth(None).await;
println!("Daily growth: {:.2} GB", analysis.average_daily_growth_gb);
println!("Weekly growth: {:.2} GB", analysis.average_weekly_growth_gb);
6. Use Dashboard API¶
use heliosdb_cost_optimizer_v2::{DashboardState, storage_dashboard_routes};
use axum::Router;
use std::sync::Arc;
// Create shared state
let state = DashboardState {
attributor: Arc::new(attributor),
trend_tracker: Arc::new(tracker),
};
// Create router
let app = storage_dashboard_routes().with_state(state);
// Start server
let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await?;
axum::serve(listener, app).await?;
7. Access Dashboard¶
# Main dashboard
curl http://localhost:8080/api/storage/dashboard?top_n=10&forecast_days=30
# Cost breakdown
curl http://localhost:8080/api/storage/cost
# Table details
curl http://localhost:8080/api/storage/tables/users
# Recommendations
curl http://localhost:8080/api/storage/recommendations
# Forecast
curl http://localhost:8080/api/storage/forecast?forecast_days=60
# Efficiency metrics
curl http://localhost:8080/api/storage/efficiency
Common Use Cases¶
Use Case 1: Identify Cost Hotspots¶
// Get top 10 most expensive tables
let top_tables = attributor.top_tables_by_cost(10).await;
for table in top_tables {
println!("{}: ${:.2}/month ({:.1}% of total)",
table.table,
table.cost_usd_monthly,
table.percent_of_total
);
}
Output:
events_log: $500.00/month (40.0% of total)
user_sessions: $200.00/month (16.0% of total)
transactions: $150.00/month (12.0% of total)
Use Case 2: Optimize Old Data¶
// Find tables that should be moved to cheaper tiers
let recommendations = attributor.tiering_recommendations().await;
let big_savers = recommendations.iter()
.filter(|r| r.annual_savings_usd > 1000.0)
.collect::<Vec<_>>();
println!("Found {} high-value optimizations", big_savers.len());
Use Case 3: Forecast Capacity Needs¶
// Predict when you'll hit capacity
let capacity_bytes = 100_000_000_000_000; // 100 TB
let analysis = tracker.analyze_growth(Some(capacity_bytes)).await;
if let Some(days) = analysis.days_until_capacity {
println!("Will reach capacity in {} days", days);
println!("Consider adding storage or archiving old data");
}
Use Case 4: Monitor Efficiency¶
// Track storage efficiency over time
let score = attributor.storage_efficiency_score().await;
match score {
s if s >= 90.0 => println!(" Excellent efficiency"),
s if s >= 70.0 => println!("⚠ Good efficiency, room for improvement"),
s if s >= 50.0 => println!("⚠ Fair efficiency, optimization recommended"),
_ => println!("❌ Poor efficiency, immediate action needed"),
}
Configuration¶
Tier Cost Configuration¶
// AWS-like pricing
let aws_costs = TierCostConfig {
hot_cost_per_gb: 0.10, // EBS SSD
warm_cost_per_gb: 0.045, // EBS HDD
cold_cost_per_gb: 0.023, // S3 Standard
archive_cost_per_gb: 0.004, // S3 Glacier
};
// Azure-like pricing
let azure_costs = TierCostConfig {
hot_cost_per_gb: 0.12,
warm_cost_per_gb: 0.06,
cold_cost_per_gb: 0.015,
archive_cost_per_gb: 0.002,
};
// GCP-like pricing
let gcp_costs = TierCostConfig {
hot_cost_per_gb: 0.085,
warm_cost_per_gb: 0.040,
cold_cost_per_gb: 0.020,
archive_cost_per_gb: 0.004,
};
Snapshot Retention¶
// Keep 1 year of daily snapshots (default)
let tracker = StorageTrendTracker::new();
// Keep 2 years of daily snapshots
let tracker = StorageTrendTracker::with_retention(730);
// Keep 90 days of daily snapshots
let tracker = StorageTrendTracker::with_retention(90);
Best Practices¶
1. Regular Snapshot Collection¶
Collect daily snapshots for accurate forecasting:
tokio::spawn(async move {
let mut interval = tokio::time::interval(Duration::from_secs(86400)); // 24 hours
loop {
interval.tick().await;
let snapshot = collect_snapshot(&attributor).await;
trend_tracker.add_snapshot(snapshot).await;
}
});
2. Automatic Tier Migration¶
Implement automatic migration based on recommendations:
let recommendations = attributor.tiering_recommendations().await;
for rec in recommendations {
if rec.risk_level == RiskLevel::Low && rec.annual_savings_usd > 1000.0 {
// Auto-migrate if low risk and significant savings
migrate_table(&rec.table, rec.to_tier).await?;
println!("Auto-migrated {} to {:?}", rec.table, rec.to_tier);
}
}
3. Set Budget Alerts¶
let total_cost = attributor.calculate_total_cost().await;
let budget = 2000.0; // $2,000/month
if total_cost > budget * 0.9 {
println!("⚠ Warning: At 90% of budget (${:.2} / ${:.2})", total_cost, budget);
}
if total_cost > budget {
println!("❌ Alert: Over budget! (${:.2} / ${:.2})", total_cost, budget);
}
4. Monitor Growth Trends¶
let forecast = tracker.forecast_growth(30).await;
match forecast.growth_trend {
GrowthTrend::Accelerating => {
println!("⚠ Growth is accelerating - investigate data sources");
},
GrowthTrend::Linear => {
println!(" Steady growth - predictable");
},
GrowthTrend::Stable => {
println!(" Stable storage usage");
},
_ => {}
}
Troubleshooting¶
Issue: Forecasts are inaccurate¶
Solution: Ensure you have at least 30 days of historical snapshots:
let snapshots = tracker.get_snapshots().await;
if snapshots.len() < 30 {
println!("Warning: Only {} snapshots. Need 30+ for accurate forecasting.", snapshots.len());
}
Issue: Recommendations seem wrong¶
Solution: Verify table metrics are up-to-date:
let metrics = attributor.get_table_metrics("table_name").await;
if let Some(m) = metrics {
let days_old = (chrono::Utc::now() - m.last_accessed).num_days();
if days_old > 1 {
println!("Warning: Metrics are {} days old. Re-analyze table.", days_old);
}
}
Issue: High memory usage¶
Solution: Reduce snapshot retention or limit table count:
// Reduce retention to 90 days
let tracker = StorageTrendTracker::with_retention(90);
// Only track top N tables
let top_tables = attributor.top_tables_by_cost(100).await; // Top 100 only
Performance Tips¶
- Batch Updates: Update metrics in batches to reduce lock contention
- Async Operations: Use async/await for all operations
- Index Separately: Track index costs separately from data costs
- Cache Results: Cache dashboard results for 5-10 minutes
- Parallel Forecasting: Run forecasts for multiple scenarios in parallel
Next Steps¶
- Read the full documentation
- Explore API reference
- Check out integration examples
- Learn about Week 3: Network Cost Tracking
Support¶
For questions or issues, see: - GitHub Issues: https://github.com/heliosdb/heliosdb - Documentation: /home/claude/HeliosDB/docs/ - Examples: /home/claude/HeliosDB/heliosdb-cost-optimizer-v2/tests/