Storage Cost Attribution - Quick Start Guide¶

Get started with HeliosDB's storage cost attribution and optimization in 5 minutes.

What is Storage Cost Attribution?¶

A comprehensive system for tracking, analyzing, and optimizing storage costs at granular table and column levels. Enables 20-30% cost reduction through intelligent tiering and compression.

Quick Start¶

1. Initialize Storage Attributor¶

use heliosdb_cost_optimizer_v2::{StorageAttributor, TierCostConfig};

// Create with default costs ($0.10/GB/month for hot tier)
let attributor = StorageAttributor::new(0.10);

// OR create with custom tier costs
let tier_costs = TierCostConfig {
    hot_cost_per_gb: 0.10,   // SSD
    warm_cost_per_gb: 0.05,  // HDD
    cold_cost_per_gb: 0.01,  // Object storage
    archive_cost_per_gb: 0.004, // Glacier-like
};
let attributor = StorageAttributor::with_tier_costs(tier_costs);

2. Track Table Metrics¶

use heliosdb_cost_optimizer_v2::{TableStorageMetrics, StorageTier, AccessFrequency, AccessPattern};

let metrics = TableStorageMetrics {
    table_name: "users".to_string(),
    total_bytes: 10_737_418_240, // 10 GB
    row_count: 1_000_000,
    index_bytes: 1_073_741_824,
    data_bytes: 9_663_676_416,
    compressed_bytes: 3_579_139_413,
    uncompressed_bytes: 10_737_418_240,
    compression_ratio: 3.0,
    avg_row_size: 10737.42,
    storage_tier: StorageTier::Hot,
    last_accessed: chrono::Utc::now(),
    created_at: chrono::Utc::now() - chrono::Duration::days(365),
    access_frequency: AccessFrequency {
        reads_per_day: 1000.0,
        writes_per_day: 100.0,
        last_7_days_reads: 7000,
        last_30_days_reads: 30000,
        access_pattern: AccessPattern::Hot,
    },
};

attributor.update_table_metrics(metrics).await;

3. Calculate Costs¶

// Total cost across all tables
let total_cost = attributor.calculate_total_cost().await;
println!("Total monthly cost: ${:.2}", total_cost);

// Cost by table
let cost_by_table = attributor.cost_by_table().await;
for (table, cost) in cost_by_table {
    println!("{}: ${:.2}/month", table, cost);
}

// Compression savings
let savings = attributor.compression_savings().await;
println!("Monthly savings from compression: ${:.2}", savings);

4. Get Optimization Recommendations¶

// Tiering recommendations
let tiering_recs = attributor.tiering_recommendations().await;
for rec in tiering_recs.iter().take(5) {
    println!("Move {} from {:?} to {:?}", rec.table, rec.from_tier, rec.to_tier);
    println!("  Annual savings: ${:.2}", rec.annual_savings_usd);
    println!("  Reason: {}", rec.reason);
}

// Compression recommendations
let compression_recs = attributor.compression_recommendations().await;
for rec in compression_recs.iter().take(5) {
    println!("Compress {}.{} with {:?}", rec.table, rec.column, rec.recommended_compression);
    println!("  Annual savings: ${:.2}", rec.expected_savings_usd_annual);
}

5. Track Trends and Forecast¶

use heliosdb_cost_optimizer_v2::{StorageTrendTracker, StorageSnapshot};

let tracker = StorageTrendTracker::new();

// Add daily snapshots
let snapshot = StorageSnapshot {
    timestamp: chrono::Utc::now(),
    total_bytes: 5_368_709_120_000,
    total_cost: 500.0,
    table_count: 100,
    by_table: HashMap::new(),
    by_tier: HashMap::new(),
};
tracker.add_snapshot(snapshot).await;

// Forecast 30 days
let forecast = tracker.forecast_growth(30).await;
println!("Current: {} GB", forecast.current_bytes / 1_073_741_824);
println!("Predicted in 30 days: {} GB", forecast.predicted_bytes / 1_073_741_824);
println!("Growth rate: {:.2} GB/day", forecast.growth_rate_gb_per_day);
println!("Trend: {:?}", forecast.growth_trend);

// Analyze growth
let analysis = tracker.analyze_growth(None).await;
println!("Daily growth: {:.2} GB", analysis.average_daily_growth_gb);
println!("Weekly growth: {:.2} GB", analysis.average_weekly_growth_gb);

6. Use Dashboard API¶

use heliosdb_cost_optimizer_v2::{DashboardState, storage_dashboard_routes};
use axum::Router;
use std::sync::Arc;

// Create shared state
let state = DashboardState {
    attributor: Arc::new(attributor),
    trend_tracker: Arc::new(tracker),
};

// Create router
let app = storage_dashboard_routes().with_state(state);

// Start server
let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await?;
axum::serve(listener, app).await?;

7. Access Dashboard¶

# Main dashboard
curl http://localhost:8080/api/storage/dashboard?top_n=10&forecast_days=30

# Cost breakdown
curl http://localhost:8080/api/storage/cost

# Table details
curl http://localhost:8080/api/storage/tables/users

# Recommendations
curl http://localhost:8080/api/storage/recommendations

# Forecast
curl http://localhost:8080/api/storage/forecast?forecast_days=60

# Efficiency metrics
curl http://localhost:8080/api/storage/efficiency

Common Use Cases¶

Use Case 1: Identify Cost Hotspots¶

// Get top 10 most expensive tables
let top_tables = attributor.top_tables_by_cost(10).await;

for table in top_tables {
    println!("{}: ${:.2}/month ({:.1}% of total)",
        table.table,
        table.cost_usd_monthly,
        table.percent_of_total
    );
}

Output:

events_log: $500.00/month (40.0% of total)
user_sessions: $200.00/month (16.0% of total)
transactions: $150.00/month (12.0% of total)

Use Case 2: Optimize Old Data¶

// Find tables that should be moved to cheaper tiers
let recommendations = attributor.tiering_recommendations().await;

let big_savers = recommendations.iter()
    .filter(|r| r.annual_savings_usd > 1000.0)
    .collect::<Vec<_>>();

println!("Found {} high-value optimizations", big_savers.len());

Use Case 3: Forecast Capacity Needs¶

// Predict when you'll hit capacity
let capacity_bytes = 100_000_000_000_000; // 100 TB
let analysis = tracker.analyze_growth(Some(capacity_bytes)).await;

if let Some(days) = analysis.days_until_capacity {
    println!("Will reach capacity in {} days", days);
    println!("Consider adding storage or archiving old data");
}

Use Case 4: Monitor Efficiency¶

// Track storage efficiency over time
let score = attributor.storage_efficiency_score().await;

match score {
    s if s >= 90.0 => println!(" Excellent efficiency"),
    s if s >= 70.0 => println!("⚠  Good efficiency, room for improvement"),
    s if s >= 50.0 => println!("⚠  Fair efficiency, optimization recommended"),
    _ => println!("❌ Poor efficiency, immediate action needed"),
}

Configuration¶

Tier Cost Configuration¶

// AWS-like pricing
let aws_costs = TierCostConfig {
    hot_cost_per_gb: 0.10,    // EBS SSD
    warm_cost_per_gb: 0.045,  // EBS HDD
    cold_cost_per_gb: 0.023,  // S3 Standard
    archive_cost_per_gb: 0.004, // S3 Glacier
};

// Azure-like pricing
let azure_costs = TierCostConfig {
    hot_cost_per_gb: 0.12,
    warm_cost_per_gb: 0.06,
    cold_cost_per_gb: 0.015,
    archive_cost_per_gb: 0.002,
};

// GCP-like pricing
let gcp_costs = TierCostConfig {
    hot_cost_per_gb: 0.085,
    warm_cost_per_gb: 0.040,
    cold_cost_per_gb: 0.020,
    archive_cost_per_gb: 0.004,
};

Snapshot Retention¶

// Keep 1 year of daily snapshots (default)
let tracker = StorageTrendTracker::new();

// Keep 2 years of daily snapshots
let tracker = StorageTrendTracker::with_retention(730);

// Keep 90 days of daily snapshots
let tracker = StorageTrendTracker::with_retention(90);

Best Practices¶

1. Regular Snapshot Collection¶

Collect daily snapshots for accurate forecasting:

tokio::spawn(async move {
    let mut interval = tokio::time::interval(Duration::from_secs(86400)); // 24 hours

    loop {
        interval.tick().await;
        let snapshot = collect_snapshot(&attributor).await;
        trend_tracker.add_snapshot(snapshot).await;
    }
});

2. Automatic Tier Migration¶

Implement automatic migration based on recommendations:

let recommendations = attributor.tiering_recommendations().await;

for rec in recommendations {
    if rec.risk_level == RiskLevel::Low && rec.annual_savings_usd > 1000.0 {
        // Auto-migrate if low risk and significant savings
        migrate_table(&rec.table, rec.to_tier).await?;
        println!("Auto-migrated {} to {:?}", rec.table, rec.to_tier);
    }
}

3. Set Budget Alerts¶

let total_cost = attributor.calculate_total_cost().await;
let budget = 2000.0; // $2,000/month

if total_cost > budget * 0.9 {
    println!("⚠  Warning: At 90% of budget (${:.2} / ${:.2})", total_cost, budget);
}

if total_cost > budget {
    println!("❌ Alert: Over budget! (${:.2} / ${:.2})", total_cost, budget);
}

4. Monitor Growth Trends¶

let forecast = tracker.forecast_growth(30).await;

match forecast.growth_trend {
    GrowthTrend::Accelerating => {
        println!("⚠  Growth is accelerating - investigate data sources");
    },
    GrowthTrend::Linear => {
        println!(" Steady growth - predictable");
    },
    GrowthTrend::Stable => {
        println!(" Stable storage usage");
    },
    _ => {}
}

Troubleshooting¶

Issue: Forecasts are inaccurate¶

Solution: Ensure you have at least 30 days of historical snapshots:

let snapshots = tracker.get_snapshots().await;
if snapshots.len() < 30 {
    println!("Warning: Only {} snapshots. Need 30+ for accurate forecasting.", snapshots.len());
}

Issue: Recommendations seem wrong¶

Solution: Verify table metrics are up-to-date:

let metrics = attributor.get_table_metrics("table_name").await;
if let Some(m) = metrics {
    let days_old = (chrono::Utc::now() - m.last_accessed).num_days();
    if days_old > 1 {
        println!("Warning: Metrics are {} days old. Re-analyze table.", days_old);
    }
}

Issue: High memory usage¶

Solution: Reduce snapshot retention or limit table count:

// Reduce retention to 90 days
let tracker = StorageTrendTracker::with_retention(90);

// Only track top N tables
let top_tables = attributor.top_tables_by_cost(100).await; // Top 100 only

Performance Tips¶

Batch Updates: Update metrics in batches to reduce lock contention
Async Operations: Use async/await for all operations
Index Separately: Track index costs separately from data costs
Cache Results: Cache dashboard results for 5-10 minutes
Parallel Forecasting: Run forecasts for multiple scenarios in parallel

Next Steps¶

Read the full documentation
Explore API reference
Check out integration examples
Learn about Week 3: Network Cost Tracking

Support¶

For questions or issues, see: - GitHub Issues: https://github.com/heliosdb/heliosdb - Documentation: /home/claude/HeliosDB/docs/ - Examples: /home/claude/HeliosDB/heliosdb-cost-optimizer-v2/tests/