HeliosDB Configuration Guide¶
Version: 7.0 Last Updated: November 24, 2025 Status: Production Ready
Overview¶
HeliosDB provides a comprehensive configuration system supporting TOML files, environment variables, and CLI arguments. This guide covers all configuration options for HeliosDB's advanced features including GPU acceleration, multi-region deployment, autoscaling, and CDC.
Table of Contents¶
- Configuration Sources
- Configuration File Structure
- GPU Configuration
- Multi-Region Configuration
- Autoscaling Configuration
- CDC Configuration
- Storage Configuration
- Network Configuration
- Security Configuration
- Environment Variables
- Best Practices
- Examples
- Troubleshooting
Configuration Sources¶
HeliosDB loads configuration in the following priority order (highest to lowest):
- Environment Variables - Runtime overrides (prefixed with
HELIOSDB_) - TOML Configuration File - Primary configuration source
- Default Values - Built-in sensible defaults
Loading Configuration¶
use heliosdb_common::ConfigLoader;
// Load with default paths
let config = ConfigLoader::new().load()?;
// Load from specific file
let config = ConfigLoader::new()
.with_path("/etc/heliosdb/heliosdb.toml")
.load()?;
// Generate example configuration
let example = ConfigLoader::example_config();
Configuration File Structure¶
The configuration file uses TOML format with the following top-level sections:
[storage] # Storage engine configuration
[network] # Network and communication settings
[metadata] # Raft consensus configuration
[vector] # Vector index configuration
[security] # Security and encryption settings
[gpu] # GPU acceleration settings
[multi_region] # Multi-region deployment
[autoscale] # Autoscaling configuration
[cdc] # Change Data Capture
GPU Configuration¶
Enable GPU acceleration for 10-100x speedup on analytical queries.
Configuration Options¶
[gpu]
# Enable GPU acceleration
enabled = true
# GPU memory limit in bytes (8GB = 8589934592)
memory_limit = 8589934592
# GPU device ID (0-indexed)
device_id = 0
# Fallback to CPU if GPU unavailable
fallback_to_cpu = true
# Enable CUDA unified memory
enable_unified_memory = true
# GPU vendor: "nvidia", "amd", or "intel"
# vendor = "nvidia" # Auto-detect if not specified
# Minimum data size to trigger GPU execution (100MB)
min_data_size_bytes = 104857600
# Enable GPU for specific operations
enable_aggregations = true
enable_joins = true
enable_vector_ops = true
GPU Vendor Support¶
- NVIDIA: CUDA 11.0+ (Tesla T4, V100, A100)
- AMD: ROCm/HIP 5.0+ (MI100, MI250X)
- Intel: Level Zero API (experimental)
Environment Variables¶
export HELIOSDB_GPU_ENABLED=true
export HELIOSDB_GPU_DEVICE_ID=0
export HELIOSDB_GPU_MEMORY_LIMIT=8589934592
export HELIOSDB_GPU_FALLBACK_TO_CPU=true
Performance Tuning¶
Memory Limit: Set based on your GPU VRAM:
- 8GB GPUs: 8589934592 (8GB)
- 16GB GPUs: 17179869184 (16GB)
- 24GB GPUs: 25769803776 (24GB)
Min Data Size: Adjust based on overhead: - Small queries (<100MB): CPU is faster - Large queries (>100MB): GPU provides speedup - Optimal threshold: 100-500MB depending on GPU
Multi-Region Configuration¶
Deploy HeliosDB across multiple geographic regions with automatic replication and failover.
Configuration Options¶
[multi_region]
# Enable multi-region deployment
enabled = true
# Primary region identifier
primary_region = "us-west"
# Replication mode
replication_mode = "active_active" # Options: active_passive, active_active, multi_master
# Conflict resolution strategy
conflict_resolution = "last_write_wins" # Options: last_write_wins, first_write_wins, custom
# Enable compression for cross-region traffic
compression = true
# Enable encryption for cross-region traffic
encryption = true
# Maximum replication lag in milliseconds
max_lag_ms = 5000
# Enable automatic failover
auto_failover = true
# Region definitions
[[multi_region.regions]]
id = "us-west"
endpoint = "https://us-west.heliosdb.example.com"
datacenter = "California"
is_primary = true
[[multi_region.regions]]
id = "us-east"
endpoint = "https://us-east.heliosdb.example.com"
datacenter = "Virginia"
is_primary = false
[[multi_region.regions]]
id = "eu-central"
endpoint = "https://eu-central.heliosdb.example.com"
datacenter = "Frankfurt"
is_primary = false
Replication Modes¶
Active-Passive: - Single primary region accepts writes - Replicas serve read-only queries - Lowest complexity, eventual consistency
Active-Active: - Multiple regions accept writes - Automatic conflict resolution - Higher availability, lower write latency
Multi-Master: - All regions accept writes - Advanced conflict resolution required - Maximum availability and performance
Conflict Resolution Strategies¶
- Last Write Wins (LWW): Latest timestamp wins conflicts
- First Write Wins (FWW): First write wins conflicts
- Custom: Application-defined resolution logic
Environment Variables¶
Autoscaling Configuration¶
Automatically scale compute resources based on workload demand.
Configuration Options¶
[autoscale]
# Enable autoscaling
enabled = true
# Minimum compute units (0 = scale-to-zero)
min_cu = 10.0
# Maximum compute units
max_cu = 1000.0
# Target CPU utilization percentage
target_cpu_percent = 70.0
# Scale up when CPU exceeds threshold
scale_up_threshold = 80.0
# Scale down when CPU drops below threshold
scale_down_threshold = 30.0
# Cooldown period between scaling decisions (seconds)
cooldown_period_seconds = 60
# Scale to zero after idle period (seconds)
scale_to_zero_idle_seconds = 300
# Scale step percentage (50 = scale by 50% each time)
scale_step_percent = 50.0
# Damping factor to prevent oscillation (0.0-1.0)
damping_factor = 0.8
[autoscale.horizontal]
# Enable horizontal scaling (add/remove nodes)
enabled = true
# Minimum number of nodes
min_nodes = 1
# Maximum number of nodes
max_nodes = 10
# Maximum CUs per node
node_max_cu = 100.0
# CPU threshold for scaling out (adding nodes)
scale_out_cpu_threshold = 85.0
# CPU threshold for scaling in (removing nodes)
scale_in_cpu_threshold = 20.0
# Drain timeout when removing nodes (seconds)
drain_timeout_seconds = 120
Compute Units (CU)¶
1 CU = 1 vCPU + 4GB RAM
Sizing Guidelines: - Small workloads: 10-50 CU - Medium workloads: 50-200 CU - Large workloads: 200-1000 CU - Enterprise: 1000+ CU
Scaling Strategies¶
Vertical Scaling:
- Adjusts CU allocation per node
- Fast response (seconds)
- Limited by max_cu
Horizontal Scaling: - Adds/removes entire nodes - Slower response (minutes) - Better for sustained load
Environment Variables¶
export HELIOSDB_AUTOSCALE_ENABLED=true
export HELIOSDB_AUTOSCALE_MIN_CU=10
export HELIOSDB_AUTOSCALE_MAX_CU=1000
export HELIOSDB_AUTOSCALE_TARGET_CPU_PERCENT=70
CDC Configuration¶
Capture and stream database changes in real-time.
Configuration Options¶
[cdc]
# Enable CDC
enabled = true
# Output format
output_format = "avro" # Options: json, avro, protobuf
# Buffer size for CDC events
buffer_size = 10000
# Flush interval for batching events
flush_interval = "5s" # Humantime format: "5s", "1m", "100ms"
# Checkpoint interval (number of events)
checkpoint_interval = 1000
# Enable compression for CDC streams
compression = true
# Maximum lag before warning (seconds)
max_lag_seconds = 300
# Kafka sink configuration
[cdc.kafka]
bootstrap_servers = ["localhost:9092"]
topic = "heliosdb-cdc"
ssl = true
sasl = true
compression_type = "snappy" # Options: none, gzip, snappy, lz4, zstd
# Alternative: Kinesis sink
# [cdc.kinesis]
# region = "us-west-2"
# stream_name = "heliosdb-cdc-stream"
# enhanced_fan_out = true
Output Formats¶
JSON: - Human-readable - Larger size - Easy debugging
Avro: - Schema evolution - Compact binary format - Best performance
Protobuf: - Type-safe - Compact binary format - Cross-language support
Sinks¶
Kafka: - High throughput - Guaranteed ordering - Mature ecosystem
AWS Kinesis: - Native AWS integration - Serverless - Enhanced fan-out support
Environment Variables¶
Storage Configuration¶
Core storage engine settings.
[storage]
data_dir = "/var/lib/heliosdb"
memtable_size_mb = 64
sstable_size_mb = 128
compaction_strategy = "SizeTiered" # or "Leveled"
bloom_filter_bits_per_key = 10
gc_grace_seconds = 864000 # 10 days
Network Configuration¶
Network and communication settings.
Security Configuration¶
Encryption and key management.
[security]
tde_enabled = false
tde_algorithm = "Aes256Gcm" # Options: Aes256Gcm, Aes256Ctr, ChaCha20Poly1305
key_storage_path = "/var/lib/heliosdb/keys"
default_mek_id = "master_key_001"
key_cache_limit = 1000
hsm_enabled = false
# KMS Provider (AWS example)
# [security.kms_provider.Aws]
# region = "us-west-2"
# key_id = "arn:aws:kms:..."
Environment Variables¶
All configuration options can be overridden via environment variables using the format:
Examples¶
# GPU Configuration
export HELIOSDB_GPU_ENABLED=true
export HELIOSDB_GPU_DEVICE_ID=1
export HELIOSDB_GPU_MEMORY_LIMIT=17179869184 # 16GB
# Multi-Region Configuration
export HELIOSDB_MULTIREGION_ENABLED=true
export HELIOSDB_MULTIREGION_PRIMARY_REGION=eu-central
# Autoscaling Configuration
export HELIOSDB_AUTOSCALE_ENABLED=true
export HELIOSDB_AUTOSCALE_MIN_CU=20
export HELIOSDB_AUTOSCALE_MAX_CU=500
# CDC Configuration
export HELIOSDB_CDC_ENABLED=true
export HELIOSDB_CDC_BUFFER_SIZE=20000
# Storage Configuration
export HELIOSDB_STORAGE_DATA_DIR=/mnt/fast-ssd/heliosdb
export HELIOSDB_STORAGE_MEMTABLE_SIZE_MB=128
# Network Configuration
export HELIOSDB_NETWORK_LISTEN_ADDR=0.0.0.0:7654
export HELIOSDB_NETWORK_RDMA_ENABLED=true
Best Practices¶
Development Environment¶
[gpu]
enabled = false # Use CPU for development
[autoscale]
enabled = false # Fixed resources
[multi_region]
enabled = false # Single region
[cdc]
enabled = false # Not needed for dev
Production Environment¶
[gpu]
enabled = true
fallback_to_cpu = true # Ensure availability
[autoscale]
enabled = true
min_cu = 50 # Ensure minimum capacity
max_cu = 2000
[multi_region]
enabled = true
auto_failover = true
encryption = true
[cdc]
enabled = true
compression = true
Performance Optimization¶
- GPU: Enable for analytical workloads (OLAP)
- Autoscaling: Set aggressive thresholds for cost savings
- Multi-Region: Use active-passive for read-heavy workloads
- CDC: Use Avro format for best performance
Security Hardening¶
- Enable TDE for data-at-rest encryption
- Enable encryption for multi-region traffic
- Use KMS for key management
- Enable HSM for highest security
Examples¶
Minimal Configuration¶
GPU-Accelerated Analytics¶
[gpu]
enabled = true
memory_limit = 17179869184 # 16GB
enable_aggregations = true
enable_joins = true
fallback_to_cpu = true
Multi-Region HA Setup¶
[multi_region]
enabled = true
primary_region = "us-west"
replication_mode = "active_active"
conflict_resolution = "last_write_wins"
auto_failover = true
[[multi_region.regions]]
id = "us-west"
endpoint = "https://us-west.db.example.com"
is_primary = true
[[multi_region.regions]]
id = "eu-central"
endpoint = "https://eu.db.example.com"
is_primary = false
Serverless with Scale-to-Zero¶
[autoscale]
enabled = true
min_cu = 0 # Scale to zero
max_cu = 500
scale_to_zero_idle_seconds = 300 # 5 minutes
CDC to Kafka¶
[cdc]
enabled = true
output_format = "avro"
buffer_size = 10000
flush_interval = "5s"
[cdc.kafka]
bootstrap_servers = ["kafka1:9092", "kafka2:9092", "kafka3:9092"]
topic = "heliosdb-changes"
ssl = true
compression_type = "snappy"
Troubleshooting¶
Configuration Not Loading¶
Problem: Configuration file not found
Solution:
# Verify file exists
ls -la /etc/heliosdb/heliosdb.toml
# Check permissions
chmod 644 /etc/heliosdb/heliosdb.toml
# Use explicit path
export HELIOSDB_CONFIG_PATH=/path/to/config.toml
GPU Not Detected¶
Problem: GPU acceleration not working
Solution:
# Check GPU availability
nvidia-smi # NVIDIA
rocm-smi # AMD
# Verify device ID
export HELIOSDB_GPU_DEVICE_ID=0
# Enable CPU fallback
export HELIOSDB_GPU_FALLBACK_TO_CPU=true
Autoscaling Not Working¶
Problem: Cluster not scaling
Solution:
# Check thresholds
[autoscale]
scale_up_threshold = 80.0
scale_down_threshold = 30.0
cooldown_period_seconds = 60 # Reduce for faster response
# Verify min/max are different
min_cu = 10.0
max_cu = 100.0 # Must be > min_cu
CDC Lag Increasing¶
Problem: CDC falling behind
Solution:
# Increase buffer size
buffer_size = 50000
# Reduce flush interval
flush_interval = "1s"
# Enable compression
compression = true
Configuration Validation¶
HeliosDB automatically validates configuration on startup. Common validation errors:
max_cu must be greater than min_cugpu.memory_limit must be > 0 when GPU is enabledmulti_region.regions cannot be empty when enabledscale_up_threshold must be > scale_down_threshold
Check logs for validation errors:
See Also¶
Support¶
For configuration assistance: - GitHub Issues: https://github.com/heliosdb/heliosdb/issues - Documentation: https://docs.heliosdb.com - Community: https://community.heliosdb.com