Skip to content

HeliosDB Configuration Guide

Version: 7.0 Last Updated: November 24, 2025 Status: Production Ready

Overview

HeliosDB provides a comprehensive configuration system supporting TOML files, environment variables, and CLI arguments. This guide covers all configuration options for HeliosDB's advanced features including GPU acceleration, multi-region deployment, autoscaling, and CDC.

Table of Contents

Configuration Sources

HeliosDB loads configuration in the following priority order (highest to lowest):

  1. Environment Variables - Runtime overrides (prefixed with HELIOSDB_)
  2. TOML Configuration File - Primary configuration source
  3. Default Values - Built-in sensible defaults

Loading Configuration

use heliosdb_common::ConfigLoader;

// Load with default paths
let config = ConfigLoader::new().load()?;

// Load from specific file
let config = ConfigLoader::new()
    .with_path("/etc/heliosdb/heliosdb.toml")
    .load()?;

// Generate example configuration
let example = ConfigLoader::example_config();

Configuration File Structure

The configuration file uses TOML format with the following top-level sections:

[storage]      # Storage engine configuration
[network]      # Network and communication settings
[metadata]     # Raft consensus configuration
[vector]       # Vector index configuration
[security]     # Security and encryption settings
[gpu]          # GPU acceleration settings
[multi_region] # Multi-region deployment
[autoscale]    # Autoscaling configuration
[cdc]          # Change Data Capture

GPU Configuration

Enable GPU acceleration for 10-100x speedup on analytical queries.

Configuration Options

[gpu]
# Enable GPU acceleration
enabled = true

# GPU memory limit in bytes (8GB = 8589934592)
memory_limit = 8589934592

# GPU device ID (0-indexed)
device_id = 0

# Fallback to CPU if GPU unavailable
fallback_to_cpu = true

# Enable CUDA unified memory
enable_unified_memory = true

# GPU vendor: "nvidia", "amd", or "intel"
# vendor = "nvidia"  # Auto-detect if not specified

# Minimum data size to trigger GPU execution (100MB)
min_data_size_bytes = 104857600

# Enable GPU for specific operations
enable_aggregations = true
enable_joins = true
enable_vector_ops = true

GPU Vendor Support

  • NVIDIA: CUDA 11.0+ (Tesla T4, V100, A100)
  • AMD: ROCm/HIP 5.0+ (MI100, MI250X)
  • Intel: Level Zero API (experimental)

Environment Variables

export HELIOSDB_GPU_ENABLED=true
export HELIOSDB_GPU_DEVICE_ID=0
export HELIOSDB_GPU_MEMORY_LIMIT=8589934592
export HELIOSDB_GPU_FALLBACK_TO_CPU=true

Performance Tuning

Memory Limit: Set based on your GPU VRAM: - 8GB GPUs: 8589934592 (8GB) - 16GB GPUs: 17179869184 (16GB) - 24GB GPUs: 25769803776 (24GB)

Min Data Size: Adjust based on overhead: - Small queries (<100MB): CPU is faster - Large queries (>100MB): GPU provides speedup - Optimal threshold: 100-500MB depending on GPU

Multi-Region Configuration

Deploy HeliosDB across multiple geographic regions with automatic replication and failover.

Configuration Options

[multi_region]
# Enable multi-region deployment
enabled = true

# Primary region identifier
primary_region = "us-west"

# Replication mode
replication_mode = "active_active"  # Options: active_passive, active_active, multi_master

# Conflict resolution strategy
conflict_resolution = "last_write_wins"  # Options: last_write_wins, first_write_wins, custom

# Enable compression for cross-region traffic
compression = true

# Enable encryption for cross-region traffic
encryption = true

# Maximum replication lag in milliseconds
max_lag_ms = 5000

# Enable automatic failover
auto_failover = true

# Region definitions
[[multi_region.regions]]
id = "us-west"
endpoint = "https://us-west.heliosdb.example.com"
datacenter = "California"
is_primary = true

[[multi_region.regions]]
id = "us-east"
endpoint = "https://us-east.heliosdb.example.com"
datacenter = "Virginia"
is_primary = false

[[multi_region.regions]]
id = "eu-central"
endpoint = "https://eu-central.heliosdb.example.com"
datacenter = "Frankfurt"
is_primary = false

Replication Modes

Active-Passive: - Single primary region accepts writes - Replicas serve read-only queries - Lowest complexity, eventual consistency

Active-Active: - Multiple regions accept writes - Automatic conflict resolution - Higher availability, lower write latency

Multi-Master: - All regions accept writes - Advanced conflict resolution required - Maximum availability and performance

Conflict Resolution Strategies

  • Last Write Wins (LWW): Latest timestamp wins conflicts
  • First Write Wins (FWW): First write wins conflicts
  • Custom: Application-defined resolution logic

Environment Variables

export HELIOSDB_MULTIREGION_ENABLED=true
export HELIOSDB_MULTIREGION_PRIMARY_REGION=us-west

Autoscaling Configuration

Automatically scale compute resources based on workload demand.

Configuration Options

[autoscale]
# Enable autoscaling
enabled = true

# Minimum compute units (0 = scale-to-zero)
min_cu = 10.0

# Maximum compute units
max_cu = 1000.0

# Target CPU utilization percentage
target_cpu_percent = 70.0

# Scale up when CPU exceeds threshold
scale_up_threshold = 80.0

# Scale down when CPU drops below threshold
scale_down_threshold = 30.0

# Cooldown period between scaling decisions (seconds)
cooldown_period_seconds = 60

# Scale to zero after idle period (seconds)
scale_to_zero_idle_seconds = 300

# Scale step percentage (50 = scale by 50% each time)
scale_step_percent = 50.0

# Damping factor to prevent oscillation (0.0-1.0)
damping_factor = 0.8

[autoscale.horizontal]
# Enable horizontal scaling (add/remove nodes)
enabled = true

# Minimum number of nodes
min_nodes = 1

# Maximum number of nodes
max_nodes = 10

# Maximum CUs per node
node_max_cu = 100.0

# CPU threshold for scaling out (adding nodes)
scale_out_cpu_threshold = 85.0

# CPU threshold for scaling in (removing nodes)
scale_in_cpu_threshold = 20.0

# Drain timeout when removing nodes (seconds)
drain_timeout_seconds = 120

Compute Units (CU)

1 CU = 1 vCPU + 4GB RAM

Sizing Guidelines: - Small workloads: 10-50 CU - Medium workloads: 50-200 CU - Large workloads: 200-1000 CU - Enterprise: 1000+ CU

Scaling Strategies

Vertical Scaling: - Adjusts CU allocation per node - Fast response (seconds) - Limited by max_cu

Horizontal Scaling: - Adds/removes entire nodes - Slower response (minutes) - Better for sustained load

Environment Variables

export HELIOSDB_AUTOSCALE_ENABLED=true
export HELIOSDB_AUTOSCALE_MIN_CU=10
export HELIOSDB_AUTOSCALE_MAX_CU=1000
export HELIOSDB_AUTOSCALE_TARGET_CPU_PERCENT=70

CDC Configuration

Capture and stream database changes in real-time.

Configuration Options

[cdc]
# Enable CDC
enabled = true

# Output format
output_format = "avro"  # Options: json, avro, protobuf

# Buffer size for CDC events
buffer_size = 10000

# Flush interval for batching events
flush_interval = "5s"  # Humantime format: "5s", "1m", "100ms"

# Checkpoint interval (number of events)
checkpoint_interval = 1000

# Enable compression for CDC streams
compression = true

# Maximum lag before warning (seconds)
max_lag_seconds = 300

# Kafka sink configuration
[cdc.kafka]
bootstrap_servers = ["localhost:9092"]
topic = "heliosdb-cdc"
ssl = true
sasl = true
compression_type = "snappy"  # Options: none, gzip, snappy, lz4, zstd

# Alternative: Kinesis sink
# [cdc.kinesis]
# region = "us-west-2"
# stream_name = "heliosdb-cdc-stream"
# enhanced_fan_out = true

Output Formats

JSON: - Human-readable - Larger size - Easy debugging

Avro: - Schema evolution - Compact binary format - Best performance

Protobuf: - Type-safe - Compact binary format - Cross-language support

Sinks

Kafka: - High throughput - Guaranteed ordering - Mature ecosystem

AWS Kinesis: - Native AWS integration - Serverless - Enhanced fan-out support

Environment Variables

export HELIOSDB_CDC_ENABLED=true
export HELIOSDB_CDC_BUFFER_SIZE=10000

Storage Configuration

Core storage engine settings.

[storage]
data_dir = "/var/lib/heliosdb"
memtable_size_mb = 64
sstable_size_mb = 128
compaction_strategy = "SizeTiered"  # or "Leveled"
bloom_filter_bits_per_key = 10
gc_grace_seconds = 864000  # 10 days

Network Configuration

Network and communication settings.

[network]
listen_addr = "0.0.0.0:6543"
rdma_enabled = false
max_message_size_mb = 64

Security Configuration

Encryption and key management.

[security]
tde_enabled = false
tde_algorithm = "Aes256Gcm"  # Options: Aes256Gcm, Aes256Ctr, ChaCha20Poly1305
key_storage_path = "/var/lib/heliosdb/keys"
default_mek_id = "master_key_001"
key_cache_limit = 1000
hsm_enabled = false

# KMS Provider (AWS example)
# [security.kms_provider.Aws]
# region = "us-west-2"
# key_id = "arn:aws:kms:..."

Environment Variables

All configuration options can be overridden via environment variables using the format:

HELIOSDB_<SECTION>_<KEY>=<VALUE>

Examples

# GPU Configuration
export HELIOSDB_GPU_ENABLED=true
export HELIOSDB_GPU_DEVICE_ID=1
export HELIOSDB_GPU_MEMORY_LIMIT=17179869184  # 16GB

# Multi-Region Configuration
export HELIOSDB_MULTIREGION_ENABLED=true
export HELIOSDB_MULTIREGION_PRIMARY_REGION=eu-central

# Autoscaling Configuration
export HELIOSDB_AUTOSCALE_ENABLED=true
export HELIOSDB_AUTOSCALE_MIN_CU=20
export HELIOSDB_AUTOSCALE_MAX_CU=500

# CDC Configuration
export HELIOSDB_CDC_ENABLED=true
export HELIOSDB_CDC_BUFFER_SIZE=20000

# Storage Configuration
export HELIOSDB_STORAGE_DATA_DIR=/mnt/fast-ssd/heliosdb
export HELIOSDB_STORAGE_MEMTABLE_SIZE_MB=128

# Network Configuration
export HELIOSDB_NETWORK_LISTEN_ADDR=0.0.0.0:7654
export HELIOSDB_NETWORK_RDMA_ENABLED=true

Best Practices

Development Environment

[gpu]
enabled = false  # Use CPU for development

[autoscale]
enabled = false  # Fixed resources

[multi_region]
enabled = false  # Single region

[cdc]
enabled = false  # Not needed for dev

Production Environment

[gpu]
enabled = true
fallback_to_cpu = true  # Ensure availability

[autoscale]
enabled = true
min_cu = 50  # Ensure minimum capacity
max_cu = 2000

[multi_region]
enabled = true
auto_failover = true
encryption = true

[cdc]
enabled = true
compression = true

Performance Optimization

  1. GPU: Enable for analytical workloads (OLAP)
  2. Autoscaling: Set aggressive thresholds for cost savings
  3. Multi-Region: Use active-passive for read-heavy workloads
  4. CDC: Use Avro format for best performance

Security Hardening

  1. Enable TDE for data-at-rest encryption
  2. Enable encryption for multi-region traffic
  3. Use KMS for key management
  4. Enable HSM for highest security

Examples

Minimal Configuration

[storage]
data_dir = "/var/lib/heliosdb"

[network]
listen_addr = "0.0.0.0:6543"

GPU-Accelerated Analytics

[gpu]
enabled = true
memory_limit = 17179869184  # 16GB
enable_aggregations = true
enable_joins = true
fallback_to_cpu = true

Multi-Region HA Setup

[multi_region]
enabled = true
primary_region = "us-west"
replication_mode = "active_active"
conflict_resolution = "last_write_wins"
auto_failover = true

[[multi_region.regions]]
id = "us-west"
endpoint = "https://us-west.db.example.com"
is_primary = true

[[multi_region.regions]]
id = "eu-central"
endpoint = "https://eu.db.example.com"
is_primary = false

Serverless with Scale-to-Zero

[autoscale]
enabled = true
min_cu = 0  # Scale to zero
max_cu = 500
scale_to_zero_idle_seconds = 300  # 5 minutes

CDC to Kafka

[cdc]
enabled = true
output_format = "avro"
buffer_size = 10000
flush_interval = "5s"

[cdc.kafka]
bootstrap_servers = ["kafka1:9092", "kafka2:9092", "kafka3:9092"]
topic = "heliosdb-changes"
ssl = true
compression_type = "snappy"

Troubleshooting

Configuration Not Loading

Problem: Configuration file not found

Solution:

# Verify file exists
ls -la /etc/heliosdb/heliosdb.toml

# Check permissions
chmod 644 /etc/heliosdb/heliosdb.toml

# Use explicit path
export HELIOSDB_CONFIG_PATH=/path/to/config.toml

GPU Not Detected

Problem: GPU acceleration not working

Solution:

# Check GPU availability
nvidia-smi  # NVIDIA
rocm-smi    # AMD

# Verify device ID
export HELIOSDB_GPU_DEVICE_ID=0

# Enable CPU fallback
export HELIOSDB_GPU_FALLBACK_TO_CPU=true

Autoscaling Not Working

Problem: Cluster not scaling

Solution:

# Check thresholds
[autoscale]
scale_up_threshold = 80.0
scale_down_threshold = 30.0
cooldown_period_seconds = 60  # Reduce for faster response

# Verify min/max are different
min_cu = 10.0
max_cu = 100.0  # Must be > min_cu

CDC Lag Increasing

Problem: CDC falling behind

Solution:

# Increase buffer size
buffer_size = 50000

# Reduce flush interval
flush_interval = "1s"

# Enable compression
compression = true

Configuration Validation

HeliosDB automatically validates configuration on startup. Common validation errors:

  • max_cu must be greater than min_cu
  • gpu.memory_limit must be > 0 when GPU is enabled
  • multi_region.regions cannot be empty when enabled
  • scale_up_threshold must be > scale_down_threshold

Check logs for validation errors:

tail -f /var/log/heliosdb/heliosdb.log | grep -i "config"

See Also

Support

For configuration assistance: - GitHub Issues: https://github.com/heliosdb/heliosdb/issues - Documentation: https://docs.heliosdb.com - Community: https://community.heliosdb.com