Skip to content

HeliosDB v4.0.0 - Complete Configuration Reference

Comprehensive Configuration Guide for All Features

This document provides complete configuration reference for HeliosDB v4.0.0, including all v4.0 breakthrough features and legacy configurations.


Table of Contents


Configuration File Format

HeliosDB v4.0.0 uses YAML format for configuration files (upgraded from TOML in v3.1).

Default Location: /etc/heliosdb/heliosdb.yaml

Override with CLI:

heliosdb-server --config /path/to/custom/heliosdb.yaml


Quick Start Configurations

Development (Single Node, All Features Disabled)

# heliosdb.yaml - Development Configuration
server:
  host: 0.0.0.0
  port: 5432
  max_connections: 100

storage:
  data_dir: /var/lib/heliosdb/data
  wal_dir: /var/lib/heliosdb/wal
  compression: none

# All v4.0 features disabled for simplicity
branching:
  enabled: false

autoscaling:
  enabled: false

tiered_storage:
  enabled: false

Production (Multi-Node, All v4.0 Features Enabled)

# heliosdb.yaml - Production Configuration
server:
  host: 0.0.0.0
  port: 5432
  max_connections: 1000
  max_connections_per_user: 100

# Git-Style Branching
branching:
  enabled: true
  max_branches_per_database: 100
  default_parent: main
  auto_gc: true
  gc_interval: 24h
  branch_retention: 30d

# Scale-to-Zero & Autoscaling
autoscaling:
  enabled: true
  min_cu: 0.5                    # Minimum 0.5 CU (don't scale to zero in prod)
  max_cu: 16.0                   # Maximum 16 CUs
  scale_to_zero_after: 0         # Disable scale-to-zero in prod (set to 0)
  resume_timeout: 300ms
  target_cpu: 70
  target_queue_depth: 10
  scale_up_threshold: 80
  scale_down_threshold: 30
  cooldown_period: 60s

# Query-from-Any-Node
distributed_query:
  enabled: true
  metadata_cache_size: 1GB
  cache_ttl: 300s
  invalidation_batch_size: 1000

# Zero-Downtime Rebalancing
rebalancing:
  enabled: true
  auto_rebalance: true
  strategy: by_disk_size          # Options: by_shard_count, by_disk_size, by_tenant_id
  threshold: 0.20                 # Trigger at 20% imbalance
  check_interval: 3600s           # Check every hour
  max_concurrent_moves: 5
  bandwidth_limit: 100MB/s

# Enhanced Columnar Compression
compression:
  enabled: true
  default_algorithm: hcc_v2       # Options: none, hcc_v1, hcc_v2
  compression_level: high         # Options: low, medium, high, adaptive
  adaptive_selection: true

# Schema-Based Sharding
schema_sharding:
  enabled: true
  default_distribution: distributed
  allow_non_distributed: true

# Distributed Foreign Keys
foreign_keys:
  enabled: true
  cross_shard_validation: true
  reference_table_replication: true
  join_optimization: true

# 3-Tier Storage
tiered_storage:
  enabled: true

  hot_tier:
    path: /mnt/nvme/heliosdb
    max_size_gb: 1000
    latency_target_ms: 1
    cost_per_gb: 0.15

  warm_tier:
    path: /mnt/ssd/heliosdb
    max_size_gb: 5000
    latency_target_ms: 5
    cost_per_gb: 0.04

  cold_tier:
    type: s3
    bucket: heliosdb-cold-tier-prod
    region: us-east-1
    endpoint: https://s3.amazonaws.com
    access_key_id: ${AWS_ACCESS_KEY_ID}    # Environment variable
    secret_access_key: ${AWS_SECRET_ACCESS_KEY}
    latency_target_ms: 50
    cost_per_gb: 0.02

  policies:
    hot_to_warm: 7d
    warm_to_cold: 30d

  migration:
    max_concurrent: 5
    max_bandwidth_mbps: 100
    atomic_transitions: true
    enable_throttling: true

# Safekeeper Consensus
safekeeper:
  enabled: true
  cluster_size: 3
  quorum_size: 2

  nodes:
    - id: sk-1
      address: 10.0.1.10:5433
    - id: sk-2
      address: 10.0.1.11:5433
    - id: sk-3
      address: 10.0.1.12:5433

  wal_retention: 7d
  sync_timeout_ms: 100
  async_storage_flush: true

# Online Table Sharding
online_sharding:
  enabled: true
  max_concurrent_migrations: 3
  migration_throughput_limit: 100000  # rows/sec
  cutover_timeout_ms: 100

# Multi-Tenant Quotas
multi_tenancy:
  enabled: true
  enforce_quotas: true
  default_tier: bronze

  tiers:
    bronze:
      max_cpu_cu: 2.0
      max_iops: 5000
      priority: 1
      price_multiplier: 1.0

    silver:
      max_cpu_cu: 8.0
      max_iops: 15000
      priority: 5
      price_multiplier: 1.5

    gold:
      max_cpu_cu: 32.0
      max_iops: 50000
      priority: 10
      price_multiplier: 2.5

  enforcement:
    check_interval_ms: 100
    grace_period_ms: 1000
    throttle_on_violation: true

v4.0 Breakthrough Features

1. Git-Style Database Branching

branching:
  enabled: true                          # Enable branching feature
  max_branches_per_database: 100         # Maximum branches per database
  default_parent: main                   # Default parent branch name
  auto_gc: true                          # Automatic garbage collection
  gc_interval: 24h                       # GC interval (hours, days)
  branch_retention: 30d                  # Retention period for deleted branches
  delta_ssttable_max_size: 1GB          # Maximum delta SSTable size before compaction
  copy_on_write: true                    # Enable copy-on-write (always true)

SQL API:

-- Create branch
SELECT heliosdb.create_branch('feature-1');
SELECT heliosdb.create_branch('debug-1', parent_timestamp => '2025-10-24 14:30:00');
SELECT heliosdb.create_branch('rollback-1', parent_lsn => '0/1234ABCD');

-- List branches
SELECT * FROM heliosdb.branches;

-- Switch branch
SELECT heliosdb.checkout_branch('feature-1');

-- Delete branch
SELECT heliosdb.delete_branch('feature-1');


2. Scale-to-Zero Serverless Compute

autoscaling:
  enabled: true
  min_cu: 0.0                            # Minimum CUs (0.0 = scale to zero)
  max_cu: 4.0                            # Maximum CUs
  scale_to_zero_after: 300s              # Idle time before suspend (seconds)
  resume_timeout: 300ms                  # Maximum resume time
  suspend_checkpoint_timeout: 500ms      # Checkpoint timeout during suspend
  persist_connection_state: true         # Persist connection state
  state_storage_path: /var/lib/heliosdb/state
  billing_precision: 1s                  # Billing precision (1s = 1 second granularity)

Activity Monitoring:

autoscaling:
  activity_monitoring:
    enabled: true
    check_interval_ms: 1000              # Check every second
    idle_threshold_queries: 0            # Consider idle if 0 queries
    idle_threshold_connections: 0        # Consider idle if 0 active connections

SQL API:

-- Check compute status
SELECT * FROM heliosdb.compute_status;

-- Manual suspend
SELECT heliosdb.suspend_compute();

-- Manual resume
SELECT heliosdb.resume_compute();


3. Dynamic Autoscaling

autoscaling:
  # Already covered above, additional settings:
  target_cpu: 70                         # Target CPU utilization (%)
  target_queue_depth: 10                 # Target query queue depth
  target_memory: 80                      # Target memory utilization (%)
  scale_up_threshold: 80                 # Scale up at 80% CPU
  scale_down_threshold: 30               # Scale down at 30% CPU
  cooldown_period: 60s                   # Wait between scaling decisions
  min_scale_step: 0.5                    # Minimum CUs to add/remove
  max_scale_step: 2.0                    # Maximum CUs to add/remove in one step
  predictive_scaling: false              # ML-based predictive scaling (future)

SQL API:

-- View autoscaling events
SELECT * FROM heliosdb.autoscale_events
WHERE timestamp > NOW() - INTERVAL '1 hour'
ORDER BY timestamp DESC;

-- View current resource usage
SELECT
    current_cu,
    cpu_utilization,
    memory_utilization,
    query_queue_depth,
    active_connections,
    recommended_cu
FROM heliosdb.autoscale_status;


4. Query-from-Any-Node Architecture

distributed_query:
  enabled: true
  metadata_cache_size: 1GB               # Metadata cache size per node
  cache_ttl: 300s                        # Cache TTL
  cache_invalidation_batch_size: 1000    # Batch size for invalidation messages
  cache_refresh_interval: 60s            # Automatic refresh interval
  ddl_centralized: true                  # DDL always routed to metadata service
  query_routing_strategy: automatic      # Options: automatic, coordinator_only

SQL API:

-- View metadata cache status
SELECT * FROM heliosdb.metadata_cache_status;

-- Force cache invalidation
SELECT heliosdb.invalidate_metadata_cache();

-- View query routing statistics
SELECT * FROM heliosdb.query_routing_stats;


5. Zero-Downtime Shard Rebalancing

rebalancing:
  enabled: true
  auto_rebalance: true                   # Enable automatic rebalancing
  strategy: by_disk_size                 # Options: by_shard_count, by_disk_size, by_tenant_id
  threshold: 0.20                        # Trigger at 20% imbalance
  check_interval: 3600s                  # Check every hour
  max_concurrent_moves: 5                # Maximum concurrent shard moves
  bandwidth_limit: 100MB/s               # Network bandwidth limit
  replication_lag_threshold: 1000        # Max WAL records lag before cutover
  cutover_lock_timeout_ms: 100           # Maximum lock time during cutover
  rollback_on_failure: true              # Automatic rollback on failure

SQL API:

-- Manual rebalancing
SELECT heliosdb.rebalance_shards(strategy => 'by_disk_size');

-- View rebalancing status
SELECT * FROM heliosdb.rebalance_status;

-- View shard distribution
SELECT * FROM heliosdb.shard_distribution;


6. Enhanced Columnar Compression (HCC v2)

compression:
  enabled: true
  default_algorithm: hcc_v2              # Options: none, hcc_v1, hcc_v2
  compression_level: high                # Options: low, medium, high, adaptive
  adaptive_selection: true               # Automatically choose best algorithm
  algorithms:
    dictionary_encoding: true
    delta_encoding: true
    run_length_encoding: true
    zstd: true
    lz4: true
    frame_of_reference: true
    bit_packing: true
    null_suppression: true
  zstd_level: 3                          # ZSTD compression level (1-22)
  compression_threshold: 1KB             # Minimum size to compress

SQL API:

-- Create table with HCC v2
CREATE TABLE events (
    ...
) WITH (
    compression = 'hcc_v2',
    compression_level = 'high'
);

-- View compression statistics
SELECT * FROM heliosdb.compression_stats WHERE table_name = 'events';

-- Convert table to HCC v2
ALTER TABLE legacy_events SET (compression = 'hcc_v2');


7. Schema-Based Sharding

schema_sharding:
  enabled: true
  default_distribution: distributed      # Options: distributed, non_distributed
  allow_non_distributed: true            # Allow non-distributed schemas
  schema_placement_strategy: automatic   # Options: automatic, manual
  migration_strategy: logical_replication
  migration_bandwidth_limit: 100MB/s

SQL API:

-- Create distributed schema
CREATE SCHEMA tenant_1234 DISTRIBUTED;

-- Create non-distributed schema
CREATE SCHEMA analytics NON_DISTRIBUTED;

-- View schema distribution
SELECT * FROM heliosdb.schema_distribution;

-- Migrate schema to different node
SELECT heliosdb.migrate_schema(
    schema_name => 'tenant_5678',
    target_node => 'node-3',
    strategy => 'logical_replication'
);


8. Distributed Foreign Keys

foreign_keys:
  enabled: true
  cross_shard_validation: true           # Enable cross-shard FK validation
  reference_table_replication: true      # Replicate reference tables
  join_optimization: true                # Enable join optimization
  co_located_fk_optimization: true       # Optimize co-located FKs
  fk_cache_enabled: true                 # Cache FK validation results
  fk_cache_ttl: 60s                      # FK cache TTL

SQL API:

-- Create co-located foreign key
CREATE TABLE orders (
    tenant_id INT,
    user_id INT,
    FOREIGN KEY (tenant_id, user_id) REFERENCES users(tenant_id, user_id)
) SHARD BY (tenant_id);

-- Create reference table (replicated)
CREATE TABLE countries (
    country_code CHAR(2) PRIMARY KEY,
    country_name TEXT
) REPLICATED;

-- View FK validation statistics
SELECT * FROM heliosdb.fk_validation_stats;

-- View join optimization statistics
SELECT * FROM heliosdb.join_optimization_stats;


9. 3-Tier Storage (Hot/Warm/Cold)

tiered_storage:
  enabled: true

  hot_tier:
    path: /mnt/nvme/heliosdb
    max_size_gb: 1000
    latency_target_ms: 1
    cost_per_gb: 0.15
    iops_limit: 100000

  warm_tier:
    path: /mnt/ssd/heliosdb
    max_size_gb: 5000
    latency_target_ms: 5
    cost_per_gb: 0.04
    iops_limit: 50000

  cold_tier:
    type: s3                             # Options: s3, minio, ceph, gcs, azure
    bucket: heliosdb-cold-tier
    region: us-east-1
    endpoint: https://s3.amazonaws.com
    access_key_id: ${AWS_ACCESS_KEY_ID}
    secret_access_key: ${AWS_SECRET_ACCESS_KEY}
    latency_target_ms: 50
    cost_per_gb: 0.02
    multipart_threshold: 5MB
    multipart_chunk_size: 100MB
    compression: zstd                    # Compress before upload
    encryption: aes256                   # Encrypt before upload

  policies:
    hot_to_warm: 7d                      # Move to warm after 7 days
    warm_to_cold: 30d                    # Move to cold after 30 days
    access_based_tiering: true           # Override age-based if frequently accessed
    min_access_frequency: 10             # Keep in hot if >10 accesses/day

  migration:
    max_concurrent: 5
    max_bandwidth_mbps: 100
    atomic_transitions: true
    enable_throttling: true
    retry_attempts: 3
    retry_delay_ms: 1000

SQL API:

-- View tier usage
SELECT * FROM heliosdb.storage_tier_usage;

-- Manual tier migration
SELECT heliosdb.move_to_tier(
    object_key => 'table:events:sst-12345',
    target_tier => 'cold'
);

-- View tiering policies
SELECT * FROM heliosdb.tiering_policies;

-- Generate cost report
SELECT * FROM heliosdb.storage_cost_analysis();


10. Safekeeper Consensus Layer

safekeeper:
  enabled: true
  cluster_size: 3                        # Number of Safekeeper nodes
  quorum_size: 2                         # Quorum size (2/3)

  nodes:
    - id: sk-1
      address: 10.0.1.10:5433
      priority: 10                       # Higher priority = preferred leader
    - id: sk-2
      address: 10.0.1.11:5433
      priority: 5
    - id: sk-3
      address: 10.0.1.12:5433
      priority: 1

  wal_retention: 7d                      # WAL retention period
  wal_segment_size: 16MB                 # WAL segment size
  sync_timeout_ms: 100                   # Synchronous replication timeout
  async_storage_flush: true              # Asynchronous storage flush
  compression: lz4                       # WAL compression
  checksum: true                         # WAL checksums

SQL API:

-- View safekeeper cluster status
SELECT * FROM heliosdb.safekeeper_cluster_status;

-- View WAL replication metrics
SELECT * FROM heliosdb.wal_replication_stats;

-- Force safekeeper failover (testing only)
SELECT heliosdb.trigger_safekeeper_failover();


11. Online Table Sharding Migration

online_sharding:
  enabled: true
  max_concurrent_migrations: 3           # Maximum concurrent table migrations
  migration_throughput_limit: 100000     # Rows/sec
  bulk_copy_batch_size: 10000            # Rows per batch
  replication_lag_threshold: 1000        # Max lag before cutover
  cutover_timeout_ms: 100                # Maximum cutover lock time
  rollback_on_failure: true              # Automatic rollback
  progress_reporting_interval: 1000      # Report progress every 1000 rows

SQL API:

-- Shard table online
SELECT heliosdb.shard_table_online(
    table_name => 'events',
    shard_key => 'user_id',
    shard_count => 16
);

-- Monitor sharding progress
SELECT * FROM heliosdb.online_sharding_status WHERE table_name = 'events';

-- Rollback sharding
SELECT heliosdb.rollback_online_sharding('events');


12. Multi-Tenant Resource Quotas

multi_tenancy:
  enabled: true
  enforce_quotas: true
  default_tier: bronze

  tiers:
    bronze:
      max_cpu_cu: 2.0
      max_memory_gb: 8
      max_storage_gb: 100
      max_iops: 5000
      max_connections: 50
      priority: 1
      price_multiplier: 1.0
      burst_allowance: 0.1               # 10% burst for 1 minute
      burst_duration: 60s

    silver:
      max_cpu_cu: 8.0
      max_memory_gb: 32
      max_storage_gb: 1000
      max_iops: 15000
      max_connections: 200
      priority: 5
      price_multiplier: 1.5
      burst_allowance: 0.2               # 20% burst for 5 minutes
      burst_duration: 300s

    gold:
      max_cpu_cu: 32.0
      max_memory_gb: 128
      max_storage_gb: 10000
      max_iops: 50000
      max_connections: 1000
      priority: 10
      price_multiplier: 2.5
      burst_allowance: 0.5               # 50% burst for 15 minutes
      burst_duration: 900s

  enforcement:
    check_interval_ms: 100
    grace_period_ms: 1000
    throttle_on_violation: true
    reject_on_hard_limit: true
    notification_enabled: true
    notification_threshold: 0.9          # Notify at 90% quota

SQL API:

-- Create tenant with quotas
CREATE TENANT tenant_123 WITH (
    max_connections => 100,
    max_cpu_cu => 4.0,
    max_storage_gb => 1000,
    max_iops => 10000,
    qos_tier => 'gold'
);

-- View tenant usage
SELECT * FROM heliosdb.tenant_usage WHERE tenant_id = 'tenant_123';

-- Modify tenant quotas
ALTER TENANT tenant_123 SET max_cpu_cu = 8.0;
ALTER TENANT tenant_123 SET qos_tier = 'gold';

-- View quota violations
SELECT * FROM heliosdb.quota_violations
WHERE timestamp > NOW() - INTERVAL '1 hour'
ORDER BY timestamp DESC;


Core Configuration

Server Settings

server:
  host: 0.0.0.0                          # Listen address
  port: 5432                             # PostgreSQL port
  max_connections: 1000                  # Maximum connections
  max_connections_per_user: 100          # Per-user connection limit
  max_connections_per_database: 500      # Per-database connection limit
  connection_timeout: 30s                # Connection timeout
  idle_session_timeout: 3600s            # Idle session timeout (1 hour)
  statement_timeout: 0                   # Statement timeout (0 = disabled)
  lock_timeout: 0                        # Lock timeout (0 = disabled)
  work_mem: 4MB                          # Memory per operation
  shared_buffers: 256MB                  # Shared buffer pool
  max_worker_processes: 8                # Maximum worker processes

Storage Settings

storage:
  data_dir: /var/lib/heliosdb/data
  wal_dir: /var/lib/heliosdb/wal
  temp_dir: /var/lib/heliosdb/temp

  # LSM-Tree Settings
  memtable_size: 64MB                    # Memtable size before flush
  sstable_block_size: 4KB                # SSTable block size
  bloom_filter_bits_per_key: 10          # Bloom filter size
  compaction_strategy: tiered            # Options: leveled, tiered, universal
  compaction_threads: 4                  # Compaction threads
  max_open_files: 1000                   # Maximum open file handles

  # WAL Settings
  wal_segment_size: 16MB
  wal_sync_method: fdatasync             # Options: fsync, fdatasync, open_sync
  wal_compression: lz4                   # Options: none, lz4, zstd
  checkpoint_interval: 5m                # Checkpoint interval
  max_wal_size: 1GB                      # Maximum WAL size before checkpoint

Protocol Configuration

PostgreSQL Protocol

protocols:
  postgres:
    enabled: true
    port: 5432
    simple_query_protocol: true
    extended_query_protocol: true
    binary_format: true                  # Binary wire format
    scram_sha_256: true                  # SCRAM-SHA-256 auth
    ssl_enabled: true
    ssl_cert: /etc/heliosdb/certs/server.crt
    ssl_key: /etc/heliosdb/certs/server.key
    ssl_ca: /etc/heliosdb/certs/ca.crt

Oracle Protocol

  oracle:
    enabled: true
    port: 1521
    tns_protocol: true
    plsql_engine: true
    dbms_packages: all                   # Options: all, priority1, priority2, none
    ref_cursors: true
    oracle_types: true

MySQL Protocol

  mysql:
    enabled: true
    port: 3306
    client_protocol: true

HTTP Gateway

  http:
    enabled: true
    port: 8443
    ssl_enabled: true
    ssl_cert: /etc/heliosdb/certs/server.crt
    ssl_key: /etc/heliosdb/certs/server.key
    auth_method: oauth2                  # Options: oauth2, api_key, basic

Security Configuration

security:
  # Row-Level Security
  rls:
    enabled: true
    default_policy: deny                 # Options: deny, allow

  # Data Masking
  masking:
    enabled: true
    pii_auto_detection: true
    default_algorithm: redact            # Options: redact, hash, tokenize, etc.

  # Audit Logging
  audit:
    enabled: true
    log_path: /var/log/heliosdb/audit.log
    log_format: json
    tamper_proof: true                   # Blockchain-style hash chains
    retention: 90d

  # Encryption
  encryption:
    at_rest: true
    algorithm: aes-256-gcm
    key_rotation_interval: 90d
    per_column_encryption: true

  # FIPS 140-2 Compliance
  fips_140_2:
    enabled: false                       # Enable for government compliance
    crypto_module: openssl-fips

Environment Variables

HeliosDB supports environment variable substitution in configuration files using ${VAR_NAME} syntax.

Common Environment Variables:

# AWS Credentials
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_REGION=us-east-1

# Database Credentials
export HELIOSDB_ADMIN_PASSWORD=secure_password_here
export HELIOSDB_REPLICATION_PASSWORD=replication_password_here

# Paths
export HELIOSDB_DATA_DIR=/var/lib/heliosdb/data
export HELIOSDB_WAL_DIR=/var/lib/heliosdb/wal
export HELIOSDB_LOG_DIR=/var/log/heliosdb

# Networking
export HELIOSDB_HOST=0.0.0.0
export HELIOSDB_PORT=5432

# Feature Flags
export HELIOSDB_ENABLE_BRANCHING=true
export HELIOSDB_ENABLE_AUTOSCALING=true
export HELIOSDB_ENABLE_TIERED_STORAGE=true

Configuration Validation

Validate configuration before starting:

heliosdb-server --config /etc/heliosdb/heliosdb.yaml --validate

Check configuration syntax:

yamllint /etc/heliosdb/heliosdb.yaml

View current configuration (running server):

SELECT * FROM heliosdb.configuration;


Configuration Reload

Reload configuration without restart (for some settings):

# Send SIGHUP to reload
kill -HUP $(cat /var/run/heliosdb.pid)

# Or use SQL
SELECT heliosdb.reload_configuration();

Note: Some settings (like max_connections, shared_buffers) require a restart.


Summary

Configuration Files: - Main: /etc/heliosdb/heliosdb.yaml (1,300+ lines) - Comprehensive coverage of all 71 features - Environment variable support - Validation tools included

Quick Start Configurations Provided: - Development (single node, minimal) - Production (multi-node, all features) - Use case specific (e-commerce, IoT, SaaS, data warehouse, ML)

All v4.0 Features Configurable:

📖 Back to Main README | 🏗 Architecture | 📋 Features | Performance