Skip to content

HeliosDB SQL Command Extensions

This document describes the SQL command extensions available in HeliosDB for managing GPU acceleration, multi-region deployments, autoscaling, and Change Data Capture (CDC).

Overview

HeliosDB extends standard SQL with custom commands to control advanced features:

  • GPU Operations: Enable/disable GPU acceleration, monitor GPU status
  • Multi-Region Management: View regions, configure routing and consistency
  • Autoscaling: Control compute node scaling and resource allocation
  • Change Data Capture: Stream database changes to external systems

All commands are accessible through standard PostgreSQL protocol clients (psql, JDBC, etc.).

GPU Operations

SET gpu_enabled

Enable or disable GPU acceleration for query execution.

SET gpu_enabled = true;
SET gpu_enabled = false;

Parameters: - true: Enable GPU acceleration (queries will use GPU when beneficial) - false: Disable GPU acceleration (queries use CPU only)

Example:

-- Enable GPU acceleration
SET gpu_enabled = true;

-- Run a query that will use GPU
SELECT region, SUM(sales), AVG(price)
FROM sales_data
GROUP BY region;

SET gpu_memory_limit

Set the maximum GPU memory allocation for queries.

SET gpu_memory_limit = '4GB';
SET gpu_memory_limit = '8GB';

Parameters: - Memory size as string with unit (GB, MB)

Example:

-- Limit GPU memory to 4GB
SET gpu_memory_limit = '4GB';

SHOW gpu_status

Display current GPU configuration and status.

SHOW gpu_status;

Returns:

   setting    |    value
--------------+-------------
 enabled      | true
 available    | true
 device_count | 2
 memory_limit | 4GB

Example:

SHOW gpu_status;

SHOW gpu_devices

List all available GPU devices with their specifications.

SHOW gpu_devices;

Returns:

 device_id |        name         | total_memory | free_memory | compute_capability
-----------+---------------------+--------------+-------------+-------------------
 0         | NVIDIA Tesla T4     | 15360 MB     | 14200 MB    | 7
 1         | NVIDIA Tesla V100   | 32768 MB     | 31000 MB    | 7

Example:

SHOW gpu_devices;

Multi-Region Management

SHOW REGIONS

Display all regions in the multi-region deployment.

SHOW REGIONS;

Returns:

 region_id  | datacenter |   role    | is_healthy | lag_ms
------------+------------+-----------+------------+--------
 us-east-1  | virginia   | primary   | true       | 0
 us-west-2  | oregon     | secondary | true       | 150
 eu-west-1  | ireland    | secondary | true       | 180

Example:

SHOW REGIONS;

SHOW REGION STATUS

Display current multi-region configuration and status.

SHOW REGION STATUS;

Returns:

      setting       |    value
--------------------+--------------
 preferred_region   | us-west-2
 consistency_level  | eventual
 active_regions     | 3
 replication_mode   | async

Example:

SHOW REGION STATUS;

SET preferred_region

Set the preferred region for query routing.

SET preferred_region = 'us-west-2';
SET preferred_region = 'eu-west-1';

Parameters: - Region ID from SHOW REGIONS output

Example:

-- Route queries to US West region
SET preferred_region = 'us-west-2';

-- Queries will now prefer the US West region
SELECT * FROM users WHERE country = 'USA';

SET consistency_level

Configure consistency level for multi-region operations.

SET consistency_level = 'eventual';
SET consistency_level = 'strong';
SET consistency_level = 'quorum';

Parameters: - eventual: Fastest, returns immediately (eventual consistency) - strong: Slowest, waits for all regions (strong consistency) - quorum: Balanced, waits for majority of regions

Example:

-- Use strong consistency for financial transactions
SET consistency_level = 'strong';

BEGIN;
INSERT INTO transactions (account_id, amount) VALUES (12345, 100.00);
COMMIT;

-- Use eventual consistency for analytics
SET consistency_level = 'eventual';
SELECT COUNT(*) FROM page_views;

Autoscaling Management

SHOW AUTOSCALE STATUS

Display current autoscaling configuration and metrics.

SHOW AUTOSCALE STATUS;

Returns:

     setting      | value
------------------+-------
 enabled          | true
 target_cu        | 100
 min_instances    | 1
 max_instances    | 10
 current_instances| 3
 current_cu_usage | 85

Example:

SHOW AUTOSCALE STATUS;

SET autoscale_enabled

Enable or disable autoscaling.

SET autoscale_enabled = true;
SET autoscale_enabled = false;

Parameters: - true: Enable autoscaling (system will scale based on load) - false: Disable autoscaling (maintain current instance count)

Example:

-- Enable autoscaling
SET autoscale_enabled = true;

SET autoscale_target_cu

Set target compute unit (CU) utilization for autoscaling.

SET autoscale_target_cu = 100;
SET autoscale_target_cu = 200;

Parameters: - Target CU value (integer)

Example:

-- Scale to maintain 100 CU utilization
SET autoscale_target_cu = 100;

SET autoscale_min_instances

Set minimum number of compute instances.

SET autoscale_min_instances = 1;
SET autoscale_min_instances = 2;

Parameters: - Minimum instance count (must be ≤ max_instances)

Example:

-- Always keep at least 2 instances running
SET autoscale_min_instances = 2;

SET autoscale_max_instances

Set maximum number of compute instances.

SET autoscale_max_instances = 10;
SET autoscale_max_instances = 20;

Parameters: - Maximum instance count (must be ≥ min_instances)

Example:

-- Allow scaling up to 20 instances
SET autoscale_max_instances = 20;

SHOW COMPUTE NODES

Display all compute nodes and their status.

SHOW COMPUTE NODES;

Returns:

 node_id | status  | cpu_usage | memory_usage | cu_allocation
---------+---------+-----------+--------------+---------------
 node-1  | active  | 45%       | 62%          | 25
 node-2  | active  | 78%       | 81%          | 35
 node-3  | active  | 52%       | 70%          | 25

Example:

SHOW COMPUTE NODES;

Change Data Capture (CDC)

CREATE CHANGE DATA CAPTURE

Create a CDC stream to capture changes from a table.

CREATE CHANGE DATA CAPTURE ON table_name
  TO KAFKA 'broker_address' TOPIC 'topic_name'
  [FORMAT JSON|AVRO]
  [AS 'stream_name'];

Parameters: - table_name: Table to capture changes from - Destination: KAFKA or KINESIS - KAFKA: requires broker address and topic - KINESIS: requires stream name and region - FORMAT: JSON (default) or AVRO - stream_name: Optional custom stream name

Examples:

-- Capture changes to Kafka with JSON format
CREATE CHANGE DATA CAPTURE ON users
  TO KAFKA 'localhost:9092' TOPIC 'users-changes'
  FORMAT JSON;

-- Capture changes to Kafka with custom stream name
CREATE CHANGE DATA CAPTURE ON orders
  TO KAFKA 'kafka.example.com:9092' TOPIC 'orders-cdc'
  FORMAT JSON
  AS 'orders_stream';

-- Capture changes to AWS Kinesis with Avro format
CREATE CHANGE DATA CAPTURE ON transactions
  TO KINESIS 'transactions-stream' REGION 'us-east-1'
  FORMAT AVRO;

SHOW CDC STREAMS

List all active CDC streams.

SHOW CDC STREAMS;

Returns:

   stream_name    | table_name | format | status | events_processed |     created_at
------------------+------------+--------+--------+------------------+--------------------
 users_cdc_stream | users      | json   | Active | 15234            | 2025-01-15T10:30:00Z
 orders_stream    | orders     | json   | Active | 8921             | 2025-01-15T11:00:00Z
 transactions_cdc | transactions| avro   | Paused | 45123            | 2025-01-14T09:00:00Z

Example:

SHOW CDC STREAMS;

SHOW CDC STREAM STATUS

Display detailed status of a specific CDC stream.

SHOW CDC STREAM STATUS stream_name;

Returns:

      setting       |              value
--------------------+----------------------------------
 stream_name        | users_cdc_stream
 table_name         | users
 format             | json
 status             | Active
 destination        | Kafka: localhost:9092 -> users-changes
 events_processed   | 15234
 created_at         | 2025-01-15T10:30:00Z

Example:

SHOW CDC STREAM STATUS users_cdc_stream;

DROP CHANGE DATA CAPTURE

Remove a CDC stream.

DROP CHANGE DATA CAPTURE stream_name;

Example:

DROP CHANGE DATA CAPTURE users_cdc_stream;

ALTER CDC STREAM

Modify a CDC stream's configuration.

-- Pause a stream
ALTER CDC STREAM stream_name PAUSE;

-- Resume a stream
ALTER CDC STREAM stream_name RESUME;

-- Set event filter
ALTER CDC STREAM stream_name SET FILTER 'condition';

Examples:

-- Pause CDC stream during maintenance
ALTER CDC STREAM users_cdc_stream PAUSE;

-- Resume CDC stream
ALTER CDC STREAM users_cdc_stream RESUME;

-- Filter events (only capture UPDATEs and DELETEs)
ALTER CDC STREAM users_cdc_stream SET FILTER 'operation IN (UPDATE, DELETE)';

Usage Patterns

Pattern 1: GPU-Accelerated Analytics

-- Enable GPU for large aggregations
SET gpu_enabled = true;
SET gpu_memory_limit = '8GB';

-- Run analytics query (will use GPU)
SELECT
  product_category,
  COUNT(*) as total_orders,
  SUM(order_total) as total_revenue,
  AVG(order_total) as avg_order_value
FROM orders
WHERE order_date >= '2025-01-01'
GROUP BY product_category
ORDER BY total_revenue DESC;

-- Check GPU status
SHOW gpu_status;

Pattern 2: Multi-Region Deployment

-- Route to nearest region
SET preferred_region = 'eu-west-1';

-- Use eventual consistency for read-heavy workload
SET consistency_level = 'eventual';

SELECT * FROM products WHERE category = 'electronics';

-- Use strong consistency for critical writes
SET consistency_level = 'strong';

UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 12345;

Pattern 3: Autoscaling Configuration

-- Configure autoscaling for variable workload
SET autoscale_enabled = true;
SET autoscale_min_instances = 2;
SET autoscale_max_instances = 20;
SET autoscale_target_cu = 100;

-- Monitor scaling
SHOW AUTOSCALE STATUS;
SHOW COMPUTE NODES;

Pattern 4: CDC Stream Management

-- Create CDC stream for real-time analytics
CREATE CHANGE DATA CAPTURE ON user_events
  TO KAFKA 'kafka.example.com:9092' TOPIC 'user-events-stream'
  FORMAT JSON
  AS 'user_events_cdc';

-- Monitor CDC stream
SHOW CDC STREAMS;
SHOW CDC STREAM STATUS user_events_cdc;

-- Pause during maintenance
ALTER CDC STREAM user_events_cdc PAUSE;

-- Resume after maintenance
ALTER CDC STREAM user_events_cdc RESUME;

-- Clean up
DROP CHANGE DATA CAPTURE user_events_cdc;

Client Integration

Python (psycopg2)

import psycopg2

conn = psycopg2.connect(
    host="localhost",
    port=5432,
    database="heliosdb",
    user="admin"
)

cursor = conn.cursor()

# Enable GPU
cursor.execute("SET gpu_enabled = true;")

# Check status
cursor.execute("SHOW gpu_status;")
for row in cursor.fetchall():
    print(f"{row[0]}: {row[1]}")

# Create CDC stream
cursor.execute("""
    CREATE CHANGE DATA CAPTURE ON users
    TO KAFKA 'localhost:9092' TOPIC 'users-changes'
    FORMAT JSON;
""")

conn.commit()
cursor.close()
conn.close()

Java (JDBC)

import java.sql.*;

public class HeliosDBExample {
    public static void main(String[] args) throws SQLException {
        Connection conn = DriverManager.getConnection(
            "jdbc:postgresql://localhost:5432/heliosdb",
            "admin", "password"
        );

        Statement stmt = conn.createStatement();

        // Enable GPU
        stmt.execute("SET gpu_enabled = true");

        // Show regions
        ResultSet rs = stmt.executeQuery("SHOW REGIONS");
        while (rs.next()) {
            System.out.println(
                rs.getString("region_id") + ": " +
                rs.getString("role")
            );
        }

        stmt.close();
        conn.close();
    }
}

Go (pgx)

package main

import (
    "context"
    "fmt"
    "github.com/jackc/pgx/v5"
)

func main() {
    conn, err := pgx.Connect(context.Background(),
        "postgres://admin@localhost:5432/heliosdb")
    if err != nil {
        panic(err)
    }
    defer conn.Close(context.Background())

    // Enable autoscaling
    _, err = conn.Exec(context.Background(),
        "SET autoscale_enabled = true")
    if err != nil {
        panic(err)
    }

    // Show status
    rows, err := conn.Query(context.Background(),
        "SHOW AUTOSCALE STATUS")
    if err != nil {
        panic(err)
    }
    defer rows.Close()

    for rows.Next() {
        var setting, value string
        err = rows.Scan(&setting, &value)
        if err != nil {
            panic(err)
        }
        fmt.Printf("%s: %s\n", setting, value)
    }
}

Error Handling

All commands return standard PostgreSQL error codes:

  • 42601: Syntax error in SQL command
  • 22023: Invalid parameter value
  • 42704: Resource not found (e.g., CDC stream doesn't exist)
  • 23505: Resource already exists (e.g., CDC stream name conflict)
  • XX000: Internal error

Example:

-- This will return error 42704 (not found)
SHOW CDC STREAM STATUS nonexistent_stream;
-- ERROR:  CDC stream 'nonexistent_stream' not found

-- This will return error 23505 (duplicate)
CREATE CHANGE DATA CAPTURE ON users TO KAFKA 'localhost:9092' TOPIC 'users';
CREATE CHANGE DATA CAPTURE ON users TO KAFKA 'localhost:9092' TOPIC 'users';
-- ERROR:  CDC stream 'users_cdc_stream' already exists

Performance Considerations

GPU Commands

  • GPU operations have ~5ms initialization overhead
  • Best for queries processing >100MB of data
  • Check SHOW gpu_status to verify GPU availability

Multi-Region Commands

  • SET consistency_level affects write latency:
  • eventual: <5ms
  • quorum: 50-150ms
  • strong: 100-300ms
  • Use SHOW REGIONS to check replication lag

Autoscaling Commands

  • Configuration changes take 30-60 seconds to propagate
  • Scaling operations take 1-3 minutes per instance

CDC Commands

  • CDC has <1% performance impact on writes
  • Events are delivered with at-least-once semantics
  • Check SHOW CDC STREAMS for processing lag

See Also