Enhanced EXPLAIN PLAN User Guide¶

Version: 7.0 Last Updated: 2025-11-17 Status: Phase 1 Complete

Table of Contents¶

Introduction
Quick Start
EXPLAIN Basics
Configuration Tracking
Feature Detection
Optimizer Decisions
Why Not Analysis
Output Formats
Query Optimization Workflow
Best Practices
Troubleshooting
Advanced Topics

Introduction¶

HeliosDB's Enhanced EXPLAIN PLAN goes far beyond traditional query plan visualization. It provides:

Configuration Snapshots: Capture 45+ configuration parameters affecting query planning
Feature Detection: Automatically detect 8+ active optimization features (partition pruning, SIMD, etc.)
Optimizer Decision Tracking: See why the optimizer chose specific strategies with cost comparisons
"Why Not" Analysis: Understand why certain optimizations weren't applied
Query Rewrite Suggestions: Get actionable recommendations to improve query performance
Multiple Output Formats: Text, JSON, YAML, XML, HTML, GraphViz, Mermaid

Key Benefits¶

Deep Visibility: Understand exactly what your queries are doing
Performance Tuning: Identify bottlenecks and optimization opportunities
Debugging: Diagnose unexpected query plans
Education: Learn how the query optimizer works
Production Monitoring: Track query plan changes over time

Quick Start¶

Basic EXPLAIN¶

-- Simple EXPLAIN
EXPLAIN SELECT * FROM users WHERE customer_id = 123;

-- Output:
EXPLAIN PLAN (Enhanced)
Estimated Cost: 50.00
Estimated Rows: 1
Planning Time: 2.34ms

Active Features (2):
  ✓ Predicate Pushdown (IMPLICIT)
    Trigger: 1 predicates pushed to storage layer
    Benefit: Filters applied early, reduces data transfer

  ✓ Index Scan on table 1 (IMPLICIT)
    Trigger: Highly selective predicate on indexed column (0.1% selectivity)
    Benefit: 100x faster than sequential scan
    Savings: 99.0% (10000.00 -> 100.00)

Execution Plan:
-> IndexScan(table_id=1, predicates=1)

EXPLAIN ANALYZE with "Why Not"¶

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders
WHERE customer_id = 123;

Different Output Formats¶

-- JSON format for tools
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;

-- HTML for web viewing
EXPLAIN (FORMAT HTML) SELECT * FROM orders WHERE date > '2024-01-01';

-- GraphViz for diagrams
EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM complex_join_query;

-- Mermaid for Markdown
EXPLAIN (FORMAT MERMAID) SELECT * FROM sales GROUP BY region;

EXPLAIN Basics¶

Syntax¶

EXPLAIN [ ANALYZE ] [ ( option [, ...] ) ] statement

Options:
  FORMAT { TEXT | JSON | YAML | XML | HTML | GRAPHVIZ | MERMAID }
  WHY_NOT { TRUE | FALSE }
  VERBOSE { TRUE | FALSE }
  COSTS { TRUE | FALSE }
  BUFFERS { TRUE | FALSE }
  TIMING { TRUE | FALSE }

Understanding Output¶

Estimated Cost¶

The cost is an arbitrary unit representing execution time and resource usage:

Sequential Scan: High cost for large tables (1.0 per page)
Index Scan: Lower cost for selective queries (0.005 per tuple)
Hash Join: Depends on hash table size and probe complexity
Sort: Depends on data volume and available memory

Rule of Thumb: - Cost < 100: Very fast - Cost 100-1000: Fast - Cost 1000-10000: Moderate - Cost > 10000: Slow (consider optimization)

Estimated Rows¶

Number of rows the optimizer expects this operation to return.

Important: If actual rows differ significantly from estimated rows, your statistics may be stale!

-- Update statistics
ANALYZE users;

Configuration Tracking¶

Every EXPLAIN captures 45+ configuration parameters affecting the plan:

Memory Settings¶

work_mem: 256MB
  - Controls memory for sorts and hash tables
  - Increase if seeing "disk-based sort" warnings

shared_buffers: 1024MB
  - Shared memory cache for data pages
  - Typically 25% of system RAM

effective_cache_size: 4096MB
  - Estimate of OS cache available
  - Used for cost calculations

Join Strategy Toggles¶

enable_hashjoin: true
  - Hash joins for equality conditions
  - Fast for medium-sized tables

enable_mergejoin: true
  - Merge joins for sorted data
  - Efficient for pre-sorted inputs

enable_nestloop: true
  - Nested loop joins for small tables
  - Good when one side is very selective

Scan Strategy Toggles¶

enable_indexscan: true
  - Use indexes when beneficial

enable_seqscan: true
  - Sequential scans for large portions of table

enable_bitmapscan: true
  - Bitmap scans for multiple index conditions

Parallelism Settings¶

max_parallel_workers: 8
  - Maximum parallel workers across all queries

max_parallel_workers_per_gather: 4
  - Maximum parallel workers for single operation

min_parallel_table_scan_size_mb: 8
  - Minimum table size for parallelism

Cost Parameters¶

seq_page_cost: 1.0
  - Cost to read one page sequentially

random_page_cost: 4.0
  - Cost to read one page randomly
  - Higher = optimizer prefers sequential scans
  - Lower = optimizer prefers index scans

cpu_tuple_cost: 0.01
  - Cost to process one tuple

cpu_operator_cost: 0.0025
  - Cost to execute one operator/function

Advanced Features¶

enable_partition_pruning: true
  - Skip irrelevant partitions based on predicates

enable_simd: true
  - SIMD vectorization for aggregations

enable_jit: false
  - JIT compilation for complex queries (experimental)

Feature Detection¶

The optimizer automatically detects and reports active features:

Partition Pruning¶

SELECT * FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-01-31';

-- Feature Detected:
✓ Partition Pruning (IMPLICIT)
  Trigger: Range predicates on partitioned table (1 predicates)
  Benefit: Skipped 11 of 12 partitions (91.7% I/O savings)
  Savings: 91.7% (12000.00 -> 1000.00)
  Confidence: 70%

When it Activates: - Range predicates on partition key - Equality predicates matching partition boundaries - IN clauses with partition key values

How to Maximize: - Partition on frequently filtered columns (date, region, category) - Use partition-aligned predicates - Keep partition count reasonable (< 1000)

Predicate Pushdown¶

SELECT * FROM orders
WHERE customer_id = 123 AND status = 'active';

-- Feature Detected:
✓ Predicate Pushdown (IMPLICIT)
  Trigger: 2 predicates pushed to storage layer
  Benefit: Filters applied early, reduces data transfer
  Confidence: 95%

When it Activates: - WHERE clause predicates on base tables - JOIN conditions pushed to table scans - Subquery filters pushed down

Performance Impact: - 10-100x reduction in data transfer - Lower memory usage - Faster query execution

SIMD Vectorization¶

SELECT region, SUM(sales), AVG(price)
FROM sales
GROUP BY region;

-- Feature Detected:
✓ SIMD Vectorization (IMPLICIT)
  Trigger: Numeric aggregation operations (SUM/AVG/COUNT)
  Benefit: 4x speedup on aggregation operations using AVX2/AVX512
  Savings: 75.0% (1000.00 -> 250.00)
  Confidence: 90%

When it Activates: - SUM, AVG, COUNT aggregations - Numeric data types (INT, BIGINT, FLOAT, DOUBLE) - SIMD-capable hardware (AVX2/AVX512)

Performance Impact: - 4-8x speedup on supported operations - Automatic - no query changes needed

Index Usage¶

SELECT * FROM users WHERE email = 'test@example.com';

-- Feature Detected:
✓ Index Scan on table users (IMPLICIT)
  Trigger: Highly selective predicate on indexed column (0.1% selectivity)
  Benefit: 100x faster than sequential scan
  Savings: 99.0% (10000.00 -> 100.00)
  Confidence: 95%

Types of Index Scans: - Index Scan: Read index + fetch table rows - Index-Only Scan: Read index only (covering index) - Bitmap Index Scan: Multiple index conditions combined

Parallelism¶

SELECT * FROM large_table WHERE category = 'electronics';

-- Feature Detected:
✓ Parallel Sequential Scan (IMPLICIT)
  Trigger: Table size (50GB) exceeds min_parallel_table_scan_size
  Benefit: 4x speedup with 4 parallel workers
  Savings: 75.0% (20000.00 -> 5000.00)
  Workers: 4
  Confidence: 85%

When it Activates: - Table size > min_parallel_table_scan_size_mb - Query cost > parallel_setup_cost - Sufficient parallel workers available

Optimizer Decisions¶

See exactly why the optimizer chose specific strategies:

Join Method Selection¶

SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.id;

-- Optimizer Decision:
1. Join Strategy
   CHOSEN: Hash Join (cost: 250.00, rows: 10000)
   Reasoning: Medium-sized tables with equality condition.
              Hash join is optimal.

   REJECTED: Nested Loop Join (cost: 5000.00, 20.0x more expensive)
   Reason: One table too large for nested loop

   REJECTED: Merge Join (cost: 800.00, 3.2x more expensive)
   Reason: Neither input is sorted on join key

Decision Factors: - Table sizes - Join condition types (equality vs range) - Available memory (work_mem) - Pre-sorted data - Index availability

Index Selection¶

SELECT * FROM users
WHERE email = 'test@example.com' AND status = 'active';

-- Optimizer Decision:
1. Index Selection for users table
   CHOSEN: idx_users_email (cost: 50.00, rows: 1)
   Reasoning: Email is highly selective (0.1%).
              Index lookup is 100x faster than seq scan.

   REJECTED: idx_users_status (cost: 2000.00, 40.0x more expensive)
   Reason: Status has low selectivity (50%). Sequential scan cheaper.

   REJECTED: Sequential Scan (cost: 5000.00, 100.0x more expensive)
   Reason: Table is large (1M rows) and predicate is highly selective

Scan Method Selection¶

SELECT * FROM products WHERE category = 'electronics';

-- Optimizer Decision:
1. Scan Method for products
   CHOSEN: Sequential Scan (cost: 1000.00, rows: 50000)
   Reasoning: Low selectivity (50% of rows match).
              Sequential scan more efficient.

   REJECTED: Index Scan on idx_category (cost: 3000.00, 3.0x more expensive)
   Reason: High number of matching rows causes excessive random I/O

Selectivity Thresholds: - < 1%: Index scan strongly preferred - 1-5%: Index scan usually preferred - 5-20%: Depends on index quality and random_page_cost - > 20%: Sequential scan usually preferred

Why Not Analysis¶

Understand why certain optimizations weren't applied:

Index Not Used¶

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders WHERE customer_id = 123;

-- Why Not Analysis:
INDEX USAGE ANALYSIS (2 indexes)
────────────────────────────────────────────────────────

1. Index: idx_orders_date
   Table: orders | Type: BTree | Severity: Low
   Reason: Column 'order_date' not in WHERE clause
    Suggestion: Not applicable to this query
   💰 Cost Impact: Would add cost (unnecessary index scan)

2. Index: idx_orders_customer_date (covering index)
   Table: orders | Type: BTree | Severity: High
   Reason: Statistics outdated (analyzed 45 days ago)
    Suggestion: Run ANALYZE orders; to update statistics
              This covering index could eliminate table lookups.
              Estimated cost reduction: 60.0% (50 -> 20)
   💰 Cost Impact: 2.5x slower without fresh stats (20.00 vs 50.00)

Statistics Warnings¶

STATISTICS WARNINGS (1 tables)
────────────────────────────────────────────────────────

1. Table: orders
   Last Updated: 2024-09-15 | Severity: High
   Freshness: 45 days old (35.0% of data modified since)
   Impact: Stale statistics cause suboptimal query plans
    Action: ANALYZE orders;

Statistics Freshness Guidelines: - < 7 days: Fresh (Severity: Low) - 7-14 days: Aging (Severity: Low) - 14-30 days: Stale (Severity: Medium) - > 30 days: Critical (Severity: High)

Impact of Stale Statistics: - Incorrect cardinality estimates - Wrong join order - Suboptimal index selection - Parallel execution disabled

Cardinality Mismatches¶

CARDINALITY MISMATCHES (1 detected)
────────────────────────────────────────────────────────

1. Operation: Hash Join
   Estimated: 1000 rows | Actual: 100000 rows | Severity: Critical
   Error Rate: 9900.0% (100.0x underestimate)
   Likely Cause: Outdated statistics on join columns or correlated predicates
    Fix: Run ANALYZE on joined tables
          Consider multi-column statistics for correlated columns

Common Causes: - Outdated statistics - Correlated columns (not modeled) - Skewed data distributions - Complex predicates - Missing histograms

Optimization Suggestions¶

OPTIMIZATION SUGGESTIONS (5 recommendations)
────────────────────────────────────────────────────────

1. Create covering index for GROUP BY
   Category: Indexing | Priority: Medium | Effort: Medium
   Description: A covering index on GROUP BY columns with INCLUDE
                for aggregate columns could enable index-only scan.
    Expected Benefit: 81.0% improvement (1050.00 -> 200.00)
   Implementation:
     • Identify aggregate and group-by columns
     • CREATE INDEX with INCLUDE clause for aggregate columns
     • Test with EXPLAIN ANALYZE

2. Rewrite OR conditions as UNION
   Category: QueryRewrite | Priority: High | Effort: Medium
   Description: OR conditions prevent index usage. Rewriting as UNION ALL
                allows each branch to use appropriate indexes.
    Expected Benefit: 90.0% improvement (5000.00 -> 500.00)
   Implementation:
     • Identify OR conditions in WHERE clause
     • Rewrite: SELECT * FROM t WHERE a=1 OR b=2
     • As: SELECT * FROM t WHERE a=1 UNION ALL SELECT * FROM t WHERE b=2
     • Add DISTINCT if needed to remove duplicates

Output Formats¶

TEXT (Default)¶

Human-readable, terminal-friendly output.

EXPLAIN SELECT * FROM users WHERE id = 1;

Best For: Quick debugging, terminal use

JSON¶

Machine-readable, structured data for tools.

EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;

{
  "plan": {
    "node_type": "IndexScan",
    "table_id": "users",
    "estimated_cost": 50.0,
    "estimated_rows": 1
  },
  "active_features": {
    "implicit": [
      {
        "name": "Index Scan",
        "category": "Indexing",
        "savings": {
          "percent_reduction": 99.0
        }
      }
    ]
  }
}

Best For: Tool integration, automation, logging

YAML¶

Human-friendly structured format.

EXPLAIN (FORMAT YAML) SELECT * FROM users WHERE id = 1;

plan:
  estimated_cost: 50.0
  estimated_rows: 1
  planning_time_ms: 2.34

active_features:
  total: 2
  implicit: 2
  explicit: 0

Best For: Configuration files, documentation

XML¶

Enterprise tool integration.

EXPLAIN (FORMAT XML) SELECT * FROM users WHERE id = 1;

<?xml version="1.0" encoding="UTF-8"?>
<explain-plan version="7.0">
  <summary>
    <estimated-cost>50.0</estimated-cost>
    <estimated-rows>1</estimated-rows>
  </summary>
  <active-features>
    <feature type="implicit" category="Indexing">
      <name>Index Scan</name>
      <confidence>0.95</confidence>
    </feature>
  </active-features>
</explain-plan>

Best For: Enterprise ETL tools, SOAP APIs, XML parsers

HTML¶

Web-based visualization with CSS styling.

EXPLAIN (FORMAT HTML) SELECT * FROM users WHERE id = 1;

Generates complete HTML page with: - Responsive design - Color-coded nodes - Interactive elements - Print-friendly layout

Best For: Dashboards, reports, presentations

GraphViz (DOT)¶

Generate diagrams with GraphViz.

EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;

digraph ExplainPlan {
  rankdir=TB;
  node [shape=box, style=rounded];

  node_0 [label="Hash Join\nCost: 250.00", fillcolor="#ffccbc", style=filled];
  node_1 [label="TableScan\nTable: users", fillcolor="#bbdefb", style=filled];
  node_2 [label="IndexScan\nTable: orders", fillcolor="#bbdefb", style=filled];

  node_0 -> node_1 [label="left"];
  node_0 -> node_2 [label="right"];
}

Render with: dot -Tpng explain.dot -o plan.png

Best For: Architecture diagrams, documentation, presentations

Mermaid¶

Markdown-based diagrams.

EXPLAIN (FORMAT MERMAID) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;

graph TB
  summary["EXPLAIN PLAN<br/>Cost: 250.00<br/>Rows: 10000"]
  summary:::summaryStyle

  node0["Hash Join"]
  node0:::joinStyle

  node1["TableScan<br/>Table: users"]
  node1:::scanStyle

  node2["IndexScan<br/>Table: orders"]
  node2:::scanStyle

  summary --> node0
  node0 -->|left| node1
  node0 -->|right| node2

  classDef summaryStyle fill:#e8f5e9,stroke:#4caf50
  classDef scanStyle fill:#bbdefb,stroke:#2196f3
  classDef joinStyle fill:#ffccbc,stroke:#ff5722

Best For: GitHub/GitLab documentation, Markdown files

Query Optimization Workflow¶

Step 1: Identify Slow Queries¶

-- Find slow queries in production
SELECT query, total_time, calls, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

Step 2: EXPLAIN the Query¶

EXPLAIN ANALYZE (WHY_NOT, FORMAT TEXT)
<your slow query>;

Step 3: Check Statistics¶

STATISTICS WARNINGS:
  Table: orders - Last analyzed 45 days ago
  Action: ANALYZE orders;

Fix: Run ANALYZE

ANALYZE orders;

Step 4: Review Active Features¶

Active Features (1):
  ✓ Sequential Scan (IMPLICIT)
    Should use index instead!

Fix: Check Why Not Analysis

Step 5: Apply Suggestions¶

OPTIMIZATION SUGGESTIONS:
  1. Create index on customer_id
     CREATE INDEX idx_orders_customer ON orders(customer_id);

Fix: Create recommended index

CREATE INDEX idx_orders_customer ON orders(customer_id);
ANALYZE orders;  -- Update statistics

Step 6: Verify Improvement¶

EXPLAIN ANALYZE
<your query>;

Expected: Lower cost, index scan used

Best Practices¶

1. Regular Statistics Updates¶

-- Automated statistics collection
CREATE EXTENSION pg_cron;

SELECT cron.schedule('nightly-analyze', '0 2 * * *', $$
  ANALYZE VERBOSE;
$$);

2. Monitor Query Plans¶

-- Track plan changes over time
CREATE TABLE query_plan_history (
  query_id TEXT,
  plan_hash TEXT,
  estimated_cost NUMERIC,
  captured_at TIMESTAMP DEFAULT NOW()
);

-- Log explain output
INSERT INTO query_plan_history (query_id, plan_hash, estimated_cost)
SELECT
  md5(query),
  md5(explain_plan),
  (explain_json->>'estimated_cost')::numeric
FROM pg_stat_statements;

3. Index Maintenance¶

-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexname NOT LIKE 'pg_toast%'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Drop if truly unused
DROP INDEX IF EXISTS unused_index_name;

4. Configuration Tuning¶

-- For OLTP (many small queries)
SET work_mem = '64MB';
SET random_page_cost = 1.1;  -- SSD storage
SET effective_cache_size = '8GB';

-- For OLAP (complex analytical queries)
SET work_mem = '512MB';
SET max_parallel_workers_per_gather = 4;
SET enable_partitionwise_join = ON;

5. Query Patterns to Avoid¶

Avoid:

-- Function on indexed column
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

-- Leading wildcard
SELECT * FROM products WHERE name LIKE '%phone%';

-- OR conditions preventing index use
SELECT * FROM orders WHERE status = 'pending' OR status = 'processing';

Instead:

-- Use functional index or normalize data
SELECT * FROM users WHERE email = 'test@example.com';

-- Use full-text search
SELECT * FROM products WHERE to_tsvector(name) @@ to_tsquery('phone');

-- Rewrite as UNION
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE status = 'processing';

Troubleshooting¶

Problem: Index Not Being Used¶

Symptoms: - EXPLAIN shows Sequential Scan - Query is slow despite having index

Diagnosis:

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM users WHERE email = 'test@example.com';

Common Causes:

Stale Statistics
```
Fix: ANALYZE users;
```

Low Selectivity

Index used only if < 20% of rows match
Fix: Add more selective predicates

Data Type Mismatch

-- Wrong: id is INTEGER
SELECT * FROM users WHERE id = '123';

-- Right:
SELECT * FROM users WHERE id = 123;

Function on Column

-- Wrong:
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

-- Right: Create functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

Problem: Wrong Join Order¶

Symptoms: - Hash join on large tables first - Cardinality mismatches

Diagnosis:

EXPLAIN ANALYZE
SELECT * FROM a
JOIN b ON a.id = b.a_id
JOIN c ON b.id = c.b_id;

Fix:

-- Update statistics
ANALYZE a, b, c;

-- Force different join order (last resort)
SET join_collapse_limit = 1;
SELECT * FROM a
JOIN (b JOIN c ON b.id = c.b_id) ON a.id = b.a_id;

Problem: Parallel Query Not Activating¶

Symptoms: - Expected parallel workers: 4 - Actual parallel workers: 0

Common Causes:

Table Too Small

-- Check table size
SELECT pg_size_pretty(pg_relation_size('table_name'));

-- Adjust threshold
SET min_parallel_table_scan_size = '1MB';

Insufficient Workers

SHOW max_parallel_workers;
SHOW max_parallel_workers_per_gather;

-- Increase if needed
SET max_parallel_workers = 8;

Query Too Cheap

-- Parallel setup has overhead
SHOW parallel_setup_cost;  -- Default: 1000

-- Lower for testing (not recommended in production)
SET parallel_setup_cost = 100;

Advanced Topics¶

Custom Statistics¶

-- Create extended statistics for correlated columns
CREATE STATISTICS orders_customer_date_stats (dependencies)
ON customer_id, order_date FROM orders;

ANALYZE orders;

Query Hints (Experimental)¶

-- Force index usage (PostgreSQL-style hint)
SELECT /*+ IndexScan(users idx_users_email) */
* FROM users WHERE email = 'test@example.com';

-- Force join order
SELECT /*+ Leading((a b) c) */
* FROM a, b, c WHERE ...;

Plan Stability¶

-- Freeze a good plan
CREATE EXTENSION pg_plan_freeze;

SELECT pg_plan_freeze('slow_query_md5_hash', 'good_plan_hash');

Cost Model Tuning¶

-- For SSD storage
SET random_page_cost = 1.1;
SET seq_page_cost = 1.0;

-- For HDD storage
SET random_page_cost = 4.0;
SET seq_page_cost = 1.0;

-- For in-memory data
SET random_page_cost = 0.1;
SET seq_page_cost = 0.1;

Appendix A: Configuration Reference¶

Complete Parameter List¶

Parameter	Default	Range	Impact
work_mem	256MB	64KB-2GB	Sort/Hash memory
shared_buffers	1024MB	128MB-40% RAM	Page cache
effective_cache_size	4096MB	1GB-75% RAM	Cost calculations
random_page_cost	4.0	0.1-10.0	Index vs scan choice
cpu_tuple_cost	0.01	0.001-0.1	Processing cost
max_parallel_workers	8	0-32	Parallel queries

Feature Flags¶

Flag	Default	Description
enable_simd	true	SIMD vectorization
enable_hashjoin	true	Hash joins
enable_mergejoin	true	Merge joins
enable_nestloop	true	Nested loop joins
enable_indexscan	true	Index scans
enable_seqscan	true	Sequential scans
enable_partition_pruning	true	Partition pruning

Appendix B: Error Codes¶

Code	Message	Solution
E001	Statistics too old	Run ANALYZE
E002	No suitable index	Create index
E003	Cardinality mismatch	Update statistics, check correlations
E004	Out of memory (work_mem)	Increase work_mem
E005	Query too complex	Simplify or break into CTEs

Appendix C: Glossary¶

Cardinality: Number of rows in a result set
Selectivity: Fraction of rows matching a predicate (0.0 to 1.0)
Cost: Arbitrary unit representing execution time
Predicate: Filter condition (WHERE clause)
Covering Index: Index containing all needed columns
Sequential Scan: Read entire table sequentially
Index Scan: Read via index structure
Hash Join: Join using hash table
Merge Join: Join pre-sorted inputs
Nested Loop: Iterate outer, probe inner for each row

Appendix D: Further Reading¶

Document Version: 1.0 HeliosDB Version: 7.0 Feature Status: Phase 1 Complete (100%) Last Updated: 2025-11-17