Skip to content

Enhanced EXPLAIN PLAN User Guide

Version: 7.0 Last Updated: 2025-11-17 Status: Phase 1 Complete

Table of Contents

  1. Introduction
  2. Quick Start
  3. EXPLAIN Basics
  4. Configuration Tracking
  5. Feature Detection
  6. Optimizer Decisions
  7. Why Not Analysis
  8. Output Formats
  9. Query Optimization Workflow
  10. Best Practices
  11. Troubleshooting
  12. Advanced Topics

Introduction

HeliosDB's Enhanced EXPLAIN PLAN goes far beyond traditional query plan visualization. It provides:

  • Configuration Snapshots: Capture 45+ configuration parameters affecting query planning
  • Feature Detection: Automatically detect 8+ active optimization features (partition pruning, SIMD, etc.)
  • Optimizer Decision Tracking: See why the optimizer chose specific strategies with cost comparisons
  • "Why Not" Analysis: Understand why certain optimizations weren't applied
  • Query Rewrite Suggestions: Get actionable recommendations to improve query performance
  • Multiple Output Formats: Text, JSON, YAML, XML, HTML, GraphViz, Mermaid

Key Benefits

  1. Deep Visibility: Understand exactly what your queries are doing
  2. Performance Tuning: Identify bottlenecks and optimization opportunities
  3. Debugging: Diagnose unexpected query plans
  4. Education: Learn how the query optimizer works
  5. Production Monitoring: Track query plan changes over time

Quick Start

Basic EXPLAIN

-- Simple EXPLAIN
EXPLAIN SELECT * FROM users WHERE customer_id = 123;

-- Output:
EXPLAIN PLAN (Enhanced)
Estimated Cost: 50.00
Estimated Rows: 1
Planning Time: 2.34ms

Active Features (2):
   Predicate Pushdown (IMPLICIT)
    Trigger: 1 predicates pushed to storage layer
    Benefit: Filters applied early, reduces data transfer

   Index Scan on table 1 (IMPLICIT)
    Trigger: Highly selective predicate on indexed column (0.1% selectivity)
    Benefit: 100x faster than sequential scan
    Savings: 99.0% (10000.00 -> 100.00)

Execution Plan:
-> IndexScan(table_id=1, predicates=1)

EXPLAIN ANALYZE with "Why Not"

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders
WHERE customer_id = 123;

Different Output Formats

-- JSON format for tools
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;

-- HTML for web viewing
EXPLAIN (FORMAT HTML) SELECT * FROM orders WHERE date > '2024-01-01';

-- GraphViz for diagrams
EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM complex_join_query;

-- Mermaid for Markdown
EXPLAIN (FORMAT MERMAID) SELECT * FROM sales GROUP BY region;

EXPLAIN Basics

Syntax

EXPLAIN [ ANALYZE ] [ ( option [, ...] ) ] statement

Options:
  FORMAT { TEXT | JSON | YAML | XML | HTML | GRAPHVIZ | MERMAID }
  WHY_NOT { TRUE | FALSE }
  VERBOSE { TRUE | FALSE }
  COSTS { TRUE | FALSE }
  BUFFERS { TRUE | FALSE }
  TIMING { TRUE | FALSE }

Understanding Output

Estimated Cost

The cost is an arbitrary unit representing execution time and resource usage:

  • Sequential Scan: High cost for large tables (1.0 per page)
  • Index Scan: Lower cost for selective queries (0.005 per tuple)
  • Hash Join: Depends on hash table size and probe complexity
  • Sort: Depends on data volume and available memory

Rule of Thumb: - Cost < 100: Very fast - Cost 100-1000: Fast - Cost 1000-10000: Moderate - Cost > 10000: Slow (consider optimization)

Estimated Rows

Number of rows the optimizer expects this operation to return.

Important: If actual rows differ significantly from estimated rows, your statistics may be stale!

-- Update statistics
ANALYZE users;

Configuration Tracking

Every EXPLAIN captures 45+ configuration parameters affecting the plan:

Memory Settings

work_mem: 256MB
  - Controls memory for sorts and hash tables
  - Increase if seeing "disk-based sort" warnings

shared_buffers: 1024MB
  - Shared memory cache for data pages
  - Typically 25% of system RAM

effective_cache_size: 4096MB
  - Estimate of OS cache available
  - Used for cost calculations

Join Strategy Toggles

enable_hashjoin: true
  - Hash joins for equality conditions
  - Fast for medium-sized tables

enable_mergejoin: true
  - Merge joins for sorted data
  - Efficient for pre-sorted inputs

enable_nestloop: true
  - Nested loop joins for small tables
  - Good when one side is very selective

Scan Strategy Toggles

enable_indexscan: true
  - Use indexes when beneficial

enable_seqscan: true
  - Sequential scans for large portions of table

enable_bitmapscan: true
  - Bitmap scans for multiple index conditions

Parallelism Settings

max_parallel_workers: 8
  - Maximum parallel workers across all queries

max_parallel_workers_per_gather: 4
  - Maximum parallel workers for single operation

min_parallel_table_scan_size_mb: 8
  - Minimum table size for parallelism

Cost Parameters

seq_page_cost: 1.0
  - Cost to read one page sequentially

random_page_cost: 4.0
  - Cost to read one page randomly
  - Higher = optimizer prefers sequential scans
  - Lower = optimizer prefers index scans

cpu_tuple_cost: 0.01
  - Cost to process one tuple

cpu_operator_cost: 0.0025
  - Cost to execute one operator/function

Advanced Features

enable_partition_pruning: true
  - Skip irrelevant partitions based on predicates

enable_simd: true
  - SIMD vectorization for aggregations

enable_jit: false
  - JIT compilation for complex queries (experimental)

Feature Detection

The optimizer automatically detects and reports active features:

Partition Pruning

SELECT * FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-01-31';

-- Feature Detected:
 Partition Pruning (IMPLICIT)
  Trigger: Range predicates on partitioned table (1 predicates)
  Benefit: Skipped 11 of 12 partitions (91.7% I/O savings)
  Savings: 91.7% (12000.00 -> 1000.00)
  Confidence: 70%

When it Activates: - Range predicates on partition key - Equality predicates matching partition boundaries - IN clauses with partition key values

How to Maximize: - Partition on frequently filtered columns (date, region, category) - Use partition-aligned predicates - Keep partition count reasonable (< 1000)

Predicate Pushdown

SELECT * FROM orders
WHERE customer_id = 123 AND status = 'active';

-- Feature Detected:
 Predicate Pushdown (IMPLICIT)
  Trigger: 2 predicates pushed to storage layer
  Benefit: Filters applied early, reduces data transfer
  Confidence: 95%

When it Activates: - WHERE clause predicates on base tables - JOIN conditions pushed to table scans - Subquery filters pushed down

Performance Impact: - 10-100x reduction in data transfer - Lower memory usage - Faster query execution

SIMD Vectorization

SELECT region, SUM(sales), AVG(price)
FROM sales
GROUP BY region;

-- Feature Detected:
 SIMD Vectorization (IMPLICIT)
  Trigger: Numeric aggregation operations (SUM/AVG/COUNT)
  Benefit: 4x speedup on aggregation operations using AVX2/AVX512
  Savings: 75.0% (1000.00 -> 250.00)
  Confidence: 90%

When it Activates: - SUM, AVG, COUNT aggregations - Numeric data types (INT, BIGINT, FLOAT, DOUBLE) - SIMD-capable hardware (AVX2/AVX512)

Performance Impact: - 4-8x speedup on supported operations - Automatic - no query changes needed

Index Usage

SELECT * FROM users WHERE email = 'test@example.com';

-- Feature Detected:
 Index Scan on table users (IMPLICIT)
  Trigger: Highly selective predicate on indexed column (0.1% selectivity)
  Benefit: 100x faster than sequential scan
  Savings: 99.0% (10000.00 -> 100.00)
  Confidence: 95%

Types of Index Scans: - Index Scan: Read index + fetch table rows - Index-Only Scan: Read index only (covering index) - Bitmap Index Scan: Multiple index conditions combined

Parallelism

SELECT * FROM large_table WHERE category = 'electronics';

-- Feature Detected:
 Parallel Sequential Scan (IMPLICIT)
  Trigger: Table size (50GB) exceeds min_parallel_table_scan_size
  Benefit: 4x speedup with 4 parallel workers
  Savings: 75.0% (20000.00 -> 5000.00)
  Workers: 4
  Confidence: 85%

When it Activates: - Table size > min_parallel_table_scan_size_mb - Query cost > parallel_setup_cost - Sufficient parallel workers available


Optimizer Decisions

See exactly why the optimizer chose specific strategies:

Join Method Selection

SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.id;

-- Optimizer Decision:
1. Join Strategy
   CHOSEN: Hash Join (cost: 250.00, rows: 10000)
   Reasoning: Medium-sized tables with equality condition.
              Hash join is optimal.

   REJECTED: Nested Loop Join (cost: 5000.00, 20.0x more expensive)
   Reason: One table too large for nested loop

   REJECTED: Merge Join (cost: 800.00, 3.2x more expensive)
   Reason: Neither input is sorted on join key

Decision Factors: - Table sizes - Join condition types (equality vs range) - Available memory (work_mem) - Pre-sorted data - Index availability

Index Selection

SELECT * FROM users
WHERE email = 'test@example.com' AND status = 'active';

-- Optimizer Decision:
1. Index Selection for users table
   CHOSEN: idx_users_email (cost: 50.00, rows: 1)
   Reasoning: Email is highly selective (0.1%).
              Index lookup is 100x faster than seq scan.

   REJECTED: idx_users_status (cost: 2000.00, 40.0x more expensive)
   Reason: Status has low selectivity (50%). Sequential scan cheaper.

   REJECTED: Sequential Scan (cost: 5000.00, 100.0x more expensive)
   Reason: Table is large (1M rows) and predicate is highly selective

Scan Method Selection

SELECT * FROM products WHERE category = 'electronics';

-- Optimizer Decision:
1. Scan Method for products
   CHOSEN: Sequential Scan (cost: 1000.00, rows: 50000)
   Reasoning: Low selectivity (50% of rows match).
              Sequential scan more efficient.

   REJECTED: Index Scan on idx_category (cost: 3000.00, 3.0x more expensive)
   Reason: High number of matching rows causes excessive random I/O

Selectivity Thresholds: - < 1%: Index scan strongly preferred - 1-5%: Index scan usually preferred - 5-20%: Depends on index quality and random_page_cost - > 20%: Sequential scan usually preferred


Why Not Analysis

Understand why certain optimizations weren't applied:

Index Not Used

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders WHERE customer_id = 123;

-- Why Not Analysis:
INDEX USAGE ANALYSIS (2 indexes)
────────────────────────────────────────────────────────

1. Index: idx_orders_date
   Table: orders | Type: BTree | Severity: Low
   Reason: Column 'order_date' not in WHERE clause
    Suggestion: Not applicable to this query
   💰 Cost Impact: Would add cost (unnecessary index scan)

2. Index: idx_orders_customer_date (covering index)
   Table: orders | Type: BTree | Severity: High
   Reason: Statistics outdated (analyzed 45 days ago)
    Suggestion: Run ANALYZE orders; to update statistics
              This covering index could eliminate table lookups.
              Estimated cost reduction: 60.0% (50 -> 20)
   💰 Cost Impact: 2.5x slower without fresh stats (20.00 vs 50.00)

Statistics Warnings

STATISTICS WARNINGS (1 tables)
────────────────────────────────────────────────────────

1. Table: orders
   Last Updated: 2024-09-15 | Severity: High
   Freshness: 45 days old (35.0% of data modified since)
   Impact: Stale statistics cause suboptimal query plans
    Action: ANALYZE orders;

Statistics Freshness Guidelines: - < 7 days: Fresh (Severity: Low) - 7-14 days: Aging (Severity: Low) - 14-30 days: Stale (Severity: Medium) - > 30 days: Critical (Severity: High)

Impact of Stale Statistics: - Incorrect cardinality estimates - Wrong join order - Suboptimal index selection - Parallel execution disabled

Cardinality Mismatches

CARDINALITY MISMATCHES (1 detected)
────────────────────────────────────────────────────────

1. Operation: Hash Join
   Estimated: 1000 rows | Actual: 100000 rows | Severity: Critical
   Error Rate: 9900.0% (100.0x underestimate)
   Likely Cause: Outdated statistics on join columns or correlated predicates
    Fix: Run ANALYZE on joined tables
          Consider multi-column statistics for correlated columns

Common Causes: - Outdated statistics - Correlated columns (not modeled) - Skewed data distributions - Complex predicates - Missing histograms

Optimization Suggestions

OPTIMIZATION SUGGESTIONS (5 recommendations)
────────────────────────────────────────────────────────

1. Create covering index for GROUP BY
   Category: Indexing | Priority: Medium | Effort: Medium
   Description: A covering index on GROUP BY columns with INCLUDE
                for aggregate columns could enable index-only scan.
    Expected Benefit: 81.0% improvement (1050.00 -> 200.00)
   Implementation:
      Identify aggregate and group-by columns
      CREATE INDEX with INCLUDE clause for aggregate columns
      Test with EXPLAIN ANALYZE

2. Rewrite OR conditions as UNION
   Category: QueryRewrite | Priority: High | Effort: Medium
   Description: OR conditions prevent index usage. Rewriting as UNION ALL
                allows each branch to use appropriate indexes.
    Expected Benefit: 90.0% improvement (5000.00 -> 500.00)
   Implementation:
      Identify OR conditions in WHERE clause
      Rewrite: SELECT * FROM t WHERE a=1 OR b=2
      As: SELECT * FROM t WHERE a=1 UNION ALL SELECT * FROM t WHERE b=2
      Add DISTINCT if needed to remove duplicates

Output Formats

TEXT (Default)

Human-readable, terminal-friendly output.

EXPLAIN SELECT * FROM users WHERE id = 1;

Best For: Quick debugging, terminal use

JSON

Machine-readable, structured data for tools.

EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;
{
  "plan": {
    "node_type": "IndexScan",
    "table_id": "users",
    "estimated_cost": 50.0,
    "estimated_rows": 1
  },
  "active_features": {
    "implicit": [
      {
        "name": "Index Scan",
        "category": "Indexing",
        "savings": {
          "percent_reduction": 99.0
        }
      }
    ]
  }
}

Best For: Tool integration, automation, logging

YAML

Human-friendly structured format.

EXPLAIN (FORMAT YAML) SELECT * FROM users WHERE id = 1;
plan:
  estimated_cost: 50.0
  estimated_rows: 1
  planning_time_ms: 2.34

active_features:
  total: 2
  implicit: 2
  explicit: 0

Best For: Configuration files, documentation

XML

Enterprise tool integration.

EXPLAIN (FORMAT XML) SELECT * FROM users WHERE id = 1;
<?xml version="1.0" encoding="UTF-8"?>
<explain-plan version="7.0">
  <summary>
    <estimated-cost>50.0</estimated-cost>
    <estimated-rows>1</estimated-rows>
  </summary>
  <active-features>
    <feature type="implicit" category="Indexing">
      <name>Index Scan</name>
      <confidence>0.95</confidence>
    </feature>
  </active-features>
</explain-plan>

Best For: Enterprise ETL tools, SOAP APIs, XML parsers

HTML

Web-based visualization with CSS styling.

EXPLAIN (FORMAT HTML) SELECT * FROM users WHERE id = 1;

Generates complete HTML page with: - Responsive design - Color-coded nodes - Interactive elements - Print-friendly layout

Best For: Dashboards, reports, presentations

GraphViz (DOT)

Generate diagrams with GraphViz.

EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;
digraph ExplainPlan {
  rankdir=TB;
  node [shape=box, style=rounded];

  node_0 [label="Hash Join\nCost: 250.00", fillcolor="#ffccbc", style=filled];
  node_1 [label="TableScan\nTable: users", fillcolor="#bbdefb", style=filled];
  node_2 [label="IndexScan\nTable: orders", fillcolor="#bbdefb", style=filled];

  node_0 -> node_1 [label="left"];
  node_0 -> node_2 [label="right"];
}

Render with: dot -Tpng explain.dot -o plan.png

Best For: Architecture diagrams, documentation, presentations

Mermaid

Markdown-based diagrams.

EXPLAIN (FORMAT MERMAID) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;
graph TB
  summary["EXPLAIN PLAN<br/>Cost: 250.00<br/>Rows: 10000"]
  summary:::summaryStyle

  node0["Hash Join"]
  node0:::joinStyle

  node1["TableScan<br/>Table: users"]
  node1:::scanStyle

  node2["IndexScan<br/>Table: orders"]
  node2:::scanStyle

  summary --> node0
  node0 -->|left| node1
  node0 -->|right| node2

  classDef summaryStyle fill:#e8f5e9,stroke:#4caf50
  classDef scanStyle fill:#bbdefb,stroke:#2196f3
  classDef joinStyle fill:#ffccbc,stroke:#ff5722

Best For: GitHub/GitLab documentation, Markdown files


Query Optimization Workflow

Step 1: Identify Slow Queries

-- Find slow queries in production
SELECT query, total_time, calls, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

Step 2: EXPLAIN the Query

EXPLAIN ANALYZE (WHY_NOT, FORMAT TEXT)
<your slow query>;

Step 3: Check Statistics

STATISTICS WARNINGS:
  Table: orders - Last analyzed 45 days ago
  Action: ANALYZE orders;

Fix: Run ANALYZE

ANALYZE orders;

Step 4: Review Active Features

Active Features (1):
  ✓ Sequential Scan (IMPLICIT)
    Should use index instead!

Fix: Check Why Not Analysis

Step 5: Apply Suggestions

OPTIMIZATION SUGGESTIONS:
  1. Create index on customer_id
     CREATE INDEX idx_orders_customer ON orders(customer_id);

Fix: Create recommended index

CREATE INDEX idx_orders_customer ON orders(customer_id);
ANALYZE orders;  -- Update statistics

Step 6: Verify Improvement

EXPLAIN ANALYZE
<your query>;

Expected: Lower cost, index scan used


Best Practices

1. Regular Statistics Updates

-- Automated statistics collection
CREATE EXTENSION pg_cron;

SELECT cron.schedule('nightly-analyze', '0 2 * * *', $$
  ANALYZE VERBOSE;
$$);

2. Monitor Query Plans

-- Track plan changes over time
CREATE TABLE query_plan_history (
  query_id TEXT,
  plan_hash TEXT,
  estimated_cost NUMERIC,
  captured_at TIMESTAMP DEFAULT NOW()
);

-- Log explain output
INSERT INTO query_plan_history (query_id, plan_hash, estimated_cost)
SELECT
  md5(query),
  md5(explain_plan),
  (explain_json->>'estimated_cost')::numeric
FROM pg_stat_statements;

3. Index Maintenance

-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexname NOT LIKE 'pg_toast%'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Drop if truly unused
DROP INDEX IF EXISTS unused_index_name;

4. Configuration Tuning

-- For OLTP (many small queries)
SET work_mem = '64MB';
SET random_page_cost = 1.1;  -- SSD storage
SET effective_cache_size = '8GB';

-- For OLAP (complex analytical queries)
SET work_mem = '512MB';
SET max_parallel_workers_per_gather = 4;
SET enable_partitionwise_join = ON;

5. Query Patterns to Avoid

Avoid:

-- Function on indexed column
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

-- Leading wildcard
SELECT * FROM products WHERE name LIKE '%phone%';

-- OR conditions preventing index use
SELECT * FROM orders WHERE status = 'pending' OR status = 'processing';

Instead:

-- Use functional index or normalize data
SELECT * FROM users WHERE email = 'test@example.com';

-- Use full-text search
SELECT * FROM products WHERE to_tsvector(name) @@ to_tsquery('phone');

-- Rewrite as UNION
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE status = 'processing';


Troubleshooting

Problem: Index Not Being Used

Symptoms: - EXPLAIN shows Sequential Scan - Query is slow despite having index

Diagnosis:

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM users WHERE email = 'test@example.com';

Common Causes:

  1. Stale Statistics

    Fix: ANALYZE users;
    

  2. Low Selectivity

    Index used only if < 20% of rows match
    Fix: Add more selective predicates
    

  3. Data Type Mismatch

    -- Wrong: id is INTEGER
    SELECT * FROM users WHERE id = '123';
    
    -- Right:
    SELECT * FROM users WHERE id = 123;
    

  4. Function on Column

    -- Wrong:
    SELECT * FROM users WHERE LOWER(email) = 'test@example.com';
    
    -- Right: Create functional index
    CREATE INDEX idx_users_email_lower ON users(LOWER(email));
    

Problem: Wrong Join Order

Symptoms: - Hash join on large tables first - Cardinality mismatches

Diagnosis:

EXPLAIN ANALYZE
SELECT * FROM a
JOIN b ON a.id = b.a_id
JOIN c ON b.id = c.b_id;

Fix:

-- Update statistics
ANALYZE a, b, c;

-- Force different join order (last resort)
SET join_collapse_limit = 1;
SELECT * FROM a
JOIN (b JOIN c ON b.id = c.b_id) ON a.id = b.a_id;

Problem: Parallel Query Not Activating

Symptoms: - Expected parallel workers: 4 - Actual parallel workers: 0

Common Causes:

  1. Table Too Small

    -- Check table size
    SELECT pg_size_pretty(pg_relation_size('table_name'));
    
    -- Adjust threshold
    SET min_parallel_table_scan_size = '1MB';
    

  2. Insufficient Workers

    SHOW max_parallel_workers;
    SHOW max_parallel_workers_per_gather;
    
    -- Increase if needed
    SET max_parallel_workers = 8;
    

  3. Query Too Cheap

    -- Parallel setup has overhead
    SHOW parallel_setup_cost;  -- Default: 1000
    
    -- Lower for testing (not recommended in production)
    SET parallel_setup_cost = 100;
    


Advanced Topics

Custom Statistics

-- Create extended statistics for correlated columns
CREATE STATISTICS orders_customer_date_stats (dependencies)
ON customer_id, order_date FROM orders;

ANALYZE orders;

Query Hints (Experimental)

-- Force index usage (PostgreSQL-style hint)
SELECT /*+ IndexScan(users idx_users_email) */
* FROM users WHERE email = 'test@example.com';

-- Force join order
SELECT /*+ Leading((a b) c) */
* FROM a, b, c WHERE ...;

Plan Stability

-- Freeze a good plan
CREATE EXTENSION pg_plan_freeze;

SELECT pg_plan_freeze('slow_query_md5_hash', 'good_plan_hash');

Cost Model Tuning

-- For SSD storage
SET random_page_cost = 1.1;
SET seq_page_cost = 1.0;

-- For HDD storage
SET random_page_cost = 4.0;
SET seq_page_cost = 1.0;

-- For in-memory data
SET random_page_cost = 0.1;
SET seq_page_cost = 0.1;

Appendix A: Configuration Reference

Complete Parameter List

Parameter Default Range Impact
work_mem 256MB 64KB-2GB Sort/Hash memory
shared_buffers 1024MB 128MB-40% RAM Page cache
effective_cache_size 4096MB 1GB-75% RAM Cost calculations
random_page_cost 4.0 0.1-10.0 Index vs scan choice
cpu_tuple_cost 0.01 0.001-0.1 Processing cost
max_parallel_workers 8 0-32 Parallel queries

Feature Flags

Flag Default Description
enable_simd true SIMD vectorization
enable_hashjoin true Hash joins
enable_mergejoin true Merge joins
enable_nestloop true Nested loop joins
enable_indexscan true Index scans
enable_seqscan true Sequential scans
enable_partition_pruning true Partition pruning

Appendix B: Error Codes

Code Message Solution
E001 Statistics too old Run ANALYZE
E002 No suitable index Create index
E003 Cardinality mismatch Update statistics, check correlations
E004 Out of memory (work_mem) Increase work_mem
E005 Query too complex Simplify or break into CTEs

Appendix C: Glossary

  • Cardinality: Number of rows in a result set
  • Selectivity: Fraction of rows matching a predicate (0.0 to 1.0)
  • Cost: Arbitrary unit representing execution time
  • Predicate: Filter condition (WHERE clause)
  • Covering Index: Index containing all needed columns
  • Sequential Scan: Read entire table sequentially
  • Index Scan: Read via index structure
  • Hash Join: Join using hash table
  • Merge Join: Join pre-sorted inputs
  • Nested Loop: Iterate outer, probe inner for each row

Appendix D: Further Reading

  1. PostgreSQL Query Performance Tuning
  2. Use The Index, Luke!
  3. HeliosDB Architecture Documentation
  4. Query Optimizer Internals

Document Version: 1.0 HeliosDB Version: 7.0 Feature Status: Phase 1 Complete (100%) Last Updated: 2025-11-17