Skip to content

EXPLAIN PLAN Quick Start Guide

Feature: AI-Powered Query Explanation Version: HeliosDB v7.0 Status: Production Ready


What is EXPLAIN?

EXPLAIN shows you how HeliosDB will execute your query. Unlike traditional databases, HeliosDB's EXPLAIN uses AI to explain query plans in plain English, making optimization accessible to everyone.


Quick Examples

1. Basic EXPLAIN

EXPLAIN SELECT * FROM users WHERE age > 25;

Output:

→ Scan users [cost=200.00, rows=1000]
  Filter: age > 25

EXPLAIN (AI) SELECT * FROM users WHERE age > 25;

Output:

This query searches all users to find those older than 25.
It reads the entire table (2,550 rows) because no index exists on 'age'.

Performance: MODERATE (50ms)
Suggestion: CREATE INDEX idx_users_age ON users(age);
Expected improvement: 90% faster

3. Full Analysis

EXPLAIN (ANALYZE) SELECT * FROM users WHERE age > 25;

Output: Complete analysis including: - AI explanation - Why-Not analysis (why optimizations weren't applied) - Optimizer decisions - Suggested improvements - Configuration recommendations


EXPLAIN Modes

Mode When to Use Output
STANDARD Quick plan overview Basic tree structure
VERBOSE Detailed cost analysis Costs + decisions
AI Beginner-friendly Natural language + suggestions
ANALYZE Deep optimization Full analysis + Why-Not

Usage

-- Standard mode (default)
EXPLAIN SELECT ...;

-- Verbose mode
EXPLAIN (VERBOSE) SELECT ...;

-- AI mode (natural language)
EXPLAIN (AI) SELECT ...;

-- Full analysis
EXPLAIN (ANALYZE) SELECT ...;

Output Formats

Text Format (Default)

EXPLAIN SELECT * FROM users;

Human-readable text with structure.

JSON Format

EXPLAIN (FORMAT JSON) SELECT * FROM users;

Machine-readable for tools and scripts.

YAML Format

EXPLAIN (FORMAT YAML) SELECT * FROM users;

Configuration-friendly format.

Tree Format

EXPLAIN (FORMAT TREE) SELECT * FROM users;

Visual tree with ANSI colors.


Combining Options

-- AI mode with JSON output
EXPLAIN (AI, FORMAT JSON) SELECT * FROM users WHERE age > 25;

-- Full analysis with YAML output
EXPLAIN (ANALYZE, FORMAT YAML) SELECT * FROM users JOIN orders ON users.id = orders.user_id;

Understanding Output

Cost

cost=250.00
  • Estimated query execution cost
  • Lower is better
  • Relative metric (not time)

Interpretation: - < 100: Fast - 100-1000: Moderate - 1000-10000: Slow - > 10000: Very slow

Rows

rows=1000
  • Estimated number of rows
  • Helps understand data volume
  • Used for join ordering

Node Types

Node Meaning
Scan Read table data
Filter Apply WHERE conditions
Join Combine tables
Aggregate GROUP BY operations
Sort ORDER BY operations
Limit LIMIT/OFFSET

AI Explanations

Setting Up AI (Optional)

AI explanations work out-of-the-box with local models. For cloud LLMs:

Option 1: OpenAI

export HELIOSDB_LLM_PROVIDER=openai
export HELIOSDB_LLM_API_KEY=sk-...
export HELIOSDB_LLM_MODEL=gpt-4

Option 2: Anthropic Claude

export HELIOSDB_LLM_PROVIDER=anthropic
export HELIOSDB_LLM_API_KEY=sk-ant-...
export HELIOSDB_LLM_MODEL=claude-3-opus

Option 3: Local Ollama (Free, Private)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Download model
ollama pull llama2

# Configure HeliosDB
export HELIOSDB_LLM_PROVIDER=ollama
export HELIOSDB_LLM_MODEL=llama2

AI Explanation Components

  1. Summary: High-level query description
  2. Walkthrough: Step-by-step execution
  3. Performance: Speed prediction
  4. Bottlenecks: What's slowing it down
  5. Suggestions: How to improve
  6. Warnings: Things to watch out for

Why-Not Analysis

Explains why optimizations weren't applied.

Example

EXPLAIN (ANALYZE) SELECT * FROM users WHERE email = 'test@example.com';

Why-Not Output:

WHY NOT ANALYSIS

Index 'idx_users_email' exists but wasn't used because:
- Query uses SELECT * (requires all columns)
- Index is not covering (doesn't include all columns)
- Estimated cost: Index scan (150.0) vs Seq scan (100.0)
- Seq scan is actually faster for this query

Suggestion: If this query runs frequently, create a covering index:
  CREATE INDEX idx_users_email_covering ON users(email)
    INCLUDE (id, name, created_at);

What Why-Not Checks

  1. Unused Indexes: Why indexes weren't used
  2. Stale Statistics: If data is outdated
  3. Configuration: If settings are suboptimal
  4. Cardinality: If row estimates are off

Common Scenarios

Scenario 1: Slow Query

EXPLAIN (AI) SELECT * FROM orders WHERE created_at > '2024-01-01';

AI Response:

Performance: SLOW (2.5s)

Bottleneck: Full table scan on 10 million rows

Suggestion: CREATE INDEX idx_orders_created_at ON orders(created_at);
Expected improvement: 95% faster (2.5s → 125ms)

Scenario 2: Complex Join

EXPLAIN (AI)
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.age > 25
GROUP BY u.name;

AI Response:

This query joins users with orders, filters by age, and counts orders per user.

Execution:
1. Scan users table, filter age > 25
2. Build hash table from filtered users
3. Scan orders, probe hash table
4. Group results and count

Performance: MODERATE (150ms)

Optimization: Create index on users(age) to speed up filtering
Expected improvement: 40% faster

Scenario 3: Missing Index

EXPLAIN (ANALYZE) SELECT * FROM products WHERE category = 'electronics';

Why-Not Output:

No index exists on 'category' column

Current approach: Sequential scan (cost: 1000.0)
With index: Index scan (cost: 50.0)
Potential speedup: 20x faster

Recommendation:
  CREATE INDEX idx_products_category ON products(category);


Best Practices

1. Use AI Mode for Learning

-- Start with AI to understand your queries
EXPLAIN (AI) SELECT ...;

2. Check Why-Not for Optimization

-- When query is slow, use ANALYZE
EXPLAIN (ANALYZE) SELECT ...;

3. Use JSON for Automation

-- Export to JSON for scripts
EXPLAIN (FORMAT JSON) SELECT ... > plan.json

4. Compare Before/After

-- Before optimization
EXPLAIN (AI) SELECT * FROM users WHERE age > 25;

-- Create index
CREATE INDEX idx_users_age ON users(age);

-- After optimization
EXPLAIN (AI) SELECT * FROM users WHERE age > 25;

Interpreting Suggestions

Index Suggestions

Suggestion: CREATE INDEX idx_users_age ON users(age);

When to create: - Query runs frequently (>100 times/day) - Table is large (>10,000 rows) - Filter is selective (<10% of rows)

When not to create: - One-time query - Small table - Filter matches most rows

Configuration Suggestions

Suggestion: Increase work_mem to 512MB

When to apply: - Memory is available - Joins/sorts are spilling to disk - Query is critical

When not to apply: - Limited memory - Many concurrent queries - Not a bottleneck


Troubleshooting

"LLM not available"

Solution: Install Ollama (free, local)

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama2

"Cost seems wrong"

Solution: Run ANALYZE to update statistics

ANALYZE users;

"Suggestion doesn't help"

Solution: Use full analysis

EXPLAIN (ANALYZE, VERBOSE) SELECT ...;


Quick Reference

Syntax

EXPLAIN [ ( option [, ...] ) ] statement

Options:
  VERBOSE           Show detailed costs
  AI                Natural language explanation
  ANALYZE           Full analysis with Why-Not
  FORMAT { TEXT | JSON | YAML | TREE }

Examples

-- Basic
EXPLAIN SELECT * FROM users;

-- With options
EXPLAIN (VERBOSE) SELECT * FROM users;
EXPLAIN (AI) SELECT * FROM users;
EXPLAIN (ANALYZE) SELECT * FROM users;
EXPLAIN (AI, FORMAT JSON) SELECT * FROM users;

-- Complex query
EXPLAIN (ANALYZE)
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.age > 25
GROUP BY u.name
HAVING COUNT(o.id) > 5
ORDER BY order_count DESC
LIMIT 10;

Performance Tips

  1. Always EXPLAIN before running expensive queries
  2. Use ANALYZE mode to understand bottlenecks
  3. Follow AI suggestions (they're usually right)
  4. Create indexes on filtered/joined columns
  5. Run ANALYZE after large data changes
  6. Monitor query costs over time
  7. Use covering indexes for frequently accessed data

Learn More


Support


Last Updated: 2025-11-14 HeliosDB Version: 7.0 Feature Status: Production Ready