GraphRAG Quick Start¶

Overview¶

GraphRAG combines graph databases with Retrieval-Augmented Generation (RAG) to enable intelligent knowledge graph queries using natural language and LLM reasoning.

Key Concepts¶

What is GraphRAG?¶

Graph reasoning: Leverage relationships and connections in data
LLM integration: Use language models for semantic understanding
Hybrid querying: Combine structured graph queries with semantic search
Knowledge graphs: Build and query interconnected information

When to Use GraphRAG¶

Complex relationship discovery
Knowledge base systems
Semantic search with relationships
Entity resolution and linking
Graph-based recommendation systems

Quick Start¶

1. Enable GraphRAG¶

-- Create graph database
CREATE DATABASE knowledge_graph;

-- Enable GraphRAG features
SET graphrag_enabled = true;
SET graphrag_embedding_model = 'openai';

2. Create Graph Structures¶

-- Create nodes (entities)
CREATE TABLE entities (
  id SERIAL PRIMARY KEY,
  name VARCHAR(256),
  type VARCHAR(100),  -- Person, Place, Organization, etc.
  description TEXT,
  embedding vector(1536)  -- Vector embedding for semantic search
);

-- Create relationships (edges)
CREATE TABLE relationships (
  id SERIAL PRIMARY KEY,
  source_id INT REFERENCES entities(id),
  target_id INT REFERENCES entities(id),
  relationship_type VARCHAR(100),
  properties JSONB
);

3. Build the Knowledge Graph¶

-- Insert entities
INSERT INTO entities (name, type, description)
VALUES
  ('Alice', 'Person', 'Software Engineer'),
  ('Bob', 'Person', 'Product Manager'),
  ('Company X', 'Organization', 'Tech startup');

-- Insert relationships
INSERT INTO relationships (source_id, target_id, relationship_type, properties)
VALUES
  (1, 3, 'works_at', '{"since": "2022-01-01"}'),
  (2, 3, 'manages', '{"teams": ["Engineering"]}'),
  (1, 2, 'reports_to', '{"direct_report": true}');

4. Query with Natural Language¶

-- Cypher-like graph queries
MATCH (p:Person)-[r:works_at]->(o:Organization)
RETURN p.name, r.relationship_type, o.name;

-- With semantic enhancement
MATCH (p:Person {type: 'Person'})-[]->(org:Organization)
WHERE p.name LIKE '%Alice%'
RETURN p, org;

5. Combine with Vector Search¶

-- Hybrid query: structure + semantics
SELECT
  e.name,
  e.type,
  e.description
FROM entities e
WHERE e.type = 'Person'
ORDER BY e.embedding <-> to_vector('embedding of "software engineer"')
LIMIT 5;

Common Use Cases¶

1. Knowledge Discovery¶

-- Find all relationships for an entity
MATCH (e:Entity {name: 'Company X'})-[r]->(related)
RETURN e, r, related;

2. Semantic Search with Relationships¶

-- Find people and their roles
MATCH (p:Person)-[r:works_at]->(o:Organization)
WHERE o.name = 'Company X'
RETURN p.name, r.properties;

3. Recommendation System¶

-- Find similar people based on relationships and embeddings
SELECT p2.name, COUNT(*) as common_relationships
FROM entities p1
JOIN relationships r1 ON p1.id = r1.source_id
JOIN relationships r2 ON r1.target_id = r2.target_id
JOIN entities p2 ON r2.source_id = p2.id
WHERE p1.name = 'Alice'
GROUP BY p2.id, p2.name
ORDER BY common_relationships DESC;

4. Entity Resolution¶

-- Find duplicate or similar entities
SELECT e1.name, e2.name
FROM entities e1
JOIN entities e2 ON e1.id < e2.id
WHERE e1.type = e2.type
AND e1.embedding <-> e2.embedding < 0.1;  -- High similarity

Performance Tips¶

Index Creation: Speed up graph traversals

CREATE INDEX idx_entities_type ON entities(type);
CREATE INDEX idx_relationships_type ON relationships(relationship_type);
CREATE INDEX idx_embeddings ON entities USING ivfflat (embedding vector_cosine_ops);

Optimize Vector Search

-- Use appropriate vector indexes for similarity
SET search_path TO public;
CREATE EXTENSION IF NOT EXISTS vector;

Batch Operations

-- Insert relationships in batches
INSERT INTO relationships (...) VALUES (...), (...), (...);

Graph Statistics

SELECT
  COUNT(*) as total_entities,
  COUNT(DISTINCT type) as entity_types
FROM entities;

Troubleshooting¶

Q: Queries returning too many results?¶

A: Add WHERE clauses to filter relationships and entity types.

Q: Vector search is slow?¶

A: Create appropriate vector indexes and check embedding dimensions.

Q: Memory usage high for large graphs?¶

A: Use pagination and limit result sets.

Best Practices¶

Create indexes on frequently queried relationships
Use appropriate vector dimensions (1536 for large models, smaller for faster processing)
Batch insert relationships to avoid transaction overhead
Regular VACUUM to maintain performance
Use EXPLAIN to optimize queries

Next Steps¶

Review /docs/features/graphrag/USER_GUIDE.md for advanced features
Check Neo4j migration guide in /docs/features/graphrag/NEO4J_MIGRATION_GUIDE.md
Explore Cypher reference in /docs/features/graphrag/CYPHER_REFERENCE.md

Vector Search: /docs/features/multimodal-vector/
Full-Text Search: /docs/guides/user/FULL_TEXT_SEARCH_TUNING_GUIDE.md
Graph Query: /docs/features/packages/12-graph-readme.md

Document Version: 1.0 Last Updated: December 30, 2025 Audience: Data engineers, knowledge graph developers Reading Time: 8 minutes