Skip to content

GraphRAG Quick Start

Overview

GraphRAG combines graph databases with Retrieval-Augmented Generation (RAG) to enable intelligent knowledge graph queries using natural language and LLM reasoning.

Key Concepts

What is GraphRAG?

  • Graph reasoning: Leverage relationships and connections in data
  • LLM integration: Use language models for semantic understanding
  • Hybrid querying: Combine structured graph queries with semantic search
  • Knowledge graphs: Build and query interconnected information

When to Use GraphRAG

  • Complex relationship discovery
  • Knowledge base systems
  • Semantic search with relationships
  • Entity resolution and linking
  • Graph-based recommendation systems

Quick Start

1. Enable GraphRAG

-- Create graph database
CREATE DATABASE knowledge_graph;

-- Enable GraphRAG features
SET graphrag_enabled = true;
SET graphrag_embedding_model = 'openai';

2. Create Graph Structures

-- Create nodes (entities)
CREATE TABLE entities (
  id SERIAL PRIMARY KEY,
  name VARCHAR(256),
  type VARCHAR(100),  -- Person, Place, Organization, etc.
  description TEXT,
  embedding vector(1536)  -- Vector embedding for semantic search
);

-- Create relationships (edges)
CREATE TABLE relationships (
  id SERIAL PRIMARY KEY,
  source_id INT REFERENCES entities(id),
  target_id INT REFERENCES entities(id),
  relationship_type VARCHAR(100),
  properties JSONB
);

3. Build the Knowledge Graph

-- Insert entities
INSERT INTO entities (name, type, description)
VALUES
  ('Alice', 'Person', 'Software Engineer'),
  ('Bob', 'Person', 'Product Manager'),
  ('Company X', 'Organization', 'Tech startup');

-- Insert relationships
INSERT INTO relationships (source_id, target_id, relationship_type, properties)
VALUES
  (1, 3, 'works_at', '{"since": "2022-01-01"}'),
  (2, 3, 'manages', '{"teams": ["Engineering"]}'),
  (1, 2, 'reports_to', '{"direct_report": true}');

4. Query with Natural Language

-- Cypher-like graph queries
MATCH (p:Person)-[r:works_at]->(o:Organization)
RETURN p.name, r.relationship_type, o.name;

-- With semantic enhancement
MATCH (p:Person {type: 'Person'})-[]->(org:Organization)
WHERE p.name LIKE '%Alice%'
RETURN p, org;
-- Hybrid query: structure + semantics
SELECT
  e.name,
  e.type,
  e.description
FROM entities e
WHERE e.type = 'Person'
ORDER BY e.embedding <-> to_vector('embedding of "software engineer"')
LIMIT 5;

Common Use Cases

1. Knowledge Discovery

-- Find all relationships for an entity
MATCH (e:Entity {name: 'Company X'})-[r]->(related)
RETURN e, r, related;

2. Semantic Search with Relationships

-- Find people and their roles
MATCH (p:Person)-[r:works_at]->(o:Organization)
WHERE o.name = 'Company X'
RETURN p.name, r.properties;

3. Recommendation System

-- Find similar people based on relationships and embeddings
SELECT p2.name, COUNT(*) as common_relationships
FROM entities p1
JOIN relationships r1 ON p1.id = r1.source_id
JOIN relationships r2 ON r1.target_id = r2.target_id
JOIN entities p2 ON r2.source_id = p2.id
WHERE p1.name = 'Alice'
GROUP BY p2.id, p2.name
ORDER BY common_relationships DESC;

4. Entity Resolution

-- Find duplicate or similar entities
SELECT e1.name, e2.name
FROM entities e1
JOIN entities e2 ON e1.id < e2.id
WHERE e1.type = e2.type
AND e1.embedding <-> e2.embedding < 0.1;  -- High similarity

Performance Tips

  1. Index Creation: Speed up graph traversals

    CREATE INDEX idx_entities_type ON entities(type);
    CREATE INDEX idx_relationships_type ON relationships(relationship_type);
    CREATE INDEX idx_embeddings ON entities USING ivfflat (embedding vector_cosine_ops);
    

  2. Optimize Vector Search

    -- Use appropriate vector indexes for similarity
    SET search_path TO public;
    CREATE EXTENSION IF NOT EXISTS vector;
    

  3. Batch Operations

    -- Insert relationships in batches
    INSERT INTO relationships (...) VALUES (...), (...), (...);
    

  4. Graph Statistics

    SELECT
      COUNT(*) as total_entities,
      COUNT(DISTINCT type) as entity_types
    FROM entities;
    

Troubleshooting

Q: Queries returning too many results?

A: Add WHERE clauses to filter relationships and entity types.

Q: Vector search is slow?

A: Create appropriate vector indexes and check embedding dimensions.

Q: Memory usage high for large graphs?

A: Use pagination and limit result sets.

Best Practices

  • Create indexes on frequently queried relationships
  • Use appropriate vector dimensions (1536 for large models, smaller for faster processing)
  • Batch insert relationships to avoid transaction overhead
  • Regular VACUUM to maintain performance
  • Use EXPLAIN to optimize queries

Next Steps

  1. Review /docs/features/graphrag/USER_GUIDE.md for advanced features
  2. Check Neo4j migration guide in /docs/features/graphrag/NEO4J_MIGRATION_GUIDE.md
  3. Explore Cypher reference in /docs/features/graphrag/CYPHER_REFERENCE.md
  • Vector Search: /docs/features/multimodal-vector/
  • Full-Text Search: /docs/guides/user/FULL_TEXT_SEARCH_TUNING_GUIDE.md
  • Graph Query: /docs/features/packages/12-graph-readme.md

Document Version: 1.0 Last Updated: December 30, 2025 Audience: Data engineers, knowledge graph developers Reading Time: 8 minutes