Storage Overview
GraphRAG SDK uses a three-layer storage architecture to handle different aspects of graph RAG:
- Graph Store - Stores nodes and edges (entities, relationships, communities)
- Vector Store - Stores embeddings for similarity search
- Key-Value Store - Stores metadata, chunks, and other document data
Docker Setup
Start Services
bash
# Neo4j (graph store) — with GDS + APOC plugins
docker run -d --name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS='["graph-data-science", "apoc"]' \
-e NEO4J_ACCEPT_LICENSE_AGREEMENT=yes \
-v neo4j-data:/data \
neo4j:5-enterprise
# Qdrant (vector store)
docker run -d --name qdrant \
-p 6333:6333 -p 6334:6334 \
-v qdrant_storage:/qdrant/storage \
qdrant/qdrant:latest
# Redis (KV store)
docker run -d --name redis \
-p 6379:6379 -v redis-data:/data \
redis:7-alpine \
redis-server --appendonly yes --save 900 1 --save 300 10
# PostgreSQL + pgvector (vector store)
docker run -d --name postgres \
-p 5432:5432 \
-e POSTGRES_PASSWORD=password -e POSTGRES_DB=graphrag \
-v pgdata:/var/lib/postgresql/data --shm-size=256mb \
pgvector/pgvector:pg16
# FalkorDB (graph store)
docker run -d --name falkordb \
-p 6379:6379 -v falkordb-data:/data \
falkordb/falkordb:latestVerify Services
bash
# Neo4j — browser at http://localhost:7474
docker exec -it neo4j cypher-shell -u neo4j -p password "CALL gds.version()"
# Qdrant — dashboard at http://localhost:6333/dashboard
curl http://localhost:6333/collections
# Redis
docker exec -it redis redis-cli ping
# PostgreSQL
docker exec -it postgres psql -U postgres -d graphrag -c "SELECT 1"
# FalkorDB
docker exec -it falkordb redis-cli pingStop & Clean Up
bash
# Stop (keeps data)
docker stop neo4j qdrant redis postgres falkordb
# Start again
docker start neo4j qdrant redis postgres falkordb
# Remove containers
docker rm -f neo4j qdrant redis postgres falkordb
# Remove data volumes
docker volume rm neo4j-data qdrant_storage redis-data pgdata falkordb-dataDefault Credentials
| Service | Connection | Credentials |
|---|---|---|
| Neo4j | bolt://localhost:7687 | neo4j / password |
| Qdrant | http://localhost:6333 | none (local) |
| Redis | localhost:6379 | none (local) |
| PostgreSQL | localhost:5432 | postgres / password |
| FalkorDB | localhost:6379 | none (local) |
Storage Architecture
typescript
const graph = createGraph({
model: openai("gpt-4o-mini"),
embedding: openai.embedding("text-embedding-3-small"),
storage: {
graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
vector: qdrantVector({ url: "http://localhost:6333", ... }),
kv: redisKV({ host: "localhost", port: 6379 })
}
});Each layer can use a different backend, or you can use a unified storage solution.
Available Storage Packages
Graph Stores
| Package | Database | Best For | Status |
|---|---|---|---|
@graphrag-sdk/in-memory-storage | In-memory (Cytoscape) | Development, testing | ✅ |
@graphrag-sdk/neo4j | Neo4j + GDS | Production, community detection | ✅ |
@graphrag-sdk/dozerdb | DozerDB (Neo4j-compatible) | Open-source production | ✅ |
@graphrag-sdk/falkordb | FalkorDB (Redis-based) | Lightweight production | ✅ |
Vector Stores
| Package | Database | Best For | Status |
|---|---|---|---|
@graphrag-sdk/in-memory-storage | In-memory cosine | Development, testing | ✅ |
@graphrag-sdk/qdrant | Qdrant | High-performance vector search | ✅ |
@graphrag-sdk/pgvector | PostgreSQL + pgvector | SQL-based workflows | ✅ |
Key-Value Stores
| Package | Database | Best For | Status |
|---|---|---|---|
@graphrag-sdk/in-memory-storage | In-memory Map | Development, testing | ✅ |
@graphrag-sdk/in-memory-storage | JSON files | Persistence without DB | ✅ |
@graphrag-sdk/redis | Redis | Production KV storage | ✅ |
Quick Comparison
Development & Testing
Recommended: Use @graphrag-sdk/in-memory-storage for all three layers.
typescript
import { memoryGraph, memoryVector, memoryKV } from '@graphrag-sdk/in-memory-storage';
const graph = createGraph({
// ... model config
storage: {
graph: memoryGraph,
vector: memoryVector,
kv: memoryKV,
}
});Production
Option 1: Specialized Stack
typescript
import { neo4jGraph } from '@graphrag-sdk/neo4j';
import { qdrantVector } from '@graphrag-sdk/qdrant';
import { redisKV } from '@graphrag-sdk/redis';
const graph = createGraph({
// ... model config
storage: {
graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
vector: qdrantVector({ url: "http://localhost:6333", ... }),
kv: redisKV({ host: "localhost", port: 6379 })
}
});Option 2: PostgreSQL-Only Stack
typescript
import { pgVector } from '@graphrag-sdk/pgvector';
import { memoryGraph } from '@graphrag-sdk/in-memory-storage';
// Note: Use pgvector for vectors, memory for graph until pg graph store is implemented
const graph = createGraph({
// ... model config
storage: {
graph: memoryGraph, // or neo4j/falkordb
vector: pgVector({ host: "localhost", database: "graphrag", ... }),
kv: memoryKV, // or redis
}
});Option 3: Redis-Based Stack
typescript
import { falkorGraph } from '@graphrag-sdk/falkordb';
import { redisKV } from '@graphrag-sdk/redis';
import { memoryVector } from '@graphrag-sdk/in-memory-storage';
const graph = createGraph({
// ... model config
storage: {
graph: falkorGraph({ host: "localhost", port: 6379 }),
vector: memoryVector, // or qdrant/pgvector
kv: redisKV({ host: "localhost", port: 6379 })
}
});Storage Requirements by Algorithm
Different algorithms have different storage requirements:
Similarity Graph
- Graph Store: Optional (stores similarity edges)
- Vector Store: Required (primary data structure)
- KV Store: Required (chunk metadata)
LightRAG
- Graph Store: Required (entities + relations)
- Vector Store: Required (dual vectors for entities and relations)
- KV Store: Required (chunk metadata)
Microsoft GraphRAG
- Graph Store: Required (entities, relations, communities)
- Vector Store: Required (entity and chunk vectors)
- KV Store: Required (chunks, community reports)
Fast GraphRAG
- Graph Store: Required (entities + relations for PageRank)
- Vector Store: Required (entity vectors)
- KV Store: Required (chunk metadata)
AWS GraphRAG
- Graph Store: Required (hierarchical fact graph)
- Vector Store: Required (chunk and statement vectors)
- KV Store: Required (chunks, statements, facts)
Performance Considerations
Latency
| Storage | Vector Search | Graph Traversal | KV Lookup |
|---|---|---|---|
| Memory | < 1ms | < 1ms | < 1ms |
| Neo4j | N/A | 10-50ms | N/A |
| Qdrant | 5-20ms | N/A | N/A |
| pgvector | 10-30ms | N/A | N/A |
| FalkorDB | N/A | 5-15ms | N/A |
| Redis | N/A | N/A | 1-5ms |
Scalability
| Storage | Max Nodes/Vectors | Horizontal Scaling |
|---|---|---|
| Memory | Limited by RAM | No |
| Neo4j | Billions | Yes (Enterprise) |
| Qdrant | Billions | Yes |
| pgvector | Millions | Yes (with sharding) |
| FalkorDB | Millions | Yes (via Redis cluster) |
| Redis | Billions (keys) | Yes (cluster mode) |
Cost
| Storage | Hosting Cost | License |
|---|---|---|
| Memory | Free | MIT |
| Neo4j | Medium-High | Community (GPL) / Enterprise |
| Qdrant | Low-Medium | Apache 2.0 |
| pgvector | Low | PostgreSQL License |
| FalkorDB | Low | SSPL / Commercial |
| Redis | Low-Medium | RSALv2 / Commercial |
Namespace Isolation
All storage backends support namespace isolation for multi-tenancy:
typescript
// Each namespace gets its own isolated storage
const graph1 = createGraph({
namespace: "tenant-1",
storage: { ... }
});
const graph2 = createGraph({
namespace: "tenant-2",
storage: { ... }
});How namespaces are implemented:
- Memory: Separate in-memory instances
- Neo4j: Label-based isolation (
namespace__tenant-1) - DozerDB: Label-based isolation (
namespace__tenant-1) - Qdrant: Collection prefixes (
namespace_tenant-1_index) - pgvector: Table prefixes (
namespace_tenant_1_index) - FalkorDB: Graph name prefixes
- Redis: Key prefixes (
namespace:tenant-1:key)
Migration & Persistence
Development to Production
typescript
// Step 1: Export from memory storage
const data = await graph.export('json');
// Step 2: Initialize production storage
const prodGraph = createGraph({
storage: {
graph: neo4jGraph({ ... }),
vector: qdrantVector({ ... }),
kv: redisKV({ ... })
}
});
// Step 3: Import data
await prodGraph.import(data);Backup Strategies
Each storage backend has its own backup approach:
- Memory: Use
export()to save to JSON - Neo4j: Use
neo4j-admin dumpor GDS backup - Qdrant: Use collection snapshots
- pgvector: Use
pg_dump - FalkorDB: Use Redis RDB/AOF persistence
- Redis: Use RDB snapshots or AOF logs
Next Steps
- Memory Storage - In-memory and file-based storage
- Neo4j - Graph database with GDS support
- DozerDB - Open-source Neo4j-compatible graph database
- Qdrant - High-performance vector search
- PostgreSQL + pgvector - SQL-based vector storage
- FalkorDB - Redis-based graph database
- Redis - Key-value storage