Skip to content

Storage Overview

GraphRAG SDK uses a three-layer storage architecture to handle different aspects of graph RAG:

  • Graph Store - Stores nodes and edges (entities, relationships, communities)
  • Vector Store - Stores embeddings for similarity search
  • Key-Value Store - Stores metadata, chunks, and other document data

Docker Setup

Start Services

bash
# Neo4j (graph store) — with GDS + APOC plugins
docker run -d --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password \
  -e NEO4J_PLUGINS='["graph-data-science", "apoc"]' \
  -e NEO4J_ACCEPT_LICENSE_AGREEMENT=yes \
  -v neo4j-data:/data \
  neo4j:5-enterprise

# Qdrant (vector store)
docker run -d --name qdrant \
  -p 6333:6333 -p 6334:6334 \
  -v qdrant_storage:/qdrant/storage \
  qdrant/qdrant:latest

# Redis (KV store)
docker run -d --name redis \
  -p 6379:6379 -v redis-data:/data \
  redis:7-alpine \
  redis-server --appendonly yes --save 900 1 --save 300 10

# PostgreSQL + pgvector (vector store)
docker run -d --name postgres \
  -p 5432:5432 \
  -e POSTGRES_PASSWORD=password -e POSTGRES_DB=graphrag \
  -v pgdata:/var/lib/postgresql/data --shm-size=256mb \
  pgvector/pgvector:pg16

# FalkorDB (graph store)
docker run -d --name falkordb \
  -p 6379:6379 -v falkordb-data:/data \
  falkordb/falkordb:latest

Verify Services

bash
# Neo4j — browser at http://localhost:7474
docker exec -it neo4j cypher-shell -u neo4j -p password "CALL gds.version()"

# Qdrant — dashboard at http://localhost:6333/dashboard
curl http://localhost:6333/collections

# Redis
docker exec -it redis redis-cli ping

# PostgreSQL
docker exec -it postgres psql -U postgres -d graphrag -c "SELECT 1"

# FalkorDB
docker exec -it falkordb redis-cli ping

Stop & Clean Up

bash
# Stop (keeps data)
docker stop neo4j qdrant redis postgres falkordb

# Start again
docker start neo4j qdrant redis postgres falkordb

# Remove containers
docker rm -f neo4j qdrant redis postgres falkordb

# Remove data volumes
docker volume rm neo4j-data qdrant_storage redis-data pgdata falkordb-data

Default Credentials

ServiceConnectionCredentials
Neo4jbolt://localhost:7687neo4j / password
Qdranthttp://localhost:6333none (local)
Redislocalhost:6379none (local)
PostgreSQLlocalhost:5432postgres / password
FalkorDBlocalhost:6379none (local)

Storage Architecture

typescript
const graph = createGraph({
  model: openai("gpt-4o-mini"),
  embedding: openai.embedding("text-embedding-3-small"),
  storage: {
    graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
    vector: qdrantVector({ url: "http://localhost:6333", ... }),
    kv: redisKV({ host: "localhost", port: 6379 })
  }
});

Each layer can use a different backend, or you can use a unified storage solution.

Available Storage Packages

Graph Stores

PackageDatabaseBest ForStatus
@graphrag-sdk/in-memory-storageIn-memory (Cytoscape)Development, testing
@graphrag-sdk/neo4jNeo4j + GDSProduction, community detection
@graphrag-sdk/dozerdbDozerDB (Neo4j-compatible)Open-source production
@graphrag-sdk/falkordbFalkorDB (Redis-based)Lightweight production

Vector Stores

PackageDatabaseBest ForStatus
@graphrag-sdk/in-memory-storageIn-memory cosineDevelopment, testing
@graphrag-sdk/qdrantQdrantHigh-performance vector search
@graphrag-sdk/pgvectorPostgreSQL + pgvectorSQL-based workflows

Key-Value Stores

PackageDatabaseBest ForStatus
@graphrag-sdk/in-memory-storageIn-memory MapDevelopment, testing
@graphrag-sdk/in-memory-storageJSON filesPersistence without DB
@graphrag-sdk/redisRedisProduction KV storage

Quick Comparison

Development & Testing

Recommended: Use @graphrag-sdk/in-memory-storage for all three layers.

typescript
import { memoryGraph, memoryVector, memoryKV } from '@graphrag-sdk/in-memory-storage';

const graph = createGraph({
  // ... model config
  storage: {
    graph: memoryGraph,
    vector: memoryVector,
    kv: memoryKV,
  }
});

Production

Option 1: Specialized Stack

typescript
import { neo4jGraph } from '@graphrag-sdk/neo4j';
import { qdrantVector } from '@graphrag-sdk/qdrant';
import { redisKV } from '@graphrag-sdk/redis';

const graph = createGraph({
  // ... model config
  storage: {
    graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
    vector: qdrantVector({ url: "http://localhost:6333", ... }),
    kv: redisKV({ host: "localhost", port: 6379 })
  }
});

Option 2: PostgreSQL-Only Stack

typescript
import { pgVector } from '@graphrag-sdk/pgvector';
import { memoryGraph } from '@graphrag-sdk/in-memory-storage';
// Note: Use pgvector for vectors, memory for graph until pg graph store is implemented

const graph = createGraph({
  // ... model config
  storage: {
    graph: memoryGraph, // or neo4j/falkordb
    vector: pgVector({ host: "localhost", database: "graphrag", ... }),
    kv: memoryKV, // or redis
  }
});

Option 3: Redis-Based Stack

typescript
import { falkorGraph } from '@graphrag-sdk/falkordb';
import { redisKV } from '@graphrag-sdk/redis';
import { memoryVector } from '@graphrag-sdk/in-memory-storage';

const graph = createGraph({
  // ... model config
  storage: {
    graph: falkorGraph({ host: "localhost", port: 6379 }),
    vector: memoryVector, // or qdrant/pgvector
    kv: redisKV({ host: "localhost", port: 6379 })
  }
});

Storage Requirements by Algorithm

Different algorithms have different storage requirements:

Similarity Graph

  • Graph Store: Optional (stores similarity edges)
  • Vector Store: Required (primary data structure)
  • KV Store: Required (chunk metadata)

LightRAG

  • Graph Store: Required (entities + relations)
  • Vector Store: Required (dual vectors for entities and relations)
  • KV Store: Required (chunk metadata)

Microsoft GraphRAG

  • Graph Store: Required (entities, relations, communities)
  • Vector Store: Required (entity and chunk vectors)
  • KV Store: Required (chunks, community reports)

Fast GraphRAG

  • Graph Store: Required (entities + relations for PageRank)
  • Vector Store: Required (entity vectors)
  • KV Store: Required (chunk metadata)

AWS GraphRAG

  • Graph Store: Required (hierarchical fact graph)
  • Vector Store: Required (chunk and statement vectors)
  • KV Store: Required (chunks, statements, facts)

Performance Considerations

Latency

StorageVector SearchGraph TraversalKV Lookup
Memory< 1ms< 1ms< 1ms
Neo4jN/A10-50msN/A
Qdrant5-20msN/AN/A
pgvector10-30msN/AN/A
FalkorDBN/A5-15msN/A
RedisN/AN/A1-5ms

Scalability

StorageMax Nodes/VectorsHorizontal Scaling
MemoryLimited by RAMNo
Neo4jBillionsYes (Enterprise)
QdrantBillionsYes
pgvectorMillionsYes (with sharding)
FalkorDBMillionsYes (via Redis cluster)
RedisBillions (keys)Yes (cluster mode)

Cost

StorageHosting CostLicense
MemoryFreeMIT
Neo4jMedium-HighCommunity (GPL) / Enterprise
QdrantLow-MediumApache 2.0
pgvectorLowPostgreSQL License
FalkorDBLowSSPL / Commercial
RedisLow-MediumRSALv2 / Commercial

Namespace Isolation

All storage backends support namespace isolation for multi-tenancy:

typescript
// Each namespace gets its own isolated storage
const graph1 = createGraph({
  namespace: "tenant-1",
  storage: { ... }
});

const graph2 = createGraph({
  namespace: "tenant-2",
  storage: { ... }
});

How namespaces are implemented:

  • Memory: Separate in-memory instances
  • Neo4j: Label-based isolation (namespace__tenant-1)
  • DozerDB: Label-based isolation (namespace__tenant-1)
  • Qdrant: Collection prefixes (namespace_tenant-1_index)
  • pgvector: Table prefixes (namespace_tenant_1_index)
  • FalkorDB: Graph name prefixes
  • Redis: Key prefixes (namespace:tenant-1:key)

Migration & Persistence

Development to Production

typescript
// Step 1: Export from memory storage
const data = await graph.export('json');

// Step 2: Initialize production storage
const prodGraph = createGraph({
  storage: {
    graph: neo4jGraph({ ... }),
    vector: qdrantVector({ ... }),
    kv: redisKV({ ... })
  }
});

// Step 3: Import data
await prodGraph.import(data);

Backup Strategies

Each storage backend has its own backup approach:

  • Memory: Use export() to save to JSON
  • Neo4j: Use neo4j-admin dump or GDS backup
  • Qdrant: Use collection snapshots
  • pgvector: Use pg_dump
  • FalkorDB: Use Redis RDB/AOF persistence
  • Redis: Use RDB snapshots or AOF logs

Next Steps

External Resources

Released under the Elastic License 2.0. Made with ❤️ by Narek.