Contact Us
No results found.

Graph Database Benchmark: Neo4j vs FalkorDB vs Memgraph

Ekrem Sarı
Ekrem Sarı
updated on Mar 27, 2026

We benchmarked Neo4j, FalkorDB, and Memgraph on a synthetic graph derived from 120,000 Amazon product reviews (381K nodes, 804K edges). We ran 12 query templates with 1,000 measurements each, tested ingestion at 6 batch sizes, sustained concurrent load for 60 seconds at up to 32 threads, and measured memory, cold start, mixed workload, and index impact.

FalkorDB delivered higher throughput than Neo4j and Memgraph at 8 threads.

Graph database benchmark results

Concurrent throughput

Loading Chart

QPS (queries per second) measures how many read queries the database answers per second under sustained multi-threaded load. Each run lasts 60 seconds. Higher is better.

Query latency (p50)

p50 is the median latency: half of all queries finish faster than this value. Lower is better.

  • Point lookup: Fetch a single node by ID. FalkorDB’s Redis hash tables do O(1) in-memory lookups, roughly 3x faster.
  • Traversal: Walk from a node to its neighbors (1-hop) or neighbors-of-neighbors (2-hop). FalkorDB does 2-hop in 2.9x faster.
  • Aggregation: Count reviews per brand, compute average star ratings.
  • Filter + scan: Filter reviews by star rating across the full dataset.

Ingestion throughput

Ingestion throughput measures how many reviews per second the database can write. Each point on the chart is a different batch size: how many reviews are grouped into a single query. Higher is better.

At batch size 1, Memgraph leads (1,427/s). As batch size increases, FalkorDB scales steeply and crosses Memgraph around batch 500. Neo4j plateaus at ~10,600/s regardless of batch size. At batch 5,000, FalkorDB reaches 22,784/s, 77x its batch-1 performance.

You can read more about our graph database benchmark methodology.

Key findings

FalkorDB hits 6,693 QPS at 8 threads, 6.7x Neo4j

Redis’s in-memory data structures and event loop let it combine low-latency queries with high parallelism. After 8 threads, throughput plateaus because the Redis single-threaded core is the ceiling. Neo4j peaks at 16 threads (1,010 QPS) then drops at 32 (927 QPS), which points to thread contention.

FalkorDB cold starts in 1.1ms, 82x faster than Neo4j

Neo4j takes 90ms to accept its first query after a restart. The first warming query runs at 274ms, then it takes about 3 queries to settle down to 34ms. FalkorDB is ready in 1.1ms, first query at 0.4ms. In a microservice or serverless setup where pods scale up and down, that gap matters.

Indexes: 1,700x difference on Neo4j, ~1x on FalkorDB

Without indexes, Neo4j’s deep_feature_products query took 293ms. With indexes, 0.17ms. That is a 1,712x difference. Memgraph showed similar sensitivity (160-898x depending on the query). FalkorDB’s results stayed roughly the same with or without indexes because Redis hash tables already function as implicit indexes.

Memory: 415MB vs 2,668MB for the same graph

  • Memgraph: 415MB
  • FalkorDB: 496MB
  • Neo4j: 2,668MB (JMX heap used)

Neo4j’s JVM pre-allocates 4GB at startup, so its process-level memory (VmRSS) is always ~5.2GB regardless of actual data usage. The JMX heap metric is the meaningful one. The peak of 2.7GB is the number to use for capacity planning.

Neo4j won the heaviest aggregation

FalkorDB had the lowest latency on 11 of 12 queries. The exception was agg_feature_sentiment (group by sentiment with filtering), where Neo4j’s query optimizer produced a better execution plan: 131ms vs FalkorDB’s 152ms.

Mixed workload (80% read, 20% write)

8 threads, 60 seconds, zero errors across all three databases:

  • FalkorDB: 50,223 ops (837 QPS)
  • Neo4j: 44,256 ops (738 QPS)
  • Memgraph: 28,040 ops (467 QPS)

Write operations did not noticeably degrade read performance on any of them.

Architectures in this benchmark

Each database ships its own management UI. These screenshots show the same dataset (16,127 nodes, 24,318 edges) loaded into all three, running the same COMPARED_WITH traversal query.

FalkorDB

FalkorDB is a graph module built on Redis’s in-memory key-value store. Queries are openCypher, but underneath it’s Redis hash tables. That’s why point lookups land at 0.044-0.048ms.
The other two databases in this benchmark measured 2-3x higher on the same queries. The tradeoff is that Redis’s single-threaded core means concurrent throughput stops scaling past 8 threads

FalkorDB Browser. Left panel shows graph metadata (6 node labels, 8 edge types, 9 property keys). The query panel runs openCypher directly. Memory usage reported as 4 MB for the graph index alone (data lives in Redis memory).

Neo4j

Neo4j runs on the JVM. JIT compilation means repeated queries get faster over time (warmup: 274ms -> 34ms). GC pauses affect tail latency but get caught by IQR outlier removal. The query optimizer handles complex aggregation plans well, and that’s where the agg_feature_sentiment win comes from. The cost is 4GB heap pre-allocation and GC overhead.

Neo4j Browser. Left sidebar shows node labels (Brand, Category, Feature, Product, Review, Reviewer), relationship types, and property keys. The bottom panel renders a COMPARED_WITH traversal as an interactive graph. Same 16,127 nodes and 24,318 relationships.

Memgraph

Memgraph is written in C++. No JVM overhead. 415MB for the full dataset, lowest of the three. Fastest at individual inserts (1,427/s) thanks to minimal per-query overhead. But it falls behind on concurrent throughput (684 QPS peak). Bolt-compatible, so it works with the Neo4j driver.

 Memgraph Lab graph schema. 16,127 nodes, 24,318 relationships across 6 node types and 8 edge types.

Graph database benchmark methodology

Environment

  • RunPod 8 vCPU (AMD EPYC x86_64), 32GB RAM, Ubuntu 24.04 LTS
  • Native install, no Docker. All three databases on the same machine, localhost connections.
  • Python 3.12.3. Persistent sessions for single-threaded tests, per-call sessions from a connection pool for multi-threaded tests.

Data

  • 120,000 synthetic reviews generated from Zipf (brands, features) and Poisson (entities, relationships) distributions, fixed seed=42.
  • 6 node types: Review, Product, Reviewer, Brand, Feature, Category
  • 8 edge types: ABOUT, WRITTEN_BY, IN_CATEGORY, MADE_BY, HAS_POSITIVE, HAS_NEGATIVE, MENTIONS, COMPARED_WITH

Queries

12 Cypher templates across 5 categories: point lookup (3), 1-hop traversal (2), 2-hop traversal (2), aggregation (3), filter (1), full scan (1). Each parameterized query runs with 10 different parameter values, 100 times each, for 1,000 measurements per query per database.

Parameters are sampled from the full ID space using Zipf-weighted selection, so both popular and rare items are tested.

Two examples:

Point lookup: Fetch a single node by indexed ID:

2-hop traversal: Walk from a brand through its products to their reviews:

Measurement

  • Timing: time.perf_counter_ns(), 500 warmup queries, 100 runs per query minimum
  • Statistics: 10,000 bootstrap samples, 95% CI, IQR outlier removal (3.0x factor). Both raw and filtered data are reported.
  • Memory: Neo4j via JMX heap used (VmRSS is meaningless because JVM pre-allocates), FalkorDB via Redis used_memory_rss, Memgraph via /proc/{pid}/status VmRSS.

Fairness

  • Same connection pool size, warmup count, Cypher queries, data, and machine across all three databases.
  • Concurrent test: 60-second sustained load at 1, 2, 4, 8, 16, and 32 threads with a fixed pool_size=32. Query mix: 40% 1-hop traversal, 30% 2-hop traversal, 20% aggregation, 10% 3-hop traversal.

Databases tested

Limitations

Single machine, single node per database. No distributed or cluster benchmarking. Neo4j Enterprise clustering and Memgraph replication are outside the scope.

Synthetic data with distributions derived from real Amazon reviews. May not match specific production workload patterns.

Not measured: disk persistence/recovery, full-text search, graph algorithms (PageRank, community detection), and write-heavy workloads (>50% writes).

Different drivers: Neo4j and Memgraph used the Neo4j Python driver, FalkorDB used its own. Overhead difference was <0.5ms in single-threaded tests.

Conclusion

FalkorDB won 11 of 12 queries, hit 6,693 QPS, and cold-started in 1.1ms. For read-heavy graph workloads, it is the fastest option in this benchmark. Memgraph is the most memory-efficient option (415MB vs 2.7GB). Neo4j offers the broadest ecosystem: RBAC, clustering, monitoring, and a query optimizer that handles complex aggregation plans better than either alternative.

Architecture determines the ceiling. Distributed clusters, 1M+ node graphs, and write-heavy workloads are the tests that could reshuffle these rankings.

AI Researcher
Ekrem Sarı
Ekrem Sarı
AI Researcher
Ekrem is an AI Researcher at AIMultiple, focusing on intelligent automation, GPUs, AI Agents, and RAG frameworks.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required.

0/450