What's Actually Happening

Neo4j graph queries are running slowly. Queries that should be fast are taking too long to complete.

The Error You'll See

```cypher neo4j> MATCH (u:User)-[:FRIEND]->(f:User) WHERE u.name = 'John' RETURN f;

// Query takes 30+ seconds // Expected: < 1 second ```

Timeout error:

bash
org.neo4j.graphdb.QueryExecutionException: Query timed out

Memory error:

bash
java.lang.OutOfMemoryError: Java heap space

Cartesian product:

cypher
// Query plan shows cartesian product warning

Why This Happens

  1. 1.Missing indexes - Queries scan all nodes
  2. 2.Cartesian products - Unconstrained patterns
  3. 3.Large result sets - Returning too much data
  4. 4.Memory configuration - Heap too small
  5. 5.Query complexity - Inefficient patterns
  6. 6.Page cache - Insufficient cache size

Step 1: Analyze Query Plan

```cypher // Use PROFILE to see execution details: PROFILE MATCH (u:User {name: 'John'})-[:FRIEND]->(f:User) RETURN f;

// Use EXPLAIN for plan without execution: EXPLAIN MATCH (u:User {name: 'John'})-[:FRIEND]->(f:User) RETURN f;

// Check for: // 1. AllNodesScan - Full scan (bad, need index) // 2. NodeByLabelScan - Scan by label (better) // 3. NodeIndexSeek - Index lookup (good) // 4. CartesianProduct - Unconstrained join (optimize)

// View in Neo4j Browser: // Run query and click "Plan" tab

// Check for warnings: PROFILE MATCH (a:User), (b:User) RETURN a, b; // Shows: "This query builds a cartesian product" ```

Step 2: Create Missing Indexes

```cypher // List existing indexes: SHOW INDEXES;

// Create index for label property: CREATE INDEX user_name_index FOR (u:User) ON (u.name);

// Create composite index: CREATE INDEX user_name_email FOR (u:User) ON (u.name, u.email);

// Create full-text index: CREATE FULLTEXT INDEX user_fulltext FOR (u:User) ON EACH [u.name, u.email];

// Create constraint (creates unique index): CREATE CONSTRAINT user_id_unique FOR (u:User) REQUIRE u.id IS UNIQUE;

// Create existence constraint: CREATE CONSTRAINT user_name_exists FOR (u:User) REQUIRE u.name IS NOT NULL;

// Wait for index to be online: SHOW INDEXES WHERE state = 'POPULATING'; SHOW INDEXES WHERE state = 'ONLINE';

// Check index size: CALL db.index.full.awaitEventuallyConsistentIndexRefresh();

// Drop index: DROP INDEX user_name_index; ```

Step 3: Optimize Query Patterns

```cypher // Use parameters instead of literals: // BAD: MATCH (u:User {name: 'John'}) RETURN u;

// GOOD: MATCH (u:User {name: $name}) RETURN u;

// Use WITH to limit intermediate results: MATCH (u:User {name: 'John'}) WITH u LIMIT 100 MATCH (u)-[:FRIEND]->(f:User) RETURN f;

// Use DISTINCT early: MATCH (u:User)-[:FRIEND]->(f:User) WHERE u.name = 'John' RETURN DISTINCT f.name;

// Use pattern comprehension for projection: MATCH (u:User {name: 'John'}) RETURN u.name, [(u)-[:FRIEND]->(f) | f.name] AS friends;

// Avoid cartesian products: // BAD: MATCH (a:User), (b:User) RETURN a, b;

// GOOD: MATCH (a:User {name: 'John'}) MATCH (b:User {name: 'Jane'}) RETURN a, b;

// Use index hints: MATCH (u:User) USING INDEX u:User(name) WHERE u.name = 'John' RETURN u; ```

Step 4: Check Memory Configuration

```bash # Check Neo4j configuration: cat /etc/neo4j/neo4j.conf | grep -E "heap|memory|cache"

# Heap memory (for query execution): dbms.memory.heap.initial_size=4G dbms.memory.heap.max_size=4G

# Page cache (for data cache): dbms.memory.pagecache.size=4G

# Transaction memory: dbms.tx_state.memory_allocation=OFF_HEAP

# Check current settings: cypher-shell -u neo4j -p password "CALL dbms.listConfig('memory')"

# Monitor memory usage: cypher-shell "CALL dbms.queryJmx('org.neo4j:*')"

# Set via environment: export NEO4J_HEAP_SIZE=4G export NEO4J_PAGECACHE=4G

# For Docker: docker run -e NEO4J_HEAP_SIZE=4G -e NEO4J_PAGECACHE=4G neo4j

# Total memory recommendation: # Heap + PageCache + OS overhead < 80% of RAM ```

Step 5: Optimize Transaction Scope

```cypher // Keep transactions small:

// BAD: Large transaction MATCH (u:User) SET u.processed = true;

// GOOD: Batch processing CALL apoc.periodic.iterate( 'MATCH (u:User) RETURN u', 'SET u.processed = true', {batchSize: 1000} );

// Use apoc.periodic.commit for large updates: CALL apoc.periodic.commit( 'MATCH (u:User) WHERE NOT u.processed WITH u LIMIT 1000 SET u.processed = true RETURN count(u)' );

// Avoid long-running transactions: // Transaction timeout in config: dbms.transaction.timeout=30s

// Check transaction log: CALL dbms.listTransactions(); ```

Step 6: Monitor Query Statistics

```cypher // Query statistics: CALL dbms.listQueries();

// Kill slow query: CALL dbms.killQuery('query-id');

// Query log: CALL dbms.listConfig('dbms.logs.query');

// Enable query logging: dbms.logs.query.enabled=true dbms.logs.query.threshold=1s

// Check recent queries: CALL dbms.listQueries() YIELD queryId, query, elapsedTimeMillis WHERE elapsedTimeMillis > 1000 RETURN query, elapsedTimeMillis ORDER BY elapsedTimeMillis DESC;

// Page hits and faults: PROFILE MATCH (u:User {name: 'John'}) RETURN u; // Look at "pageCacheHits" and "pageCacheMisses" ```

Step 7: Check Data Statistics

```cypher // Check node count: MATCH (n) RETURN count(n);

// Check relationship count: MATCH ()-[r]->() RETURN count(r);

// Check specific label count: MATCH (n:User) RETURN count(n);

// Check relationship types: CALL db.relationshipTypes();

// Check property keys: CALL db.propertyKeys();

// Database statistics: CALL db.stats.retrieve('GRAPH COUNTS');

// Check label distribution: MATCH (n) WITH labels(n) AS label, count(*) AS count RETURN label, count ORDER BY count DESC;

// Check relationship distribution: MATCH ()-[r]->() WITH type(r) AS relType, count(*) AS count RETURN relType, count ORDER BY count DESC; ```

Step 8: Use APOC for Optimization

```cypher // APOC procedures for optimization:

// Batch processing: CALL apoc.periodic.iterate( 'MATCH (u:User) RETURN u', 'SET u.lastSeen = datetime()', {batchSize: 1000, parallel: true} );

// Virtual relationships for large queries: MATCH (u:User {name: 'John'}) CALL apoc.when( size((u)-[:FRIEND]->()) > 100, 'MATCH (u)-[:FRIEND]->(f) RETURN f LIMIT 100', 'MATCH (u)-[:FRIEND]->(f) RETURN f', {u: u} ) YIELD value RETURN value.f;

// Conditionals: CALL apoc.when( condition, 'query if true', 'query if false', {params} );

// Use apoc.path for traversal: MATCH (u:User {name: 'John'}) CALL apoc.path.subgraphNodes(u, { relationshipFilter: 'FRIEND>', maxDepth: 2 }) YIELD node RETURN node; ```

Step 9: Check Database Health

```bash # Check database consistency: neo4j-admin database check database-name

# Store statistics: neo4j-admin store-info --all /var/lib/neo4j/data/databases/neo4j

# Check transaction logs: ls -la /var/lib/neo4j/data/transactions/

# Check database size: du -sh /var/lib/neo4j/data/databases/neo4j

# Rebuild indexes: neo4j-admin database rebuild-indexes neo4j

# Check for corruption: neo4j-admin database report neo4j

# Backup and restore: neo4j-admin database backup neo4j --to-path=/backup

# Monitor via JMX: curl -u neo4j:password http://localhost:7474/dbms/jmx ```

Step 10: Neo4j Query Verification Script

```bash # Create verification script: cat << 'EOF' > /usr/local/bin/check-neo4j-query.sh #!/bin/bash

echo "=== Neo4j Status ===" systemctl status neo4j 2>/dev/null || docker ps | grep neo4j

echo "" echo "=== Memory Configuration ===" cat /etc/neo4j/neo4j.conf 2>/dev/null | grep -E "heap|pagecache" || echo "Config not accessible"

echo "" echo "=== Database Statistics ===" cypher-shell -u neo4j -p password "MATCH (n) RETURN count(n) AS nodes" 2>/dev/null || echo "Cannot connect"

echo "" echo "=== Indexes ===" cypher-shell -u neo4j -p password "SHOW INDEXES" 2>/dev/null | head -20

echo "" echo "=== Constraints ===" cypher-shell -u neo4j -p password "SHOW CONSTRAINTS" 2>/dev/null | head -20

echo "" echo "=== Current Queries ===" cypher-shell -u neo4j -p password "CALL dbms.listQueries() YIELD query, elapsedTimeMillis RETURN query, elapsedTimeMillis ORDER BY elapsedTimeMillis DESC LIMIT 10" 2>/dev/null

echo "" echo "=== Page Cache Stats ===" cypher-shell -u neo4j -p password "CALL dbms.queryJmx('org.neo4j:*PageCache*') YIELD attributes RETURN attributes" 2>/dev/null | head -20

echo "" echo "=== Database Size ===" du -sh /var/lib/neo4j/data/databases/neo4j 2>/dev/null || echo "Cannot determine size"

echo "" echo "=== Recommendations ===" echo "1. Create indexes for frequently queried properties" echo "2. Use PROFILE to analyze query plans" echo "3. Avoid cartesian products with proper WHERE clauses" echo "4. Increase heap and page cache if needed" echo "5. Use parameters instead of literals" echo "6. Batch large operations with APOC" echo "7. Enable query logging for slow queries" EOF

chmod +x /usr/local/bin/check-neo4j-query.sh

# Usage: /usr/local/bin/check-neo4j-query.sh ```

Neo4j Query Performance Checklist

CheckExpected
IndexesCreated for queried properties
Query planUses index seeks
MemoryHeap and page cache adequate
No cartesian productsConstrained patterns
Batch processingLarge updates batched
ParametersUsed instead of literals
Query loggingEnabled for monitoring

Verify the Fix

```bash # After fixing Neo4j query performance

# 1. Check indexes online SHOW INDEXES WHERE state = 'ONLINE'; // All indexes online

# 2. Run slow query PROFILE MATCH (u:User {name: 'John'})-[:FRIEND]->(f) RETURN f; // Uses NodeIndexSeek

# 3. Check execution time // Query completes in < 1s

# 4. Monitor page cache CALL dbms.queryJmx('org.neo4j:*PageCache*'); // High hit ratio

# 5. Check memory usage // No OOM errors

# 6. Verify query logs tail -f /var/log/neo4j/query.log // Slow queries logged ```

  • [Fix MongoDB Index Not Used in Query](/articles/fix-mongodb-index-not-used-in-query)
  • [Fix PostgreSQL Slow Query](/articles/fix-postgresql-slow-query)
  • [Fix Redis High Latency](/articles/fix-redis-high-latency)