What's Actually Happening
Your PostgreSQL table has a JSONB column with a GIN index, but queries against JSON fields are running slow because the index isn't being used. Instead of fast index scans, PostgreSQL performs full table scans, causing significant performance degradation.
The Error You'll See
Query execution shows sequential scan:
```sql EXPLAIN ANALYZE SELECT * FROM products WHERE data->>'category' = 'electronics';
QUERY PLAN ---------------------------------------------------------- Seq Scan on products (cost=0.00..50000.00 rows=1000) Filter: ((data ->> 'category'::text) = 'electronics'::text) Rows Removed by Filter: 999000 Planning Time: 0.5 ms Execution Time: 850.00 ms -- Too slow! ```
Index exists but isn't used:
```sql \d products
Table "public.products" Column | Type --------|------- id | integer data | jsonb
Indexes: "products_data_idx" gin (data) -- Index exists! ```
Attempting index hint fails:
```sql SET enable_seqscan = off;
EXPLAIN SELECT * FROM products WHERE data->>'category' = 'electronics';
-- Still shows Seq Scan or very expensive bitmap scan ```
Why This Happens
- 1.Wrong operator -
->>operator doesn't use GIN index by default - 2.GIN index type mismatch - Default GIN doesn't support
->>extraction - 3.Missing jsonb_path_ops - Using default GIN instead of optimized path ops
- 4.Query returns too many rows - PostgreSQL chooses seq scan for large result sets
- 5.Statistics outdated - Planner has wrong cardinality estimates
- 6.Partial index mismatch - Query doesn't match partial index condition
- 7.Expression not indexed - Query uses different expression than indexed
- 8.JSONB structure mismatch - Query doesn't match indexed containment pattern
Step 1: Analyze Current Index and Query
```sql -- Check existing indexes: SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'products';
-- Sample result: -- products_data_idx: gin (data)
-- Check index usage statistics: SELECT schemaname, tablename, indexname, idx_scan, -- Number of index scans idx_tup_read, -- Tuples read from index idx_tup_fetch -- Tuples fetched from index FROM pg_stat_user_indexes WHERE relname = 'products';
-- If idx_scan is 0, index never used
-- Check query execution plan: EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT * FROM products WHERE data->>'category' = 'electronics';
-- Check table statistics: SELECT relname, relpages, reltuples, relallvisible FROM pg_class WHERE relname = 'products';
-- Check JSONB column statistics: SELECT attname, n_distinct, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename = 'products' AND attname = 'data'; ```
Step 2: Understand GIN Index Operator Compatibility
```sql -- GIN default index supports these operators: -- @>, <@, ?, ?|, ?&
-- Test containment operator (uses index): EXPLAIN ANALYZE SELECT * FROM products WHERE data @> '{"category": "electronics"}'::jsonb;
-- Expected: Bitmap Index Scan on products_data_idx
-- Test existence operator (uses index): EXPLAIN ANALYZE SELECT * FROM products WHERE data ? 'category';
-- Expected: Bitmap Index Scan on products_data_idx
-- But extraction operator DOES NOT use default GIN: EXPLAIN ANALYZE SELECT * FROM products WHERE data->>'category' = 'electronics';
-- Result: Seq Scan (index not used!)
-- Check operator class: SELECT opcname, opcintype::regtype, oprname FROM pg_opclass oc JOIN pg_operator op ON op.oprleft = oc.opcintype WHERE opcname LIKE '%jsonb%';
-- Default GIN: jsonb_ops (supports @>, ?, ?|, ?&) -- Optimized: jsonb_path_ops (only @>, faster, smaller) ```
Step 3: Create Expression Index for `->>` Operator
```sql -- Create expression index for specific JSON key: CREATE INDEX products_category_idx ON products ((data->>'category'));
-- Or with specific type: CREATE INDEX products_category_idx ON products ((data->>'category') text_pattern_ops);
-- Test query now: EXPLAIN ANALYZE SELECT * FROM products WHERE data->>'category' = 'electronics';
-- Expected: Index Scan using products_category_idx
-- Create multiple expression indexes: CREATE INDEX products_price_idx ON products ((data->'price'::text)); CREATE INDEX products_brand_idx ON products ((data->>'brand')); CREATE INDEX products_status_idx ON products ((data->>'status'));
-- For numeric JSON values, cast properly: CREATE INDEX products_price_numeric_idx ON products (((data->'price')::numeric));
-- Query with cast: SELECT * FROM products WHERE (data->'price')::numeric > 100; -- Uses products_price_numeric_idx
-- Verify indexes created: SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'products'; ```
Step 4: Use jsonb_path_ops for Containment Queries
```sql -- Drop default GIN index if only using containment: DROP INDEX IF EXISTS products_data_idx;
-- Create optimized jsonb_path_ops index: CREATE INDEX products_data_path_idx ON products USING gin (data jsonb_path_ops);
-- This is smaller and faster for @> queries -- But only supports @> operator
-- Test containment query: EXPLAIN ANALYZE SELECT * FROM products WHERE data @> '{"category": "electronics"}';
-- Expected: Bitmap Index Scan on products_data_path_idx -- Much faster than default GIN for containment
-- Compare index sizes: SELECT indexname, pg_relation_size(indexname::regclass) as index_size_bytes FROM pg_indexes WHERE tablename = 'products';
-- jsonb_path_ops typically 2-3x smaller
-- For mixed query patterns, keep both: CREATE INDEX products_data_gin_idx ON products USING gin (data); CREATE INDEX products_data_path_idx ON products USING gin (data jsonb_path_ops); -- PostgreSQL will choose appropriate index ```
Step 5: Rewrite Query to Use Index-Compatible Operators
```sql -- Original slow query: SELECT * FROM products WHERE data->>'category' = 'electronics'; -- Uses Seq Scan
-- Rewrite using containment (uses GIN): SELECT * FROM products WHERE data @> '{"category": "electronics"}'; -- Uses Index Scan
-- For multiple conditions: SELECT * FROM products WHERE data @> '{"category": "electronics", "status": "active"}'; -- Uses Index Scan for both
-- For range queries, combine: SELECT * FROM products WHERE data @> '{"category": "electronics"}' AND (data->'price')::numeric < 500; -- Uses GIN for category, expression index for price
-- Use ? operator for key existence: SELECT * FROM products WHERE data ? 'category'; -- Uses GIN index
-- Use ?| for multiple keys: SELECT * FROM products WHERE data ?| ARRAY['category', 'price', 'brand']; -- Uses GIN index
-- Use ?& for all keys required: SELECT * FROM products WHERE data ?& ARRAY['category', 'price']; -- Uses GIN index ```
Step 6: Create Composite Indexes for Complex Queries
```sql -- For queries filtering on JSON + regular columns: CREATE INDEX products_category_created_idx ON products ((data->>'category'), created_at DESC);
-- Query: SELECT * FROM products WHERE data->>'category' = 'electronics' ORDER BY created_at DESC LIMIT 10; -- Uses composite index
-- For nested JSON paths: CREATE INDEX products_specs_memory_idx ON products ((data->'specs'->>'memory'));
-- Query: SELECT * FROM products WHERE data->'specs'->>'memory' = '16GB'; -- Uses index
-- For arrays in JSONB: CREATE INDEX products_tags_idx ON products USING gin ((data->'tags') jsonb_path_ops);
-- Query: SELECT * FROM products WHERE data->'tags' @> '["featured", "sale"]'; -- Uses index
-- Partial index for specific conditions: CREATE INDEX products_active_category_idx ON products ((data->>'category')) WHERE data->>'status' = 'active';
-- Query: SELECT * FROM products WHERE data->>'status' = 'active' AND data->>'category' = 'electronics'; -- Uses partial index ```
Step 7: Update Statistics and Analyze Query Plans
```sql -- Update statistics: ANALYZE products;
-- For large tables, increase sample: ANALYZE products (product_data);
-- Check if statistics updated: SELECT relname, last_analyze, autoanalyze_counter FROM pg_stat_user_tables WHERE relname = 'products';
-- Increase statistics target for JSONB column: ALTER TABLE products ALTER COLUMN data SET STATISTICS 1000;
-- Reanalyze after changing target: ANALYZE products;
-- Check updated statistics: SELECT n_distinct, most_common_vals::text, correlation FROM pg_stats WHERE tablename = 'products' AND attname = 'data';
-- Force planner to prefer index: SET random_page_cost = 1.1; -- Lower cost for index access SET cpu_tuple_cost = 0.01;
-- Or temporarily disable seq scan for testing: SET enable_seqscan = off; SET enable_bitmapscan = on;
-- Run query again: EXPLAIN ANALYZE SELECT * FROM products WHERE data @> '{"category": "electronics"}'; -- Should use index
-- Reset to defaults: RESET ALL; ```
Step 8: Optimize JSONB Structure for Indexing
```sql -- Original flat structure: -- data: {"category": "electronics", "price": 100, "brand": "Sony"}
-- Index works for: WHERE data @> '{"category": "electronics"}'
-- But for nested queries, structure matters: -- data: {"specs": {"memory": "16GB", "cpu": "i7"}, "category": "electronics"}
-- Create index for nested path: CREATE INDEX products_specs_idx ON products USING gin ((data->'specs') jsonb_path_ops);
-- Query nested: SELECT * FROM products WHERE data->'specs' @> '{"memory": "16GB"}';
-- For deeply nested, use expression index: CREATE INDEX products_specs_cpu_idx ON products ((data#>'{specs,cpu}'::text[]));
-- Query: SELECT * FROM products WHERE data#>'{specs,cpu}' = '"i7"'; -- Uses index
-- Flatten frequently queried fields: -- Instead of nested, keep top-level: -- data: {"category": "electronics", "memory": "16GB", "cpu": "i7"}
-- Then simple containment works: WHERE data @> '{"category": "electronics", "memory": "16GB"}' ```
Step 9: Handle Large Result Sets
```sql -- Check query selectivity: SELECT count(*) as total_rows, count(*) FILTER (WHERE data @> '{"category": "electronics"}') as matching_rows, round(100.0 * count(*) FILTER (WHERE data @> '{"category": "electronics}') / count(*), 2) as percentage FROM products;
-- If matching > 10-15% of table, seq scan may be faster
-- For large result sets, use LIMIT with index: SELECT * FROM products WHERE data @> '{"category": "electronics"' ORDER BY id LIMIT 100;
-- Use cursor for pagination: BEGIN; DECLARE cur CURSOR FOR SELECT * FROM products WHERE data @> '{"category": "electronics"'; FETCH 100 FROM cur; COMMIT;
-- For batch processing, use WHERE with offset: SELECT * FROM products WHERE data @> '{"category": "electronics"}' AND id > :last_processed_id ORDER BY id LIMIT 1000;
-- Add index on id for efficient pagination: CREATE INDEX products_id_idx ON products (id); ```
Step 10: Monitor and Benchmark JSONB Index Performance
```sql -- Create benchmark query: EXPLAIN (ANALYZE, BUFFERS, TIMING) SELECT * FROM products WHERE data @> '{"category": "electronics"}';
-- Check buffer usage: -- Shared Read: pages read from disk -- Shared Hit: pages in cache -- High Hit ratio = good cache usage
-- Compare query performance: \timing on
-- Without index (seq scan): DROP INDEX products_data_path_idx; SELECT * FROM products WHERE data @> '{"category": "electronics"}'; -- Time: 850ms
-- With index: CREATE INDEX products_data_path_idx ON products USING gin (data jsonb_path_ops); SELECT * FROM products WHERE data @> '{"category": "electronics"}'; -- Time: 5ms
-- Monitor index usage over time: SELECT indexname, idx_scan, idx_tup_read, idx_tup_fetch, pg_size_pretty(pg_relation_size(indexname::regclass)) as size FROM pg_stat_user_indexes WHERE relname = 'products';
-- Set up monitoring view: CREATE VIEW jsonb_index_stats AS SELECT schemaname, tablename, indexname, idx_scan as scans, idx_tup_read as tuples_read, pg_size_pretty(pg_relation_size(indexname::regclass)) as index_size, last_idx_scan FROM pg_stat_user_indexes WHERE indexname LIKE '%jsonb%' OR indexname LIKE '%data%';
-- Check regularly: SELECT * FROM jsonb_index_stats; ```
PostgreSQL JSONB Index Checklist
| Check | Query | Expected |
|---|---|---|
| Index exists | pg_indexes | GIN or expression index |
| Operator match | EXPLAIN | Index Scan |
| Statistics | pg_stats | Updated, accurate |
| Selectivity | count filter | < 15% for index |
| Index size | pg_relation_size | Reasonable size |
| Buffer hit ratio | EXPLAIN BUFFERS | High cache usage |
Verify the Fix
```sql -- After creating appropriate indexes:
-- 1. Check index scan is used EXPLAIN ANALYZE SELECT * FROM products WHERE data @> '{"category": "electronics"}'; -- Result: Bitmap Index Scan, Execution Time < 10ms
-- 2. Verify expression index works EXPLAIN ANALYZE SELECT * FROM products WHERE data->>'category' = 'electronics'; -- Result: Index Scan on products_category_idx
-- 3. Test complex query EXPLAIN ANALYZE SELECT * FROM products WHERE data @> '{"category": "electronics"}' AND (data->'price')::numeric < 500; -- Result: Uses both indexes
-- 4. Check performance improvement SELECT * FROM products WHERE data @> '{"category": "electronics"}' ORDER BY created_at DESC LIMIT 100; -- Fast response time
-- 5. Monitor index usage SELECT * FROM pg_stat_user_indexes WHERE relname = 'products'; -- idx_scan increasing over time
-- 6. Compare before/after -- Before: 850ms, Seq Scan -- After: 5ms, Index Scan
-- 7. Verify index size SELECT pg_size_pretty(pg_relation_size('products_data_path_idx')); -- Smaller than default GIN ```
Related Issues
- [Fix PostgreSQL Query Timeout](/articles/fix-database-query-timeout)
- [Fix PostgreSQL Index Corrupted](/articles/fix-database-index-corrupted)
- [Fix PostgreSQL Slow Query Performance](/articles/fix-database-slow-query-performance)