12 min read

The PostgreSQL Query Planner: How It Works and When It Gets It Wrong

What the planner does, the statistics that drive its decisions, and the diagnostic workflow for the moments when it picks the spectacularly wrong plan.

PostgreSQL's query planner is smarter than you. Most of the time. The rest of the time, it's confidently wrong, executing a plan that's 100x slower than the optimal one because it made a bad estimate 3 nodes up in the tree.

Understanding the planner doesn't mean you'll always outsmart it. It means you'll know when it's broken and why, which is enough to fix most problems.

How Cost Estimation Works

The planner assigns a "cost" to every possible execution plan and picks the cheapest one. Cost is a unitless number — it represents rough I/O and CPU work, calibrated to seq_page_cost (the baseline cost of reading a page sequentially, set to 1.0 by default).

random_page_cost is the cost of a random page read — default 4.0. This reflects that random disk reads are slower than sequential reads. On SSDs, this ratio is wrong (SSDs have similar cost for sequential and random reads). Set it lower on SSD systems:

# postgresql.conf — for SSD storage
random_page_cost = 1.1
effective_cache_size = 24GB  # estimate of filesystem cache available

Getting these right is important. If random_page_cost is too high relative to seq_page_cost, the planner will prefer sequential scans over index scans even when an index scan would be faster.

Statistics: The Planner's Assumptions

The planner relies on statistics collected by ANALYZE (or autovacuum, which runs ANALYZE automatically). These are stored in pg_statistic (raw) and pg_stats (readable).

-- What the planner knows about a column
SELECT
  tablename,
  attname AS column_name,
  n_distinct,
  correlation,
  most_common_vals,
  most_common_freqs,
  histogram_bounds
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'status';

Key stats:

  • n_distinct: number of distinct values. If -1, it's a fraction (e.g., -0.3 = 30% distinct)
  • correlation: physical order correlation. 1.0 = data is ordered, -1.0 = reverse ordered. High correlation means index scans are efficient
  • most_common_vals/most_common_freqs: for highly skewed columns (like status = 'active' 95% of the time)
  • histogram_bounds: for range queries

When the planner gets cardinality estimation wrong, it's usually because statistics are stale, the default statistics target (100) isn't detailed enough, or data is highly correlated across columns.

Statistics Target

By default, PostgreSQL collects 100 "buckets" of statistics per column. You can increase this for columns that are frequently in WHERE clauses or join conditions:

-- Increase statistics for a specific column
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;

-- Or globally
-- statistics_target = 500  (in postgresql.conf, affects all future ANALYZE)

More statistics means better cardinality estimates, which means better query plans. The tradeoff is ANALYZE takes longer. I typically crank this up for columns on large tables that appear in complex WHERE clauses.

Join Strategies

PostgreSQL has three join strategies:

Nested Loop: For each row in the outer relation, scan the inner relation. Great when the outer result is small and there's an index on the inner join key. Terrible when both sides are large.

Hash Join: Build a hash table from the smaller side, probe it with the larger side. Great for large tables without useful indexes, when both fit in work_mem. Memory-heavy.

Merge Join: Sort both sides, then merge. Great when both sides are already sorted (or there are sorted indexes). CPU-efficient but needs sorted input.

-- Which join strategies are enabled
SHOW enable_nestloop;
SHOW enable_hashjoin;
SHOW enable_mergejoin;

When the Planner Picks Wrong

Classic case: the planner chooses a nested loop when a hash join would be much faster, because it underestimated the inner side's cardinality.

EXPLAIN (ANALYZE, BUFFERS)
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.signup_date > '2025-01-01'
GROUP BY u.id;

If you see:

Nested Loop (actual rows=500000, loops=1)
  -> Seq Scan on users (actual rows=50000)
  -> Index Scan on orders ... (actual rows=10 loops=50000)

The nested loop is doing 50,000 index scans. A hash join would have done this in one pass. The planner chose nested loop because it thought users would return far fewer rows.

Force a different strategy temporarily to test:

SET enable_nestloop = off;
-- Re-run your query and compare plans

Don't leave this set globally. Use it to understand what the planner would do with a different strategy, then figure out why it prefers the wrong one (usually a stats issue).

Multi-Column Statistics

The planner assumes column values are independent. But real data often isn't — city and state are correlated, product_type and price_range might be correlated.

-- Create extended statistics for correlated columns
CREATE STATISTICS stat_city_state ON city, state FROM addresses;
ANALYZE addresses;

-- Check what extended statistics were created
SELECT * FROM pg_statistic_ext;

This tells the planner "these columns are correlated, here's how much." Without this, estimates like WHERE city = 'Austin' AND state = 'TX' might assume independence and badly underestimate the actual result size.

When to Override the Planner

Honestly, most of the time when you think the planner is wrong, you should instead:

  1. Make sure statistics are current (ANALYZE the table)
  2. Check if statistics_target should be increased
  3. Check if random_page_cost is appropriate for your storage
  4. Consider adding extended statistics for correlated columns

The nuclear option — hints via pg_hint_plan extension — lets you force specific join orders, scan types, and join strategies. I use it rarely, only when I've exhausted other options and have a specific query that the planner consistently gets wrong.

-- With pg_hint_plan installed
/*+ HashJoin(orders users) SeqScan(users) */
SELECT u.name, COUNT(o.id)
FROM users u JOIN orders o ON o.user_id = u.id
WHERE u.signup_date > '2025-01-01'
GROUP BY u.id;

Practical Debugging

-- Compare estimated vs actual rows at each node
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT ...;

-- Look for large discrepancies:
-- estimated rows: 150, actual rows: 87432 → stats problem
-- estimated cost: 123, actual: 45000ms → something unexpected (locks? bloat?)

The combination of MonPG's query monitoring and periodic ANALYZE on high-traffic tables is usually enough to keep the planner working well. When it's not, you have the tools above to diagnose exactly what went wrong.