10 min read

PostgreSQL Parallel Query: Why More Workers Do Not Always Mean Faster SQL

I turned on parallel query globally and made the database slower. The lesson: parallelism is a trade, not free speed, and it only pays on big work with cores to spare.

Parallel query feels like it should be simple. One worker is slow, so use four. Four are slow, so use eight. I believed that for exactly one week, which is how long it took the monthly reporting load to fall over after I bumped the worker count across the board.

Real systems are less polite than the intuition. Workers need coordination, memory, CPU, and enough useful work to be worth launching. Get any of that wrong and you've added overhead, not speed.

Where parallelism actually pays

The wins are real when the work per row is heavy, the row count is large, and there are idle cores to use. A big sequential scan, a large hash join, a heavy aggregation over millions of rows on an otherwise quiet box — that's the sweet spot, and there parallel query is genuinely transformative.

I confirm it the same way every time: read the plan for a Gather node and compare workers planned against workers launched.

EXPLAIN (ANALYZE, BUFFERS)
SELECT category_id, count(*)
FROM events
WHERE created_at >= now() - interval '30 days'
GROUP BY category_id;
-- Watch "Workers Planned" vs "Workers Launched" under the Gather

Where it quietly loses

My mistake was a global setting. On a busy box, every medium query started recruiting workers, stealing CPU from each other and from the OLTP traffic that actually mattered. Small queries paid coordination overhead for nothing, and concurrent reports fought over the same cores and the same disk.

Six workers reading one disk don't get six times the bandwidth. They contend, and you can end up slower than a single clean sequential reader.

Tune per workload, not globally

The fix that stuck was setting parallelism per role. The OLTP application stays conservative or single-threaded; the analytics role gets generous limits because it runs when the box is quiet. One global number can't serve both, and trying to make it do so is how I broke things in the first place.

ALTER ROLE app_user  SET max_parallel_workers_per_gather = 0;
ALTER ROLE analytics SET max_parallel_workers_per_gather = 6;

What I keep coming back to

None of this is clever. Name the failure mode, capture the before number, change one thing, and compare the same signal afterward. The boring version of database work is the version that actually holds up at 2 AM.