Indexes9 min read

PostgreSQL Index-Only Scans: Why Heap Fetches Ruin the Fast Plan

An index-only scan that still hammered the heap taught me the plan name lies. The real story is the visibility map, vacuum health, and whether the index actually covers the query.

The first time I made a query index-only, it felt like cheating. A report that used to crawl over a big table came back instantly, the plan said Index Only Scan, and I went to lunch feeling clever.

Two weeks later the same query was slow again. Same plan, same Index Only Scan, but EXPLAIN now showed millions of heap fetches. A teammate asked the obvious question: if it's index-only, why is it reading the table? I didn't have a good answer that day. I do now.

The visibility map is the part nobody mentions

An index-only scan can only skip the heap if PostgreSQL already knows the rows are visible to everyone. It learns that from the visibility map, which VACUUM maintains by marking heap pages all-visible once there are no relevant dead tuples on them.

So when autovacuum falls behind, or a table churns hard, or an old transaction keeps dead rows alive, those all-visible bits don't get set. The plan stays "index-only" in name, but PostgreSQL quietly visits the heap to check visibility. That's where my heap fetches came from.

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, created_at, status
FROM orders
WHERE tenant_id = 42
ORDER BY created_at DESC
LIMIT 100;
-- Read "Heap Fetches:" — a small number is fine, a large one means trouble

Cover the whole query, not just the filter

The other half of my problem was a lazy index. An index-only scan needs every column the query touches — filters, ordering, and the selected columns — present in the index. Mine had the filter but not the columns I was returning, so it had to hit the heap anyway.

INCLUDE columns are the fix. They ride along in the index leaf without being part of the search key, which is exactly what you want for values you only need to return.

CREATE INDEX CONCURRENTLY orders_tenant_created_cover_idx
ON orders (tenant_id, created_at DESC)
INCLUDE (id, status);

Keep vacuum healthy on the hot tables

Once I understood the dependency, the operational fix was obvious: the tables I most want index-only scans on are the ones that most need aggressive autovacuum. I tuned the scale factor down on those tables so all-visible bits get set while they still matter.

After that, the same plan that had been doing millions of heap fetches did almost none, and the report went back to feeling like cheating.

What I keep coming back to

None of this is clever. Name the failure mode, capture the before number, change one thing, and compare the same signal afterward. The boring version of database work is the version that actually holds up at 2 AM.