A point lookup that should have been instant was reading thousands of buffer pages. The table was modest — a few hundred megabytes. The index on it was over a gigabyte. The index had bloated to three times the size of the data it indexed, and every lookup was wading through mostly empty pages.
Index bloat is sneakier than table bloat because people forget indexes are separate structures with their own pages, their own dead space, and their own maintenance needs. A healthy table can sit underneath a deeply bloated index.
How a B-tree bloats
A B-tree index packs entries into pages. When you insert into a full page, PostgreSQL splits it into two half-full pages. Under heavy insert and update churn, especially with random keys like UUIDs, you get a lot of splits and a lot of half-empty pages that never refill.
Updates and deletes add dead index entries. PostgreSQL cleans some of them up lazily, but it does not automatically merge half-empty pages back together. So over time the index spreads out: more pages, more of them sparse, and every scan touches more pages than the data justifies.
Measure before you act
Do not guess at bloat. Compare the index size to the table and, better, use the bloat-estimation queries that ship in the community (or the pgstattuple extension) to get an actual dead-space percentage. A large index is not necessarily bloated; a large index that is mostly empty pages is.
CREATE EXTENSION IF NOT EXISTS pgstattuple;
-- Real fill factor / dead space for a specific index
SELECT * FROM pgstatindex('orders_user_id_idx');
-- Quick size comparison
SELECT pg_size_pretty(pg_relation_size('orders')) AS table_size,
pg_size_pretty(pg_relation_size('orders_user_id_idx')) AS index_size;
Deduplication helps on low-cardinality indexes
Since PostgreSQL 13, B-tree indexes use deduplication: when many rows share the same indexed value, the index stores that key once with a compact list of row pointers instead of repeating the key. For indexes on low-cardinality columns (a status, a type, a foreign key with few distinct parents) this dramatically reduces size.
Indexes built on older versions do not get deduplication until they are rebuilt. That alone is a reason a REINDEX can shrink an old index far more than you expect.
REINDEX, online
The direct fix for a bloated index is to rebuild it, which packs entries tightly and reclaims the empty pages. Use REINDEX ... CONCURRENTLY so you do not lock out writes while it runs.
When I rebuilt that three-gigabyte index, it came back at under 400 megabytes, and the lookup that had been reading thousands of pages dropped back to a handful.
-- Rebuild without blocking writes
REINDEX INDEX CONCURRENTLY orders_user_id_idx;
-- Or every index on the table
REINDEX TABLE CONCURRENTLY orders;
Slow the bloat down
Rebuilding is treatment, not prevention. To bloat more slowly: keep autovacuum healthy so dead index entries are cleaned promptly, design updates so HOT updates apply (don't change indexed columns when you can avoid it, which lets PostgreSQL skip touching the index entirely), and consider a lower fillfactor on indexes that take heavy in-place churn so there is room before a split.
- Indexes are separate structures with their own dead space — watch their size.
- Random keys and heavy churn cause page splits and sparse pages.
- PG13+ deduplication shrinks low-cardinality indexes; old indexes need a rebuild to get it.
- REINDEX CONCURRENTLY reclaims the space without blocking writes.
- Healthy autovacuum and HOT updates slow future bloat.
The practical standard
The best PostgreSQL performance work is boring in the right way. Name the failure mode, capture the before plan or metric, make one change, and compare the exact same signal afterward. Anything else is just a more confident guess.