The first time I ran a REINDEX in production, I learned that older Postgres locked the table for the entire operation. The on-call engineer learned this slightly later, when they got the page. We ran it on a 200GB table on a Tuesday afternoon. The lesson cost us a 22-minute outage.
New Postgres has REINDEX CONCURRENTLY, which is much friendlier. But the underlying question — when do you actually need to reindex? — is the same. The answer is: less often than you might think.
How indexes get bloated
Postgres B-tree indexes do not auto-shrink the way the table itself does after a vacuum. When rows are deleted or updated, the corresponding index entries are removed lazily. Over time, especially on heavily-updated columns, the index can carry empty space that never gets reclaimed.
VACUUM helps. pg_repack and REINDEX help more. The question is whether the index is bloated enough to matter.
Detecting bloat
There is no single perfect bloat query. The one I trust is the pgstattuple extension:
CREATE EXTENSION IF NOT EXISTS pgstattuple;
SELECT *
FROM pgstatindex('orders_customer_id_idx');
The output gives you leaf_pages, internal_pages, avg_leaf_density, and friends. The number to watch is avg_leaf_density. A healthy B-tree index has this above 80%. Below 50% and the index is half empty. At that point a rebuild is worth considering.
For a quick at-a-glance view of all indexes:
SELECT
schemaname,
tablename,
indexrelname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;
Look for indexes that are large in size but have few idx_scan reads. Those are either unused (drop them) or bloated (reindex them).
Rebuild safely with REINDEX CONCURRENTLY
For Postgres 12 and later, REINDEX CONCURRENTLY rebuilds the index without taking an exclusive lock. Reads and writes continue. The operation is slower than a regular reindex but does not interrupt the application.
REINDEX INDEX CONCURRENTLY orders_customer_id_idx;
This builds a new index alongside the old one, then swaps. If something fails partway through (most often a uniqueness violation that the rebuild encounters), you end up with an INVALID index that needs to be dropped manually:
SELECT indexrelid::regclass
FROM pg_index
WHERE NOT indisvalid;
DROP INDEX CONCURRENTLY name_of_invalid_index;
For entire tables (REINDEX TABLE CONCURRENTLY), the operation runs index-by-index, so a failure on one index does not undo work on the others.
When NOT to REINDEX
The most common mistake I see is reindexing on a schedule. "Every Sunday at 3 a.m. we reindex everything" is a bad ritual. Most indexes do not need rebuilding ever. Reindex generates WAL, reads from disk, writes to disk, and provides no benefit on a healthy index.
Reindex is the right call when:
- A specific index has measurably bloat (low leaf density, large size relative to table size).
- An index has become invalid (e.g., from a failed
CREATE INDEX CONCURRENTLY). - You are dealing with a known issue, like a corruption notice from
pg_amcheck. - The index is on a column with sustained UPDATE traffic and bloat has been confirmed.
For most tables, the right ritual is no ritual. Let vacuum do its job, and reindex when monitoring tells you to.
What about pg_repack?
pg_repack is an extension that rebuilds tables and indexes without locks, in a way that also reclaims space in the table itself. For tables suffering from bloat (not just indexes), pg_repack is often the better choice. It is more complex to operate (requires a maintenance schema, cannot run during certain DDL), but it solves a problem REINDEX does not.
For index-only bloat, REINDEX CONCURRENTLY is sufficient and ships with Postgres.
A pragmatic playbook
Here is the sequence I use when bloat is the suspected problem:
- Confirm there actually is bloat. Run
pgstattupleor check index size vs table size andidx_scancount. - If only one or two indexes are bloated,
REINDEX INDEX CONCURRENTLYthem. - If the table itself is bloated, reach for
pg_repack. - After the rebuild, monitor for the same problem returning. If it does, the bloat is being generated by ongoing UPDATE traffic and the long-term answer might be a different schema (HOT updates, fillfactor tuning, partitioning).
Reindex is a tool, not a cure. The cure is usually somewhere else.