Indexes8 min read

REINDEX in Postgres: When the Index Got Out of Hand

REINDEX is a repair tool, not an indexing strategy. Use it when bloat, corruption, or access pattern churn makes rebuilding cheaper than carrying the old index forward.

The most expensive index I have seen was not a bad index. It had been useful for years. Then the table changed shape: frequent updates, deletes, new write paths, and a workload that left the index full of space the application could no longer use efficiently.

Queries were slower. The index was huge. Vacuum was working, but the physical structure was not getting smaller. That is when REINDEX enters the conversation.

REINDEX rebuilds an index from the table data. It is not a cure for every slow query. It is a way to replace an index copy that has become bloated, invalid, or suspect.

The framework: prove the index is the problem

I do not reindex because an index is old. I reindex when there is evidence:

  • The index is much larger than expected for the live row count.
  • The table has heavy churn and the index no longer fits in cache.
  • An index is invalid after a failed concurrent build.
  • There is suspected corruption and recovery needs a rebuild.
  • A known access pattern creates B-tree bloat.

If the real issue is a missing predicate, wrong column order, or bad query plan, REINDEX just gives you a fresh version of the wrong index.

Start with size and usage

SELECT
  schemaname,
  relname AS table_name,
  indexrelname AS index_name,
  idx_scan,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 25;

This shows large indexes and whether they are being used. A huge index with low scans may be a drop candidate, not a reindex candidate.

Use pgstattuple when you need stronger evidence

The pgstattuple extension can inspect bloat more directly, though it has cost. I use it on targeted objects, not the whole database during an incident.

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT *
FROM pgstatindex('public.orders_created_at_idx');

If the index has high deleted pages or poor density, a rebuild may be justified.

Understand the lock tradeoff

REINDEX INDEX public.orders_created_at_idx;

A normal reindex is simpler and can be faster, but it takes locks that block writes to the table while the rebuild happens. That is often unacceptable for a hot production table.

REINDEX INDEX CONCURRENTLY public.orders_created_at_idx;

CONCURRENTLY rebuilds without blocking concurrent inserts, updates, and deletes in the same way. It usually takes longer and cannot run inside a transaction block. It also needs enough disk headroom for the new index while the old one still exists.

Do not forget invalid indexes

Failed concurrent index operations can leave invalid indexes behind. They may still consume disk and confuse future maintenance.

SELECT
  n.nspname,
  c.relname,
  i.indisvalid,
  i.indisready
FROM pg_index i
JOIN pg_class c ON c.oid = i.indexrelid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE NOT i.indisvalid OR NOT i.indisready;

For invalid indexes, decide whether to reindex, drop, or rebuild with a better definition.

Schedule around IO and WAL

Rebuilding an index is not free. It reads table data, writes a new index, generates WAL, and can compete with normal workload IO. On replicas, it can create replay pressure.

Before a large reindex, I check:

  • Available disk headroom.
  • Replica lag tolerance.
  • WAL archive capacity.
  • Maintenance window or traffic trough.
  • Whether the index is still the right index.

When not to REINDEX

  • Do not reindex to fix a missing index.
  • Do not reindex to fix a bad query shape.
  • Do not reindex every week as ritual maintenance.
  • Do not reindex large hot tables without disk and WAL planning.

The pragmatic default

Use REINDEX CONCURRENTLY for hot production indexes when writes must continue. Use regular REINDEX in controlled windows where blocking is acceptable. Always ask whether the rebuilt index is still the index you want.

The best reindex is boring: evidence, headroom, command, verification, and a note explaining why the index got there in the first place.