Troubleshooting

Autovacuum Performance Checklist

How to identify autovacuum-related performance problems: dead tuples, stale statistics, old transactions, table churn, bloat, and per-table tuning.

Autovacuum is performance infrastructure

When autovacuum falls behind, ordinary reads scan through more dead rows, indexes grow, statistics drift, and the planner starts making worse choices. Treat autovacuum lag as a query performance issue, not only a storage issue.

Find tables with cleanup pressure

SELECT
  relname,
  n_live_tup,
  n_dead_tup,
  last_vacuum,
  last_autovacuum,
  last_analyze,
  last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 30;

High dead tuples on a high-traffic table are a priority. A high dead tuple count on a cold archive table may be less urgent.

Check for old transactions

SELECT
  pid,
  state,
  now() - xact_start AS xact_age,
  left(query, 180) AS query_sample
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY xact_start
LIMIT 20;

Old transactions can prevent vacuum from removing row versions. Closing the old transaction may do more than changing autovacuum settings.

Use per-table settings for hot tables

ALTER TABLE events SET (
  autovacuum_vacuum_scale_factor = 0.02,
  autovacuum_analyze_scale_factor = 0.01,
  autovacuum_vacuum_cost_limit = 2000
);

Do this for tables with heavy churn and visible cleanup lag. Do not start by changing the whole cluster unless the problem is cluster-wide.

Checklist

  • Is query latency rising with dead tuple count?
  • Are statistics stale on the affected table?
  • Are long transactions blocking cleanup?
  • Are update-heavy tables using default scale factors that are too loose?
  • Does a manual ANALYZE improve the bad plan?
  • Does vacuum finish before the next wave of writes creates new debt?