There are two operational events in Postgres that result in single-user emergency mode: catastrophic disk full, and transaction ID wraparound. The first is hard to prevent entirely. The second is almost always preventable, almost always caused by ignored autovacuum settings, and almost always introduced because someone tuned vacuum to be "less aggressive" without understanding what they turned off.
I have helped recover three different databases from imminent wraparound. Two of them were running in production with autovacuum disabled because "vacuum was too slow." One was on a managed service where the customer had increased the autovacuum freeze threshold to a value that was effectively never going to fire.
FREEZE is not optional. Here is what it does and how to keep it boring.
What FREEZE actually does
Postgres uses 32-bit transaction IDs. That gives you about 2 billion transactions before the counter wraps. To handle wraparound, Postgres tracks which rows belong to "old enough" transactions and marks them as "frozen," meaning the transaction ID does not matter for visibility anymore.
FREEZE is the operation that performs this marking. Regular VACUUM does it for tuples older than vacuum_freeze_min_age. The aggressive variant — VACUUM FREEZE — forces it for everything in the table.
Without FREEZE happening regularly, transaction IDs accumulate, and eventually datfrozenxid (the database's oldest unfrozen transaction) gets dangerously close to the wraparound horizon. At 200 million transactions to wraparound, autovacuum starts emergency runs. At 1 million, Postgres refuses new transactions and goes to single-user mode. You do not want to be there at 4 a.m.
The settings that keep you safe
Three settings matter:
autovacuum_freeze_max_age— when datfrozenxid gets this far behind the current transaction ID, autovacuum forces an aggressive freeze run. Default is 200,000,000.vacuum_freeze_table_age— when a table's relfrozenxid is this far behind, the next vacuum on it does an aggressive freeze. Default is 150,000,000.vacuum_freeze_min_age— minimum age a row must reach before vacuum will freeze it. Default is 50,000,000.
For most workloads, these defaults work. Where they fail is on tables that are large, frequently-written, and where autovacuum is being interrupted or disabled for cost reasons.
The single most important rule: do not change these to higher values without understanding what you are postponing. Bumping autovacuum_freeze_max_age from 200M to 1B looks like "less work," until the day Postgres has to scan a 5TB table to freeze billions of tuples in one go because nothing freezed them gradually.
Monitoring, in two queries
This is the query I run every time I look at a new database:
SELECT
datname,
age(datfrozenxid) AS xid_age,
2147483647 - age(datfrozenxid) AS xids_until_wraparound
FROM pg_database
ORDER BY xid_age DESC;
If xids_until_wraparound is below 500 million, you are not in danger but you should know why. Below 200 million and autovacuum should be fighting hard already. Below 100 million and someone is paged.
The per-table version, to find the worst offenders:
SELECT
schemaname,
relname,
age(relfrozenxid) AS table_age,
pg_size_pretty(pg_relation_size(format('%I.%I', schemaname, relname)::regclass)) AS size
FROM pg_stat_user_tables
ORDER BY age(relfrozenxid) DESC
LIMIT 20;
The biggest, oldest tables are usually the ones at risk.
When autovacuum keeps falling behind
If the freeze age keeps climbing despite autovacuum running, the typical causes:
autovacuum_max_workersis too low. Default is 3. On a database with hundreds of tables that all need freezing, three workers cannot keep up.autovacuum_vacuum_cost_delayis too high. Default in recent Postgres is 2ms; older versions defaulted to 20ms which is hopelessly slow. Drop it to 2 or even 0 if your I/O can handle it.- Long-running transactions are blocking vacuum. Vacuum cannot freeze rows that might be visible to an open transaction. Find them with
pg_stat_activity WHERE state = 'idle in transaction'. - Replication slots are holding xmin back. A replica that has fallen behind keeps an xmin horizon open, preventing freezes. Check
pg_replication_slotsand either fix the replica or drop the slot.
In all of these, the answer is to address the root cause, not to disable or postpone vacuum.
The manual FREEZE
If a specific table has fallen far behind, run a manual aggressive vacuum:
VACUUM (FREEZE, VERBOSE) my_table;
This is read-heavy and write-heavy. On a multi-TB table it can take hours. The good news is it does not block reads or most writes (it does take a ShareUpdateExclusiveLock which conflicts with DDL but not with normal traffic).
The VERBOSE output tells you exactly what it did, which helps both for confidence and for postmortems.
What boring looks like
A healthy production database has:
- All databases at well under 100M xid age.
- Autovacuum freeze runs happening continuously, never emergency runs.
- No long-running idle transactions.
- No replication slots stuck behind.
If any of those is wrong, fix it now, before it becomes a 4 a.m. story.