Deadlock Prevention in Postgres: Design It Out, Do Not Just Retry
Deadlocks are the symptom of a design that allows two transactions to hold each other's locks. The fix is rarely retry logic. Here is what to look for instead.
Notes for the problems that show up after launch: bad plans, awkward migrations, index debt, vacuum pressure, replica lag, and the small decisions that make PostgreSQL easier to operate.
Deadlocks are the symptom of a design that allows two transactions to hold each other's locks. The fix is rarely retry logic. Here is what to look for instead.
Some ALTER TABLE operations are instant. Others lock the table for the duration. The Postgres docs hide this in subsections. Here is the cheat sheet.
Transaction ID wraparound is rare, dramatic, and entirely avoidable. Here is what FREEZE actually does, and the autovacuum settings that keep you out of single-user mode.
The popular bloat estimation queries are off by 10-30% under typical workloads. Here is what pgstattuple actually measures, when the approximate variant is enough, and the cleanup decisions that follow.
Memory tuning is mostly about budgets. shared_buffers, OS cache, per-query work_mem, maintenance work, and connection count all spend the same RAM.
Disk-full on a Postgres server is rarely just one thing. WAL, temp files, logs, and delayed cleanup arrive together. The recovery is mostly about what you prepared before.
Every connection storm I have seen looks like a database problem and is actually an application or pool problem. Here is how to tell the difference and stop blaming Postgres.
PITR lets you restore the database to any moment within your retention window. The feature is well-documented; the operational reality is messier.
Most teams have backups. Most teams have never restored from one. The first time you test, you find out the backup was missing something.
Logical and physical backups solve different problems. Most teams need both, but only have one. Here is the actual decision framework.
WAL accumulates fast. The retention policy is a tradeoff between PITR window and storage cost. Here is how I think about it.
Pooling is not just raising throughput. It is controlling backend count, transaction shape, prepared statements, and failure behavior under load.