A deadlock is two transactions waiting for each other's locks. Postgres detects the cycle, picks a victim, rolls it back, and the survivor proceeds. The application sees a deadlock_detected error, retries, succeeds. Everyone moves on.
If this happens occasionally — once a day under load — it is a tolerable cost. If it happens constantly, the design has a bug. Retry logic is treating the symptom, not the cause.
Here is how to find the cause.
The simplest cause: inconsistent lock order
This is the textbook deadlock and it is also the most common production cause. Two transactions touch the same set of rows but in different order:
-- Transaction A
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- Transaction B (concurrent, different request)
BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;
UPDATE accounts SET balance = balance + 50 WHERE id = 1;
COMMIT;
A holds the lock on row 1, wants row 2. B holds row 2, wants row 1. Deadlock.
The fix is design discipline: always lock rows in a consistent order. Lowest id first, alphabetical, whatever — pick a rule and stick to it. The application can sort the IDs before issuing UPDATEs:
-- Both transactions sort by id ascending
UPDATE accounts SET balance = balance + delta
WHERE id IN (1, 2)
-- The CASE figures out the per-row delta
A single multi-row UPDATE with WHERE id IN (...) and a CASE for the per-row delta is unbeatable here — it acquires all locks in one pass, sorted by Postgres internally, no inconsistent ordering possible.
The less obvious cause: foreign key locks
When you INSERT a row with a foreign key, Postgres takes a lock on the referenced row to ensure it cannot be deleted before commit. This lock is invisible in the SQL but real:
-- Transaction A: inserts an order for customer 1
BEGIN;
INSERT INTO orders (customer_id, total_cents) VALUES (1, 1000);
-- Now A holds a FK-share lock on customers.id = 1
-- Transaction B: same
BEGIN;
INSERT INTO orders (customer_id, total_cents) VALUES (1, 2000);
-- B also holds a FK-share lock on customers.id = 1 (compatible)
-- Transaction A: updates the customer's name
UPDATE customers SET display_name = 'A' WHERE id = 1;
-- Wants exclusive lock on customers.id = 1, blocked by B's FK-share
-- Transaction B: updates the customer's email
UPDATE customers SET email = '[email protected]' WHERE id = 1;
-- Wants exclusive lock on customers.id = 1, blocked by A's FK-share
Deadlock, mediated through the foreign key. Both transactions held compatible FK-share locks; both wanted to upgrade to exclusive; neither could.
Fix: avoid touching the parent table from a transaction that also inserts to the child table, when both are concurrent. If the application updates customers and inserts orders in the same transaction, separate them. Or use SELECT FOR NO KEY UPDATE on the customer first, which is a stronger lock that prevents the upgrade race.
Long transactions amplify deadlocks
A transaction that holds locks for milliseconds rarely deadlocks. A transaction that holds locks for minutes regularly does. Long transactions cover more potential conflict windows.
The fix is the same as for any long-transaction problem: short transactions, no inter-query work, no external calls inside BEGIN. The previous post on long-running transactions has the playbook.
Advisory locks for serializable workflows
When a section of application logic must serialize across all callers — "only one process can run this at a time" — advisory locks are the right tool, not row locks.
SELECT pg_advisory_xact_lock(12345);
-- Now no other transaction can hold lock 12345 until this transaction commits
The xact_lock variant is released automatically at COMMIT/ROLLBACK. The non-xact variant must be explicitly released.
Useful for: "only one cron job can run at a time," "only one user can be in this state at a time," "don't let two requests hit this idempotency-protected endpoint simultaneously."
The key benefit: the lock is independent of any specific row, so it cannot deadlock with row-level operations. It is a separate lock space.
When retry logic is the right answer
Retry on deadlock is appropriate when:
- The deadlock rate is low (less than 0.1% of transactions).
- The transaction is short and the retry has a high chance of succeeding.
- The application can detect the deadlock error and retry within a request budget.
It is not appropriate when:
- The deadlock rate is high enough that retries become a load problem.
- The transaction is long enough that retrying wastes user time.
- The retry just causes the same deadlock again.
In those cases, fix the design, do not band-aid the symptom.
How I find the offending design
Postgres logs every deadlock with the involved queries:
ERROR: deadlock detected
DETAIL: Process 12345 waits for ShareLock on transaction 6789...
Process 4321 waits for ShareLock on transaction 9876...
HINT: See server log for query details.
In postgresql.conf, ensure log_lock_waits = on and deadlock_timeout is set to something useful (default 1 second is fine). The server log then tells you which queries from which sessions were involved.
From there, group the involved query pairs and look for the inconsistent lock order. Usually one or two patterns account for 80% of the deadlocks. Fix those, the rest fall out.
A pragmatic checklist
- Multi-row updates: use
WHERE id IN (...)with sorted IDs, not separate UPDATE statements. - Operations that touch parent and child tables: separate them or use
SELECT FOR NO KEY UPDATE. - Cross-process serialization: use advisory locks, not row locks.
- Long transactions: shorten them.
- Retry logic: keep it as a fallback, not as the primary defense.
Deadlocks are signals that the application is letting two paths cross paths. The signal is useful. The fix is in the design.