The pattern looks careful:
with db.transaction():
customer = db.execute("SELECT * FROM customers WHERE id = $1", uid).first()
payment_response = stripe.charge(customer.payment_method, amount)
db.execute("INSERT INTO charges (customer_id, stripe_id) VALUES ($1, $2)", uid, payment_response.id)
"If the Stripe call fails, we roll back the database changes." The intent is consistency — atomic operation across two systems.
The reality is one of the worst patterns I encounter in production code. Here is why.
What actually happens
The HTTP call to Stripe takes 500ms-5 seconds. During that time:
- The Postgres transaction is open.
- A connection is held from the connection pool.
- The transaction's xmin is held, preventing vacuum from cleaning up rows newer than this transaction.
- Any rows the transaction has touched are locked.
For a single request, this is annoying. For thousands of concurrent requests during a peak, it is catastrophic.
The cascading problems
1. Connection pool exhaustion. If 100 instances of this code run concurrently, 100 connections are held for 5 seconds each. With a pool of 100, the database is full. Other queries block.
2. Vacuum starvation. Long-running transactions hold xmin, preventing vacuum. Bloat accumulates. Performance degrades over time.
3. Lock contention. If the transaction has UPDATEd or SELECT FOR UPDATEd any row, those locks are held until the HTTP call returns. Other transactions waiting for those rows queue up.
4. Stripe outage = your outage. If Stripe takes 30 seconds to time out, your transaction is open for 30 seconds. Multiplied across concurrent requests, the pool fills, the database degrades.
The "transactional safety" you thought you bought turns out to make every external dependency a database problem.
The replacement pattern
The correct shape: do not hold the transaction across the external call. Instead:
# 1. Read what you need
customer = db.execute("SELECT * FROM customers WHERE id = $1", uid).first()
# 2. Make the external call OUTSIDE any transaction
payment_response = stripe.charge(customer.payment_method, amount)
# 3. Open a SHORT transaction to record the result
with db.transaction():
db.execute("INSERT INTO charges (customer_id, stripe_id) VALUES ($1, $2)", uid, payment_response.id)
The transaction is now milliseconds long instead of seconds.
The problem this introduces: what if the application crashes between step 2 (Stripe charged) and step 3 (database recorded)? The customer is charged but our database does not know.
Idempotency keys solve the recovery problem
The pattern that handles partial failure:
# Generate an idempotency key BEFORE the Stripe call
idempotency_key = uuid.uuid4()
db.execute("INSERT INTO charge_attempts (id, customer_id, status) VALUES ($1, $2, 'pending')",
idempotency_key, uid)
# Stripe call WITH the idempotency key
payment_response = stripe.charge(customer.payment_method, amount, idempotency_key=str(idempotency_key))
# Record the result
db.execute("UPDATE charge_attempts SET stripe_id = $1, status = 'succeeded' WHERE id = $2",
payment_response.id, idempotency_key)
If the application crashes after step 2 but before step 3, a recovery job can:
- Find
charge_attemptsrows inpendingstate. - Query Stripe with the same idempotency key to see if the charge succeeded.
- Update the database accordingly.
Stripe's idempotency mechanism guarantees that even retries do not double-charge. The pattern is self-healing.
The outbox pattern for at-least-once delivery
For stronger guarantees, the outbox pattern:
with db.transaction():
db.execute("INSERT INTO charges (customer_id, status) VALUES ($1, 'pending')", uid)
db.execute("INSERT INTO outbox (event_type, payload) VALUES ('charge.requested', $1)", payload)
A worker process polls the outbox table, dispatches events to external systems (Stripe, etc.), and updates the outbox status.
The transaction is short. External calls happen outside the transaction. Failures are retryable from the outbox state.
This is more architecture than a typical web request can absorb, but for systems where consistency between database and external state is critical, it is the right pattern.
When transactions around HTTP are acceptable
The rare cases where it is OK:
- The HTTP call is to a service in your own datacenter on a low-latency network. The call typically returns in <50ms. The transaction window is short.
- The call is read-only and idempotent. No state change to revert.
- You have a hard timeout below 1 second on the HTTP call. The transaction cannot be held longer than that.
Even in these cases, I prefer the read-then-call-then-write pattern for clarity. "Transactions don't hold HTTP calls" is a simpler rule than "transactions don't hold HTTP calls except sometimes."
What to look for in code review
The red flags:
with transaction(): ... requests.post(...) ...- `BEGIN; SELECT ...; UPDATE ...; -- and the application calls something external between
- ORM patterns where a service object both opens a transaction and calls an external API
For each one, ask: "What if this HTTP call takes 30 seconds?" If the answer is "the database is degraded for the duration," it is the wrong pattern.
The mental model
A transaction is a database resource. It costs a connection, a snapshot, lock state, vacuum holdback. It should be held for as short a time as possible.
The transaction is for a discrete unit of database work. Whatever happens outside the database — HTTP calls, file I/O, user input — does not belong inside it.
This is a simple rule that, once internalized, makes a lot of subtle production issues go away.