PostgreSQL Topic Archive

Replication and WAL PostgreSQL Articles

Replica lag, WAL growth, failover readiness, hot standby behavior, and replication slots.

Unlogged Tables: The Honest Case for Skipping WAL

July 30, 202611 min read

One word took our nightly staging load from 40 minutes to 9: UNLOGGED. Two months later a failover emptied the table and nobody on call could explain why. The semantics are absolute — if someone tells you.

MySQL GTIDs vs PostgreSQL Replication Slots: Tracking Position

July 30, 20269 min read

GTIDs make MySQL failover repointing almost trivial; replication slots make PostgreSQL WAL retention automatic but risky. A practitioner's comparison of the two position-tracking models.

MySQL Orchestrator vs PostgreSQL Patroni: Two Failover Cultures

July 30, 202610 min read

MySQL grew failover tooling around topology repair; PostgreSQL grew it around consensus and leases. Comparing orchestrator and group replication with Patroni and repmgr, honestly.

Full Page Writes, Checkpoints, and WAL Compression in PostgreSQL

July 30, 202613 min read

full_page_writes protects you from torn pages but can dominate WAL volume. How checkpoints set the FPI rate, what wal_compression with lz4 or zstd buys you, and how to measure WAL with pg_stat_wal.

Stale Reads on Read Replicas: MySQL vs PostgreSQL

July 30, 20269 min read

Both MySQL and PostgreSQL replicas serve stale data by default. What differs is the toolbox: GTID waits and group replication consistency levels versus LSN tokens and remote_apply.

PostgreSQL Replication Slots: When Retained WAL Fills the Disk

July 30, 202612 min read

A decommissioned standby left its replication slot behind, and over one quiet weekend the slot pinned 214 GB of WAL until the primary ran out of disk and PANIC'd. Here is the mechanism, the monitoring queries, and the circuit breaker that caps the damage.

MySQL Binlog vs PostgreSQL WAL: How Replication Really Differs

July 30, 202610 min read

MySQL replicates logical change events from the binlog; PostgreSQL streams physical WAL. That one design difference changes lag measurement, consistency guarantees, and what your replicas can be.

CDC from MySQL Binlog vs PostgreSQL Logical Decoding

July 30, 202610 min read

Debezium tailing a MySQL binlog and a PostgreSQL logical decoding pipeline look similar from Kafka. Underneath, snapshots, schema changes, and retention fail in opposite ways.

Replication Slots: The PostgreSQL Disk-Full Incident That Writes Itself

July 30, 202612 min read

The 3 a.m. page was pg_wal at 100 percent on a healthy primary. The cause was a Debezium slot from a proof of concept eight weeks earlier, quietly pinning 380 GB of WAL nobody would ever read.

Unlogged Tables: The Honest Case for Skipping WAL

July 30, 202611 min read

MySQL Orchestrator vs PostgreSQL Patroni: Two Failover Cultures

July 30, 202610 min read

MySQL grew failover tooling around topology repair; PostgreSQL grew it around consensus and leases. Comparing orchestrator and group replication with Patroni and repmgr, honestly.

Full Page Writes, Checkpoints, and WAL Compression in PostgreSQL

July 30, 202613 min read

full_page_writes protects you from torn pages but can dominate WAL volume. How checkpoints set the FPI rate, what wal_compression with lz4 or zstd buys you, and how to measure WAL with pg_stat_wal.

Stale Reads on Read Replicas: MySQL vs PostgreSQL

July 30, 20269 min read

Both MySQL and PostgreSQL replicas serve stale data by default. What differs is the toolbox: GTID waits and group replication consistency levels versus LSN tokens and remote_apply.

CDC from MySQL Binlog vs PostgreSQL Logical Decoding

July 30, 202610 min read

Debezium tailing a MySQL binlog and a PostgreSQL logical decoding pipeline look similar from Kafka. Underneath, snapshots, schema changes, and retention fail in opposite ways.

PostgreSQL Logical Replication Conflicts: Stalls, Causes, and Fixes

July 22, 20267 min read

A single unique violation on a logical replication subscriber can stall the stream and fill the publisher's disk with retained WAL. Here is how conflicts happen and how to recover.

PostgreSQL hot_standby_feedback: The Setting That Stops Replica Query Cancellations and Bloats Your Primary

June 6, 20269 min read

Long analytics queries on our replica kept getting cancelled mid-run. Turning on hot_standby_feedback stopped the cancellations instantly — and then the primary started bloating. That trade is the whole story.

PostgreSQL Read Replica Lag: When Scaling Reads Makes Data Stale

May 8, 20269 min read

We scaled reads to a replica and started getting bug reports about data that 'disappeared' right after saving. The cause was replication lag, and the fix was being honest about which reads can tolerate it.

PostgreSQL WAL and Checkpoints: Tuning Write Spikes Without Guesswork

May 8, 202610 min read

Write-heavy PostgreSQL systems usually fail through WAL pressure, checkpoint I/O, replication lag, or storage stalls. The fix starts with measuring the write path, not raising random knobs.

Logical Replication in Postgres: When It Is the Right Tool and When It Isn't

February 27, 20267 min read

Logical replication is more flexible than physical and more fragile. Use it when you need partial replication, cross-version, or selective sync. Don't use it for HA.

Physical Replication Slots: The Lifesaver That Quietly Fills Your Disk

February 26, 20265 min read

Physical replication slots make sure replicas can catch up after a disconnect. They also make sure your primary's disk fills if a replica is gone and forgotten.

Read Replica Staleness in Postgres: The Bug You Will Eventually Meet

February 25, 20266 min read

Read replicas are eventually consistent. The application's view of "after I wrote, my read should see it" is often wrong by milliseconds, sometimes by minutes.

Postgres Failover Readiness: The Drill That Tells You If You Are Lying to Yourself

February 24, 20266 min read

Failover is mostly fine when you do not need it and broken when you do. Here is how to know which you have.

When Replication Slots Eat the Disk: A Diagnostic Walkthrough

February 23, 20265 min read

If your Postgres disk is growing and you cannot identify the culprit, replication slots are usually the answer. Here is the diagnostic sequence.

PostgreSQL Replication Monitoring: Lag, Slots, and Failover

February 21, 202612 min read

Replication monitoring is not one lag number. You need to know stale-read risk, slot retention, replay delay, WAL growth, and whether failover would help or hurt.

AWS Aurora Postgres Replica Lag: Different from Vanilla, Different to Diagnose

February 15, 20266 min read

Aurora's replica lag has different mechanics than vanilla streaming replication. The dashboard metric "replica lag" can be misleading. Here is what it actually measures.

WAL Monitoring in Postgres: What to Watch Before Disk Becomes the Story

January 31, 20269 min read

WAL problems usually look like disk problems too late. Monitor generation rate, checkpoints, archiving, replication lag, and slot retention before pg_wal owns the incident.

Back to all PostgreSQL field notes