What we learned chasing replication lag in Postgres 16

By Anders L. · April 15, 2026 · 7 min read

One of our larger customers had a Postgres 16.2 cluster with three async replicas. For two weeks they were chasing a phantom 8–12 second lag on one specific replica that always cleared itself within five minutes. Their on-call told us it “probably wasn’t important” but they wanted to understand it. So did we.

The temptation: blame disk

The instinct is always disk: “the replica is on slower NVMe, the WAL apply can’t keep up”. We checked. iostat showed <15% utilization on the replica during lag windows. pg_stat_wal_receiver showed flush_lag < 50 ms. The replica was receiving WAL just fine; it just wasn’t replaying it.

The actual cause

It was a large analytical query on the replica. A nightly job ran a 12-minute aggregation against a 1.2 TB orders table. Postgres’s hot_standby_feedback=on kept the replication slot open while that query ran, but the apply WAL replay was actually pausing on a DROP INDEX CONCURRENTLY issued earlier on the primary. The replay couldn’t take the AccessExclusiveLock because the long query was holding AccessShareLock on the same table.

So the timeline was:

  • 00:00 — nightly aggregation starts on replica, takes AccessShareLock on orders.
  • 00:04 — primary issues a DROP INDEX CONCURRENTLY on orders.
  • 00:04 — WAL arrives at replica with the lock-acquire record. Replay blocks.
  • 00:12 — aggregation finishes, lock released, replay catches up in < 5 s.

How we found it

Three queries pointed straight at the answer:

-- 1) is replay actually waiting on a lock?
SELECT * FROM pg_stat_activity
WHERE backend_type = 'startup' AND wait_event IS NOT NULL;

-- 2) what's holding the conflicting lock?
SELECT pid, mode, locktype, relation::regclass, granted
FROM pg_locks WHERE NOT granted;

-- 3) on the primary, what was the WAL doing at that timestamp?
SELECT pg_walfile_name(pg_current_wal_lsn());

The fix was unglamorous. We split the analytical query into ten smaller transactions of about 70 seconds each (each one within max_standby_streaming_delay), and we now schedule schema changes for a window that doesn’t overlap nightly reporting. Lag has been < 200 ms ever since.

The general lesson

Replication lag in modern Postgres is almost never about hardware. The replay process is single-threaded, and any AccessExclusive lock the primary takes (CREATE INDEX, DROP INDEX, VACUUM FULL, CLUSTER, REINDEX) will pause replay if a replica session is holding even a low-grade lock on the same object.

If you’re seeing “mystery lag” that clears itself in 5–15 minutes, before you start tuning checkpoint_timeout or buying more NVMe, just look at pg_stat_activity on the replica during the lag window. Nine times out of ten the answer is one row away.


Got a Postgres mystery you can’t crack? Send it our way.