6 min read

Postgres High I/O: Diagnosing Where Disk Goes Wrong

When your database is I/O-bound, the question is not whether the disk is fast enough. It is which queries are doing the I/O and whether they should be.

A Postgres instance saturated on I/O does not look like a database problem on the surface. The application is slow. CPU is fine. Memory is fine. The dashboard says "disk reads/sec" is high, but that is a number, not a diagnosis.

The question to answer: which queries are causing the I/O, and is that I/O necessary?

Here is the sequence I run when I get this kind of problem.

Step 1: Confirm I/O is the bottleneck

SELECT wait_event_type, wait_event, count(*)
FROM pg_stat_activity
WHERE state = 'active'
GROUP BY 1, 2
ORDER BY count(*) DESC;

If many queries are waiting on IO events (DataFileRead, DataFileWrite, WALSync), I/O is the constraint. If they are waiting on Lock, LWLock, or other things, the issue is elsewhere.

Step 2: Check buffer cache hit ratio

SELECT
  sum(heap_blks_read) AS reads,
  sum(heap_blks_hit) AS hits,
  round(100.0 * sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0), 2) AS hit_ratio_pct
FROM pg_statio_user_tables;

A healthy production database has > 95% buffer cache hit ratio. Below 90% means a meaningful fraction of reads are coming from disk.

If the hit ratio is low, two possibilities:

  • The data set is much larger than shared_buffers. Increase memory or reduce the working set.
  • A specific query is doing a sequential scan over a huge table, polluting the buffer cache. Find it and fix it.

Step 3: Find the I/O-heavy queries

SELECT
  round(total_exec_time::numeric, 0) AS total_ms,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  shared_blks_read,
  shared_blks_hit,
  query
FROM pg_stat_statements
ORDER BY shared_blks_read DESC
LIMIT 20;

The top queries by shared_blks_read are the ones causing the most physical reads. Each block is 8KB.

For each of the top queries, the diagnostic question:

  • Are they doing sequential scans where indexes would be cheaper? (Check with EXPLAIN.)
  • Are they reading bloated tables/indexes that should be vacuumed?
  • Are they touching cold data that fell out of cache?

Step 4: EXPLAIN the worst offender

EXPLAIN (ANALYZE, BUFFERS)
<the worst query from step 3>;

Look for:

  • Seq Scan over large tables: missing index, almost always.
  • Buffers: shared read=N — N is the number of pages read from disk for this query. If it is large, the query is doing a lot of I/O.
  • Buffers: shared hit=N — pages served from cache. The ratio of hit to read tells you whether this query is being served from cache.

The output usually points directly at the problem. A query reading 100,000 pages to return 10 rows wants an index.

Step 5: Check vacuum status

Bloated tables read more pages than they should — most of those pages are dead tuples that vacuum would have cleaned up.

SELECT
  schemaname, relname,
  n_live_tup, n_dead_tup,
  round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_live_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 20;

Any table with dead_pct > 30% is a vacuum candidate. The query reads on those tables include scanning over dead tuples. A VACUUM (or pg_repack for severe cases) reduces the I/O footprint.

Step 6: Common patterns

The specific patterns that account for most I/O incidents:

1. A new query path with no matching index.

A new feature shipped a query that hits a column without an index. The query does a full sequential scan of a large table, which dominates the I/O.

Fix: add the index. CREATE INDEX CONCURRENTLY to avoid locks.

2. A query that should use an index does not because of a function or expression.

WHERE LOWER(email) = ? does not use an index on email. The query seq-scans because the planner cannot match the expression.

Fix: expression index (LOWER(email)), or generated column with regular index.

3. Bloated index.

The index is many times its functional size. Reads bring in dead pages.

Fix: REINDEX INDEX CONCURRENTLY.

4. A heavy report running during peak.

A single analytical query that scans an entire table runs during peak hours. The buffer cache fills with that table; everything else needs to read from disk.

Fix: schedule the report off-peak, or run it on a read replica.

5. Hot rows being read with cold rows.

A query like SELECT * FROM orders WHERE customer_id = 1 ORDER BY created_at DESC reads through the orders table looking for rows of customer 1. If those rows are scattered (random insertion), the query reads many pages to find a few rows.

Fix: CLUSTER the table on the relevant index. Or partition by customer_id. Or accept the I/O.

Step 7: Check for sequential scans more broadly

SELECT
  schemaname, relname,
  seq_scan,
  seq_tup_read,
  idx_scan,
  pg_size_pretty(pg_relation_size(format('%I.%I', schemaname, relname)::regclass)) AS table_size
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;

Tables with seq_tup_read >> idx_scan are taking a lot of sequential scans. For large tables, this is the smoking gun for missing indexes.

Step 8: Verify the disk itself

If the queries are well-indexed and vacuum is happy and the buffer cache hit ratio is good, the bottleneck is the disk's actual capacity.

Check iostat or the cloud provider's I/O metrics. Symptoms of disk saturation:

  • I/O latency above 10-20ms (modern SSD should be sub-millisecond).
  • Queue depth consistently high.
  • Throughput at the disk's stated maximum.

For cloud-managed instances, the fix is usually a higher IOPS tier or a larger instance. For self-managed, possibly upgrading to NVMe or adding parallelism.

A pragmatic checklist

When I diagnose high I/O on someone else's Postgres:

  1. Confirm I/O wait, not lock wait or CPU.
  2. Check buffer cache hit ratio.
  3. Find top queries by shared_blks_read in pg_stat_statements.
  4. EXPLAIN the worst few. Look for sequential scans on big tables.
  5. Check vacuum/bloat status.
  6. Add missing indexes, vacuum bloated tables, schedule heavy reports off-peak.
  7. Only after all of the above, consider scaling up the instance.

In 9 out of 10 cases, the answer is one of: missing index, bloated table, or a heavy query that should not run during peak. The disk is rarely the actual bottleneck; the workload is what generated the I/O.