pg_stat_statements Workflow
How to use pg_stat_statements as a production tuning queue: total time, mean time, variance, calls, I/O, temp blocks, resets, and before/after snapshots.
Why pg_stat_statements is the starting point
Slow query logs show individual events. pg_stat_statements shows workload impact. It groups queries by normalized shape and tells you which query patterns consume the most time, run most often, read the most blocks, or spill to temp files.
Rank by total time first
SELECT
queryid,
calls,
round(total_exec_time::numeric, 2) AS total_ms,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
rows,
left(query, 180) AS query_sample
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
Total time answers the first tuning question: where does the database spend the most cumulative execution time?
Use variance to find parameter-sensitive queries
SELECT
queryid,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
round((stddev_exec_time / NULLIF(mean_exec_time, 0))::numeric, 2) AS variation,
left(query, 180) AS query_sample
FROM pg_stat_statements
WHERE calls > 100
ORDER BY variation DESC NULLS LAST
LIMIT 20;
High variance usually means the same query shape behaves differently across tenants, time ranges, statuses, or parameter values. Test more than one parameter set before changing the index design.
Separate I/O and temp-file pain
SELECT
queryid,
calls,
shared_blks_read,
shared_blks_hit,
temp_blks_written,
wal_bytes,
left(query, 180) AS query_sample
FROM pg_stat_statements
ORDER BY temp_blks_written DESC, shared_blks_read DESC
LIMIT 20;
Queries with high temp writes may need better indexes, smaller inputs, query rewrites, or targeted memory. Queries with high shared reads may be scanning too much data or suffering from cold cache and storage pressure.
Before and after workflow
- Snapshot the top queries and query IDs.
- Pick one query class and run
EXPLAIN (ANALYZE, BUFFERS). - Make one change: index, statistics, rewrite, memory, or batching.
- Let a representative workload run.
- Compare total time, mean time, variance, and block counters.
Reset stats deliberately, not casually. If you reset during an incident, save the current snapshot first.