January 11, 20265 min read

pg_stat_statements Retention: Don't Lose the History You Need

pg_stat_statements is cumulative since cluster start or last reset. If you reset it, you lose the data you would have used to debug the next incident.

pg_stat_statements is the foundation of Postgres performance debugging. It tracks aggregate stats for every query the cluster has run since startup or the last reset. Total time, calls, mean time, rows returned, buffer reads.

The inconvenient property: it is cumulative. The stats you have are everything since the last reset. If you reset it during an investigation, you lose the historical baseline. If you never reset, you have years of data including queries from old workloads that no longer exist.

Getting the retention right is mostly about understanding when to reset and how to preserve history before doing so.

The basics

-- Enable in shared_preload_libraries (requires restart)
shared_preload_libraries = 'pg_stat_statements'

-- After restart, create the extension
CREATE EXTENSION pg_stat_statements;

-- Tunable parameters
pg_stat_statements.max = 10000  -- max distinct queries tracked
pg_stat_statements.track = all  -- top-level + nested
pg_stat_statements.track_utility = off  -- don't track DDL
pg_stat_statements.save = on  -- preserve across restarts

pg_stat_statements.max is the most-tracked parameter. The default is 5000. For a busy database with many distinct queries (lots of dynamic SQL or many tables), this is too low — the long tail evicts faster than you can analyze.

My default: 10000-20000.

The save = on detail

With pg_stat_statements.save = on, the stats are persisted to disk and reloaded on restart. Without it, restart wipes everything.

For production, you want save = on. Otherwise every minor restart (cloud maintenance, version upgrade) destroys your debugging baseline.

When to reset

SELECT pg_stat_statements_reset(); clears all stats and starts fresh.

Reasons to reset:

After a major application change that invalidates the historical pattern. Old queries are no longer relevant; the table is full of dead entries.
After tuning autovacuum or major settings to see the impact cleanly.
Quarterly housekeeping if pg_stat_statements.max is hit and the eviction is unfair.

Reasons NOT to reset:

During active debugging. You will lose the data you need.
Just because the table is large. Size alone is not a problem; what matters is whether the data is useful.

Snapshot before reset

If you reset, save the current state first:

CREATE TABLE pg_stat_statements_snapshot_2025_01_15 AS
SELECT now() AS captured_at, *
FROM pg_stat_statements;

This preserves the pre-reset state for later comparison. If you discover you need the old data, you have it.

For permanent history, schedule snapshots:

-- pg_cron job
SELECT cron.schedule('snapshot-pg-stat-statements', '0 0 * * *',
  $$INSERT INTO pg_stat_statements_history
    SELECT now(), * FROM pg_stat_statements$$);

Daily snapshots into a history table. You can compute deltas:

WITH yesterday AS (
  SELECT * FROM pg_stat_statements_history
  WHERE captured_at = (
    SELECT max(captured_at) FROM pg_stat_statements_history
    WHERE captured_at < now() - interval '1 day'
  )
),
today AS (
  SELECT * FROM pg_stat_statements_history
  WHERE captured_at = (
    SELECT max(captured_at) FROM pg_stat_statements_history
  )
)
SELECT t.queryid, t.calls - y.calls AS new_calls, t.mean_exec_time, t.query
FROM today t
JOIN yesterday y USING (queryid)
ORDER BY (t.calls - y.calls) DESC
LIMIT 20;

Delta-based queries reveal what changed yesterday vs today. Much more useful than absolute stats for change detection.

What the columns mean

SELECT
  queryid,
  calls,
  total_exec_time,    -- total time spent in this query
  mean_exec_time,     -- average per call
  stddev_exec_time,   -- variance
  rows,               -- total rows returned
  shared_blks_read,   -- pages read from disk
  shared_blks_hit,    -- pages from cache
  query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

The diagnostic interpretations:

High total_exec_time with high calls: a fast query that runs constantly. Cumulative cost.
High total_exec_time with low calls: one slow query. EXPLAIN it.
High mean_exec_time: each call is slow.
High stddev_exec_time: variable performance. Often a sign of plan instability or parameter-dependent plans.
High shared_blks_read: I/O-heavy. Possible missing index or bloat.

Common pitfalls

Forgetting to install pg_stat_statements from the start. Adding it later requires a restart. The historical data starts from when you enabled it, not the cluster's life.

Always enable on day 1 of any production database.

The track = top default missing nested queries. A query inside a function or trigger does not show up unless track = all. For applications with stored procedures, set track = all.

Hitting pg_stat_statements.max and not noticing. When the limit is hit, queries are evicted. Eviction is not random — it favors least-recently-used. Less-frequent queries disappear from stats first.

SELECT count(*) FROM pg_stat_statements;
-- Compare to pg_stat_statements.max setting

If the count is at the limit, raise it.

Truncated query text. track_activity_query_size (default 1024) limits how much of each query is captured. Long queries get truncated, making them harder to identify.

Raise to 4096 or 8192 for more readable query text.

On managed services

RDS, Cloud SQL, Azure Flex all support pg_stat_statements. Enabling typically requires a parameter group change and a restart.

The save behavior on managed services: pg_stat_statements.save = on is the default but the underlying file is on the managed service's storage. After major version upgrades, the data may be lost — verify after the upgrade.

For the snapshot approach, use Cloud SQL's pg_cron support, RDS's lambda-triggered approach, or just an external cron.

What I commit to

For any production database:

pg_stat_statements installed from day 1.
pg_stat_statements.max = 10000 or higher.
track = all, save = on, track_activity_query_size = 4096.
Daily snapshot to a history table for delta analysis.
Reset only after major application changes, with a snapshot first.

This turns pg_stat_statements from a point-in-time snapshot into a longitudinal record. The investigation tools that can answer "what changed last week" or "what queries appeared after the deploy" depend on having that history.

← Back to all articles