The first time someone asks "why is COUNT(*) so slow," they have usually just run it on a 200M-row table and waited 90 seconds. The bug report follows: "the database is slow."
It is not slow. It is doing the only thing the query asked for. The answer is rarely to make COUNT(*) faster — it is to ask whether you actually need an exact count.
Here is the framework I use.
Why COUNT(*) cannot use an index alone
Postgres uses MVCC. Every row has multiple versions; each version is visible to some transactions and not others. The index does not store visibility information, so the database cannot answer COUNT(*) from the index alone — it has to consult the heap to determine which rows are currently visible.
The result: COUNT(*) on an unfiltered table is roughly an O(n) operation. On 500M rows, that is a lot of work no matter how fast your disk is.
Visibility map and index-only scans help in some cases, but not enough to make 500M-row COUNT(*) fast. The query is doing what it has to do.
When exact COUNT is necessary
Genuine cases:
- Pagination that needs total page count.
- Compliance reports.
- Billing calculations.
- Anywhere the user reads the number and trusts it to the digit.
For these, you have to pay the cost. Mitigations:
- Run the COUNT on a read replica.
- Cache the result for a few seconds at the application layer.
- Run it asynchronously and report "loading..." while it works.
When approximate COUNT is fine
Most product UI cases. "This filter has about 1.2 million matching rows" is enough information for a search results page. The user does not care about 1,234,567 vs 1,234,568.
For unfiltered or simply-filtered counts, pg_class.reltuples is the cheap approximation:
SELECT reltuples::BIGINT AS estimate
FROM pg_class
WHERE oid = 'orders'::regclass;
The estimate is updated by ANALYZE. Accuracy is generally within a few percent on healthy tables. Cost is microseconds.
For filtered counts, the EXPLAIN-based estimate works:
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;
-- Look at the rows= field in the output
You can extract this programmatically with EXPLAIN (FORMAT JSON) and parse the result. Some teams build a function that returns the estimated count from EXPLAIN; it is a 20-line plpgsql function that solves a real problem.
When pagination needs better than this
"Page 1 of 1234" pagination requires an exact count. "Page 1 of many" can use a relative count: "showing 25 of approximately 1.2M."
Keyset pagination removes the count problem entirely:
-- Instead of OFFSET, use the last row's id
SELECT * FROM orders
WHERE id > $last_seen_id
ORDER BY id
LIMIT 25;
The UI shows "next page" without claiming to know the total. For most product views, this is the right tradeoff: faster, no count needed, scales to arbitrary table sizes.
The applications that genuinely need page-of-N pagination are rare. Search results, admin dashboards, sometimes audit views. For everything else, keyset is better.
Maintaining a counter table
For counts that change frequently and must be cheap to read, a counter table:
CREATE TABLE table_counts (
table_name TEXT PRIMARY KEY,
row_count BIGINT NOT NULL
);
-- Trigger to update on INSERT/DELETE
CREATE OR REPLACE FUNCTION increment_orders_count()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
UPDATE table_counts SET row_count = row_count + 1 WHERE table_name = 'orders';
ELSIF TG_OP = 'DELETE' THEN
UPDATE table_counts SET row_count = row_count - 1 WHERE table_name = 'orders';
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER orders_count_trigger
AFTER INSERT OR DELETE ON orders
FOR EACH ROW EXECUTE FUNCTION increment_orders_count();
The trade: every insert and delete now contends on a single row in table_counts. Under high concurrency, this becomes a serialization bottleneck.
For very high-throughput tables, the better shape is a table with multiple rows that are summed:
CREATE TABLE table_counts_sharded (
table_name TEXT,
shard SMALLINT,
delta BIGINT NOT NULL,
PRIMARY KEY (table_name, shard)
);
-- Insert random shard, periodically aggregate
This is more code, but it removes the contention.
What I do today
For any new feature that mentions a count:
- Ask: does the user really need an exact number, or an approximate one?
- If approximate:
pg_class.reltuplesor EXPLAIN-based estimate. - If exact and rarely viewed: COUNT(*) on a read replica with a cache.
- If exact and frequently viewed: counter table, possibly sharded.
- If pagination: keyset, not OFFSET-based.
The rule that matters: do not let "the user wants a count" automatically mean "COUNT(*) on every page load." That is the path to slow pages, then to a slow database, then to an angry team trying to optimize the unoptimizable.