Indexes8 min read

Partial Indexes in Postgres: The Right Index Is Often a Smaller One

Partial indexes work when the product mostly asks about a predictable slice of a table. They fail when the predicate is vague, parameterized, or not repeated by the query shape.

The first partial index I usually recommend is not clever. It is boring: index the rows the product actually shows.

One SaaS team had an events table with 280 million rows. The dashboard almost always asked for open events from the last 30 days. The existing index covered every tenant, every status, every archived row, and six years of history. The index was technically correct. It was also too large to stay hot, expensive to maintain, and not shaped around the real workload.

A partial index fixed the dashboard because the query did not need a better global index. It needed a smaller index over the hot slice of the table.

The framework: index the invariant slice

I use a partial index when three things are true:

  • The query repeatedly targets a stable subset of rows.
  • The subset is much smaller than the whole table.
  • The application can express the same predicate in the query every time.

That last point is where many partial indexes fail. PostgreSQL can use a partial index only when the planner can prove the query condition implies the index predicate. If the predicate is hidden behind application logic, parameterization, or a slightly different condition, the index may sit unused.

The soft-delete pattern

Soft deletes are the cleanest starting point. Most application paths want live rows:

CREATE INDEX CONCURRENTLY idx_tasks_live_by_account
ON tasks (account_id, due_at DESC)
WHERE deleted_at IS NULL;

The matching query has to carry the same rule:

SELECT *
FROM tasks
WHERE account_id = $1
  AND deleted_at IS NULL
ORDER BY due_at DESC
LIMIT 50;

This index avoids carrying every deleted task forever. It also reduces write cost for rows outside the predicate. When a row is already deleted, updates that do not move it back into the live set do not need to maintain this index.

The hot-status pattern

Partial indexes also work well for queues, moderation workflows, payment states, and operational dashboards where one status is disproportionately important:

CREATE INDEX CONCURRENTLY idx_jobs_ready_priority
ON jobs (priority DESC, created_at)
WHERE status = 'ready';

This is useful only if ready is selective. If 70% of the table is ready, the partial index may not save much. If 1% is ready and workers poll that slice all day, it can be the difference between a tiny hot index and a large cold one.

Measure selectivity before building it

I start with the row mix, not the index syntax:

SELECT
  count(*) AS total_rows,
  count(*) FILTER (WHERE deleted_at IS NULL) AS live_rows,
  round(100.0 * count(*) FILTER (WHERE deleted_at IS NULL) / count(*), 2) AS live_pct
FROM tasks;

Then I check whether the query shape is consistent in pg_stat_statements. A partial index that helps one hand-written SQL path but misses the ORM-generated version is a maintenance liability.

Prove the planner can use it

After building the index, use the actual query shape:

EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM tasks
WHERE account_id = 42
  AND deleted_at IS NULL
ORDER BY due_at DESC
LIMIT 50;

You want to see the partial index in the plan and fewer buffers touched. If the plan ignores the index, do not assume PostgreSQL is wrong. Check whether the query predicate exactly implies the index predicate.

The parameterized-query trap

This index looks tempting:

CREATE INDEX CONCURRENTLY idx_orders_unpaid
ON orders (account_id, due_at)
WHERE status = 'unpaid';

But if the application always emits:

WHERE status = $2

the planner may not be able to prove at plan time that $2 means 'unpaid'. Depending on plan caching and query construction, the partial index can be missed. For hot product paths, I prefer making the predicate literal in the query or using a dedicated repository method that owns that query shape.

Partial indexes are not partitioning

Do not create one partial index per tenant, per month, or per status value unless you have a very specific reason. PostgreSQL documentation calls out that the system does not understand a set of partial indexes as a partitioning scheme. The planner may have to test candidates, and operations teams inherit a pile of narrow indexes nobody wants to maintain.

If your real need is data lifecycle, pruning, or tenant isolation, consider partitioning. If your need is a small hot working set inside a large table, use a partial index.

When I avoid partial indexes

  • The predicate changes often because product rules are still moving.
  • The subset is not meaningfully smaller than the table.
  • The query builder cannot reliably emit the same predicate.
  • The team cannot monitor whether the index is used after deployment.

A partial index is a contract between the schema and the query shape. If the query stops keeping its side of the contract, the index becomes silent clutter.

The rollout I trust

  1. Measure predicate selectivity.
  2. Confirm the exact query shape from production.
  3. Build with CREATE INDEX CONCURRENTLY.
  4. Compare plans and buffer usage before and after.
  5. Watch pg_stat_user_indexes.idx_scan for real usage.
  6. Drop the broader duplicate index only after normal traffic proves the replacement.

The win is not that partial indexes are advanced. The win is that the database stops maintaining index entries the product almost never asks for.