Three days into a launch we could not explain why login was slow. Auth had a clean btree on email. The query plan kept seqscanning a million-row users table anyway. The reason, once we found it, was funny in the way that bug fixes are funny only after you ship them: the application was sending LOWER(email) and the index was on plain email. The planner does not care that you intended them to be the same. Different expression, different index, no plan win.
I bring it up because I have seen the same thing four times since, in three different companies. Expression indexes are simple. People also misuse them in extremely consistent ways.
Where they actually pay for themselves
Use an expression index when the application reliably wraps a column in the same function before filtering. Case-insensitive lookups (LOWER(email)), date truncation for monthly reports (date_trunc('month', created_at)), JSON path access on a hot field ((payload->>'tenant_id')), domain extraction (split_part(email, '@', 2)).
If the workload is hitting these expressions thousands of times per minute, an expression index turns a sequential scan into an index scan, and your latency drops from "why is this so slow" to "oh, it is fine." If the workload only runs the expression once a week from an analyst's notebook, save yourself the maintenance cost and let it seqscan.
-- Practical: case-insensitive email lookup
CREATE INDEX users_email_lower_idx ON users (LOWER(email));
-- The query MUST use the same expression
SELECT * FROM users WHERE LOWER(email) = LOWER($1);
The trap: the planner is literal
This is the entire bug class:
-- Index defined this way:
CREATE INDEX users_email_lower_idx ON users (LOWER(email));
-- Will NOT use the index:
SELECT * FROM users WHERE email = LOWER('[email protected]');
-- WILL use the index:
SELECT * FROM users WHERE LOWER(email) = LOWER('[email protected]');
The planner matches expressions structurally. email is not the same expression as LOWER(email). There is no "close enough." If your ORM, your data layer, or one of fifty different call sites forgets the wrapper, that call seqscans and you do not notice until traffic doubles.
The most painful version of this is when the index works for 90% of callers and a single legacy code path uses a slightly different expression — lower(email) (lowercase function name in Postgres is fine, both work) but with a trim in front, like LOWER(TRIM(email)). That call seqscans forever and the only signal is one slow line in pg_stat_statements.
Always verify with EXPLAIN
Do not assume the index helped. Confirm it.
EXPLAIN ANALYZE
SELECT * FROM users WHERE LOWER(email) = '[email protected]';
Look for Index Scan using users_email_lower_idx. If you see Seq Scan, the planner did not pick your index. Common reasons:
- The expression in the query does not match the index. Check character-by-character.
- Statistics are stale. Run
ANALYZE usersand try again. - The table is small. Postgres is not wrong to seqscan a thousand-row table.
- Your index includes a function that depends on a non-IMMUTABLE input (more on that below).
IMMUTABLE matters more than people think
Postgres only allows expression indexes on IMMUTABLE functions, because the index physical layout depends on the function returning the same value forever for the same input. Most built-ins are immutable. A few that look immutable are not:
to_char(timestamp, 'YYYY-MM')isSTABLE, not immutable, because it depends on the session timezone.now()and friends — obviously volatile.- Custom functions default to
VOLATILEunless you mark them otherwise.
If you try to index a non-immutable expression, Postgres rejects the CREATE INDEX. The fix is usually one of:
- Use the immutable variant (
extract(year FROM created_at)is immutable;to_char(...)is not). - Wrap your custom function in an immutable wrapper if you can prove the result is stable.
- Pre-compute the value in a generated column and index that instead.
Generated columns are often the cleaner answer when the expression is non-trivial:
ALTER TABLE users
ADD COLUMN email_lower TEXT GENERATED ALWAYS AS (LOWER(email)) STORED;
CREATE INDEX users_email_lower_idx ON users (email_lower);
Now callers can query email_lower directly. No expression matching required, no risk of one call site forgetting the wrapper.
Cost reality check
Expression indexes are not free. Every write recomputes the expression and updates the index. For a hot table with frequent writes, an additional index can measurably increase write latency. I have seen a single ill-considered expression index turn a 3ms insert into a 9ms insert because the function it computed had to JOIN-shaped lookups on every write.
Before creating one in production:
- Run it in a clone, measure write throughput before and after.
- Make sure the expression in the application matches exactly. Grep the codebase, do not just trust the ORM docs.
- Watch
pg_stat_user_indexesfor a few days after deploy to confirm the index is actually being used.
If an expression index is created and never used, drop it. An unused index is pure write tax.
What I do now
When a slow query lands on my desk and the column has any kind of transformation in the WHERE clause, expression index is the second thing I check (the first is whether the column has any index at all). I look at the actual SQL the application is sending, not what someone says it sends. Then I write the index to match that expression precisely, and I add a one-line comment in the migration linking the index to the query path it serves.
The three-day login bug shipped a one-line fix: LOWER(email) in the index definition. The team has since had this conversation only twice more. Good enough.