6 min read

Generated Columns in Postgres: Less Application Glue, Better Queries

Generated columns let you keep derived data in the schema instead of in scattered application code. Here is when they pay off and the corner where they bite.

Generated columns are one of the cleanest features Postgres has shipped recently, and one of the most underused. They let you compute a column value from other columns in the same row, automatically, every write. Read it back like any normal column. Index it like any normal column. Forget the application code you would have written to keep it in sync.

I use them more every year. Here is the playbook.

What they look like

CREATE TABLE products (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  price_cents INTEGER NOT NULL,
  tax_rate NUMERIC(4,3) NOT NULL,
  total_cents INTEGER GENERATED ALWAYS AS
    (price_cents + (price_cents * tax_rate)::INTEGER) STORED
);

Writes only set price_cents and tax_rate. total_cents is computed by Postgres on every insert and update. Reads return it like any other column. Indexes can target it.

The STORED keyword is required (Postgres does not yet support virtual generated columns, only stored). The expression must be IMMUTABLE — it can only depend on other columns of the same row, not on now(), the session, or any external state.

Where they earn their keep

Four cases I see often:

1. Eliminating denormalization-by-application-code. If the application has been computing LOWER(email) and storing it in a separate column, replace that with a generated column and delete the application code that maintained it. One less place for bugs.

2. Creating an indexable form of an expression. Sometimes you want to index payload->>'tenant_id', but the application does the cast (payload->>'tenant_id')::int before comparing. The expression index needs to match. A generated column gives you a simple integer column to index:

ALTER TABLE events
  ADD COLUMN tenant_id INTEGER
  GENERATED ALWAYS AS ((payload->>'tenant_id')::INTEGER) STORED;

CREATE INDEX events_tenant_id_idx ON events (tenant_id);

Now WHERE tenant_id = 123 uses the index, no expression matching gymnastics required.

3. Keeping a search-friendly representation alongside the canonical one. A tsvector for full-text search:

ALTER TABLE articles
  ADD COLUMN search_vector tsvector
  GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(body, '')), 'B')
  ) STORED;

CREATE INDEX articles_search_idx ON articles USING GIN (search_vector);

Full-text search index on every article, automatically maintained, no triggers.

4. Computed display values. Total prices, full names, formatted IDs. Anything where the derivation is deterministic and the same logic was being repeated across application code.

The IMMUTABLE constraint matters

Generated columns can only call IMMUTABLE functions. This rules out:

  • now() (volatile by definition).
  • to_char(timestamp, 'YYYY-MM') (stable, not immutable, depends on timezone).
  • Custom functions unless explicitly marked IMMUTABLE.

If you try, Postgres rejects the ALTER TABLE with a clear error. The fix is one of:

  • Use a different function (extract(year FROM ...) is immutable).
  • Compute the value in the application and store it as a normal column.
  • Use a trigger if the logic genuinely depends on something non-immutable.

The IMMUTABLE constraint is not arbitrary. The generated value has to be deterministic for Postgres's storage and replication guarantees to hold.

Performance characteristics

Generated columns add a small write cost (the expression has to evaluate on every insert/update of the source columns) and zero read cost (the value is just there, like any column).

For most expressions this is fine. For very expensive expressions — say, a JSON pretty-printer over a large blob — the write cost becomes meaningful. Measure if you are unsure.

Updates that do not change any source columns of a generated column do not re-evaluate it. Postgres tracks dependencies; only the affected generated columns recompute.

Migrating an existing table

Adding a generated column on an existing table rewrites the table, which is expensive on large tables:

-- Locks the table for the duration. Slow on big tables.
ALTER TABLE events
  ADD COLUMN tenant_id INTEGER
  GENERATED ALWAYS AS ((payload->>'tenant_id')::INTEGER) STORED;

For large tables, a better pattern is to add a non-generated column, backfill it, then swap. But this is more involved. For tables under a few million rows, the simple ALTER is fine.

What I would not do with generated columns

  • Use them for values that depend on external state. The IMMUTABLE rule will stop you, but I have seen people work around it with custom functions marked IMMUTABLE that lie. Do not.
  • Use them for very expensive computations on hot-write tables. The write cost can dominate.
  • Use them where a view or materialized view would be more appropriate. Generated columns are for per-row derivations. Aggregates and cross-row work belong in views.

A small example, end to end

From application logic spread across three call sites:

# Old code: every time we write a user
user.email_lower = user.email.lower()
user.save()

To schema-level enforcement:

ALTER TABLE users
  ADD COLUMN email_lower TEXT
  GENERATED ALWAYS AS (LOWER(email)) STORED;

CREATE UNIQUE INDEX users_email_lower_idx ON users (email_lower);

Application code drops the manual sync, the database guarantees consistency, and the index supports case-insensitive lookups directly. Less code, fewer bugs, same performance characteristics. This is generated columns at their best.