JSONB is one of the better things Postgres has shipped in the last decade. Schemaless data, real query support, real indexing. It also has the highest "I added an index but nothing got faster" rate of any feature in Postgres, because JSONB indexing has more nuance than the basics of B-tree.
Here is what I have learned, mostly the hard way, about which indexes actually pay for themselves on JSONB columns.
The two main flavors of JSONB index
GIN index over the whole column. Indexes every key-value pair. Big, expensive to maintain, but answers any ?, ?|, ?&, @> query.
CREATE INDEX events_payload_gin
ON events USING GIN (payload);
Expression index on a specific path. A B-tree index on a single extracted field. Cheap, fast, but only helps queries that hit that exact path.
CREATE INDEX events_user_id_idx
ON events ((payload->>'user_id'));
The choice between them comes down to how queries actually look. If your application filters JSONB ten different ways across many keys, GIN. If 95% of queries filter on user_id, an expression index on user_id is much cheaper.
The query operators that match each index
This is the part that bites people:
payload @> '{"user_id": 123}'— containment. Uses GIN.payload ? 'user_id'— key exists. Uses GIN.payload->>'user_id' = '123'— extract and compare. Uses expression index, NOT GIN.(payload->>'user_id')::int = 123— extract and cast. Uses expression index ONLY if the index also has the cast.
That last one is the most common foot-gun. If your application does (payload->>'user_id')::int = 123 and your index is on (payload->>'user_id') without the cast, the planner cannot use the index. You either match the cast in the index:
CREATE INDEX events_user_id_int_idx
ON events (((payload->>'user_id')::int));
Or query without the cast:
SELECT * FROM events WHERE payload->>'user_id' = '123';
Note the quotes around '123' — the result of ->> is text, so the comparison must be text.
When a GIN index is the right call
GIN indexes are big and slow to update. The index size is often a sizable fraction of the table size. Inserts and updates rebuild posting lists, which means write throughput drops.
The right time for GIN:
- Read-heavy table where the application filters JSONB by many different paths.
- The schema of the JSONB column genuinely varies across rows.
- The query patterns include containment (
@>) or key existence (?), which only GIN supports.
There is also jsonb_path_ops operator class, which only supports @> and is much smaller and faster than the default:
CREATE INDEX events_payload_path_ops_gin
ON events USING GIN (payload jsonb_path_ops);
If your queries only use @>, this variant is significantly cheaper. If you need other operators, use the default.
When an expression index is the right call
For most production JSONB workloads I have seen, the queries do not vary as much as the schema suggests. There is usually one or two paths queried hot and a long tail of analytical queries that can seqscan.
For those cases, expression indexes on the hot paths beat GIN handily:
- 10x smaller index.
- Negligible maintenance overhead.
- Same speed for the queries that match.
The downside: the index only helps queries that hit that exact expression. If a new query path emerges, you add another expression index.
My default is to start with expression indexes on the obvious hot paths, then add a jsonb_path_ops GIN if and when the query mix grows.
What does NOT use a JSONB index
A few patterns that look like they should but do not:
payload @@ '$.user_id == 123'— JSON path queries. These do not use GIN by default; you need ajsonb_opsGIN index, which is the default GIN, not the path-ops variant.payload->'user_id'(single arrow, returns JSONB) compared to a JSONB literal. Works only if the index expression also returns JSONB. The arrow vs double-arrow distinction matters.WHERE jsonb_array_length(payload->'tags') > 5. Functions on JSONB columns generally do not use indexes unless you create an expression index on the exact function call.
The rule: check what the query actually does with EXPLAIN before assuming any of this.
A worked example
Real scenario: events table with a JSONB payload, queries like payload->>'tenant_id' = '...' running thousands of times per minute, and an occasional analyst running payload @> '{"event_type": "signup"}'.
What I would do:
-- Hot path: tenant_id filter
CREATE INDEX events_tenant_id_idx
ON events ((payload->>'tenant_id'));
-- Cold path: containment queries — small jsonb_path_ops GIN
CREATE INDEX events_payload_path_ops
ON events USING GIN (payload jsonb_path_ops);
The expression index handles 99% of traffic at minimal cost. The path_ops GIN handles the analytical queries without the maintenance overhead of a full GIN. Total index size is a fraction of what a single full GIN would have been.
What I do before creating any JSONB index
Look at pg_stat_statements. Find the actual queries hitting the JSONB column. Group them by which path or operator they use. Then create the smallest set of indexes that covers the hot ones.
Doing this before reaching for GIN almost always lands on a smaller, cheaper, faster set of indexes than the obvious answer.