GIN and GiST are the two non-B-tree index types most teams encounter. They cover the cases B-tree cannot — full-text search, JSONB containment, range overlaps, geometric queries. The names are jargon, the documentation is dense, and the choice between them is rarely as obvious as people would like.
Here is the rule of thumb I use.
What each one is
GIN (Generalized Inverted iNdex) is an inverted index. For each indexed value (a word, a JSONB key-value pair, an array element), it stores a list of row IDs that contain that value. Querying "which rows contain X" is a fast lookup followed by a list scan.
GiST (Generalized Search Tree) is a tree-shaped index that uses a configurable comparison function for each operator class. It works for any data type that has a sensible "these two values overlap" or "this value contains that one" definition.
Both can support full-text search, JSONB queries, range queries. The right one depends on the workload.
The general rule
GIN is faster to read and slower to write. It is the right choice when:
- Reads dominate writes (typically 5:1 or more).
- You query existence/containment (
@>,?,@@) rather than range overlap. - The number of distinct values per row is moderate (a few dozen tags, a few hundred words). Very high per-row cardinality makes GIN's posting lists huge.
GiST is faster to write and slower to read. It is the right choice when:
- Writes are frequent and read latency is tolerable.
- You need range overlap (
&&for tsranges, geometry, etc.). - The data has natural geometric structure (PostGIS, the
range_opsoperator class). - You need exclusion constraints (only GiST supports
EXCLUDE USING gist).
Full-text search: the typical answer is GIN
The canonical full-text setup:
CREATE INDEX articles_search_gin
ON articles USING GIN (search_vector);
GIN scans the inverted index, finds matching articles, returns them. Read latency is typically a few milliseconds even on a multi-million-row table.
The write cost: every insert and update of an indexed tsvector column rebuilds part of the posting list. For an article-publishing system that writes a few thousand articles per day, this is fine. For a system that updates millions of rows per day, GIN's write cost becomes meaningful.
For full-text on a write-heavy table (logs, events), GiST is the alternative:
CREATE INDEX events_search_gist
ON events USING GIST (search_vector);
Reads are slower (GiST has to traverse the tree), but writes do not pay for posting list maintenance.
JSONB: GIN is almost always right
CREATE INDEX events_payload_gin
ON events USING GIN (payload);
GIN supports the full set of JSONB operators (@>, ?, ?|, ?&). GiST does not have the same coverage.
For JSONB with @> containment as the only query pattern, the smaller jsonb_path_ops operator class is much cheaper:
CREATE INDEX events_payload_path_ops
ON events USING GIN (payload jsonb_path_ops);
This variant indexes only containment-style lookups, is significantly smaller and faster, and is the right default for many JSONB use cases.
Range queries: GiST shines
For range types and range-overlap queries, GiST is the right answer:
CREATE INDEX bookings_during_gist
ON bookings USING GIST (during);
-- Query: any bookings overlapping a specific window?
SELECT * FROM bookings
WHERE during && tstzrange(now(), now() + interval '1 hour');
GiST natively understands range overlap. GIN does not.
The most useful pattern combining GiST with constraints:
CREATE EXTENSION btree_gist;
CREATE TABLE bookings (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
resource_id BIGINT NOT NULL,
during TSTZRANGE NOT NULL,
EXCLUDE USING gist (resource_id WITH =, during WITH &&)
);
No two bookings of the same resource can have overlapping ranges. The exclusion constraint is GiST-only.
Geometric data: GiST or SP-GiST
For PostGIS spatial data, GiST is the standard. SP-GiST ("space-partitioned GiST") is a related variant that handles non-balanced trees better — useful for some geometric distributions but a more advanced topic.
For most geographic queries, the canonical pattern is:
CREATE INDEX places_location_gist
ON places USING GIST (location);
Where location is a geometry or geography column.
The cost story
A few numbers I have measured on real systems:
- A GIN index on a
tsvectorover 10 million articles: roughly 1.5 GB. Read latency for a typical full-text search: 5-15 ms. - A GiST index on the same data: roughly 800 MB. Read latency: 20-50 ms. Write throughput: 2-3x higher than GIN.
- A GIN index on a JSONB column with 50 keys per row: roughly 30% of the table size. Read latency for
@>queries: 1-10 ms. - A
jsonb_path_opsGIN on the same: roughly 10% of the table size. Read latency similar.
Your numbers will vary; these give a sense of magnitude.
How I decide in practice
- What operator does the query use? GIN supports more JSONB operators; GiST is needed for range overlaps.
- What is the read/write ratio? Write-heavy → GiST is a contender.
- Do you need exclusion constraints? GiST is required.
- Is the data geometric? GiST or SP-GiST.
- Is there a smaller GIN variant (
jsonb_path_ops)? Often a great middle ground.
For most JSONB and full-text use cases, the answer is some flavor of GIN. For ranges, geometry, and exclusion, GiST. The default decision usually comes from query shape, not from raw performance numbers.