pgvector Production Setup Checklist | MonPG Docs

Before you install pgvector

pgvector is useful when vectors are part of the application data model and PostgreSQL already owns the document lifecycle. It is not a shortcut around retrieval design. Decide how you will represent tenants, permissions, document state, embedding model, and chunk version before you create the first index.

Install the extension

CREATE EXTENSION IF NOT EXISTS vector;

The extension name is vector, even though most teams call the project pgvector. Managed-provider support varies by PostgreSQL version and provider policy, so verify the extension list for the target cluster before rollout.

Use a serving schema, not just an embedding column

The vector row should carry the metadata needed to decide whether a result is allowed and fresh.

CREATE TABLE document_chunks (
  id bigserial PRIMARY KEY,
  tenant_id bigint NOT NULL,
  document_id bigint NOT NULL,
  acl_group_id bigint NOT NULL,
  language text NOT NULL,
  is_deleted boolean NOT NULL DEFAULT false,
  embedding_model text NOT NULL,
  embedding_version integer NOT NULL DEFAULT 1,
  body text NOT NULL,
  embedding vector(1536) NOT NULL,
  embedded_at timestamptz NOT NULL DEFAULT now()
);

If tenant, ACL, language, or delete state affects retrieval correctness, keep it close to the vector row. Hiding those rules in application code makes recall problems harder to reproduce.

Create the first HNSW index

CREATE INDEX CONCURRENTLY document_chunks_embedding_hnsw
ON document_chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Start with conservative settings and measure. Higher graph connectivity and construction effort can improve recall, but they also increase build time, memory, and index size.

Keep an exact-search baseline

Approximate search needs a reference. Use exact search on sampled tenants and queries to measure whether the production index returns useful rows.

SELECT id
FROM document_chunks
WHERE tenant_id = $1
  AND acl_group_id = $2
  AND is_deleted = false
ORDER BY embedding <=> $3
LIMIT 20;

Track recall@k, result count, and p99 latency together. A faster query that misses the right chunk is not a win.

Maintenance checklist

Watch index size and table size after bulk imports.
Watch autovacuum lag after deletes and re-embedding jobs.
Record HNSW settings with every benchmark result.
Run representative tests for small and large tenants.
Keep rollback SQL for new indexes and embedding versions.