PostgreSQL Topic Archive

pgvector and RAG PostgreSQL Articles

HNSW, IVFFlat, recall, embedding search, and production RAG performance notes.

Filtered Vector Search: The Real Vector Database Problem

May 4, 20266 min read

Vector search looks easy until tenant filters, permissions, freshness, and deleted content arrive. The hard part is not nearest neighbors; it is filtered recall under production rules.

Hybrid Search in Production: BM25 Plus Vector Is Not a Magic Button

May 4, 20266 min read

Hybrid search helps when exact terms and semantic meaning both matter. It fails when teams blend rankings without query intent, calibration, or evaluation.

Multi-Tenant Vector Search: Namespaces, Metadata Filters, or Partitions?

May 4, 20266 min read

Multi-tenant vector search is a correctness and isolation problem. Namespaces, filters, and partitions each fail differently under tenant skew and ACL rules.

Chunking Is a Database Design Problem, Not a Prompting Trick

May 4, 20266 min read

Chunk size and overlap decide retrieval quality, citation accuracy, freshness, permissions, and cost. Treat chunks as serving data with ownership.

Pinecone vs Qdrant vs Weaviate vs pgvector: A Production Decision Framework

May 4, 20267 min read

The best vector database depends on ownership boundaries, filter semantics, recall targets, migration paths, cost, and operational maturity - not benchmark screenshots.

pgvector HNSW Tuning: Why Default Settings Quietly Kill Recall

May 4, 20266 min read

HNSW defaults can look fast while missing useful results. Production tuning needs recall@k, p99, index build time, memory, and filtered result count together.

RAG Quality Metrics: Stop Measuring Only Latency

May 4, 20266 min read

A fast RAG system can still be wrong. Production teams need recall@k, MRR, answerability, citation coverage, freshness, no-hit rate, and drift signals.

Vector Deletes, Freshness, and Permissions: The Hidden RAG Incident

May 4, 20265 min read

A RAG system is unsafe if deleted documents, permission changes, and stale embeddings can still appear in answers. Freshness is part of correctness.

Re-Embedding Without Breaking Production Search

May 4, 20266 min read

Embedding model upgrades are migrations. Versioned embeddings, dual indexes, shadow queries, backfills, and rollback decide whether search quality survives.

Vector Database Cost Model: Why Your RAG Bill Explodes

May 4, 20265 min read

Vector cost is not just storage. Dimensions, top_k, filters, rerankers, embedding refreshes, payload size, index rebuilds, and tenant skew all compound.

Rerankers in RAG: The Expensive Fix That Only Works After Retrieval

May 4, 20265 min read

Rerankers can improve answer quality, but they cannot recover evidence that retrieval never found. Use them after recall, cost, and latency are understood.

Vector Search Observability: The Dashboard I Want Before Scaling RAG

May 4, 20265 min read

Vector search observability needs recall, result count, filter selectivity, freshness lag, tenant skew, index health, reranker cost, and answer grounding - not just latency.

Vector Databases: How I Decide Between pgvector and a Dedicated Vector Store

April 26, 202615 min read

The hard vector database decision is not pgvector versus Pinecone on a checklist. It is whether your filters, recall target, update rate, and incident budget still fit inside Postgres.

Back to all PostgreSQL field notes