Does MonPG see my embeddings?

MonPG reads a sample of your embeddings into the benchmark worker's memory. Samples are never persisted, logged, or sent anywhere else. A self-hosted runner is available if you need benchmarking to happen entirely on your own infrastructure.

How long does a benchmark take?

A typical run on 5,000 sample vectors and 100 query vectors across 48 configurations finishes in 5 to 10 minutes. Larger samples (up to 100,000 vectors) extend to 30 to 60 minutes.

What pgvector deployments are supported?

Any pgvector 0.5+ deployment: self-hosted Postgres, Supabase, Neon, AWS RDS, Aurora, Azure Database for PostgreSQL, GCP AlloyDB, Crunchy Data, Tembo.

How is this different from pgvectorBench?

pgvectorBench is an open-source CLI you run locally against your own database. MonPG is a hosted service with a web UI, grid search on our benchmark infrastructure, migration SQL generation, CI integration, and team sharing. It is the same pattern as Fathom Analytics vs. Matomo OSS, or Plausible vs. self-hosted analytics.

Can I run this for free?

Yes. Every MonPG account includes one HNSW benchmark per month on the free tier with a public report. Paid add-on at 29 USD per month unlocks unlimited benchmarks, private reports, CI integration, and scheduled re-benchmarks.

pgvector recall tuning

Your pgvector recall is silently 67%. Fix it in 5 minutes.

HNSW index parameters are immutable after creation, and defaults deliver 67% recall on 2026 RAG workloads. MonPG samples your real embeddings, runs 48 configurations on our benchmark infra, and hands you zero-downtime migration SQL.

Review Question

What HNSW parameters actually give you 95% recall on this exact table, and how do you roll that out without dropping queries?

At A Glance

Input: Read-only Postgres connection to your pgvector table
Decision: Keep defaults, tune, or rebuild on a different model
Output: Trade-off matrix, winner config, migration SQL, rollback plan

What you get

Recall@10 measured against brute-force ground truth on your real data
p50 / p95 / p99 latency per HNSW configuration
Trade-off matrix across recall, latency, index size, and build time
Zero-downtime CREATE INDEX CONCURRENTLY migration SQL with rollback plan

Guardrails

Reads one sample from your DB; all benchmarking runs on our worker.
Embeddings are never persisted on MonPG's side; sample is held in memory only.
Self-hosted runner available for air-gapped or strict compliance environments.

Signals covered

The page is designed to answer these production questions and search intents without losing the operational context.

pgvector hnsw tuningpgvector recall optimizationpgvector ef_construction ef_searchpgvector production rag tuningpgvector m parameter benchmark

PostgreSQL guide Compare tools

Why This Matters

Your RAG works in demo. It returns the wrong document 1 in 3 times in production.

Most teams ship pgvector with default HNSW parameters. The application looks healthy, queries are fast, and metrics stay green. The only signal something is wrong is that users say the chatbot or search feels off. By the time anyone traces it back to the index, trust has eroded.

Defaults are tuned for 2023, not 2026

pgvector defaults (m=16, ec=64, es=40) were set for 384-dim sentence-transformers. 1536-dim OpenAI embeddings at 500K+ scale need denser graphs and wider query-time exploration.

HNSW parameters are immutable

Once you CREATE INDEX, m and ef_construction cannot be ALTERed. Wrong defaults mean a full rebuild, not a tweak.

Recall is not surfaced by any existing PG monitoring tool

pganalyze, Datadog, pgwatch all show latency and throughput. None measure recall against ground truth. The silent-killer bug is invisible.

Sample Benchmark Output

m=16, ec=64, es=40 (default)recall 0.67 · p95 8 ms · 520 MB

m=16, ec=64, es=80recall 0.79 · p95 12 ms · 520 MB

m=32, ec=128, es=40recall 0.82 · p95 9 ms · 780 MB

m=32, ec=128, es=60recall 0.91 · p95 14 ms · 780 MB

m=32, ec=128, es=80 (winner)recall 0.94 · p95 18 ms · 780 MB

m=48, ec=256, es=120recall 0.97 · p95 32 ms · 1150 MB

500K OpenAI text-embedding-3-small (1536 dim) on a customer-support workload. Winner highlighted. Your numbers will differ; that is the point.

Workflow

How It Works

Connect your Postgres

Read-only credentials or agent token. Self-hosted, Supabase, Neon, RDS, Aurora, AlloyDB, Azure — all supported.

Pick the table and vector column

Auto-detected from your pgvector inventory. Pick distance operator (cosine, L2, inner product).

Sample real embeddings

Your DB sees a single SELECT ... LIMIT N. Sample (1K-100K vectors) is held in memory on our worker only.

Compute ground truth

Brute-force exact k-NN on the sample gives us the reference set for recall@10.

Grid search 48 configurations

m x ef_construction x ef_search in parallel. Build, query, measure recall, latency, size, build time per config.

Get the winner + migration SQL

Trade-off matrix, winner config, and zero-downtime CREATE INDEX CONCURRENTLY script ready to copy.

Integrations

Works With Your pgvector Deployment

Self-hosted Postgres

Any Postgres 12+ with pgvector 0.5+

Supabase

Read-only role across all projects

Neon

Branching-friendly, works with ephemeral DBs

AWS RDS / Aurora

pgvector 0.8+ on PostgreSQL 15/16/17

GCP AlloyDB / Cloud SQL

Manual params, no native tuner

Azure Database for PostgreSQL

Flexible Server tier

Stop shipping RAG with default HNSW parameters.

Measure your recall once this quarter. If it is below 0.85, you have the silent-killer bug and the fix is a zero-downtime index swap. MonPG gives you the measurement and the migration SQL in 10 minutes.

Start free benchmark Read the full story

Frequently Asked

Does MonPG read all of my embeddings?: No. MonPG reads a sample (you choose: 1K to 100K vectors) with a single SELECT ... LIMIT N. The sample stays in the benchmark worker memory and is discarded when the job completes. Nothing is persisted or logged.
How often should I re-benchmark?: Once per quarter, after major data growth, or after changing your embedding model. Scale tier supports scheduled re-benchmarks with recall drop alerts.
Can I use this against pgvectorscale / StreamingDiskANN?: Not in v1. We start with HNSW because it covers around 90 percent of current pgvector deployments. pgvectorscale tuning is on the v2 roadmap.
What about IVFFlat indexes?: HNSW-first in v1. IVFFlat support is v2. If you need IVFFlat tuning now, the methodology in our blog posts is portable; message us and we can prioritize.

Related Tools

PostgreSQL Plan Autopsy

Paste EXPLAIN ANALYZE output and get an incident-style read of the plan: planner estimate drift, loop explosions, disk spills, buffer pressure, and the evidence SQL to prove the fix.

PostgreSQL Index Rollout Simulator

Model a proposed PostgreSQL index as a production rollout: DDL shape, lock level, WAL pressure, write amplification, replica lag risk, validation SQL, and rollback criteria.

Useful here, still point-in-time.

This page should help with a real decision right now. It should not pretend to replace the historical evidence that production teams need after a deploy, during an incident, or across a longer rollout window.

Helpful On This Page

A candidate index shape based on the query pattern you entered
A rollout risk band from table size, writes, replicas, and timing
Starter checks to adapt before touching production

Still Needs Live History

Whether PostgreSQL actually uses the new index after deploy
How write latency, lag, and storage changed during the rollout
Whether the index kept earning its cost a week later

See Live History in MonPG