Indexes10 min read

PostgreSQL Collation and the glibc Upgrade That Silently Corrupts Your Indexes

An OS upgrade changed glibc's collation, your text indexes were now sorted by rules that no longer match, and PostgreSQL had no idea. Queries return wrong results and unique constraints let duplicates in.

This is the scariest PostgreSQL incident I know, because nothing crashes, nothing logs an error, and the database keeps serving traffic while quietly returning wrong answers. We found it by accident: a unique index on email addresses had two rows with the same email.

How does a unique index allow a duplicate? It doesn't, on the day it was built. But the OS had been upgraded, glibc had changed how it sorts text, and the index was now ordered by rules that no longer matched the rules PostgreSQL used to look things up. The index was internally inconsistent and PostgreSQL trusted it completely.

If you run PostgreSQL on a libc-provided collation, which is the default almost everywhere, you need to understand this failure mode before it finds you.

Why text sort order is not the database's job

PostgreSQL does not define how text sorts. For the default and most named collations it delegates to the operating system's C library, glibc on Linux. The collation decides whether 'Z' comes before 'a', how accented characters order, how punctuation is handled, all of it.

A B-tree index on a text column stores entries in collation order. That ordering is baked into the physical structure when the index is built. Lookups, range scans, and uniqueness checks all assume the on-disk order matches the collation's current rules.

What an OS upgrade changes

glibc has changed its collation rules between versions, most famously around glibc 2.28, where the sort order of many locales changed substantially. When you upgrade the OS, or migrate to a new base image, or restore onto a different distro, the library underneath PostgreSQL can start sorting text differently.

Nothing in PostgreSQL is rebuilt by this. Your existing indexes are still physically ordered by the old rules, while every new comparison uses the new rules. The index and the collation now disagree, and that is corruption, even though every file is intact.

How the corruption shows up

The symptoms are subtle and data-dependent. A query with a range predicate on text misses rows that are physically in the index but on the "wrong" side of the new ordering. A unique index fails to find an existing key during an insert and lets a duplicate through. An ORDER BY returns results that are not actually ordered.

Because it only affects comparisons that fall in the region where old and new rules differ, you can have a corrupted index that returns correct results for 99% of queries, which is exactly why it goes undetected for months.

Detecting it with amcheck

The amcheck extension can verify that a B-tree index is internally consistent with the current collation. Running it after any OS or glibc change is the single most valuable habit here. If an index is ordered by stale rules, amcheck will report it.

Run it across your text indexes, especially unique ones on user-facing identity columns like email, username, and slug, where a silent duplicate is a real correctness bug.

CREATE EXTENSION IF NOT EXISTS amcheck;

-- Verify a specific index against current collation rules
SELECT bt_index_check('users_email_key', true);

-- Sweep all btree indexes on text-bearing tables
SELECT c.relname,
       bt_index_check(c.oid, true)
FROM pg_index i
JOIN pg_class c ON c.oid = i.indexrelid
WHERE c.relam = (SELECT oid FROM pg_am WHERE amname = 'btree');

The fix is REINDEX, and you must do it deliberately

Once the collation has changed, every affected text index must be rebuilt so its physical order matches the new rules. REINDEX does this. Use REINDEX ... CONCURRENTLY to avoid locking out writes on a live system.

The critical sequencing point: do the reindex after the glibc change and before you trust uniqueness again. If you upgraded the OS days ago and only now noticed, you may already have duplicate data that slipped past a broken unique index, so check for and resolve duplicates before recreating the constraint.

-- Rebuild without blocking writes
REINDEX INDEX CONCURRENTLY users_email_key;

-- Or rebuild everything text-related on a table
REINDEX TABLE CONCURRENTLY users;

-- Before re-trusting a unique index, look for damage already done
SELECT email, count(*)
FROM users
GROUP BY email
HAVING count(*) > 1;

Prevent the next one with ICU collations

The durable fix is to stop depending on the host's glibc for ordering. PostgreSQL supports ICU collations, where the collation rules are versioned and travel with a known ICU library rather than whatever the OS ships. PostgreSQL records the collation version and will warn you when it detects a mismatch, which is exactly the early signal glibc never gives you.

On PostgreSQL 15 and later you can create an ICU-based database or column collation. It does not make collations immutable forever, but it makes version changes visible and controlled instead of silent.

-- An ICU collation records a version PostgreSQL can check
CREATE COLLATION app_default (provider = icu, locale = 'en-US');

ALTER TABLE users
  ALTER COLUMN email TYPE text COLLATE app_default;

-- PostgreSQL tracks collation versions; mismatches become visible
SELECT collname, collversion
FROM pg_collation
WHERE collprovider = 'i';

The operational rule that prevents the 2 AM version

Treat any change to the C library as a database event, not just an OS event. That includes major-version OS upgrades, switching base images, and restoring a dump onto a different distribution. Each one demands the same response.

  1. Before the change, note the glibc/ICU version and the at-risk text indexes.
  2. After the change, run amcheck across text and unique indexes.
  3. REINDEX CONCURRENTLY anything flagged, and re-verify.
  4. Check for duplicate data that a broken unique index may have admitted.
  5. Plan a migration to versioned ICU collations so the next change warns you.

The practical standard

The best PostgreSQL performance work is boring in the right way. Name the failure mode, capture the before plan or metric, make one change, and compare the exact same signal afterward. Anything else is just a more confident guess.