Azure Flex is the modern Postgres offering on Azure (the older Single Server is being phased out). The defaults are reasonable for a generic workload, which means they are slightly wrong for almost every specific workload. The Server parameters blade is where most of the value lives.
Here is the list I run through on every new Azure Flex instance.
SPL: shared_preload_libraries
The one parameter that requires a server restart. Get it right at instance creation time if possible.
For most workloads I add:
shared_preload_libraries = pg_stat_statements,auto_explain,pg_buffercache
For specific use cases, also add:
pg_cronfor scheduled jobs.pgauditfor compliance.pg_wait_samplingfor wait-event analysis (only if available — varies by Azure region).
Azure has a quirk: even after adding to SPL, you need to add the extension name to azure.extensions to allow CREATE EXTENSION. Two parameters, both required. Easy to miss the second one.
azure.extensions
The list of extensions allowed for CREATE EXTENSION. Default includes the common ones. To enable additional:
azure.extensions = pg_stat_statements,auto_explain,pg_buffercache,hypopg,pgstattuple,pg_cron,pgaudit
This is unique to Azure Flex; RDS and Cloud SQL handle it differently. If CREATE EXTENSION foo fails on Flex, this is usually why.
Memory parameters
work_mem — default is dynamically sized but typically too low for analytical queries.
work_mem = 16MB # for OLTP
work_mem = 64MB # for mixed workloads
For heavy individual queries, override at session level rather than globally. SET LOCAL work_mem = '256MB' for that query only.
shared_buffers — Azure sizes this based on instance memory automatically. Usually fine. If you need to override, target 25-35% of memory.
effective_cache_size — also auto-sized. Confirm it is set to roughly 75% of memory; if not, set explicitly.
maintenance_work_mem — used by VACUUM, REINDEX, CREATE INDEX. Default is conservative. Bump to 512MB or 1GB for faster maintenance:
maintenance_work_mem = 512MB
VACUUM finishes faster, REINDEX is faster, CREATE INDEX uses less I/O.
Connections
max_connections — Azure scales with instance size. For a Standard_D4s_v3 (16GB), the default is around 1700.
This is too high for almost any application. Each connection costs memory and lock-table space. The right shape:
- Run PgBouncer in front of Flex (Azure provides built-in pooling —
connection-poolingis on by default in newer Flex versions). - Set
max_connectionsto a more realistic number (200-400 for most workloads).
On Azure Flex, the built-in pooler is convenient. It handles transaction-mode pooling automatically. Configure pool sizes via the Connection pooling blade.
Vacuum
More aggressive than defaults:
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02
autovacuum_naptime = 15s
autovacuum_max_workers = 6
autovacuum_vacuum_cost_limit = 2000
This runs autovacuum more often with more parallelism. Cost: slightly more I/O during vacuum. Benefit: less bloat, fewer planner surprises, lower freeze-related incident risk.
For specific tables that need different treatment (very large or very high-churn), use table-level overrides:
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.02,
autovacuum_analyze_scale_factor = 0.01
);
Checkpoints and WAL
max_wal_size = 8GB
checkpoint_timeout = 15min
bgwriter_lru_maxpages = 500
Reduces checkpoint frequency. Smooths I/O. Eliminates most checkpoint-induced latency spikes.
Logging
log_min_duration_statement = 1000 # 1 second
log_lock_waits = on
log_temp_files = 1024
log_autovacuum_min_duration = 1000
log_checkpoints = on
These give you the diagnostic foundation. Azure Flex sends Postgres logs to Log Analytics; useful settings turn into useful Log Analytics queries.
For log_min_duration_statement, 1 second is a starting point. Adjust based on workload — for OLTP-heavy, drop to 500ms; for analytics, raise to 5000ms.
Statement timeouts
Default is 0 (no timeout). Set:
statement_timeout = 60s
idle_in_transaction_session_timeout = 5min
lock_timeout = 30s
These protect against runaway queries and stuck transactions. Adjust per workload — analytical queries may need longer statement_timeout, set per-session in those cases.
What is unique about Azure Flex
A few Flex-specific quirks:
Built-in connection pooling. Available since 2023. Enable in the Connection pooling blade. Transaction-mode by default. For most applications, this removes the need for separate PgBouncer.
Automatic minor-version upgrades. Flex applies minor patches in a maintenance window you configure. Set this to a low-traffic time. Major-version upgrades are still manual.
Storage tiering. Flex offers several storage tiers (P10/P20/P30/P40 by IOPS). Cheaper tiers cap IOPS lower. For write-heavy workloads, calculate IOPS need and pick accordingly — too low a tier saturates.
Read replicas. Async only on Flex, in the same or different region. The lag is typically sub-second on same-region. For HA, use the high-availability mode with a sync standby; for read scaling, use a separate read replica.
Backup retention. Default 7 days, configurable up to 35 days. For longer retention, you have to do logical backups yourself (pg_dump to blob storage).
Verification
After applying changes via Server parameters:
SELECT name, setting, source
FROM pg_settings
WHERE name IN ('work_mem', 'max_wal_size', 'autovacuum_vacuum_scale_factor');
The source column shows where the value came from (configuration file, default, session). If your Server parameters change is not showing up, check whether it requires a restart — those are flagged in the Azure portal.
A typical tuning session
For a new Azure Flex deployment, my workflow:
Day 1: SPL + extensions configured, instance restarted once. Apply the dynamic parameter changes. Defer the work_mem/connections/vacuum tuning until I see actual workload patterns.
Week 1: monitor
pg_stat_statementsand the Azure Insights blade. Identify hot queries and resource utilization.Week 2: tune
work_mem, autovacuum settings, connection limits based on observed workload.Ongoing: revisit quarterly or after significant workload changes.
This incremental approach beats trying to set everything perfectly upfront. The defaults are not awful; they are just not optimal for any specific workload.