Introduction
PostgreSQL’s performance is heavily dependent on proper indexing. A single missing index can turn a millisecond query into a multi-second table scan, while unused indexes waste disk space, slow down writes, and bloat your buffer cache. For self-hosted PostgreSQL deployments, having the right tools to analyze, test, and optimize your index strategy is essential.
In this guide, we compare three powerful open-source tools for PostgreSQL index optimization: HypoPG for hypothetical index testing, pg_qualstats for predicate-based index recommendations, and pg_stat_user_indexes (built into PostgreSQL) for tracking actual index usage. Each tool addresses a different phase of the index lifecycle — from discovery to validation to cleanup.
Comparison Table
| Feature | HypoPG | pg_qualstats | pg_stat_user_indexes |
|---|---|---|---|
| Purpose | Test hypothetical indexes | Find missing indexes | Monitor existing indexes |
| Type | PostgreSQL extension | PostgreSQL extension | Built-in system view |
| Stars | 1,654+ | 331+ | N/A (core PostgreSQL) |
| Last Updated | May 2026 | May 2026 | Active (PostgreSQL 17) |
| Performance Impact | Near-zero (virtual) | ~2-5% overhead | None (built-in stats) |
| Installation | CREATE EXTENSION | CREATE EXTENSION | Always available |
| Best For | Validating index candidates | Discovering WHERE clause gaps | Finding unused indexes |
| Docker Support | Yes (official PGXN) | Yes (pgxn tools) | Built into PG images |
| License | PostgreSQL | PostgreSQL | PostgreSQL |
HypoPG: Testing Indexes Without Creating Them
HypoPG is a PostgreSQL extension that lets you create hypothetical indexes — indexes that exist only in the query planner’s memory. The database never writes them to disk, yet you can run explain to see if the planner would use them. This is invaluable for testing index candidates on production-sized schemas without the risk of a long CREATE INDEX operation.
Installation
| |
Docker Compose Setup
| |
After connecting, enable the extension:
| |
Usage Example
| |
HypoPG is especially useful when combined with explain (ANALYZE, BUFFERS) to compare the estimated cost reduction. Since hypothetical indexes never touch disk, you can safely test dozens of candidates in seconds.
pg_qualstats: Finding What the WHERE Clause Misses
pg_qualstats takes a different approach: instead of testing what you think you need, it observes your actual queries and reports which WHERE clause predicates could benefit from indexes. It samples query predicates and their selectivity (how many rows each filter eliminates), then recommends missing indexes.
Installation
| |
Enable in the target database:
| |
Key Queries
| |
The filter_ratio column is critical — a value of 0.95 means 95% of scanned rows are discarded by the filter. This is a strong signal that an index would help.
pg_qualstats is designed to run continuously on production databases with minimal overhead (~2-5% CPU). Unlike HypoPG’s manual hypothesis approach, qualstats builds evidence from real workloads.
pg_stat_user_indexes: Cleaning Up Unused Indexes
PostgreSQL’s built-in pg_stat_user_indexes view tracks how often each index is used. It’s the most direct way to identify indexes that are wasting space and slowing down writes.
| |
Important: idx_scan resets on server restart and on statistics reset. Use pg_stat_reset() to start fresh tracking, or combine with pg_stat_statements for longer-term analysis.
Automated Index Cleanup Script
| |
Why Self-Host Your Index Optimization?
Self-hosting your index optimization workflow gives you complete control over your performance data. Unlike cloud-managed PostgreSQL services that may limit access to system catalogs or charge extra for query analytics, a self-hosted setup with HypoPG, pg_qualstats, and pg_stat_user_indexes gives you unfiltered access to every statistic. You can run these tools 24/7 without per-query pricing, store unlimited historical data, and customize the analysis pipeline to match your exact schema patterns.
Data sovereignty is another advantage: all query statistics and predicate analysis data stays on your infrastructure. For organizations handling sensitive user data or operating in regulated industries (healthcare, finance, government), keeping query telemetry in-house eliminates compliance concerns that arise with third-party monitoring services.
The open-source nature also means you’re not locked into a single vendor’s tooling. If your needs grow beyond what HypoPG and pg_qualstats offer, you can integrate with pg_stat_statements, pgBadger, or the broader Postgres ecosystem — all running on your own hardware. For a comprehensive PostgreSQL backup strategy, check our PostgreSQL backup tools guide. For monitoring your database health, see our PostgreSQL monitoring comparison. If you’re managing connection pools, our PgBouncer vs ProxySQL vs Odyssey comparison provides essential guidance for scaling your connection layer.
Finally, running these tools locally means instant feedback. When a developer asks “would an index on (customer_id, created_at) help this query?”, you can test it with HypoPG in seconds — no ticket needed, no cloud console login, no waiting for managed service dashboards to refresh. This tight feedback loop is invaluable for development teams that ship frequently.
Workflow: Putting All Three Tools Together
Here’s a practical workflow for index optimization:
- Monitor — Run pg_qualstats for a week on your production workload to identify predicate patterns that lack index coverage.
- Hypothesize — For each missing-index candidate from qualstats, use HypoPG to test whether the PostgreSQL planner would actually use it.
- Create — Deploy the validated index candidates with
CREATE INDEX CONCURRENTLY(non-blocking). - Verify — After a few days, check pg_stat_user_indexes to confirm the new indexes are being used.
- Clean — Periodically review pg_stat_user_indexes and drop indexes with
idx_scan = 0(excluding primary keys and unique constraints).
This loop — discover, test, deploy, verify, clean — forms a complete index lifecycle management strategy.
FAQ
How much performance overhead do these tools add?
HypoPG has zero overhead on actual queries — hypothetical indexes exist only in memory and don’t affect real query execution. pg_qualstats adds approximately 2-5% CPU overhead because it inspects every query’s WHERE clause predicates. pg_stat_user_indexes has no measurable overhead since PostgreSQL already tracks index usage statistics by default.
Can I use HypoPG and pg_qualstats together on the same database?
Yes — they complement each other perfectly. Install both extensions on your staging or development database. Use pg_qualstats to discover WHERE clause gaps, then use HypoPG to validate whether proposed indexes would actually be used by the planner. On production, you may want to only run pg_qualstats (for its low overhead) while using HypoPG on a replica or staging copy.
How do I know if dropping an unused index is safe?
Always check two things before dropping: (1) Is the index backing a constraint? Primary key and unique constraint indexes are never truly “unused” — they enforce data integrity. Filter them out with contype IN ('p', 'u'). (2) Has idx_scan been reset recently? If you just restarted PostgreSQL or ran pg_stat_reset(), the counters may not reflect actual usage. Wait at least 1-2 weeks after a reset before trusting zero-scan statistics.
What’s the difference between pg_stat_user_indexes and pg_statio_user_indexes?
pg_stat_user_indexes tracks logical index usage (how many times the index was scanned for queries). pg_statio_user_indexes tracks physical I/O (how many blocks were read from the index). Use both together: idx_scan = 0 but high idx_blks_read may indicate the index is being used but statistics were reset. Conversely, high idx_scan with high idx_blks_read suggests the index is heavily used and may benefit from being in a faster storage tier.
Should I run these tools on production or staging?
pg_qualstats is designed for production with minimal overhead, so it’s safe to run there. HypoPG is best on staging or a production replica — while it has zero overhead on real queries, you want to test hypothetical indexes on a dataset that matches production scale for accurate planner estimates. pg_stat_user_indexes is always available on every PostgreSQL instance.
💡 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到 AI 监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测 AI 相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com