Runbook: High Database Connections
Overview
This runbook describes how to handle situations where database connection pool is exhausted or nearing limits.
Symptoms
- Application errors: “too many connections”
- Slow API responses
- Connection pool metrics showing high usage
- Database showing max_connections limit reached
Impact
- API requests fail
- Background jobs unable to process
- Users experience errors and timeouts
Detection
- Alert:
DatabaseConnectionsHigh - Threshold: > 80% of max_connections
- Dashboard: Database Health
Response
Immediate Actions
-
Check current connections
SELECT count(*) FROM pg_stat_activity; SELECT usename, application_name, count(*) FROM pg_stat_activity GROUP BY usename, application_name ORDER BY count DESC; -
Identify idle connections
SELECT pid, usename, application_name, state, state_change FROM pg_stat_activity WHERE state = 'idle' AND state_change < NOW() - INTERVAL '10 minutes'; -
Kill long-idle connections (if safe)
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND state_change < NOW() - INTERVAL '30 minutes';
Root Cause Analysis
-
Check for connection leaks
- Review recent deployments
- Check for missing
defer db.Close() - Look for transactions not being committed/rolled back
-
Review pool configuration
- Current settings in environment
- Calculate optimal pool size
- Check for misconfigured services
-
Analyze traffic patterns
- Sudden spike in requests
- New feature causing more queries
- Background job issues
Long-term Fixes
-
Optimize connection pool settings
db.SetMaxOpenConns(25) db.SetMaxIdleConns(10) db.SetConnMaxLifetime(5 * time.Minute) -
Implement connection pooler
- Consider PgBouncer for connection multiplexing
- Configure pool modes appropriately
-
Code improvements
- Use prepared statements
- Batch queries where possible
- Implement query result caching
Prevention
- Monitor connection pool metrics
- Load test with realistic concurrency
- Regular code reviews for database usage
- Implement circuit breakers