Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Runbook: High Database Connections

Overview

This runbook describes how to handle situations where database connection pool is exhausted or nearing limits.

Symptoms

  • Application errors: “too many connections”
  • Slow API responses
  • Connection pool metrics showing high usage
  • Database showing max_connections limit reached

Impact

  • API requests fail
  • Background jobs unable to process
  • Users experience errors and timeouts

Detection

  • Alert: DatabaseConnectionsHigh
  • Threshold: > 80% of max_connections
  • Dashboard: Database Health

Response

Immediate Actions

  1. Check current connections

    SELECT count(*) FROM pg_stat_activity;
    SELECT usename, application_name, count(*)
    FROM pg_stat_activity
    GROUP BY usename, application_name
    ORDER BY count DESC;
    
  2. Identify idle connections

    SELECT pid, usename, application_name, state, state_change
    FROM pg_stat_activity
    WHERE state = 'idle'
    AND state_change < NOW() - INTERVAL '10 minutes';
    
  3. Kill long-idle connections (if safe)

    SELECT pg_terminate_backend(pid)
    FROM pg_stat_activity
    WHERE state = 'idle'
    AND state_change < NOW() - INTERVAL '30 minutes';
    

Root Cause Analysis

  1. Check for connection leaks

    • Review recent deployments
    • Check for missing defer db.Close()
    • Look for transactions not being committed/rolled back
  2. Review pool configuration

    • Current settings in environment
    • Calculate optimal pool size
    • Check for misconfigured services
  3. Analyze traffic patterns

    • Sudden spike in requests
    • New feature causing more queries
    • Background job issues

Long-term Fixes

  1. Optimize connection pool settings

    db.SetMaxOpenConns(25)
    db.SetMaxIdleConns(10)
    db.SetConnMaxLifetime(5 * time.Minute)
    
  2. Implement connection pooler

    • Consider PgBouncer for connection multiplexing
    • Configure pool modes appropriately
  3. Code improvements

    • Use prepared statements
    • Batch queries where possible
    • Implement query result caching

Prevention

  • Monitor connection pool metrics
  • Load test with realistic concurrency
  • Regular code reviews for database usage
  • Implement circuit breakers