🔌

Building PostgreSQL Connection Pool Alerts Through /proc Monitoring Instead of Database Queries

· Server Scout

$ ss -tn state established | grep :5432 | wc -l
127
$ cat /proc/sys/kernel/pid_max && pgrep postgres | wc -l
32768
128

Your PostgreSQL connection pool just hit 80% capacity, but you won't know until applications start throwing connection refused errors. The standard approach involves querying pgstatactivity every few seconds, but that adds overhead to an already stressed system. There's a better way: monitor connection pools through the filesystem without touching the database at all.

Why pgstatactivity Polling Creates More Problems Than It Solves

Polling pgstatactivity requires an active database connection, consuming one of the very resources you're trying to monitor. During connection pool exhaustion, your monitoring queries compete with application traffic for limited connections.

More problematically, database queries add latency when response time matters most. A monitoring system that takes 200ms to detect pool saturation gives you less time to react than one that detects it instantly through process counting.

The solution lies in monitoring connection pools from outside the database entirely.

Process-Based Connection Monitoring: The /proc Approach

Linux exposes connection information through /proc and socket state commands. These require zero database connections and provide real-time visibility into pool utilisation.

Counting PostgreSQL Connections via Process Lists

Each PostgreSQL connection spawns a process. Count them directly:

# Count active PostgreSQL backend processes
pgrep -f "postgres.*:" | wc -l

# More specific: exclude background processes
ps aux | grep "postgres:.*SELECT\|INSERT\|UPDATE\|DELETE" | grep -v grep | wc -l

# Include idle connections
ps aux | grep "postgres:.*idle" | wc -l

Socket State Analysis with ss and netstat

Socket states reveal connection pool health before process counts hit limits:

# Active PostgreSQL connections
ss -tn state established | grep :5432 | wc -l

# Connections waiting to close (potential pool backup)
ss -tn state time-wait | grep :5432 | wc -l

# Full connection state breakdown
ss -tn | grep :5432 | awk '{print $1}' | sort | uniq -c

Filesystem-Based Early Warning Implementation

Build alerts that trigger before pool exhaustion occurs, not after applications start failing.

Setting Up Connection Count Thresholds

Create a monitoring script that checks connection counts against configured limits:

#!/bin/bash
MAX_CONNECTIONS=$(grep max_connections /etc/postgresql/*/main/postgresql.conf | cut -d= -f2 | tr -d ' ')
CURRENT_CONNECTIONS=$(pgrep -f "postgres:" | wc -l)
UTILISATION=$(( CURRENT_CONNECTIONS * 100 / MAX_CONNECTIONS ))

if [ $UTILISATION -gt 80 ]; then
    echo "PostgreSQL connection pool at ${UTILISATION}% capacity"
    # Trigger alert
fi

Detecting Pool Saturation Patterns

Connection pools show predictable patterns before exhaustion. Monitor TIME_WAIT states as an early indicator:

# Detect pool cycling (connections opening/closing rapidly)
TIME_WAIT_COUNT=$(ss -tn state time-wait | grep :5432 | wc -l)
ESTABLISHED_COUNT=$(ss -tn state established | grep :5432 | wc -l)

# High TIME_WAIT relative to established suggests pool thrashing
RATIO=$(( TIME_WAIT_COUNT * 100 / (ESTABLISHED_COUNT + 1) ))

if [ $RATIO -gt 50 ]; then
    echo "High connection churn detected: ${RATIO}% TIME_WAIT ratio"
fi

Building Alerts That Fire Before Pool Exhaustion

Effective connection pool monitoring combines absolute thresholds with rate-based detection.

Rate-Based vs Absolute Threshold Monitoring

Absolute thresholds catch steady-state problems. Rate monitoring catches rapid connection growth:

# Store previous count
PREVIOUS_COUNT=$(cat /tmp/pg_connection_count 2>/dev/null || echo 0)
CURRENT_COUNT=$(pgrep -f "postgres:" | wc -l)
echo $CURRENT_COUNT > /tmp/pg_connection_count

# Calculate connection growth rate
GROWTH_RATE=$(( CURRENT_COUNT - PREVIOUS_COUNT ))

if [ $GROWTH_RATE -gt 20 ]; then
    echo "Rapid connection growth: +${GROWTH_RATE} connections/minute"
fi

Combine both approaches for comprehensive coverage. Why Your First Alert Shouldn't Be Your Last: Designing Failure-Proof Notification Chains explains how to build escalating alerts that ensure critical issues get attention even when primary notification methods fail.

Real-World Configuration Examples

For multi-tenant environments, monitor per-database connection counts. Isolating Resource Usage by Customer in Multi-Tenant Hosting covers techniques for attributing resource usage to specific customers, including database connections.

A complete monitoring implementation might look like:

#!/bin/bash
# PostgreSQL connection pool monitor

check_connections() {
    local db_name=$1
    local max_conn=$2
    local warn_threshold=80
    
    # Count connections for specific database
    local current=$(ps aux | grep "postgres: .*${db_name}" | grep -v grep | wc -l)
    local utilisation=$(( current * 100 / max_conn ))
    
    if [ $utilisation -gt $warn_threshold ]; then
        echo "Database ${db_name}: ${utilisation}% pool utilisation (${current}/${max_conn})"
        return 1
    fi
    return 0
}

# Monitor each database separately
check_connections "webapp_prod" 100
check_connections "analytics" 50
check_connections "reporting" 25

This approach scales better than polling database views because it operates entirely through the filesystem. Server Scout's plugin system makes it simple to deploy custom monitoring scripts like this across multiple servers, providing connection pool visibility without the overhead of traditional database monitoring tools.

The PostgreSQL documentation details connection configuration parameters, but monitoring them through /proc gives you insight into actual runtime behaviour rather than configured limits.

Filesystem-based monitoring catches connection pool problems before they impact applications, without adding load to stressed database systems.

FAQ

How accurate is process counting compared to pgstatactivity queries?

Process counting via /proc is actually more accurate for connection pool monitoring because it reflects the true number of backend processes, regardless of whether they're actively running queries or sitting idle. pgstatactivity only shows sessions that have executed queries recently.

Will this monitoring approach work with connection poolers like PgBouncer?

Yes, but you'll need to monitor both the application-to-pooler connections and the pooler-to-PostgreSQL connections separately. Use ss -tn | grep :6432 for PgBouncer client connections and ss -tn | grep :5432 for the actual PostgreSQL backend connections.

How often should I run these checks without impacting system performance?

These filesystem-based checks are extremely lightweight - you can safely run them every 10-30 seconds. The commands read from /proc and don't create network traffic or database load, making them suitable for high-frequency monitoring even on busy systems.

Ready to Try Server Scout?

Start monitoring your servers and infrastructure in under 60 seconds. Free for 3 months.

Start Free Trial