Your application starts crawling. Response times triple. Users complain. The memcached stats look normal, connection counts seem reasonable, but something is clearly wrong with your caching layer.
Connection pool exhaustion in memcached often manifests as gradual performance degradation rather than obvious failures. Applications may successfully establish initial connections but fail to properly close them, leading to socket leaks that accumulate over time. Standard monitoring tools focus on cache hit rates and memory usage, but miss the underlying network-level problems that cause real performance issues.
This step-by-step guide shows you how to identify memcached connection leaks using /proc filesystem analysis, without requiring administrative access to memcached itself or relying on stats commands that might be restricted in your environment.
Understanding Memcached Socket Pool Behavior
Memcached operates as a network daemon, typically listening on port 11211. Each client connection creates a TCP socket that should be properly closed when the application finishes its caching operations. Connection leaks occur when applications establish connections but fail to close them cleanly, leaving sockets in various TCP states that consume system resources.
Unlike database connections that might show obvious query backlogs, memcached connection leaks often appear as slowly accumulating socket file descriptors and TCP connections in unusual states. The /proc filesystem provides direct visibility into these socket states without requiring memcached cooperation.
Step 1: Baseline Connection Analysis with /proc/net/tcp
Start by establishing a baseline of current memcached connections using the TCP connection table. The /proc/net/tcp file shows all active TCP connections in hexadecimal format, where memcached's default port 11211 appears as 2BCB.
First, identify your memcached server's IP address and convert the port to hex format. Use grep to filter connections to port 11211 and count the various TCP states:
sudo cat /proc/net/tcp | grep ':2BCB' | awk '{print $4}' | sort | uniq -c
This baseline shows you the distribution of connection states. In a healthy system, you should see mostly ESTABLISHED (01) connections during active periods, with occasional TIMEWAIT (06) connections during normal cleanup. Large numbers of CLOSEWAIT (08) or FIN_WAIT1 (04) states often indicate connection leaks.
Parsing TCP Socket States for Memcached Ports
The /proc/net/tcp format uses hexadecimal values for both addresses and states. Key states to monitor include:
- 01 (ESTABLISHED): Active connections
- 06 (TIME_WAIT): Connections in normal cleanup
- 08 (CLOSE_WAIT): Connections waiting for application to close
- 04 (FIN_WAIT1): Connections in closing sequence
Close attention to CLOSE_WAIT connections is crucial, as these indicate the application received a FIN packet but hasn't closed its side of the connection.
Step 2: Application Process Socket Tracking via /proc/pid/fd
Once you've identified suspicious connection patterns, trace them back to specific application processes. Each process maintains file descriptors in /proc/pid/fd, where network sockets appear as socket:[inode] entries.
First, identify processes that might be connecting to memcached by examining their network connections. For each suspicious process, examine its file descriptors:
ls -la /proc/[pid]/fd | grep socket
Cross-reference these socket inodes with the inode numbers shown in /proc/net/tcp. This correlation reveals which processes hold specific memcached connections and helps identify the source of leaked sockets.
Correlating File Descriptors to Network Connections
The inode numbers in /proc/pid/fd correspond to the inode column in /proc/net/tcp. By matching these values, you can definitively identify which process owns each memcached connection. This becomes particularly valuable when multiple applications share the same memcached instance.
Build a simple mapping by extracting inodes from both sources and looking for matches. Processes holding large numbers of memcached socket file descriptors often indicate connection pooling problems or leaked connections.
Step 3: Monitoring Socket State Changes Across Service Restarts
The most revealing test for connection leaks involves monitoring socket states before, during, and after application restarts. Properly designed applications should cleanly close all memcached connections during shutdown, leaving no orphaned sockets.
Before restarting your application, capture a snapshot of all memcached connections. Stop the application and wait 30 seconds, then check the connection table again. Any remaining ESTABLISHED or CLOSE_WAIT connections to memcached indicate leaked sockets that weren't properly cleaned up.
This technique works even when you can't directly access memcached statistics, as it relies purely on kernel-level socket tracking.
Detecting Persistent Connections After Application Stop
Healthy applications should leave zero memcached connections after proper shutdown. Persistent connections indicate several potential problems:
- Connection pools not being drained during shutdown
- Background threads maintaining connections after main application stop
- Leaked connections from exception handling failures
- Improperly configured connection timeouts
Document these persistent connections by recording their TCP states and process owners before investigating the application code.
Step 4: Identifying Connection Leak Patterns
Connection leaks typically follow predictable patterns that become visible through /proc analysis. Monitor the rate of new connection establishment versus connection cleanup over time to identify gradual accumulation.
Some applications create connection leaks under specific conditions - during error handling, peak load periods, or when certain code paths fail to properly clean up resources. By correlating /proc/net/tcp patterns with application logs and system load, you can identify the triggering conditions.
TIME_WAIT vs ESTABLISHED State Analysis
Normal memcached usage generates TIMEWAIT connections as part of proper TCP cleanup. However, an accumulation of ESTABLISHED connections without corresponding cleanup, or excessive CLOSEWAIT states, indicates application-level problems rather than normal network behavior.
Track the ratio between these states over time. A healthy pattern shows periodic creation and cleanup of connections, while leak patterns show steady accumulation of one or more states without corresponding cleanup.
Step 5: Automated Monitoring Script for Continuous Detection
Once you understand the leak patterns, build continuous monitoring to detect future occurrences. A simple bash script can monitor /proc/net/tcp for memcached connections and alert when connection counts exceed baseline thresholds or when problematic states accumulate.
Implementing this monitoring at the system level provides early warning before application performance degrades. Server Scout's service monitoring features can track these patterns alongside traditional metrics, providing comprehensive visibility into both application and infrastructure health.
For teams managing multiple memcached instances across different servers, unified infrastructure dashboards help correlate connection patterns with broader system performance trends. This system-level approach often reveals problems that application-level monitoring misses.
Connection leak detection through /proc analysis provides definitive evidence of socket-level problems without requiring administrative access to memcached itself. This technique proves particularly valuable in environments where cache access is restricted or when network-level connection analysis reveals problems that application metrics don't show.
By following these steps systematically, you can identify memcached connection leaks at the socket level, trace them back to specific application processes, and implement monitoring that catches future occurrences before they impact performance. The /proc filesystem provides all the data needed to diagnose these problems definitively, regardless of memcached configuration or access restrictions.
FAQ
How do I know if the connection leaks are from my application or the memcached server itself?
Check /proc/pid/fd for your application processes and correlate the socket inodes with /proc/net/tcp entries. If your application holds socket file descriptors for connections in CLOSE_WAIT state, the leak is on the application side. Memcached server issues typically show as ESTABLISHED connections without corresponding client-side file descriptors.
What's a normal number of memcached connections to expect?
This depends entirely on your application architecture. A single-threaded application might maintain 1-2 persistent connections, while a multi-threaded web application could legitimately hold dozens. The key is consistency - connection counts should remain stable or follow predictable patterns, not steadily increase over time.
Can I use this technique to monitor other connection pools like Redis or database connections?
Absolutely. The same /proc/net/tcp analysis works for any TCP-based service. Just substitute the appropriate port number in hexadecimal format. Redis default port 6379 appears as 18EB, MySQL's 3306 appears as 0CEA, and PostgreSQL's 5432 appears as 1538.