🔗

SSH vs SNMP for Switch Monitoring: Building Complete Infrastructure Visibility

· Server Scout

Your database server shows normal CPU and memory usage, but users complain about slow response times. The culprit? A saturated uplink on the switch handling your database traffic. Without network infrastructure monitoring, you're troubleshooting server problems that actually originate in the network layer.

Most sysadmins focus exclusively on server metrics while treating network switches as black boxes. This creates blind spots where network bottlenecks appear as mysterious server performance issues. Building complete infrastructure visibility requires monitoring both servers and the network fabric connecting them.

Setting Up SNMP-Based Switch Monitoring

SNMP remains the standard protocol for network device monitoring. Most enterprise switches ship with SNMP enabled, making it the path of least resistance for basic monitoring.

Configuring SNMP on Common Switch Models

Cisco switches require SNMP community configuration:

switch(config)# snmp-server community readonly RO
switch(config)# snmp-server community readwrite RW

HP/HPE switches use similar syntax:

switch(config)# snmp-server community "readonly" unrestricted

For production environments, use SNMP v3 with authentication:

switch(config)# snmp-server group monitoring-group v3 auth
switch(config)# snmp-server user monitor-user monitoring-group v3 auth sha password123

Essential SNMP OIDs for Infrastructure Monitoring

Start with these critical OIDs for interface statistics:

  • 1.3.6.1.2.1.2.2.1.10 - Interface incoming octets (ifInOctets)
  • 1.3.6.1.2.1.2.2.1.16 - Interface outgoing octets (ifOutOctets)
  • 1.3.6.1.2.1.2.2.1.14 - Interface input errors
  • 1.3.6.1.2.1.2.2.1.20 - Interface output errors

Query interface statistics with snmpwalk:

snmpwalk -v2c -c readonly 192.168.1.10 1.3.6.1.2.1.2.2.1.10

For CPU monitoring, most vendors require proprietary OIDs. Cisco uses 1.3.6.1.4.1.9.9.109.1.1.1.1.7 for CPU utilisation.

SSH-Based Switch Monitoring Implementation

SNMP has limitations. Many switches expose richer diagnostics through CLI commands that aren't available via SNMP. SSH-based monitoring provides deeper visibility but requires careful implementation.

Automated SSH Command Execution

Create a monitoring script that executes switch commands remotely:

#!/bin/bash
SSH_KEY="/opt/monitoring/.ssh/switch_key"
SWITCH_IP="192.168.1.10"

# Cisco switch interface statistics
ssh -i "$SSH_KEY" admin@"$SWITCH_IP" "show interfaces summary" | \
  awk '/Ethernet/ {print $1, $2, $3}'

# HP switch power consumption
ssh -i "$SSH_KEY" admin@"$SWITCH_IP" "show system power-consumption" | \
  grep "Total Power"

Implement connection pooling to avoid overwhelming switch management interfaces:

# Use SSH control sockets for persistent connections
ssh -M -S /tmp/switch-%h-%p-%r -f -N admin@192.168.1.10
ssh -S /tmp/switch-%h-%p-%r admin@192.168.1.10 "show version"

Parsing Switch Command Output

Switch CLI output requires parsing for metric extraction. Here's a practical example for Cisco interface statistics:

parse_cisco_interfaces() {
    ssh -S /tmp/switch-%h-%p-%r admin@192.168.1.10 "show interfaces" | \
    awk '
    /^[A-Z]/ { iface=$1 }
    /packets input/ { rx_packets=$(NF-1) }
    /packets output/ { tx_packets=$(NF-1) }
    /input errors/ { rx_errors=$1 }
    /output errors/ { tx_errors=$1; print iface, rx_packets, tx_packets, rx_errors, tx_errors }
    '
}

Creating Unified Infrastructure Dashboards

The goal is correlating network performance with server metrics. A database slowdown might correlate with increased switch port utilisation, revealing network bottlenecks that server monitoring alone would miss.

Correlating Network and Server Metrics

Store both switch and server metrics with consistent timestamps for correlation analysis:

# Collect switch port utilisation
SWITCH_UTIL=$(snmpget -v2c -c readonly 192.168.1.10 1.3.6.1.2.1.2.2.1.10.24)

# Collect server network stats
SERVER_RX=$(cat /proc/net/dev | awk '/eth0/ {print $2}')

# Log with timestamp for correlation
echo "$(date +%s) switch_util=$SWITCH_UTIL server_rx=$SERVER_RX" >> /var/log/infra-metrics.log

This approach allows you to identify patterns where server performance issues coincide with network infrastructure problems. Building a unified infrastructure dashboard becomes essential for operational visibility across your entire stack.

Monitoring Strategy Comparison

When to Use SNMP vs SSH Approaches

SNMP excels for standardised metrics collection across multiple vendor platforms. The protocol overhead is minimal, and polling scales well across hundreds of devices. However, SNMP often lacks the granular diagnostics available through vendor CLI interfaces.

SSH-based monitoring provides richer data but requires careful connection management. For multi-tenant hosting environments, SSH allows customer-specific network statistics that SNMP might not expose.

Use SNMP for:

  • High-frequency polling (30-second intervals)
  • Standardised metrics across vendor platforms
  • Large-scale deployments (100+ switches)

Use SSH for:

  • Vendor-specific diagnostics
  • Complex configuration validation
  • Detailed troubleshooting scenarios

Modern infrastructure requires monitoring that extends beyond individual servers. The net-snmp documentation provides comprehensive OID references for implementing robust SNMP-based monitoring. Whether you choose SNMP polling or SSH automation, the key is building correlation between network infrastructure performance and server metrics.

Server Scout's lightweight approach makes it practical to monitor both servers and network infrastructure without the resource overhead that typically limits monitoring scope. A 3MB bash agent leaves plenty of headroom for additional infrastructure monitoring scripts.

FAQ

Should I use SNMP v2c or v3 for production switch monitoring?

Use SNMP v3 with authentication for production environments. While v2c is simpler to configure, v3 provides encryption and user authentication that's essential for network security. The slight configuration overhead is worth the security benefits in production deployments.

How often should I poll network switches for performance metrics?

Poll critical interface statistics every 30-60 seconds for capacity planning, but use 5-minute intervals for less critical metrics like CPU and temperature. More frequent polling can overwhelm switch management interfaces and create unnecessary network overhead.

Can SSH-based switch monitoring scale to hundreds of devices?

SSH monitoring scales with proper connection pooling and parallel execution, but SNMP is more efficient for large deployments. Consider SSH for detailed diagnostics on key infrastructure switches while using SNMP for broader monitoring coverage across your entire network.

Ready to Try Server Scout?

Start monitoring your servers and infrastructure in under 60 seconds. Free for 3 months.

Start Free Trial