CPU Temperature Monitoring

Server Scout provides comprehensive CPU temperature monitoring to help you keep your servers running cool and prevent thermal throttling or hardware damage. This guide covers how to enable and configure CPU temperature monitoring effectively.

How Server Scout Reads CPU Temperature

The Server Scout agent automatically detects CPU temperature by reading from the Linux thermal subsystem, specifically from /sys/class/thermal/thermal_zone0/temp. This file contains the CPU temperature in millidegrees Celsius, which the agent converts to standard degrees Celsius for display.

You can manually check your CPU temperature using:

cat /sys/class/thermal/thermal_zone0/temp

The value returned (e.g., 45000) represents 45°C. On systems with multiple thermal zones, Server Scout will attempt to read from the primary CPU thermal zone.

Enabling CPU Temperature Monitoring

CPU temperature monitoring is an optional metric in Server Scout that must be explicitly enabled. To activate this feature:

  1. Edit the agent configuration file (typically located at /opt/scout-agent/agent.env.yml or similar)
  1. Add or modify the optional metrics section:

``yaml optionalmetrics: cputemp: true ``

  1. Restart the Server Scout agent:

``bash sudo systemctl restart scout-agent ``

  1. Verify the metric is being collected by checking the agent logs or waiting for the next reporting cycle

Once enabled, temperature readings will appear on your server's detail page in the Server Scout dashboard, displayed in degrees Celsius with regular updates based on your configured reporting interval.

Virtual Machine Considerations

Important: Virtual machines typically cannot access physical thermal sensors and will report null values for CPU temperature. This is expected behaviour as VMs run on virtualised hardware without direct access to the physical CPU's thermal monitoring capabilities.

If you're monitoring VMs, consider focusing on the host system's temperature monitoring instead, as this provides more meaningful thermal data for the actual physical hardware.

Temperature Display and Monitoring

Server Scout displays CPU temperature prominently on the server detail page alongside other vital metrics like CPU usage, memory consumption, and disk space. The temperature is updated in real-time according to your agent's reporting schedule.

Temperature readings include:

  • Current temperature in degrees Celsius
  • Historical temperature graphs
  • Temperature trend indicators
  • Alert status based on your configured thresholds

Recommended Alert Thresholds

For optimal server health and longevity, Server Scout recommends configuring temperature alerts with these thresholds:

  • Warning Level: 75°C
  • Critical Level: 85°C

These thresholds provide adequate headroom before thermal throttling typically begins (usually around 90-95°C on most processors) whilst giving you time to investigate and resolve cooling issues.

To configure these alerts in Server Scout:

  1. Navigate to your server's alert settings
  2. Enable CPU temperature monitoring
  3. Set warning threshold to 75°C
  4. Set critical threshold to 85°C
  5. Configure notification preferences for each threshold level

Common Causes of High CPU Temperature

When Server Scout alerts you to elevated CPU temperatures, consider these common causes:

Sustained High Load: Continuous CPU-intensive processes can generate excessive heat. Monitor your CPU usage alongside temperature metrics to identify correlation between workload and thermal issues.

Poor Cooling Systems: Failed or inadequate cooling can quickly lead to overheating. Check that:

  • All fans are operational and spinning at appropriate speeds
  • Heat sinks are properly seated and making good thermal contact
  • Thermal paste hasn't dried out or degraded

Dusty Hardware: Accumulated dust acts as thermal insulation and blocks airflow. Regular cleaning schedules should include:

  • Clearing dust from heat sinks and fan blades
  • Ensuring air intake and exhaust vents remain unobstructed
  • Checking that server room air filtration is adequate

Environmental Factors: Ambient temperature increases, poor server room cooling, or inadequate airflow can contribute to thermal issues even with properly functioning hardware.

By monitoring CPU temperature with Server Scout, you can proactively identify thermal issues before they impact performance or cause hardware damage, ensuring your servers maintain optimal operating conditions.

Frequently Asked Questions

How do I enable CPU temperature monitoring in ServerScout

Edit the agent configuration file (typically at /opt/scout-agent/agent.env.yml), add 'cpu_temp: true' under the optional_metrics section, then restart the ServerScout agent using 'sudo systemctl restart scout-agent'. Temperature readings will appear on your server's detail page after the next reporting cycle.

How does ServerScout read CPU temperature on Linux servers

ServerScout automatically reads CPU temperature from the Linux thermal subsystem at /sys/class/thermal/thermal_zone0/temp. This file contains temperature in millidegrees Celsius, which the agent converts to standard degrees Celsius for display. On multi-zone systems, it reads from the primary CPU thermal zone.

Why is my virtual machine showing null CPU temperature values

Virtual machines cannot access physical thermal sensors and will report null temperature values. This is expected behavior as VMs run on virtualized hardware without direct access to the physical CPU's thermal monitoring capabilities. Monitor the host system's temperature instead for meaningful thermal data.

What are the recommended CPU temperature alert thresholds

ServerScout recommends setting warning alerts at 75°C and critical alerts at 85°C. These thresholds provide adequate headroom before thermal throttling typically begins (around 90-95°C) while giving you time to investigate and resolve cooling issues before hardware damage occurs.

What causes high CPU temperatures in servers

Common causes include sustained high CPU load, failed or inadequate cooling systems, accumulated dust blocking airflow, degraded thermal paste, and poor environmental conditions. ServerScout helps identify correlations between CPU usage and temperature spikes to pinpoint thermal issues.

How do I manually check CPU temperature on Linux

Use the command 'cat /sys/class/thermal/thermal_zone0/temp' to manually check CPU temperature. The returned value (e.g., 45000) represents temperature in millidegrees Celsius, so divide by 1000 to get the actual temperature (45°C in this example).

What temperature information does ServerScout display

ServerScout displays current temperature in degrees Celsius, historical temperature graphs, temperature trend indicators, and alert status based on configured thresholds. All temperature data appears on the server detail page alongside other vital metrics like CPU usage and memory consumption.

Was this article helpful?