Understanding Smart Alerts

Server Scout's smart alert system is designed to provide meaningful notifications whilst avoiding the constant barrage of false alarms that plague many monitoring solutions. By using intelligent thresholds and thoughtful escalation patterns, you'll receive alerts when they truly matter.

How Sustained Thresholds Work

Unlike basic monitoring tools that trigger alerts on momentary spikes, Server Scout uses sustained thresholds to ensure alerts represent genuine issues rather than temporary fluctuations.

When a metric exceeds your configured threshold, Server Scout doesn't immediately fire an alert. Instead, it monitors whether the condition persists for a specified duration. This approach prevents alerts for brief CPU spikes during normal operations or temporary memory usage increases that quickly resolve themselves.

For example, if your CPU threshold is set to 80% with a 5-minute duration requirement, Server Scout will only trigger an alert if CPU usage remains above 80% for the full 5-minute period.

Configurable Duration Requirements

Duration requirements are customisable for each alert type, allowing you to fine-tune sensitivity based on your specific needs:

  1. Short durations (1-2 minutes) work well for critical services where immediate action is required
  2. Medium durations (3-5 minutes) suit most general monitoring scenarios
  3. Longer durations (10+ minutes) are ideal for metrics that naturally fluctuate

To adjust duration settings:

  1. Navigate to your server's alert configuration
  2. Select the metric you wish to modify
  3. Set both the threshold value and duration requirement
  4. Save your changes

Alert Cooldown Periods

Alert cooldown periods prevent notification flooding by ensuring you won't receive repeated alerts for the same ongoing issue. Once an alert fires, Server Scout enters a cooldown period during which additional notifications for that specific alert are suppressed.

The default cooldown period is 30 minutes, but this can be adjusted based on your response requirements. During cooldown:

  • The alert condition continues to be monitored
  • No additional notifications are sent for the same alert type
  • If the condition resolves and then reoccurs after cooldown expires, a new alert will be triggered

This mechanism ensures your notification channels remain useful rather than becoming overwhelmed with duplicate alerts.

Alert Escalation System

Server Scout implements a two-tier escalation system: Warning and Critical levels.

Warning Alerts

Warning alerts serve as early indicators that a metric is approaching problematic levels. These typically have:

  • Lower threshold values
  • Shorter duration requirements
  • Less aggressive notification methods

Critical Alerts

Critical alerts indicate immediate attention is required. They feature:

  • Higher threshold values
  • Potentially different notification channels
  • More urgent delivery methods

This escalation approach allows you to address issues proactively at the warning stage, potentially preventing critical situations entirely.

Server Offline Detection

Server offline detection operates differently from metric-based alerts, relying on missed check-ins rather than threshold violations.

Server Scout expects regular heartbeats from your monitored servers. When a server fails to check in within the expected timeframe:

  1. Server Scout marks the server as potentially offline
  2. After a grace period (typically 2-3 missed check-ins), an offline alert is triggered
  3. The alert continues until the server resumes normal check-ins

This system accounts for temporary network hiccups whilst quickly identifying genuine connectivity or server issues.

Check-in Frequency

The standard check-in interval is 60 seconds. A server is considered offline if it misses three consecutive check-ins, providing a 3-minute detection window whilst avoiding false positives from brief network interruptions.

Viewing Alert History

The notification log provides comprehensive alert history, accessible through your Server Scout dashboard:

Dashboard → Notifications → Alert History

The alert history includes:

  • Timestamp of each alert
  • Alert type and affected metric
  • Threshold values that triggered the alert
  • Duration the condition persisted
  • Resolution time when the condition cleared

This historical data proves invaluable for identifying patterns, planning capacity upgrades, and demonstrating system reliability to stakeholders.

Best Practices

Configure thresholds based on your server's normal operating patterns rather than theoretical maximums. Monitor your alert history regularly to fine-tune duration and threshold settings, ensuring your smart alert system continues to provide maximum value with minimal noise.

Frequently Asked Questions

How do I set up smart alerts in ServerScout

Navigate to your server's alert configuration, select the metric you want to monitor, set both the threshold value and duration requirement, then save your changes. You can customize duration requirements for each alert type based on your specific monitoring needs.

What are sustained thresholds and how do they work

Sustained thresholds require a metric to exceed your configured limit for a specified duration before triggering an alert. For example, if CPU threshold is 80% with 5-minute duration, the alert only fires if CPU stays above 80% for the full 5 minutes, preventing false alarms from temporary spikes.

Why am I not getting duplicate alerts for the same issue

ServerScout uses alert cooldown periods to prevent notification flooding. Once an alert fires, it enters a 30-minute cooldown where additional notifications for the same alert type are suppressed. The condition continues being monitored, but no duplicate notifications are sent during cooldown.

How does ServerScout detect when a server goes offline

Server offline detection relies on missed check-ins rather than metric thresholds. Servers send heartbeats every 60 seconds, and if three consecutive check-ins are missed (3-minute window), an offline alert triggers. This accounts for temporary network issues while quickly identifying genuine connectivity problems.

What's the difference between warning and critical alerts

Warning alerts are early indicators with lower thresholds and shorter durations, serving as proactive notifications. Critical alerts indicate immediate attention is required, featuring higher threshold values and more urgent delivery methods. This two-tier system helps address issues before they become critical.

What duration should I set for different types of alerts

Use short durations (1-2 minutes) for critical services needing immediate action, medium durations (3-5 minutes) for general monitoring scenarios, and longer durations (10+ minutes) for naturally fluctuating metrics. Configure based on your server's normal operating patterns rather than theoretical maximums.

Where can I view my alert history in ServerScout

Access alert history through Dashboard → Notifications → Alert History. The notification log shows timestamps, alert types, threshold values that triggered alerts, duration the condition persisted, and resolution times. This data helps identify patterns and fine-tune your alert settings.

How do I stop getting too many false alert notifications

Increase duration requirements for metrics that naturally fluctuate, adjust thresholds based on your server's normal operating patterns, and use the alert history to identify and fine-tune problematic settings. ServerScout's sustained thresholds and cooldown periods are specifically designed to reduce false alarms.

Was this article helpful?