Agent Data Spooling and Offline Recovery

Overview

Server Scout's agent includes a robust data spooling system designed to minimise data loss during network outages, server maintenance, or API unavailability. When the agent cannot reach the Server Scout API, it automatically stores metric data locally and replays it once connectivity is restored.

This offline recovery mechanism ensures your monitoring data remains consistent and complete, even during planned maintenance windows or unexpected network disruptions.

How Data Spooling Works

When the Server Scout agent encounters connectivity issues, it seamlessly switches to offline mode. Instead of discarding metric data, the agent writes each metric payload to disk in the spool directory located at /opt/scout-agent/spool/.

Each file in the spool directory represents a single metric collection cycle, containing all the performance data that would normally be transmitted to the API. The files are timestamped and stored in a format that preserves the original metric structure.

Spool Directory Management

The spool directory has several important characteristics:

  • Location: /opt/scout-agent/spool/
  • Maximum files: 720 files
  • File format: Compressed JSON payloads with timestamp prefixes

To check your current spool status, you can examine the directory:

ls -la /opt/scout-agent/spool/

When the maximum of 720 files is reached, the agent will begin overwriting the oldest files to prevent disk space issues. This provides approximately 12 hours of offline storage at the default collection interval.

Replay Process

Once connectivity is restored, the agent begins replaying spooled data using a First-In-First-Out (FIFO) approach. This ensures that metric data is transmitted in chronological order, maintaining the integrity of your monitoring timeline.

During replay, the agent adds a special HTTP header to distinguish replayed data from live metrics:

X-Replay: true

This header allows the Server Scout platform to properly handle historical data and prevents any conflicts with real-time monitoring.

Retry Logic and Backoff

The agent employs an intelligent retry mechanism with exponential backoff to avoid overwhelming the API during recovery:

  1. Initial retry: 5 seconds
  2. Subsequent retries: Double the previous interval
  3. Maximum interval: 120 seconds

This approach ensures that temporary network hiccups are handled quickly, whilst persistent issues don't result in aggressive retry attempts that could impact system performance.

Time-Based Data Management

To maintain data relevance and prevent the replay of stale information, the agent implements a 2-hour replay time limit. Any spooled data older than 2 hours is automatically discarded when connectivity is restored.

This policy ensures that:

  • Only relevant historical data is transmitted
  • Long outages don't result in excessive replay traffic
  • System resources are used efficiently during recovery

Monitoring Spool Health

You can monitor the spooling system's health through several approaches:

Check spool file count:

find /opt/scout-agent/spool/ -name "*.json.gz" | wc -l

View oldest spooled data:

ls -lt /opt/scout-agent/spool/ | tail -1

Monitor agent logs for replay activity:

tail -f /var/log/scout-agent/agent.log | grep -i replay

Best Practices

To optimise the effectiveness of the spooling system:

  1. Ensure adequate disk space: Monitor the /opt/scout-agent/ partition to prevent storage issues
  2. Regular connectivity checks: Verify that firewall rules and network configurations don't interfere with API communication
  3. Log monitoring: Regularly review agent logs to identify patterns in connectivity issues

Troubleshooting

If you notice issues with data spooling or replay:

  • Verify that the spool directory has appropriate permissions for the scout-agent user
  • Check available disk space in the /opt/scout-agent/ directory
  • Review network connectivity and firewall configurations
  • Examine agent logs for specific error messages during replay attempts

The spooling system operates transparently, requiring no manual intervention under normal circumstances. This design ensures that your Server Scout monitoring remains reliable and comprehensive, even during challenging network conditions.

Frequently Asked Questions

How does ServerScout agent data spooling work

When the ServerScout agent cannot reach the API, it automatically switches to offline mode and stores metric data locally in the /opt/scout-agent/spool/ directory. Each file represents a single metric collection cycle with timestamped, compressed JSON payloads. Once connectivity is restored, the agent replays this data in chronological order using a First-In-First-Out approach.

Where is the ServerScout agent spool directory located

The spool directory is located at /opt/scout-agent/spool/. You can check its contents using 'ls -la /opt/scout-agent/spool/' to view stored metric files. The directory can hold a maximum of 720 files, providing approximately 12 hours of offline storage at default collection intervals.

How long does ServerScout keep spooled data before discarding it

ServerScout automatically discards spooled data that is older than 2 hours when connectivity is restored. This 2-hour replay time limit ensures only relevant historical data is transmitted and prevents excessive replay traffic during recovery from long outages.

What happens when the spool directory reaches maximum capacity

When the maximum of 720 files is reached, the agent begins overwriting the oldest files to prevent disk space issues. This automatic rotation ensures the spool directory doesn't consume unlimited disk space while maintaining recent monitoring data during extended outages.

How to check if ServerScout agent is replaying spooled data

You can monitor replay activity by checking agent logs with 'tail -f /var/log/scout-agent/agent.log | grep -i replay'. During replay, the agent adds an 'X-Replay: true' HTTP header to distinguish replayed data from live metrics, allowing the platform to handle historical data appropriately.

ServerScout agent spool directory permission issues troubleshooting

If you're experiencing spooling issues, first verify that the spool directory has appropriate permissions for the scout-agent user. Also check available disk space in the /opt/scout-agent/ directory and review agent logs for specific error messages during replay attempts.

What retry logic does ServerScout agent use for API connectivity

The agent uses exponential backoff starting with a 5-second initial retry, then doubling the interval for subsequent attempts up to a maximum of 120 seconds. This intelligent retry mechanism handles temporary network issues quickly while avoiding aggressive retry attempts during persistent outages that could impact system performance.

Was this article helpful?