What is the OOM Killer?
The Out of Memory (OOM) killer is a Linux kernel mechanism that springs into action when your system runs critically low on available memory. When this happens, the kernel must make tough decisions about which processes to terminate to free up memory and prevent a complete system crash. These events, known as OOM kills, are serious indicators that your server is under severe memory pressure.
How Server Scout Monitors OOM Events
Server Scout's monitoring agent continuously tracks OOM killer activity by reading the oom_kills counter from /proc/vmstat. This system file maintains a running tally of how many processes the kernel has terminated due to memory exhaustion since the last system boot.
# View current OOM kill count manually
grep oom_kill /proc/vmstat
The agent samples this counter regularly and calculates the oomkillsdelta metric, which represents the number of new OOM kills since the last check. This delta calculation is crucial because it allows Server Scout to detect fresh OOM events rather than just reporting the cumulative count.
Default Alert Configuration
Server Scout's default alert condition for OOM killer detection is intentionally sensitive:
Alert fires when: oomkillsdelta > 0
This means any increase in the OOM kill counter will trigger an immediate alert. This aggressive threshold is deliberate—even a single OOM kill event indicates your server is experiencing memory exhaustion, which requires prompt attention.
The alert system monitors the delta value because:
- It identifies new incidents as they occur
- It avoids repeated alerts for historical OOM events
- It provides timely notifications when memory pressure escalates
Viewing OOM Events in Server Scout
When OOM kills occur, you can review the events through Server Scout's alert history:
- Navigate to your server's dashboard
- Click on the "Alerts" section
- Look for OOM killer alerts, which will show the timestamp and delta value
- Check the alert details to see how many processes were killed during each incident
The alert history provides valuable insights into patterns—whether OOM kills are isolated incidents or recurring problems that suggest systematic memory issues.
Understanding the Impact
When the OOM killer activates, it selects victims based on a scoring system that considers factors like:
- Memory consumption
- Process importance (system processes are protected)
- How long the process has been running
The killed processes are terminated immediately without graceful shutdown, potentially causing:
- Data loss in applications that haven't saved recent work
- Service interruptions
- Database corruption if database processes are terminated mid-transaction
Preventing OOM Kills
1. Right-Size Your Memory
Analyse your server's memory usage patterns:
# Monitor memory usage over time
free -h
# Check which processes consume the most memory
ps aux --sort=-%mem | head -10
Consider upgrading RAM if your applications legitimately require more memory than available.
2. Configure Memory Limits
Use systemd or cgroups to set memory limits for services:
# Example: Limit a service to 2GB RAM
sudo systemctl edit your-service
Add the following configuration:
[Service]
MemoryLimit=2G
3. Add Swap Space
While not ideal for performance, swap can prevent OOM kills:
# Create a 2GB swap file
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
4. Optimise Applications
- Review application memory usage and fix memory leaks
- Tune database buffer pools and caches appropriately
- Consider more memory-efficient alternatives for resource-heavy applications
Best Practices
- Never ignore OOM kill alerts – they indicate serious system stress
- Investigate immediately – check system logs (
journalctl -k | grep -i "killed process") to identify which processes were terminated - Monitor trends – recurring OOM kills suggest the need for infrastructure changes
- Test thoroughly after implementing fixes to ensure stability
By staying vigilant about OOM killer activity through Server Scout's monitoring, you can maintain system stability and prevent unexpected service disruptions.
Frequently Asked Questions
What is the OOM killer in Linux
How does ServerScout detect OOM killer events
What triggers an OOM killer alert in ServerScout
How do I view OOM killer alerts in ServerScout
What happens when the OOM killer terminates a process
How can I prevent OOM killer events on my server
Should I ignore OOM killer alerts if my server seems fine
Was this article helpful?