The previous administrator left three weeks ago. They took their passwords with them. The handover document is a single page listing IP addresses — no services, no dependencies, no deployment notes. Your new employer needs you to "figure out what's running" across 23 servers scattered between Dublin and Frankfurt.
This is server archaeology. You're not just auditing infrastructure — you're reconstructing the digital DNA of a business that forgot to document itself. Here's how to approach it systematically, without breaking anything important in the process.
The Network Reconnaissance Phase
Start with the network map. Your IP address list is the archaeological equivalent of knowing where to dig, but not what you'll find. Before touching any individual server, understand the topology.
Run nmap -sS -O against the entire subnet ranges. This reveals not just which hosts respond, but what operating systems they're running and which ports they're listening on. A server showing SSH on port 22, HTTP on 80, and MySQL on 3306 tells a very different story from one running only SSH and a high-numbered port that might be a custom application.
Pay attention to the patterns. If servers 192.168.1.10 through 192.168.1.15 all show similar port configurations, they're likely part of a cluster or redundant setup. Document these groupings — they'll guide your monitoring priorities later.
Service Discovery and Dependency Mapping
Once you have network access to a server, your first stop is netstat -tulpn (or ss -tulpn on newer systems). This single command reveals the listening processes and the programs behind them. A server might look quiet from the outside but be running internal services that other systems depend on.
Look for unexpected processes. That /opt/legacy-billing/bin/processor listening on port 8445? It's probably business-critical even though it's not in any documentation. Note the process IDs and use lsof -p [PID] to see what files and network connections each process has open. This reveals dependencies that won't show up in standard configuration files.
Check systemctl list-units --type=service --state=running to understand what the system thinks should be running. Cross-reference this with your netstat output. Services that should be running but aren't listening on expected ports often indicate configuration problems waiting to cause outages.
Building Your Baseline Database
Create a simple spreadsheet or database tracking each server's essential characteristics: hostname, IP, operating system version, primary services, unusual processes, disk usage patterns, and memory allocation. This becomes your infrastructure Bible.
For each service, document its apparent purpose, configuration file locations, log file paths, and any obvious dependencies. A web server connecting to a database on another specific IP address creates a dependency chain you'll need to monitor.
Don't aim for perfection in the first pass. Your goal is building a working map of critical services, not comprehensive documentation. That comes later, as you understand how the pieces fit together.
Establishing Monitoring Coverage from Scratch
With your service map in hand, you can finally deploy monitoring intelligently. Rather than applying generic templates, prioritise based on what you've discovered. That database server handling connections from five web servers needs different alert thresholds than the standalone backup system that only runs nightly scripts.
Start with the quick setup process on your most critical servers first — typically database servers, load balancers, and anything handling customer-facing traffic. The lightweight agent installation means you can deploy monitoring without worrying about resource impact during your investigation phase.
Configure initial alerts conservatively. You're still learning the system's normal behaviour patterns. Better to receive fewer alerts initially and tune them based on observed baselines than to trigger false alarms during your first week.
Grouping Servers by Function
Use the dependency patterns you discovered to organise servers into logical groups. Web servers that all connect to the same database belong in one group. Backup systems that only communicate with storage belong in another. This grouping strategy pays dividends when you need to understand the blast radius of potential failures.
Common Gotchas in Inherited Infrastructure
Every inherited server environment has surprises. Here are the patterns that catch people off-guard:
Shadow IT services running on "spare" servers. That development box might be handling production log aggregation because someone needed a quick solution six months ago. Check cron jobs with crontab -l for all users, not just root.
Hardcoded IP addresses in application configurations. When you find database connection strings in application configs, trace where those IP addresses actually point. The "database server" might have been moved, with traffic forwarded through iptables rules that aren't documented anywhere.
Legacy SSL certificates installed in non-standard locations. Check /opt, /usr/local, and home directories for certificate files. That e-commerce site might depend on certificates installed in the application directory rather than system-standard locations.
Security and Access Control Audits
Review SSH authorized_keys files across all servers. Previous administrators often leave access keys that no longer correspond to current staff. Similarly, check for shared service accounts that might have overly broad permissions.
Look for services running as root that shouldn't be. This often indicates quick fixes that became permanent solutions. Document these for future security hardening, but don't change them during your initial audit phase.
Creating Documentation Templates for Future Handoffs
As you build your understanding of the infrastructure, create the documentation templates that should have existed. For each server, document:
- Primary business function
- Key services and their startup procedures
- Dependencies on other servers
- Backup and recovery procedures
- Emergency contact information for application owners
This isn't just for your successor — it's for the 3AM version of yourself who needs to troubleshoot a service outage. For comprehensive guidance on building sustainable documentation practices, see Essential Monitoring Handoff Framework: Step-by-Step Documentation That Survives Team Changes.
From Discovery to Sustainable Operations
Server archaeology is temporary work. Your goal is transforming mysterious inherited infrastructure into monitored, documented systems that your team can maintain confidently.
Once you understand what's running and how the pieces connect, you can implement proper monitoring coverage with appropriate alert thresholds. The [/pricing.html]() shows how comprehensive monitoring fits into most infrastructure budgets, especially compared to the cost of unplanned outages in undocumented systems.
The servers might have been inherited, but the monitoring culture you build around them can prevent the next person from facing the same detective work you've just completed.
FAQ
How long should the initial server audit phase take?
Plan for 2-3 days per server group if you're working methodically. Critical production systems need more attention than backup or development servers. Don't rush — mistakes during the audit phase create ongoing operational problems.
Should I change configurations during the audit process?
Avoid configuration changes until you understand dependencies. Document what should be changed, but implement fixes only after you have comprehensive monitoring in place to detect any unintended consequences.
What if I find servers that seem to serve no purpose?
Leave them running initially and monitor their network connections over several weeks. Some systems only activate during specific business cycles (month-end processing, quarterly reports) and shutting them down prematurely can cause business disruption.