Monitoring Tools Ranking for VPS
Ranking methodology
Scores (1–100) are based on:
- Daily usefulness: how often it solves real problems quickly
- Clarity under pressure: readability and ease of interaction
- Coverage: CPU, memory/swap, disk I/O, network, process-level drill-down
- Operational safety: low overhead and low risk of accidental disruption
- Portability: availability across common VPS distributions
Ranked list (WordPress VPS daily operations)
| Rank | Tool | Score | Best for | Notes |
|---|---|---|---|---|
| : | -: | |||
| 1 | htop | 98 | Fast process triage | Best “first screen” for CPU/RAM and process actions |
| 2 | btop | 94 | Correlating CPU/RAM/I/O/Network visually | Excellent when the bottleneck is unclear |
| 3 | top | 92 | Emergency and minimal systems | Always available; essential fallback |
| 4 | atop | 90 | Historical evidence and after-the-fact debugging | Strong when you missed the spike; stores samples |
| 5 | glances | 88 | One-screen summary dashboard | Useful for broad overview; may add dependencies |
| 6 | iotop | 87 | Disk I/O by process | Critical for DB flushes, backups, slow storage |
| 7 | iftop | 85 | Network bandwidth by host/flow | Strong for bot spikes and unexpected egress |
| 8 | nethogs | 83 | Network bandwidth by process | Useful when egress is high and you need the process |
| 9 | iostat (sysstat) | 82 | Disk utilization/latency snapshots | Confirms I/O bottlenecks quickly |
| 10 | pidstat (sysstat) | 80 | Per-process CPU/I/O over time | Good for short sampling runs during incidents |
| 11 | vmstat | 78 | Memory pressure and run queue snapshots | Lightweight, reliable signal tool |
| 12 | mpstat (sysstat) | 76 | Per-core CPU breakdown | Confirms single-core saturation or imbalance |
| 13 | ss | 75 | Socket and listener inspection | Replaces most netstat use cases |
| 14 | ps | 74 | Scriptable process inspection | Strong paired with sorting and grep |
| 15 | free | 72 | Quick memory/swap snapshot | Good as a fast check, not a full diagnostic |
| 16 | uptime | 70 | Instant load average check | Useful for a quick triage signal |
| 17 | watch | 68 | Repeating a command live | Best as a wrapper for other tools |
glancesglances is valuable, but installation and dependencies vary (package vs Python). For daily VPS operations, prioritize tools that install cleanly and behave consistently in your environment.
Recommended top 5 to install on a daily WordPress VPS
These are chosen to cover the four primary bottleneck domains (CPU, memory/swap, disk I/O, network) with minimal overlap:
| Tool | Why it makes the top 5 | Primary questions it answers |
|---|---|---|
| - | - | -- |
htop | Fastest interactive “what’s burning right now” | Which processes are consuming CPU/RAM right now? |
btop | Best cross-panel correlation | Is this CPU, memory, disk, or network causing the slowdown? |
iotop | Disk I/O visibility by process | Is MariaDB, backup, or logging saturating disk? |
iftop | Network visibility by host/flow | Is traffic spiking? Who is talking to my server? |
atop | Historical capture for missed incidents | What happened 10 minutes ago when the site slowed down? |
If you must reduce to fewer than 5 tools, keep top and add only one interactive tool (htop or btop) plus one domain-specific tool (iotop or iftop) depending on your most common incidents.
Installation (Ubuntu/Debian daily stack)
Safe, repo-based install:
sudo apt update
sudo apt install -y htop btop iotop iftop atop sysstat
This installs:
- Primary interactive monitors:
htop,btop - Disk/network tools:
iotop,iftop - Historical monitor:
atop sysstatutilities:iostat,pidstat,mpstat(and others)
Optional additions (only if you need them):
nethogs(network by process)glances(dashboard view)
sudo apt install -y nethogs glances
Daily workflows (fast playbooks)
Workflow 1: Site is slow right now
- Identify culprit processes:
htop
- If it’s not obvious, correlate across domains:
btop
- If load is high but CPU is not maxed, check disk I/O:
sudo iotop -oP
- If traffic is suspicious, check network flows:
sudo iftop -nP
Workflow 2: Spike already happened
Use historical evidence:
sudo atop
Then:
- Review CPU/memory/disk/network history screens inside
atop - Correlate time of incident with service logs
Tool selection by symptom
| Symptom | Start with | Then verify with |
|---|---|---|
| - | ||
| High CPU / load | htop | top, pidstat, mpstat |
| Swap growing / OOM risk | btop / htop | vmstat, free |
| High load but CPU moderate | btop | iostat, iotop |
| Slow database / heavy writes | iotop | iostat, service logs |
| Suspected bot burst | iftop | ss, web access logs |
| Incident already passed | atop | journalctl, application logs |
Operational safety notes
Interactive monitors allow killing processes, but terminating the wrong PID can cause downtime or data loss. Prefer read-only diagnosis first, then use controlled service management (systemctl) where appropriate.
Avoid killing mariadbd/mysqld during normal operations. Prefer controlled restarts during maintenance windows after confirming backups and understanding impact.
Quick reference cheat sheet
| Goal | Command |
|---|---|
| -- | - |
| Process triage | htop |
| Visual multi-domain triage | btop |
| Minimal fallback monitor | top |
| Disk I/O by process | sudo iotop -oP |
| Network flows | sudo iftop -nP |
| Network by process | sudo nethogs |
| Historical view | sudo atop |
| Disk stats snapshot | iostat -xz 1 3 |
| Per-process sampling | pidstat -dur 1 5 |
| Per-core CPU sampling | mpstat -P ALL 1 3 |
| Memory/run queue snapshot | vmstat 1 5 |
| Sockets/listeners | ss -tulpen |