Skip to main content

Performance Foundation

Performance troubleshooting is easiest when you follow the same order every time: capture a snapshot, identify the bottleneck class (CPU, memory, disk I/O, network, or a specific process), and validate the fix with the same measurements.

Quick Summary
  • Collect a snapshot: load, CPU, memory, disk space, I/O wait, top processes.
  • Decide the bottleneck class before you change anything.
  • Prefer safe actions first: staging restores, service restarts during maintenance windows, rate-limiting cron jobs.
  • Document what you changed and confirm the symptom improves.

What "slow" means (in practice)

On a WordPress VPS (or any Linux server), "slow" usually appears as one (or more) of these:

  • Requests queue up (PHP-FPM workers saturated, MySQL slow, disk I/O wait).
  • The server becomes unresponsive (memory pressure, swap thrash, runaway process).
  • Storage fills up (writes block, services crash, MySQL stops).
  • Latency spikes intermittently (cron bursts, backups, log rotation, noisy neighbors).

Your goal is to map the symptom to a measurable signal.

The 10-minute triage checklist

Run this checklist before you restart services or kill processes.

triage-10min-checklist.sh
set -eu

echo "==== time ===="
date -Is

echo "==== uptime / load ===="
uptime

echo "==== cpu / memory ===="
free -h

echo "==== disk space / inodes ===="
df -hT
df -ih

echo "==== top cpu processes ===="
ps -eo pid,ppid,user,stat,pcpu,pmem,etime,cmd --sort=-pcpu | head -n 20

echo "==== top memory processes ===="
ps -eo pid,ppid,user,stat,pcpu,pmem,etime,cmd --sort=-pmem | head -n 20

echo "==== i/o wait quick view ===="
vmstat 1 5

What to look for:

  • High load with low CPU utilization often implies I/O wait or blocked processes.
  • Low free memory is not automatically bad, but sustained swap-in/swap-out is.
  • A full filesystem (or inode exhaustion) can break services in surprising ways.

Install a minimal toolbelt

If the box is missing common tools, install them first. Keep the toolbelt small and predictable.

install-performance-tools-apt.sh
sudo apt update
sudo apt install -y htop iotop sysstat psmisc lsof ncdu

A repeatable workflow

Use this sequence as a default. It prevents random guessing.

  1. Capture a snapshot (commands above).

  2. Pick one bottleneck class:

  • CPU bound: one or more processes consume most CPU.
  • Memory bound: OOM kills, swap thrash, high major page faults.
  • Disk space bound: filesystem near 100% or inodes exhausted.
  • Disk I/O bound: high I/O wait, slow queries, blocked writes.
  • Network bound: packet loss, saturation, DNS issues.
  1. Validate with a focused tool:
  • CPU/process: top, htop, ps, pidstat.
  • Memory: free -h, vmstat, dmesg (OOM), MySQL buffers.
  • Disk space: df -hT, ncdu.
  • Disk I/O: iostat -x, iotop, pidstat -d.
  • Services: systemctl status, journalctl -u.
  1. Make the smallest safe change.

  2. Re-measure using the same commands.

Safety defaults

These defaults keep you out of trouble.

  • Always extract backups into a staging directory first.
  • Prefer SIGTERM over SIGKILL when stopping processes.
  • Restart services during a maintenance window when possible.
  • Never change kernel/sysctl values "to see if it helps" without a rollback plan.
warning

Commands like kill -9, aggressive cache clearing, and changing sysctl can cause data loss or downtime. Use them only after you have identified the actual bottleneck.

Where to go next

  • Process-level investigation: see [Process control](./process-control).
  • Disk I/O diagnosis: see [Disk I/O troubleshooting](./disk-io-troubleshooting).
  • Disk usage and cleanup: see [Disk usage insights](./disk-usage-insights).
  • Historical evidence: see [Historical performance stats](./historical-performance-stats).
  • Service health: see [Service monitoring](./service-monitoring) and [Monitoring health](./monitoring-health).