mtr: Practical Network Diagnosis for Engineers

7 min read

“If you can’t see the path, you can’t fix the problem.” That rings true for network ops — and mtr is one of the shortest routes from mystery to answer. I remember running mtr during a production incident and spotting a single hop that doubled latency; we fixed routing the same day.

What is mtr and why use it?

mtr (originally “My Traceroute”) is a command-line diagnostic tool that combines the features of traceroute and ping to show path and latency over time. Think of it as a live traceroute that updates as it runs. Unlike a single traceroute snapshot, mtr continually samples each hop, giving both per-hop packet loss and latency trends. That dual view is why engineers prefer mtr when troubleshooting intermittent or persistent path problems.

Quick definition (40–60 words)

mtr is a real-time network diagnostic tool that sends ICMP/UDP/TCP probes along the path to a destination, updating per-hop latency and packet-loss statistics continuously so operators can identify where packets are delayed or dropped.

Basic question: How does mtr differ from traceroute and ping?

Short answer: mtr merges traceroute’s path discovery with ping’s repeated sampling. Traceroute shows the path once and exits. Ping samples a single endpoint repeatedly but doesn’t show intermediate hops. mtr repeatedly probes each hop, giving both a live path map and time-series metrics for each hop.

How to install and run mtr (quick start)

On Debian/Ubuntu: sudo apt install mtr. On Fedora/CentOS: sudo dnf install mtr or yum install mtr. macOS users can use Homebrew: brew install mtr. Windows users can use the WinMTR GUI (WinMTR on GitHub) or the WSL Linux build.

Run a basic test: mtr –report example.com. The –report flag runs a finite probe sequence and prints a summary; omitting it gives an interactive, continuously updating view.

Example: interactive run

Run mtr 8.8.8.8 (or a hostname). You’ll see a table that updates: hop number, hostname/IP, packet loss, last/avg/best/worst/mdev latencies. Leave it running for 30–60 seconds to gather representative data.

How to interpret mtr output: the practical lens

Reading mtr is both art and science. Here’s my step-by-step mental checklist when I open an mtr output during an incident:

Look at per-hop packet loss column. A non-zero loss that appears at one hop and persists on subsequent hops likely indicates a real problem at or beyond that hop.
Check whether loss appears only at one intermediate hop but not at the final destination — some routers deprioritize ICMP, which can show loss at one hop but not affect overall traffic. Correlate with actual service symptoms.
Compare latency columns (last, avg, worst). Spikes in worst or mdev suggest instability even when average looks OK.
Note path changes (different ASes or IPs over repeated runs). Path shifts can explain intermittent outages.

Example signature: 0% loss until hop 6, then 40% loss at hop 7 and 40% onward. That’s a pretty clear sign hop 7 drops packets and impacts path traffic. But if hop 7 shows 50% loss and hop 8 shows 0% loss, that may indicate the hop deprioritizes ICMP and is not affecting forwarded traffic — investigate but don’t assume end-user impact without further tests.

Advanced usage and useful flags

mtr has many options. A few I use constantly:

-r / –report: run a fixed number of pings and exit, good for scripting.
-c : number of pings in report mode (with -r).
-n: numeric output (skip DNS reverse lookups) — speeds up results and avoids misleading hostnames.
-T or -U: choose TCP or UDP probes; TCP is useful when ICMP is filtered but TCP port 80/443 is allowed.
-i : set probe interval; increasing interval reduces load and avoids rate-limits.

Command example I use in automation: mtr -r -c 100 -n –report-wide 8.8.8.8 — 100 probes, numeric output, wide report for scripts.

Real scenarios: three troubleshooting vignettes

1) Intermittent slow page loads for users in a single city. I ran mtr from a nearby server and saw rising mdev and occasional 20–30% packet loss on an ISP aggregation hop during peak hours. We scheduled a carrier conversation and the ISP confirmed a saturated link.

2) A service reports errors but mtr shows loss only at an intermediate hop while the final host shows 0% loss. I then ran an HTTP-backed TCP-based mtr probe (TCP 443) and saw the loss propagate. That confirmed the intermediate router’s ICMP policy hid the real forwarding behavior. Moral: test with the protocol the service uses when in doubt.

3) After a maintenance window, some clients lost route diversity. mtr runs showed a new AS in the path with higher latency. We adjusted BGP announcements and traffic shifted back. mtr helped verify the success of the change.

Common misconceptions about mtr (myth-busting)

Q: “If mtr shows packet loss at a hop, that hop is broken.” Not always. Many devices deprioritize ICMP/TTL-expired packets in their control plane; you’ll see apparent loss at that hop but not necessarily at the destination. Rule: always confirm by checking subsequent hops and testing with the actual protocol.

Q: “mtr can fully replace ping and traceroute.” Not exactly. mtr is extremely helpful, but targeted ping tests, tcpdump captures, BGP checks, and server-side logs are complementary. mtr is a fast path to hypothesis formation, not the entire investigation.

Q: “Higher packet loss numbers in mtr mean user impact for sure.” Often but not always. Small periodic losses (1–2%) can be normal on busy links; the pattern matters more than a single number.

Practical tips for reliable results

Run tests from multiple vantage points (end-user device, an upstream server, and from the service side) to avoid misattributing problems to a single path view.
Use the same probe type as your traffic (TCP for web, UDP for VoIP) to see realistic behaviour.
Increase sample counts for intermittent problems (e.g., -c 200) so you capture patterns rather than a single snapshot.
When scripting, use numeric output (-n) and parse fields rather than scraping printed hostnames.

How I integrate mtr into incident playbooks

In my ops playbook, the first three diagnostic steps are: (1) confirm symptoms with a quick page load or API call, (2) run an mtr from a nearby host with TCP probes to the service port, (3) cross-check with a packet capture or server logs if mtr shows concerning loss. This approach keeps work focused: mtr narrows down suspects fast, then targeted tools confirm the root cause.

When mtr won’t help — and what to do instead

mtr won’t reveal broken application logic, server-side rate-limit configurations, or encrypted-path packet transformations. If mtr shows a clear clean path but service errors persist, collect app logs, check server CPU/memory, and inspect firewall or load balancer rules. For BGP-level anomalies, use BGP route collectors (like Hurricane Electric BGP) or RIPE tools for deeper AS path analysis.

Security and permissions notes

mtr may require elevated privileges to send raw packets on some systems. Use sudo responsibly and avoid running tools with unchecked scripts. When sharing mtr output, scrub internal hostnames and IPs if they contain sensitive topology information.

References and further reading

For background and implementation details, see the mtr project and hands-on tutorials: the mtr Wikipedia entry (mtr on Wikipedia), the official man page (mtr man page), and a practical how‑to from DigitalOcean (DigitalOcean: How to use mtr).

So here’s the short takeaway: if you’re troubleshooting network delay or intermittent loss, run mtr early and from multiple vantage points, test with the protocol your app uses, and treat mtr as a hypothesis generator — then confirm with protocol-specific tests.

Frequently Asked Questions

What is mtr and when should I use it?

mtr is a live traceroute tool that samples latency and packet loss per hop. Use it when you need both path information and time-based statistics to diagnose intermittent or persistent network issues.

Does packet loss at a single hop in mtr always mean a routing problem?

Not necessarily. Some routers deprioritize ICMP/TTL-expired packets, showing loss at a hop while the destination is unaffected. Confirm by checking subsequent hops and testing with the actual application protocol (e.g., TCP port 443).

Which probe type should I use with mtr?

Start with ICMP for general checks, but if you suspect ICMP filtering or want realistic results for web traffic, run TCP probes to the service port (e.g., -T –tcp -P 443). Choose the probe that best matches your application’s traffic.