Skip to content

Commit

Permalink
tools/tcprtt: add tcprtt to trace the RTT of TCP
Browse files Browse the repository at this point in the history
This program traces TCP RTT(round-trip time) to analyze the quality of
network, then help us to distinguish the network latency trouble is
from user process or physical network.

Currently, support source address/port and destination address/port as
tcp filter.

Suggested-by: Edward Wu <[email protected]>
Suggested-by: Martin KaFai Lau <[email protected]>
Signed-off-by: zhenwei pi <[email protected]>
  • Loading branch information
pizhenwei authored and yonghong-song committed Sep 9, 2020
1 parent b8e661f commit e42ac41
Show file tree
Hide file tree
Showing 3 changed files with 328 additions and 0 deletions.
76 changes: 76 additions & 0 deletions man/man8/tcprtt.8
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
.TH tcprtt 8 "2020-08-23" "USER COMMANDS"
.SH NAME
tcprtt \- Trace TCP RTT of established connections. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B tcprtt [\-h] [\-T] [\-D] [\-m] [\-i INTERVAL] [\-d DURATION]
.SH DESCRIPTION
This tool traces established connections RTT(round-trip time) to analyze the
quality of network. This can be useful for general troubleshooting to
distinguish the network latency is from user process or physical network.

Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-h
Print usage message.
.TP
\-T
Include a time column on output (HH:MM:SS).
.TP
\-D
Show debug infomation of bpf text.
.TP
\-m
Output histogram in milliseconds.
.TP
\-i INTERVAL
Print output every interval seconds.
.TP
\-d DURATION
Total duration of trace in seconds.
.TP
\-p SPORT
Filter for source port.
.TP
\-P DPORT
Filter for destination port.
.TP
\-a SADDR
Filter for source address.
.TP
\-A DADDR
Filter for destination address.
.SH EXAMPLES
.TP
Trace TCP RTT and print 1 second summaries, 10 times:
#
.B tcprtt \-i 1 \-d 10
.TP
Summarize in millisecond, and timestamps:
#
.B tcprtt \-m \-T
.TP
Only trace TCP RTT for destination address 192.168.1.100 and destination port 80:
#
.B tcprtt \-i 1 \-d 10 -A 192.168.1.100 -P 80
.SH OVERHEAD
This traces the kernel tcp_rcv_established function and collects TCP RTT. The
rate of this depends on your server application. If it is a web or proxy server
accepting many tens of thousands of connections per second.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
zhenwei pi
.SH SEE ALSO
tcptracer(8), tcpconnect(8), funccount(8), tcpdump(8)
169 changes: 169 additions & 0 deletions tools/tcprtt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
#!/usr/bin/python
# @lint-avoid-python-3-compatibility-imports
#
# tcprtt Summarize TCP RTT as a histogram. For Linux, uses BCC, eBPF.
#
# USAGE: tcprtt [-h] [-T] [-D] [-m] [-i INTERVAL] [-d DURATION]
# [-p SPORT] [-P DPORT] [-a SADDR] [-A DADDR]
#
# Copyright (c) 2020 zhenwei pi
# Licensed under the Apache License, Version 2.0 (the "License")
#
# 23-AUG-2020 zhenwei pi Created this.

from __future__ import print_function
from bcc import BPF
from time import sleep, strftime
import socket, struct
import argparse

# arguments
examples = """examples:
./tcprtt # summarize TCP RTT
./tcprtt -i 1 -d 10 # print 1 second summaries, 10 times
./tcprtt -m -T # summarize in millisecond, and timestamps
./tcprtt -p # filter for source port
./tcprtt -P # filter for destination port
./tcprtt -a # filter for source address
./tcprtt -A # filter for destination address
./tcprtt -D # show debug bpf text
"""
parser = argparse.ArgumentParser(
description="Summarize TCP RTT as a histogram",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=examples)
parser.add_argument("-i", "--interval",
help="summary interval, seconds")
parser.add_argument("-d", "--duration", type=int, default=99999,
help="total duration of trace, seconds")
parser.add_argument("-T", "--timestamp", action="store_true",
help="include timestamp on output")
parser.add_argument("-m", "--milliseconds", action="store_true",
help="millisecond histogram")
parser.add_argument("-p", "--sport",
help="source port")
parser.add_argument("-P", "--dport",
help="destination port")
parser.add_argument("-a", "--saddr",
help="source address")
parser.add_argument("-A", "--daddr",
help="destination address")
parser.add_argument("-D", "--debug", action="store_true",
help="print BPF program before starting (for debugging purposes)")
parser.add_argument("--ebpf", action="store_true",
help=argparse.SUPPRESS)
args = parser.parse_args()
if not args.interval:
args.interval = args.duration

# define BPF program
bpf_text = """
#ifndef KBUILD_MODNAME
#define KBUILD_MODNAME "bcc"
#endif
#include <uapi/linux/ptrace.h>
#include <linux/tcp.h>
#include <net/sock.h>
#include <net/inet_sock.h>
#include <bcc/proto.h>
BPF_HISTOGRAM(hist_srtt);
int trace_tcp_rcv(struct pt_regs *ctx, struct sock *sk, struct sk_buff *skb)
{
struct tcp_sock *ts = tcp_sk(sk);
u32 srtt = ts->srtt_us >> 3;
const struct inet_sock *inet = inet_sk(sk);
SPORTFILTER
DPORTFILTER
SADDRFILTER
DADDRFILTER
FACTOR
hist_srtt.increment(bpf_log2l(srtt));
return 0;
}
"""

# filter for source port
if args.sport:
bpf_text = bpf_text.replace(b'SPORTFILTER',
b"""u16 sport = 0;
bpf_probe_read_kernel(&sport, sizeof(sport), (void *)&inet->inet_sport);
if (ntohs(sport) != %d)
return 0;""" % int(args.sport))
else:
bpf_text = bpf_text.replace(b'SPORTFILTER', b'')

# filter for dest port
if args.dport:
bpf_text = bpf_text.replace(b'DPORTFILTER',
b"""u16 dport = 0;
bpf_probe_read_kernel(&dport, sizeof(dport), (void *)&inet->inet_dport);
if (ntohs(dport) != %d)
return 0;""" % int(args.dport))
else:
bpf_text = bpf_text.replace(b'DPORTFILTER', b'')

# filter for source address
if args.saddr:
bpf_text = bpf_text.replace(b'SADDRFILTER',
b"""u32 saddr = 0;
bpf_probe_read_kernel(&saddr, sizeof(saddr), (void *)&inet->inet_saddr);
if (saddr != %d)
return 0;""" % struct.unpack("=I", socket.inet_aton(args.saddr))[0])
else:
bpf_text = bpf_text.replace(b'SADDRFILTER', b'')

# filter for source address
if args.daddr:
bpf_text = bpf_text.replace(b'DADDRFILTER',
b"""u32 daddr = 0;
bpf_probe_read_kernel(&daddr, sizeof(daddr), (void *)&inet->inet_daddr);
if (daddr != %d)
return 0;""" % struct.unpack("=I", socket.inet_aton(args.daddr))[0])
else:
bpf_text = bpf_text.replace(b'DADDRFILTER', b'')

# show msecs or usecs[default]
if args.milliseconds:
bpf_text = bpf_text.replace('FACTOR', 'srtt /= 1000;')
label = "msecs"
else:
bpf_text = bpf_text.replace('FACTOR', '')
label = "usecs"

# debug/dump ebpf enable or not
if args.debug or args.ebpf:
print(bpf_text)
if args.ebpf:
exit()

# load BPF program
b = BPF(text=bpf_text)
b.attach_kprobe(event="tcp_rcv_established", fn_name="trace_tcp_rcv")

print("Tracing TCP RTT... Hit Ctrl-C to end.")

# output
exiting = 0 if args.interval else 1
dist = b.get_table("hist_srtt")
seconds = 0
while (1):
try:
sleep(int(args.interval))
seconds = seconds + int(args.interval)
except KeyboardInterrupt:
exiting = 1

print()
if args.timestamp:
print("%-8s\n" % strftime("%H:%M:%S"), end="")

dist.print_log2_hist(label, "srtt")
dist.clear()

if exiting or seconds >= args.duration:
exit()
83 changes: 83 additions & 0 deletions tools/tcprtt_example.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
Demonstrations of tcprtt, the Linux eBPF/bcc version.


This program traces TCP RTT(round-trip time) to analyze the quality of
network, then help us to distinguish the network latency trouble is from
user process or physical network.

For example, wrk show the http request latency distribution:
# wrk -d 30 -c 10 --latency http:https://192.168.122.100/index.html
Running 30s test @ http:https://192.168.122.100/index.html
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 86.75ms 153.76ms 1.54s 90.85%
Req/Sec 160.91 76.07 424.00 67.06%
Latency Distribution
50% 14.55ms
75% 119.21ms
90% 230.22ms
99% 726.90ms
9523 requests in 30.02s, 69.62MB read
Socket errors: connect 0, read 0, write 0, timeout 1

During wrk testing, run tcprtt:
# ./tcprtt -i 1 -d 10 -m
Tracing TCP RTT... Hit Ctrl-C to end.
msecs : count distribution
0 -> 1 : 4 | |
2 -> 3 : 0 | |
4 -> 7 : 1055 |****************************************|
8 -> 15 : 26 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 18 | |
128 -> 255 : 14 | |
256 -> 511 : 14 | |
512 -> 1023 : 12 | |

The wrk output shows that the latency of web service is not stable, and tcprtt
also shows unstable TCP RTT. So in this situation, we need to make sure the
quality of network is good or not firstly.


Use filter for address and(or) port. Ex, only collect source address 192.168.122.200
and destination address 192.168.122.100 and destination port 80.
# ./tcprtt -i 1 -d 10 -m -a 192.168.122.200 -A 192.168.122.100 -P 80


Full USAGE:

# ./tcprtt -h
usage: tcprtt [-h] [-i INTERVAL] [-d DURATION] [-T] [-m] [-p SPORT]
[-P DPORT] [-a SADDR] [-A DADDR] [-D]

Summarize TCP RTT as a histogram

optional arguments:
-h, --help show this help message and exit
-i INTERVAL, --interval INTERVAL
summary interval, seconds
-d DURATION, --duration DURATION
total duration of trace, seconds
-T, --timestamp include timestamp on output
-m, --milliseconds millisecond histogram
-p SPORT, --sport SPORT
source port
-P DPORT, --dport DPORT
destination port
-a SADDR, --saddr SADDR
source address
-A DADDR, --daddr DADDR
destination address
-D, --debug print BPF program before starting (for debugging
purposes)

examples:
./tcprtt # summarize TCP RTT
./tcprtt -i 1 -d 10 # print 1 second summaries, 10 times
./tcprtt -m -T # summarize in millisecond, and timestamps
./tcprtt -p # filter for source port
./tcprtt -P # filter for destination port
./tcprtt -a # filter for source address
./tcprtt -A # filter for destination address
./tcprtt -D # show debug bpf text

0 comments on commit e42ac41

Please sign in to comment.