forked from michalgr/bcc
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request iovisor#353 from brendangregg/master
3 tools: fsslower, execsnoop, runqlat
- Loading branch information
Showing
12 changed files
with
1,520 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
.TH execsnoop 8 "2016-02-07" "USER COMMANDS" | ||
.SH NAME | ||
execsnoop \- Trace new processes via exec() syscalls. Uses Linux eBPF/bcc. | ||
.SH SYNOPSIS | ||
.B execsnoop [\-h] [\-t] [\-X] [\-n NAME] | ||
.SH DESCRIPTION | ||
execsnoop traces new processes, showing the filename executed, argument | ||
list, and return value (0 for success). | ||
|
||
It works by traces the execve() system call (commonly used exec() variant). | ||
This catches new processes that follow the fork->exec sequence, as well as | ||
processes that re-exec() themselves. Some applications fork() but do not | ||
exec(), eg, for worker processes, which won't be included in the execsnoop | ||
output. | ||
|
||
This works by tracing the kernel sys_execve() function using dynamic tracing, | ||
and will need updating to match any changes to this function. | ||
|
||
Since this uses BPF, only the root user can use this tool. | ||
.SH REQUIREMENTS | ||
CONFIG_BPF and bcc. | ||
.SH OPTIONS | ||
.TP | ||
\-h | ||
Print usage message. | ||
.TP | ||
\-t | ||
Include a timestamp column. | ||
.TP | ||
\-X | ||
Exclude failed exec()s | ||
.TP | ||
\-n NAME | ||
Only print command lines matching this name (regex), matched anywhere | ||
.SH EXAMPLES | ||
.TP | ||
Trace all exec() syscalls: | ||
# | ||
.B execsnoop | ||
.TP | ||
Trace all exec() syscalls, and include timestamps: | ||
# | ||
.B execsnoop \-t | ||
.TP | ||
Only trace successful exec()s: | ||
# | ||
.B execsnoop \-X | ||
.TP | ||
Only trace exec()s where the filename or arguments contain "mount": | ||
# | ||
.B opensnoop \-n mount | ||
.SH FIELDS | ||
.TP | ||
TIME(s) | ||
Time of exec() return, in seconds. | ||
.TP | ||
PCOMM | ||
Parent process/command name. | ||
.TP | ||
PID | ||
Process ID | ||
.TP | ||
RET | ||
Return value of exec(). 0 == successs. | ||
.TP | ||
ARGS | ||
Filename for the exec(), followed be up to 19 arguments. An ellipsis "..." is | ||
shown if the argument list is known to be truncated. | ||
.SH OVERHEAD | ||
This traces the kernel execve function and prints output for each event. As the | ||
rate of this is generally expected to be low (< 1000/s), the overhead is also | ||
expected to be negligible. If you have an application that is calling a high | ||
rate of exec()s, then test and understand overhead before use. | ||
.SH SOURCE | ||
This is from bcc. | ||
.IP | ||
https://github.com/iovisor/bcc | ||
.PP | ||
Also look in the bcc distribution for a companion _examples.txt file containing | ||
example usage, output, and commentary for this tool. | ||
.SH OS | ||
Linux | ||
.SH STABILITY | ||
Unstable - in development. | ||
.SH AUTHOR | ||
Brendan Gregg | ||
.SH SEE ALSO | ||
opensnoop(1) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
.TH fileslower 8 "2016-02-07" "USER COMMANDS" | ||
.SH NAME | ||
fileslower \- Trace slow synchronous file reads and writes. | ||
.SH SYNOPSIS | ||
.B fileslower [\-h] [\-p PID] [min_ms] | ||
.SH DESCRIPTION | ||
This script uses kernel dynamic tracing of synchronous reads and writes | ||
at the VFS interface, to identify slow file reads and writes for any file | ||
system. | ||
|
||
This version traces __vfs_read() and __vfs_write() and only showing | ||
synchronous I/O (the path to new_sync_read() and new_sync_write()), and | ||
I/O with filenames. This approach provides a view of just two file | ||
system request types: file reads and writes. There are typically many others: | ||
asynchronous I/O, directory operations, file handle operations, file open()s, | ||
fflush(), etc. | ||
|
||
WARNING: See the OVERHEAD section. | ||
|
||
By default, a minimum millisecond threshold of 10 is used. | ||
|
||
Since this works by tracing various kernel __vfs_*() functions using dynamic | ||
tracing, it will need updating to match any changes to these functions. A | ||
future version should switch to using FS tracepoints instead. | ||
|
||
Since this uses BPF, only the root user can use this tool. | ||
.SH REQUIREMENTS | ||
CONFIG_BPF and bcc. | ||
.SH OPTIONS | ||
\-p PID | ||
Trace this PID only. | ||
.TP | ||
min_ms | ||
Minimum I/O latency (duration) to trace, in milliseconds. Default is 10 ms. | ||
.SH EXAMPLES | ||
.TP | ||
Trace synchronous file reads and writes slower than 10 ms: | ||
# | ||
.B fileslower | ||
.TP | ||
Trace slower than 1 ms: | ||
# | ||
.B fileslower 1 | ||
.TP | ||
Trace slower than 1 ms, for PID 181 only: | ||
# | ||
.B fileslower \-p 181 1 | ||
.SH FIELDS | ||
.TP | ||
TIME(s) | ||
Time of I/O completion since the first I/O seen, in seconds. | ||
.TP | ||
COMM | ||
Process name. | ||
.TP | ||
PID | ||
Process ID. | ||
.TP | ||
D | ||
Direction of I/O. R == read, W == write. | ||
.TP | ||
BYTES | ||
Size of I/O, in bytes. | ||
.TP | ||
LAT(ms) | ||
Latency (duration) of I/O, measured from when the application issued it to VFS | ||
to when it completed. This time is inclusive of block device I/O, file system | ||
CPU cycles, file system locks, run queue latency, etc. It's a more accurate | ||
measure of the latency suffered by applications performing file system I/O, | ||
than to measure this down at the block device interface. | ||
.TP | ||
FILENAME | ||
A cached kernel file name (comes from dentry->d_iname). | ||
.SH OVERHEAD | ||
Depending on the frequency of application reads and writes, overhead can become | ||
severe, in the worst case slowing applications by 2x. In the best case, the | ||
overhead is negligible. Hopefully for real world workloads the overhead is | ||
often at the lower end of the spectrum -- test before use. The reason for | ||
high overhead is that this traces VFS reads and writes, which includes FS | ||
cache reads and writes, and can exceed one million events per second if the | ||
application is I/O heavy. While the instrumentation is extremely lightweight, | ||
and uses in-kernel eBPF maps for efficient timing and filtering, multiply that | ||
cost by one million events per second and that cost becomes a million times | ||
worse. You can get an idea of the possible cost by just counting the | ||
instrumented events using the bcc funccount tool, eg: | ||
.PP | ||
# ./funccount.py -i 1 -r '^__vfs_(read|write)$' | ||
.PP | ||
This also costs overhead, but is somewhat less than fileslower. | ||
.PP | ||
If the overhead is prohibitive for your workload, I'd recommend moving | ||
down-stack a little from VFS into the file system functions (ext4, xfs, etc). | ||
Look for updates to bcc for specific file system tools that do this. The | ||
advantage of a per-file system approach is that we can trace post-cache, | ||
greatly reducing events and overhead. The disadvantage is needing custom | ||
tracing approaches for each different file system (whereas VFS is generic). | ||
.SH SOURCE | ||
This is from bcc. | ||
.IP | ||
https://github.com/iovisor/bcc | ||
.PP | ||
Also look in the bcc distribution for a companion _examples.txt file containing | ||
example usage, output, and commentary for this tool. | ||
.SH OS | ||
Linux | ||
.SH STABILITY | ||
Unstable - in development. | ||
.SH AUTHOR | ||
Brendan Gregg | ||
.SH SEE ALSO | ||
biosnoop(8), funccount(8) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -88,7 +88,7 @@ Nanosecond range | |
usecs | ||
Microsecond range | ||
.TP | ||
mecs | ||
msecs | ||
Millisecond range | ||
.TP | ||
count | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
.TH runqlat 8 "2016-02-07" "USER COMMANDS" | ||
.SH NAME | ||
runqlat \- Run queue (scheduler) latency as a histogram. | ||
.SH SYNOPSIS | ||
.B runqlat [\-h] [\-T] [\-m] [\-P] [\-p PID] [interval] [count] | ||
.SH DESCRIPTION | ||
This measures the time a task spends waiting on a run queue (or equivalent | ||
scheduler data structure) for a turn on-CPU, and shows this time as a | ||
histogram. This time should be small, but a task may need to wait its turn due | ||
to CPU load. The higher the CPU load, the longer a task will generally need to | ||
wait its turn. | ||
|
||
This tool measures two types of run queue latency: | ||
|
||
1. The time from a task being enqueued on a run queue to its context switch | ||
and execution. This traces enqueue_task_*() -> finish_task_switch(), | ||
and instruments the run queue latency after a voluntary context switch. | ||
|
||
2. The time from when a task was involuntary context switched and still | ||
in the runnable state, to when it next executed. This is instrumented | ||
from finish_task_switch() alone. | ||
|
||
This tool uses in-kernel eBPF maps for storing timestamps and the histogram, | ||
for efficiency. Despite this, the overhead of this tool may become significant | ||
for some workloads: see the OVERHEAD section. | ||
|
||
This works by tracing various kernel scheduler functions using dynamic tracing, | ||
and will need updating to match any changes to these functions. | ||
|
||
Since this uses BPF, only the root user can use this tool. | ||
.SH REQUIREMENTS | ||
CONFIG_BPF and bcc. | ||
.SH OPTIONS | ||
.TP | ||
\-h | ||
Print usage message. | ||
.TP | ||
\-T | ||
Include timestamps on output. | ||
.TP | ||
\-m | ||
Output histogram in milliseconds. | ||
.TP | ||
\-P | ||
Print a histogram for each PID. | ||
.TP | ||
\-p PID | ||
Only show this PID (filtered in kernel for efficiency). | ||
.TP | ||
interval | ||
Output interval, in seconds. | ||
.TP | ||
count | ||
Number of outputs. | ||
.SH EXAMPLES | ||
.TP | ||
Summarize run queue latency as a histogram: | ||
# | ||
.B runqlat | ||
.TP | ||
Print 1 second summaries, 10 times: | ||
# | ||
.B runqlat 1 10 | ||
.TP | ||
Print 1 second summaries, using milliseconds as units for the histogram, and include timestamps on output: | ||
# | ||
.B runqlat \-mT 1 | ||
.TP | ||
Trace PID 186 only, 1 second summaries: | ||
# | ||
.B runqlat -P 185 1 | ||
.SH FIELDS | ||
.TP | ||
usecs | ||
Microsecond range | ||
.TP | ||
msecs | ||
Millisecond range | ||
.TP | ||
count | ||
How many times a task event fell into this range | ||
.TP | ||
distribution | ||
An ASCII bar chart to visualize the distribution (count column) | ||
.SH OVERHEAD | ||
This traces scheduler functions, which can become very frequent. While eBPF | ||
has very low overhead, and this tool uses in-kernel maps for efficiency, the | ||
frequency of scheduler events for some workloads may be high enough that the | ||
overhead of this tool becomes significant. Measure in a lab environment | ||
to quantify the overhead before use. | ||
.SH SOURCE | ||
This is from bcc. | ||
.IP | ||
https://github.com/iovisor/bcc | ||
.PP | ||
Also look in the bcc distribution for a companion _examples.txt file containing | ||
example usage, output, and commentary for this tool. | ||
.SH OS | ||
Linux | ||
.SH STABILITY | ||
Unstable - in development. | ||
.SH AUTHOR | ||
Brendan Gregg | ||
.SH SEE ALSO | ||
pidstat(1) |
Oops, something went wrong.