Skip to content

Commit

Permalink
argdist, trace: Native tracepoint support (iovisor#724)
Browse files Browse the repository at this point in the history
* Remove tracepoint.py

The `Tracepoint` class which implements the necessary
support for the tracepoint kprobe-based hack is no
longer needed and can be removed.

* argdist: Native tracepoint support

This commit migrates argdist to use the native bcc/BPF
tracepoint support instead of the hackish kprobe-
based approach. The resulting programs are cleaner
and likely more efficient.

As a result of this change, there is a slight API
change in how argdist is used with tracepoints. To
access fields from the tracepoint structure, the user
is expected to use `args->field` directly. This
leverages most of the built-in bcc support for
generating the tracepoint probe function.

* trace: Native tracepoint support

This commit migrates trace to use the native bcc/BPF
tracepoint support instead of the hackish kprobe-
based approach. The resulting programs are cleaner
and likely more efficient.

As with argdist, users are now expected to use the
`args` structure pointer to access the tracepoint's
arguments.

For example:

```
trace 't:irq:irq_handler_entry (args->irq != 27) "irq %d", args->irq'
```
  • Loading branch information
goldshtn authored and 4ast committed Oct 4, 2016
1 parent d2f4762 commit 376ae5c
Show file tree
Hide file tree
Showing 8 changed files with 50 additions and 210 deletions.
7 changes: 3 additions & 4 deletions man/man8/argdist.8
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,9 @@ The expression(s) to capture.
These are the values that are assigned to the histogram or raw event collection.
You may use the parameters directly, or valid C expressions that involve the
parameters, such as "size % 10".
Tracepoints may access a special structure called "tp" that is formatted according
to the tracepoint format (which you can obtain using tplist). For example, the
block:block_rq_complete tracepoint can access tp.nr_sector. You may also use the
members of the "tp" struct directly, e.g. "nr_sector" instead of "tp.nr_sector".
Tracepoints may access a special structure called "args" that is formatted
according to the tracepoint format (which you can obtain using tplist).
For example, the block:block_rq_complete tracepoint can access args->nr_sector.
USDT probes may access the arguments defined by the tracing program in the
special arg1, arg2, ... variables. To obtain their types, use the tplist tool.
Return probes can use the argument values received by the
Expand Down
10 changes: 4 additions & 6 deletions man/man8/trace.8
Original file line number Diff line number Diff line change
Expand Up @@ -93,11 +93,9 @@ format specifier replacements may be any C expressions, and may refer to the
same special keywords as in the predicate (arg1, arg2, etc.).

In tracepoints, both the predicate and the arguments may refer to the tracepoint
format structure, which is stored in the special "tp" variable. For example, the
block:block_rq_complete tracepoint can print or filter by tp.nr_sector. To
discover the format of your tracepoint, use the tplist tool. Note that you can
also use the members of the "tp" struct directly, e.g "nr_sector" instead of
"tp.nr_sector".
format structure, which is stored in the special "args" variable. For example, the
block:block_rq_complete tracepoint can print or filter by args->nr_sector. To
discover the format of your tracepoint, use the tplist tool.

In USDT probes, the arg1, ..., argN variables refer to the probe's arguments.
To determine which arguments your probe has, use the tplist tool.
Expand Down Expand Up @@ -126,7 +124,7 @@ Trace returns from the readline function in bash and print the return value as a
.TP
Trace the block:block_rq_complete tracepoint and print the number of sectors completed:
#
.B trace 't:block:block_rq_complete """%d sectors"", nr_sector'
.B trace 't:block:block_rq_complete """%d sectors"", args->nr_sector'
.TP
Trace the pthread_create USDT probe from the pthread library and print the address of the thread's start function:
#
Expand Down
1 change: 0 additions & 1 deletion src/python/bcc/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@

from .libbcc import lib, _CB_TYPE, bcc_symbol
from .table import Table
from .tracepoint import Tracepoint
from .perf import Perf
from .usyms import ProcessSymbols

Expand Down
143 changes: 0 additions & 143 deletions src/python/bcc/tracepoint.py

This file was deleted.

27 changes: 11 additions & 16 deletions tools/argdist.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# Licensed under the Apache License, Version 2.0 (the "License")
# Copyright (C) 2016 Sasha Goldshtein.

from bcc import BPF, Tracepoint, Perf, USDT
from bcc import BPF, USDT
from time import sleep, strftime
import argparse
import re
Expand Down Expand Up @@ -195,9 +195,6 @@ def __init__(self, tool, type, specifier):
self.library = "" # kernel
self.tp_category = parts[1]
self.tp_event = self.function
self.tp = Tracepoint.enable_tracepoint(
self.tp_category, self.tp_event)
self.function = "perf_trace_" + self.function
elif self.probe_type == "u":
self.library = parts[1]
self.probe_func_name = "%s_probe%d" % \
Expand Down Expand Up @@ -329,8 +326,10 @@ def generate_text(self):
program = ""
probe_text = """
DATA_DECL
int PROBENAME(struct pt_regs *ctx SIGNATURE)
""" + (
"TRACEPOINT_PROBE(%s, %s)" % (self.tp_category, self.tp_event) \
if self.probe_type == "t" \
else "int PROBENAME(struct pt_regs *ctx SIGNATURE)") + """
{
PID_FILTER
PREFIX
Expand All @@ -352,10 +351,7 @@ def generate_text(self):
# value we collected when entering the function:
self._replace_entry_exprs()

if self.probe_type == "t":
program += self.tp.generate_struct()
prefix += self.tp.generate_get_struct()
elif self.probe_type == "p" and len(self.signature) > 0:
if self.probe_type == "p" and len(self.signature) > 0:
# Only entry uprobes/kprobes can have user-specified
# signatures. Other probes force it to ().
signature = ", " + self.signature
Expand Down Expand Up @@ -396,7 +392,9 @@ def _attach_u(self):
pid=self.pid or -1)

def _attach_k(self):
if self.probe_type == "r" or self.probe_type == "t":
if self.probe_type == "t":
pass # Nothing to do for tracepoints
elif self.probe_type == "r":
self.bpf.attach_kretprobe(event=self.function,
fn_name=self.probe_func_name)
else:
Expand Down Expand Up @@ -537,10 +535,10 @@ class Tool(object):
Count fork() calls in libc across all processes
Can also use funccount.py, which is easier and more flexible
argdist -H 't:block:block_rq_complete():u32:tp.nr_sector'
argdist -H 't:block:block_rq_complete():u32:args->nr_sector'
Print histogram of number of sectors in completing block I/O requests
argdist -C 't:irq:irq_handler_entry():int:tp.irq'
argdist -C 't:irq:irq_handler_entry():int:args->irq'
Aggregate interrupts by interrupt request (IRQ)
argdist -C 'u:pthread:pthread_start():u64:arg2' -p 1337
Expand Down Expand Up @@ -613,8 +611,6 @@ def _generate_program(self):
bpf_source += "#include <%s>\n" % include
bpf_source += BPF.generate_auto_includes(
map(lambda p: p.raw_spec, self.probes))
bpf_source += Tracepoint.generate_decl()
bpf_source += Tracepoint.generate_entry_probe()
for probe in self.probes:
bpf_source += probe.generate_text()
if self.args.verbose:
Expand All @@ -627,7 +623,6 @@ def _generate_program(self):
self.bpf = BPF(text=bpf_source, usdt_contexts=usdt_contexts)

def _attach(self):
Tracepoint.attach(self.bpf)
for probe in self.probes:
probe.attach(self.bpf)
if self.args.verbose:
Expand Down
28 changes: 11 additions & 17 deletions tools/argdist_example.txt
Original file line number Diff line number Diff line change
Expand Up @@ -264,24 +264,18 @@ certain kinds of allocations or visually group them together.

argdist also has basic support for kernel tracepoints. It is sometimes more
convenient to use tracepoints because they are documented and don't vary a lot
between kernel versions like function signatures tend to. For example, let's
trace the net:net_dev_start_xmit tracepoint and print the interface name that
is transmitting:
between kernel versions. For example, let's trace the net:net_dev_start_xmit
tracepoint and print out the protocol field from the tracepoint structure:

# argdist -c -C 't:net:net_dev_start_xmit(void *a, void *b, struct net_device *c):char*:c->name' -n 2
[05:01:10]
t:net:net_dev_start_xmit(void *a, void *b, struct net_device *c):char*:c->name
# argdist -C 't:net:net_dev_start_xmit():u16:args->protocol'
[13:01:49]
t:net:net_dev_start_xmit():u16:args->protocol
COUNT EVENT
4 c->name = eth0
[05:01:11]
t:net:net_dev_start_xmit(void *a, void *b, struct net_device *c):char*:c->name
COUNT EVENT
6 c->name = lo
92 c->name = eth0
8 args->protocol = 2048
^C

Note that to determine the necessary function signature you need to look at the
TP_PROTO declaration in the kernel headers. For example, the net_dev_start_xmit
tracepoint is defined in the include/trace/events/net.h header file.
Note that to discover the format of the net:net_dev_start_xmit tracepoint, you
use the tplist tool (tplist -v net:net_dev_start_xmit).

Here's a final example that finds how many write() system calls are performed
by each process on the system:
Expand Down Expand Up @@ -388,10 +382,10 @@ argdist -C 'p:c:fork()#fork calls'
Count fork() calls in libc across all processes
Can also use funccount.py, which is easier and more flexible

argdist -H 't:block:block_rq_complete():u32:tp.nr_sector'
argdist -H 't:block:block_rq_complete():u32:args->nr_sector'
Print histogram of number of sectors in completing block I/O requests

argdist -C 't:irq:irq_handler_entry():int:tp.irq'
argdist -C 't:irq:irq_handler_entry():int:args->irq'
Aggregate interrupts by interrupt request (IRQ)

argdist -C 'u:pthread:pthread_start():u64:arg2' -p 1337
Expand Down
Loading

0 comments on commit 376ae5c

Please sign in to comment.