-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack Traces: TNG #450
Stack Traces: TNG #450
Conversation
Instead of manually walking the (kernel) stack inside the eBPF code, create a `BPF_STACK_TRACE` table and store the stack IDs in the alloc info struct. This greatly simplifies the leak detection code: instead of storing the full stack trace as a string to de-duplicate the allocation point for the different allocations, we can store the `stack_id`. Since we're creating stack IDs with `BPF_F_REUSE_STACKID`, the kernel will take care of deduplication for us. Additionally, the `StackDecoder` class has been specialized into a `UStackDecoder` and `KStackDecoder` (for userland and kernel stacks, respectively). This lets us pass the decode class directly to `stack_traces.walk()` to automate symbol resolution. A new class, `Allocation` has been created to encapsulate what previously was a 2-element tuple of `(count, size)`. This
The manual walking for kernel stacks in the eBPF code has been replaced with a `BPF_STACK_TRACE` table; the stack ID is now stored as an integer in the main key. Thanks to the `StackTrace.walk` iterator, the printing code can also be significantly simplified for both folded and full stack modes.
lgtm. should the man page say that kversion<=4.5 can be found in tools/old ? probably not. I guess that's implicit assumption for all tools in old and we don't need to partition it further per kernel version. |
The offset is now returned after a `+` symbol, instead of directly attached to the symbol name (which made reading the output very confusing) Before: tick_do_update_jiffies64a0 After: tick_do_update_jiffies64+0xa0
@4ast: Thank you! I've updated @brendangregg: For Also, 4.6-rc1 is out! Perfect timing to start trying this stuff out. http:https://lkml.iu.edu/hypermail/linux/kernel/1603.3/01187.html 🎉 |
Instead of manually walking the stack and printing each frame to the trace log, we can use a BPF_STACK_TRACE table to store the stack traces, and print to the trace log their IDs every time they are traced. The output of the tool has been slightly modified: we no longer prefix each line of the stack trace with the timestamp and the other headers, since the whole stack trace is fetched in one go from the table and the information would be highly redundant. The man page and the examples have been updated to reflect the new output.
The changes for this script are minimal: the inline C probe has been _significantly_ simplified, and should now perform better since the stack walk happens in native code inside the kernel thanks to the BPF_STACK_TRACE table. The output of the tool should be virtually identical, so the man page and included examples have essentially no changes.
... wow, so BPF_MAP_TYPE_STACK_TRACE is in 4.6-rc1. I must have missed the final integration email. |
Oh, I can simply run
:) So, yeah,
makes sense. |
This PR is a followup on #448 and updates several tools to use the new stack trace APIs in 4.6+.
As always, details on the changes are on each individual commit message. The man pages have been updated to reflect the changes, and the old versions using the manual stack walker have been saved to
tools/old
.TODO
memleak.py
offcputime.py
stackcount.py
stacksnoop.py