-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
broken ksnoop tool #4746
Comments
I've reproduced this issue: same symptom, no output, even when attaching to the same function with |
In my investigation KSNOOP_IP_FIX macro is causing the issue. Kprobes are preferably attached at ftrace locations. Since, this location might depend on the stack configurations(CFA/CFI etc) so we might not be lucky enough everytime to have exact address match in ksnoop_func_map. |
I agree; I think in general, we've got two options for fixing ksnoop:
|
In this case, a quick hack could be to force 16-byte alignment in |
Ideally, we'd handle this by doing |
seems like we should probably also use bpf_get_func_ip() where available, only falling back to KSNOOP_IP_FIX if that doesn't exist (it didn't when ksnoop originally landed). I also wonder if we can use an CO-RE tricks to infer MCOUNT_INSN_SIZE from a data structure that uses it? |
something like if (bpf_core_enum_value_exists(enum bpf_func_id, BPF_FUNC_get_func_ip)) (need to figure out the exact macro for KSNOOP_IP_FIX(), but the above should solve the issue for newer kernels (>5.15 ish) at least.. |
can you try attached patch and see if it helps? (I think it will help for kernels 5.13 and newer). it uses bpf_get_func_ip() where available.. |
Yeah, @alan-maguire, that patch works. Thanks! |
Great, thanks for the quick response! So I can file a PR using that fix for now, and we can think about the best solution for the pre-5.13 case. I suggest we leave this issue open regardless so we can track the pre-5.13 aspect too, since it's not really fixed until it's fixed for that case as well. What do you think? |
Yeah, just tested in a 5.4.0-137-generic kernel in Ubuntu 18.04 (I had a VM handy), and the same issue occurs; to fully fix this problem, we definitely need to do something hacky to account for the |
bpf_get_func_ip(ctx) will get the function address; use it where available using the BPF core enum value check for the function. This avoids needing to get the caller IP via KSNOOP_IP_FIX() for more up-to-date kernels. This fixes issue iovisor#4746 for kernels newer than 5.13; an additional followup will be needed to address this for older kernels. Kindly tested by https://github.com/matthew-olson-intel on a recent kernel. Reported-by: dubeyabhishek (https://github.com/dubeyabhishek) Signed-off-by: Alan Maguire <[email protected]>
okay sad to say i couldn't come up with anything better than the attached patch. the idea is this - we do a one-off serach for MCOUNT_INSN_SIZE by continually subtracting a byte from the PT_REGS_IP_CORE() instruction pointer value and checking if our map lookup succeeds. When/if it does, we cache the mcount_insn_size value, so it's a one-off cost. If you get a chance to give it a spin on your 5.4 kernel that would be great. thanks! |
Can't we just |
bpf_get_func_ip(ctx) will get the function address; use it where available using the BPF core enum value check for the function. This avoids needing to get the caller IP via KSNOOP_IP_FIX() for more up-to-date kernels. This fixes issue #4746 for kernels newer than 5.13; an additional followup will be needed to address this for older kernels. Kindly tested by https://github.com/matthew-olson-intel on a recent kernel. Reported-by: dubeyabhishek (https://github.com/dubeyabhishek) Signed-off-by: Alan Maguire <[email protected]>
would certainly be easier! however it's somewhat config-dependent on x86 at least; if CONFIG_FUNCTION_TRACER is defined, it's 5, otherwise 0. there may be other cases for other arches too, haven't looked yet. |
I tested that new patch on 5.4, and while it does find the proper IP and correctly identifies |
i haven't been able to reproduce this yet; if you get a chance would you mind attaching the 5.4 kernel config where you see this failure? thanks! |
Yep, it's just the standard Ubuntu 5.4.0-137-generic config. |
Could we employ CO-RE kconfig here ? |
we probably could, though it might get a bit messy depending on how things look on other architectures. first order of business is to get a fix that works for the 5.4 case though; the above one doesn't and i'm not sure why yet, so i'll try and reproduce that failure now.. |
Apologies for the delay, but I got set up with Ubuntu 18.04: Linux bpfubuntu 5.4.0-1100-oracle #109~18.04.1-Ubuntu SMP Fri Apr 21 12:01:52 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux When I run "ksnoop vma_merge" I see it working: ubuntu@bpfubuntu:~/src/bcc/libbpf-tools$ sudo ksnoop vma_merge I don't see anything significant in the config diff between mine and yours |
I just tried again this morning (in the same VM), and it's working. In previous attempts, I waited 2 or so minutes (left to get a cup of coffee, etc.), so figured it wasn't working because other VMs had given me samples immediately. However, this morning I left it running for a few more minutes and got a couple of samples in the 5.4 kernel. I'm sorry about that; I should have left it running for a longer period of time to be really sure it wasn't working. Perhaps in recompense, I can try running it in a BTF-enabled 4.18 kernel to be really sure. But it looks like everything's all fixed now! |
no apology necessary! I saw nothing and assumed it was broken for me too. Then triggering some activity led to seeing output. However here's the thing - I didn't have the proposed fix in place, I had built master branch bcc to get a baseline. Without the proposed patch (just using the master branch of bcc) do you see ksnoop output? |
Yes -- it seems that Seems everything's fixed now, then. Case closed! |
bpf_get_func_ip(ctx) will get the function address; use it where available using the BPF core enum value check for the function. This avoids needing to get the caller IP via KSNOOP_IP_FIX() for more up-to-date kernels. This fixes issue iovisor#4746 for kernels newer than 5.13; an additional followup will be needed to address this for older kernels. Kindly tested by https://github.com/matthew-olson-intel on a recent kernel. Reported-by: dubeyabhishek (https://github.com/dubeyabhishek) Signed-off-by: Alan Maguire <[email protected]>
Not getting any output for ksnoop tool. Tried multiple functions in argument, no error/failed message is appearing.
Example run below:
./ksnoop trace "vma_merge(mm)" -d
Debug: Parsing trace 'vma_merge(mm)'
Debug: got func 'vma_merge', args 'mm'
Debug: vma_merge = <ip c00000000046df10, mod vmlinux>
Debug: getting BTF for vmlinux
libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
Debug: arg #1: <name 'vmi', type id '9646'>
Debug: arg #2: <name 'mm', type id '415'>
Debug: arg #3: <name 'prev', type id '496'>
Debug: arg #4: <name 'addr', type id '1'>
Debug: arg #5: <name 'end', type id '1'>
Debug: return value: type id '496'>
Debug: 'mm' arg, predcate '(null)'
Debug: setting base arg for val mm to 1
Debug: custom trace with 1 args
libbpf: loading object 'ksnoop_bpf' from buffer
libbpf: elf: section(2) .text, size 5704, link 0, flags 6, type=1
libbpf: sec '.text': found program 'ksnoop' at insn offset 0 (0 bytes), code size 713 insns (5704 bytes)
libbpf: elf: section(3) .rel.text, size 176, link 15, flags 40, type=9
libbpf: elf: section(4) kprobe/foo, size 32, link 0, flags 6, type=1
libbpf: sec 'kprobe/foo': found program 'kprobe_entry' at insn offset 0 (0 bytes), code size 4 insns (32 bytes)
libbpf: elf: section(5) .relkprobe/foo, size 16, link 15, flags 40, type=9
libbpf: elf: section(6) kretprobe/foo, size 32, link 0, flags 6, type=1
libbpf: sec 'kretprobe/foo': found program 'kprobe_return' at insn offset 0 (0 bytes), code size 4 insns (32 bytes)
libbpf: elf: section(7) .relkretprobe/foo, size 16, link 15, flags 40, type=9
libbpf: elf: section(8) license, size 13, link 0, flags 3, type=1
libbpf: license of ksnoop_bpf is Dual BSD/GPL
libbpf: elf: section(9) .maps, size 88, link 0, flags 3, type=1
libbpf: elf: section(10) .BTF, size 8893, link 0, flags 0, type=1
libbpf: elf: section(12) .BTF.ext, size 4572, link 0, flags 0, type=1
libbpf: elf: section(15) .symtab, size 1920, link 1, flags 0, type=2
libbpf: looking for externs among 80 symbols...
libbpf: collected 0 externs total
libbpf: map 'ksnoop_func_stack': at sec_idx 9, offset 0.
libbpf: map 'ksnoop_func_stack': found type = 1.
libbpf: map 'ksnoop_func_stack': found key [8], sz = 8.
libbpf: map 'ksnoop_func_stack': found value [11], sz = 144.
libbpf: map 'ksnoop_func_stack': found max_entries = 2048.
libbpf: map 'ksnoop_func_map': at sec_idx 9, offset 32.
libbpf: map 'ksnoop_func_map': found type = 5.
libbpf: map 'ksnoop_func_map': found key [8], sz = 8.
libbpf: map 'ksnoop_func_map': found value [22], sz = 16296.
libbpf: map 'ksnoop_func_map': found max_entries = 64.
libbpf: map 'ksnoop_perf_map': at sec_idx 9, offset 64.
libbpf: map 'ksnoop_perf_map': found type = 4.
libbpf: map 'ksnoop_perf_map': found key_size = 4.
libbpf: map 'ksnoop_perf_map': found value_size = 4.
libbpf: sec '.rel.text': collecting relocation for section(2) '.text'
libbpf: sec '.rel.text': relo #0: insn #6 against 'ksnoop_func_stack'
libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #6
libbpf: sec '.rel.text': relo #1: insn #35 against 'ksnoop_func_stack'
libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #35
libbpf: sec '.rel.text': relo #2: insn #40 against 'ksnoop_func_stack'
libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #40
libbpf: sec '.rel.text': relo #3: insn #85 against 'ksnoop_func_map'
libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #85
libbpf: sec '.rel.text': relo #4: insn #146 against 'ksnoop_func_map'
libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #146
libbpf: sec '.rel.text': relo #5: insn #490 against 'ksnoop_func_stack'
libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #490
libbpf: sec '.rel.text': relo #6: insn #540 against 'ksnoop_func_map'
libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #540
libbpf: sec '.rel.text': relo #7: insn #568 against 'ksnoop_perf_map'
libbpf: prog 'ksnoop': found map 2 (ksnoop_perf_map, sec 9, off 64) for insn #568
libbpf: sec '.rel.text': relo #8: insn #584 against 'ksnoop_perf_map'
libbpf: prog 'ksnoop': found map 2 (ksnoop_perf_map, sec 9, off 64) for insn #584
libbpf: sec '.rel.text': relo #9: insn #635 against 'ksnoop_func_map'
libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #635
libbpf: sec '.rel.text': relo #10: insn #665 against 'ksnoop_perf_map'
libbpf: prog 'ksnoop': found map 2 (ksnoop_perf_map, sec 9, off 64) for insn #665
libbpf: sec '.relkprobe/foo': collecting relocation for section(4) 'kprobe/foo'
libbpf: sec '.relkprobe/foo': relo #0: insn #1 against '.text'
libbpf: sec '.relkretprobe/foo': collecting relocation for section(6) 'kretprobe/foo'
libbpf: sec '.relkretprobe/foo': relo #0: insn #1 against '.text'
libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
libbpf: map 'ksnoop_func_stack': created successfully, fd=4
libbpf: map 'ksnoop_func_map': created successfully, fd=5
libbpf: map 'ksnoop_perf_map': setting size to 8
libbpf: map 'ksnoop_perf_map': created successfully, fd=6
libbpf: sec '.text': found 7 CO-RE relocations
libbpf: CO-RE relocating [70] struct pt_regs: found target candidate [111] struct pt_regs in [vmlinux]
libbpf: prog 'ksnoop': relo #0: <byte_off> [70] struct pt_regs.nip (0:0:1:1 @ offset 256)
libbpf: prog 'ksnoop': relo #0: matching candidate #0 <byte_off> [111] struct pt_regs.nip (0:0:1:1 @ offset 256)
libbpf: prog 'ksnoop': relo #0: patched insn #50 (ALU/ALU64) imm 256 -> 256
libbpf: prog 'ksnoop': relo #1: <byte_off> [70] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24)
libbpf: prog 'ksnoop': relo #1: matching candidate #0 <byte_off> [111] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24)
libbpf: prog 'ksnoop': relo #1: patched insn #222 (ALU/ALU64) imm 24 -> 24
libbpf: prog 'ksnoop': relo #2: <byte_off> [70] struct pt_regs.gpr[4] (0:0:1:0:4 @ offset 32)
libbpf: prog 'ksnoop': relo #2: matching candidate #0 <byte_off> [111] struct pt_regs.gpr[4] (0:0:1:0:4 @ offset 32)
libbpf: prog 'ksnoop': relo #2: patched insn #226 (ALU/ALU64) imm 32 -> 32
libbpf: prog 'ksnoop': relo #3: <byte_off> [70] struct pt_regs.gpr[5] (0:0:1:0:5 @ offset 40)
libbpf: prog 'ksnoop': relo #3: matching candidate #0 <byte_off> [111] struct pt_regs.gpr[5] (0:0:1:0:5 @ offset 40)
libbpf: prog 'ksnoop': relo #3: patched insn #230 (ALU/ALU64) imm 40 -> 40
libbpf: prog 'ksnoop': relo #4: <byte_off> [70] struct pt_regs.gpr[6] (0:0:1:0:6 @ offset 48)
libbpf: prog 'ksnoop': relo #4: matching candidate #0 <byte_off> [111] struct pt_regs.gpr[6] (0:0:1:0:6 @ offset 48)
libbpf: prog 'ksnoop': relo #4: patched insn #234 (ALU/ALU64) imm 48 -> 48
libbpf: prog 'ksnoop': relo #5: <byte_off> [70] struct pt_regs.gpr[7] (0:0:1:0:7 @ offset 56)
libbpf: prog 'ksnoop': relo #5: matching candidate #0 <byte_off> [111] struct pt_regs.gpr[7] (0:0:1:0:7 @ offset 56)
libbpf: prog 'ksnoop': relo #5: patched insn #238 (ALU/ALU64) imm 56 -> 56
libbpf: prog 'ksnoop': relo #6: <byte_off> [70] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24)
libbpf: prog 'ksnoop': relo #6: matching candidate #0 <byte_off> [111] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24)
libbpf: prog 'ksnoop': relo #6: patched insn #294 (ALU/ALU64) imm 24 -> 24
libbpf: prog 'kprobe_entry': added 713 insns from sub-prog 'ksnoop'
libbpf: prog 'kprobe_entry': insn #1 relocated, imm 2 points to subprog 'ksnoop' (now at 4 offset)
libbpf: prog 'kprobe_return': added 713 insns from sub-prog 'ksnoop'
libbpf: prog 'kprobe_return': insn #1 relocated, imm 2 points to subprog 'ksnoop' (now at 4 offset)
Debug: Attached kprobe for 'vma_merge'
Debug: Attached kretprobe for 'vma_merge'
TIME CPU PID FUNCTION/ARGS
The text was updated successfully, but these errors were encountered: