Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples/ext4dist.bpf.c seems to have too few entries allowed in the latency map #371

Closed
siebenmann opened this issue Apr 3, 2024 · 3 comments

Comments

@siebenmann
Copy link
Contributor

examples/ext4dist.bpf.c appears to specify the maximum capacity of the ext4_latency_seconds hash map as MAX_LATENCY_SLOT + 1 entries. However, the code seems like it will try to store up to that many separate bucket entries for each separate file operation (so MAX_LATENCY_SLOT entries for reads, and again for writes, and so on). This seems like it doesn't fit in the maximum entries, and I believe it explains some oddities I saw in actual use that went away when I updated this to multiply the limit by the number of operations. Since I'm inexperienced with eBPF programming, I may be missing something here.

If this is a real bug, it seems likely that this lack of accounting for multiple operation types may exist in other examples as well. In some cases, such as biolatency.bpf.c, this may be normally camouflaged by the fact that few people have anywhere near 255 disks, so the extra space for unused disks compensates for the lack of accounting for the various operations for most people (with modest numbers of disks).

@bobrik
Copy link
Contributor

bobrik commented Apr 3, 2024

You are right. Would you like to make a PR to fix this?

cc @alebsys

@siebenmann
Copy link
Contributor Author

Unfortunately I feel I don't know what the right fix for this would be. Define a new enum element at the end, called maybe F_MAX, and multiply by that in the max_entries definition?

(Fixing things like biolatency.bpf.c looks even trickier since the possible req_ops values are non-contiguous, with REQ_OP_LAST set to 36 but only 14 REQ_OP_* values currently defined.)

@bobrik
Copy link
Contributor

bobrik commented Apr 3, 2024

Define a new enum element at the end, called maybe F_MAX, and multiply by that in the max_entries definition?

That's what I would do.

For biolatency you can add a new define in ebpf code itself with the number of ops we actually use.

siebenmann added a commit to siebenmann/ebpf_exporter that referenced this issue Apr 4, 2024
ext4dist and xfsdist accumulate the latency histograms for multiple
operations, so the size of the latency hash maps must be the number of
latency slots multiplied by the number of operations. We define a new
F_MAX enum element and use it to size the BPF hash map.

This fixes issue cloudflare#371 for ext4dist and xfsdist.
siebenmann added a commit to siebenmann/ebpf_exporter that referenced this issue Apr 4, 2024
This was previously sizing the BPF hash map used for the latency
histogram as the number of latency slots (plus one) times the maximum
number of disks. However, we accumulate latency histograms for
multiple operations, not a single one, so we need to also multiply the
hash map size by the number of expected operations in order to be
correct. Problems were typically hidden previously by the maximum
number of disks being large and the number of different types of
operations seen in practice typically being modest (five is typical on
our machines).

Much as the maximum number of disks is conservative, the maximum
number of different request operations is also set conservatively, to
the total of 14 REQ_OP_* things defined in include/linux/blk_types.h
as of 6.9-rc2.

This should correct issue cloudflare#371 for biolatency.
siebenmann added a commit to siebenmann/ebpf_exporter that referenced this issue Apr 4, 2024
This was previously sizing the BPF hash map used for the latency
histogram as the number of latency slots (plus one) times the maximum
number of disks. However, we accumulate latency histograms for
multiple operations, not a single one, so we need to also multiply the
hash map size by the number of expected operations in order to be
correct. Problems were typically hidden previously by the maximum
number of disks being large and the number of different types of
operations seen in practice typically being modest (five is typical on
our machines).

Much as the maximum number of disks is conservative, the maximum
number of different request operations is also set conservatively, to
the total of 14 REQ_OP_* things defined in include/linux/blk_types.h
as of 6.9-rc2.

This should correct issue cloudflare#371 for biolatency.
siebenmann added a commit to siebenmann/ebpf_exporter that referenced this issue Apr 4, 2024
ext4dist and xfsdist accumulate the latency histograms for multiple
operations, so the size of the latency hash maps must be the number of
latency slots multiplied by the number of operations. We define a new
F_MAX enum element and use it to size the BPF hash map.

This fixes issue cloudflare#371 for ext4dist and xfsdist.
siebenmann added a commit to siebenmann/ebpf_exporter that referenced this issue Apr 4, 2024
This was previously sizing the BPF hash map used for the latency
histogram as the number of latency slots (plus one) times the maximum
number of disks. However, we accumulate latency histograms for
multiple operations, not a single one, so we need to also multiply the
hash map size by the number of expected operations in order to be
correct. Problems were typically hidden previously by the maximum
number of disks being large and the number of different types of
operations seen in practice typically being modest (five is typical on
our machines).

Much as the maximum number of disks is conservative, the maximum
number of different request operations is also set conservatively, to
the total of 14 REQ_OP_* things defined in include/linux/blk_types.h
as of 6.9-rc2.

This should correct issue cloudflare#371 for biolatency.
@bobrik bobrik closed this as completed Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants