examples/ext4dist.bpf.c seems to have too few entries allowed in the latency map #371

siebenmann · 2024-04-03T20:29:14Z

examples/ext4dist.bpf.c appears to specify the maximum capacity of the ext4_latency_seconds hash map as MAX_LATENCY_SLOT + 1 entries. However, the code seems like it will try to store up to that many separate bucket entries for each separate file operation (so MAX_LATENCY_SLOT entries for reads, and again for writes, and so on). This seems like it doesn't fit in the maximum entries, and I believe it explains some oddities I saw in actual use that went away when I updated this to multiply the limit by the number of operations. Since I'm inexperienced with eBPF programming, I may be missing something here.

If this is a real bug, it seems likely that this lack of accounting for multiple operation types may exist in other examples as well. In some cases, such as biolatency.bpf.c, this may be normally camouflaged by the fact that few people have anywhere near 255 disks, so the extra space for unused disks compensates for the lack of accounting for the various operations for most people (with modest numbers of disks).

The text was updated successfully, but these errors were encountered:

bobrik · 2024-04-03T22:18:26Z

You are right. Would you like to make a PR to fix this?

cc @alebsys

siebenmann · 2024-04-03T23:22:07Z

Unfortunately I feel I don't know what the right fix for this would be. Define a new enum element at the end, called maybe F_MAX, and multiply by that in the max_entries definition?

(Fixing things like biolatency.bpf.c looks even trickier since the possible req_ops values are non-contiguous, with REQ_OP_LAST set to 36 but only 14 REQ_OP_* values currently defined.)

bobrik · 2024-04-03T23:29:13Z

Define a new enum element at the end, called maybe F_MAX, and multiply by that in the max_entries definition?

That's what I would do.

For biolatency you can add a new define in ebpf code itself with the number of ops we actually use.

ext4dist and xfsdist accumulate the latency histograms for multiple operations, so the size of the latency hash maps must be the number of latency slots multiplied by the number of operations. We define a new F_MAX enum element and use it to size the BPF hash map. This fixes issue cloudflare#371 for ext4dist and xfsdist.

This was previously sizing the BPF hash map used for the latency histogram as the number of latency slots (plus one) times the maximum number of disks. However, we accumulate latency histograms for multiple operations, not a single one, so we need to also multiply the hash map size by the number of expected operations in order to be correct. Problems were typically hidden previously by the maximum number of disks being large and the number of different types of operations seen in practice typically being modest (five is typical on our machines). Much as the maximum number of disks is conservative, the maximum number of different request operations is also set conservatively, to the total of 14 REQ_OP_* things defined in include/linux/blk_types.h as of 6.9-rc2. This should correct issue cloudflare#371 for biolatency.

ext4dist and xfsdist accumulate the latency histograms for multiple operations, so the size of the latency hash maps must be the number of latency slots multiplied by the number of operations. We define a new F_MAX enum element and use it to size the BPF hash map. This fixes issue cloudflare#371 for ext4dist and xfsdist.

This was previously sizing the BPF hash map used for the latency histogram as the number of latency slots (plus one) times the maximum number of disks. However, we accumulate latency histograms for multiple operations, not a single one, so we need to also multiply the hash map size by the number of expected operations in order to be correct. Problems were typically hidden previously by the maximum number of disks being large and the number of different types of operations seen in practice typically being modest (five is typical on our machines). Much as the maximum number of disks is conservative, the maximum number of different request operations is also set conservatively, to the total of 14 REQ_OP_* things defined in include/linux/blk_types.h as of 6.9-rc2. This should correct issue cloudflare#371 for biolatency.

siebenmann mentioned this issue Apr 4, 2024

Correctly size latency BPF hash maps in multi-operation latency programs #373

Merged

bobrik closed this as completed Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples/ext4dist.bpf.c seems to have too few entries allowed in the latency map #371

examples/ext4dist.bpf.c seems to have too few entries allowed in the latency map #371

siebenmann commented Apr 3, 2024

bobrik commented Apr 3, 2024

siebenmann commented Apr 3, 2024

bobrik commented Apr 3, 2024

examples/ext4dist.bpf.c seems to have too few entries allowed in the latency map #371

examples/ext4dist.bpf.c seems to have too few entries allowed in the latency map #371

Comments

siebenmann commented Apr 3, 2024

bobrik commented Apr 3, 2024

siebenmann commented Apr 3, 2024

bobrik commented Apr 3, 2024