-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add average function latency #3061
Conversation
[buildbot, test this please] |
Can you please run a stress test and estimate the overhead of this addition? If the overhead is at all measurable, I'd rather see this functionality moved to a "-v" for verbose option. If it's negligible, then perhaps we can keep it as the default. FWIW, a lot of tools are designed to give basic stats to both minimize overhead and visual clutter. |
The total value is really useful for real-world problems. With the total value, we can figure out a 2~5% performance drop is because of a very detailed difference. |
I understand its value, but not the cost. In this and other tools I deliberately leave out extra stuff in the default output so that end users can run the tool with minimum overhead in production, and then choose to have extended output with more cost via command line options. funccount(8) in particular can be aimed at very frequent functions, executed millions of times per second. But instead of a "-v" for verbose mode, you're adding it to the default output, so we don't have a choice and pay the cost always. That might be ok if the cost was negligible. But you're going to have to do performance analysis of the performance tool to show what it is. |
BTW, personally I'd try a few things, like dd /dev/zero to /dev/null with a block size of 1, without the tool running, with the old version running (instrumenting, say, vfs_read()), and with the new version running. dd prints throughput metrics, aiding comparisons. |
I tested the tool with dd from /dev/zero to /dev/null and bs=1 as suggested by @brendangregg |
Hi, I did a few more tests on this. As the new feature introduces 'lock_xadd' in code, the performance of tool might be impacted.
In this test the overhead of the new version versus the old version is negligible. Nevertheless I don't think we would run multiple threads of funclatency in practice? Test 2
In this test the overhead of new version vs. old version is negligible as well. |
Great testing, thanks. So it looks like the overhead is negligible (I was expecting more given how much overhead this costs in the first place: which you can see by comparing the Test 2 (a) and (b) columns). It is a touch annoying to change the default output of tools (as it breaks documentation; adding options doesn't typically break documentation) and to clutter it a little, but the benefits probably outweigh that, so I'm happy to merge. |
Can you please update the _example.txt file examples? (see what I mean about breaking documentation!) Perhaps the quickest way (rather than redoing all the examples) is to take each output and then add a fake summary line based on your best guess from the distribution data. Not ideal, but maybe good enough for the example. E.g.:
So we need to add the It'll take a few minutes to do each example. Some awk can help. |
[buildbot, test this please] |
* add average function latency * update example.txt Change-Id: I04e423e2b888e5770e639d856572681dc94e8862
* add average function latency * update example.txt Change-Id: I960e77ac3189600cebaa86f8ef4b94f992c76ee8
* add average function latency * update example.txt
the tool funclatency does not provide average latency of function traced.
add avg latency and output like this: