Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools/runqslower.py: Option for --pid or --tid #2635

Merged

Conversation

bodgergely
Copy link
Contributor

Closes: #2607

Option for --pid or --tid tracing.
Enable group leader level tracing.

  • --pid :
    will trace every task that has as group leader.
    Meaning it will trace the parent and its direct children.
  • --tid :
    will trace the given thread

Update example file.

@yonghong-song
Copy link
Collaborator

[buildbot, test this please]

@yonghong-song
Copy link
Collaborator

[buildbot, ok to test]

Copy link
Collaborator

@yonghong-song yonghong-song left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. Just added some comments. Overall the patch is in the right direction.

tools/runqslower.py Outdated Show resolved Hide resolved
tools/runqslower.py Show resolved Hide resolved
tools/runqslower_example.txt Show resolved Hide resolved
tools/runqslower.py Outdated Show resolved Hide resolved
@bodgergely bodgergely force-pushed the bodgergely/runqslower-pid-tip-2607 branch from 3a98afb to 82c37af Compare December 6, 2019 12:35
@bodgergely
Copy link
Contributor Author

bodgergely commented Dec 6, 2019

For now I just made changes based on all your comments except the

could you change all tgid => pid and pid => tid in both python and C code? This will make code cleaner, less confusion and easy to follow.

Copy link
Collaborator

@yonghong-song yonghong-song left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not mind you use tgid/pid in the bpf code to represent pid/tid in user space as long as the implementation is correct. Now the output we emit something like

TIME     COMM             PID           LAT(us)
10:29:01 kworker/11:2     177355            265
10:29:02 ksoftirqd/24     133              1068
10:29:02 dnsmond          561899            119

The PID here is actually tid. Do you mind to change to TID in python code and examples?

tools/runqslower.py Outdated Show resolved Hide resolved
tools/runqslower.py Outdated Show resolved Hide resolved
@bodgergely
Copy link
Contributor Author

bodgergely commented Dec 20, 2019

@yonghong-song Sorry for not resolving your comments yet... Will do so, today or over the weekend.

@bodgergely bodgergely force-pushed the bodgergely/runqslower-pid-tip-2607 branch from 82c37af to dc60797 Compare December 29, 2019 14:57
Closes: iovisor#2607

Option for --pid or --tid tracing.
Enable group leader level tracing.
* --pid <pid>:
    will trace every task that has <pid> as group leader.
    Meaning it will trace the parent and its direct children.
* --tid <tid>:
    will trace the given thread
* Reporting TID instead of PID in the report column.
* Update the example file to reflect the TID/PID change
@bodgergely bodgergely force-pushed the bodgergely/runqslower-pid-tip-2607 branch from dc60797 to 2a3818b Compare December 29, 2019 15:00
@bodgergely
Copy link
Contributor Author

bodgergely commented Dec 29, 2019

@yonghong-song Thanks for the valuable feedback! I made the changes. Can you take a look
and see if the changes look good?
Unfortunately I already squashed the commits so it will be a bit more difficult for you to see the updates
I did...sorry about that.

@yonghong-song
Copy link
Collaborator

[buildbot, test this please]

@yonghong-song
Copy link
Collaborator

@bodgergely no problem. will take a look soon.

tools/runqslower.py Outdated Show resolved Hide resolved
tools/runqslower.py Outdated Show resolved Hide resolved
tools/runqslower.py Outdated Show resolved Hide resolved
@@ -240,7 +260,7 @@ def print_event(cpu, data, size):
b.attach_kprobe(event="finish_task_switch", fn_name="trace_run")

print("Tracing run queue latency higher than %d us" % min_us)
print("%-8s %-16s %-6s %14s" % ("TIME", "COMM", "PID", "LAT(us)"))
print("%-8s %-16s %-6s %14s" % ("TIME", "COMM", "TID", "LAT(us)"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not sound right. You cannot use "TID" all the time. If user specify '-p ', you want to show "PID".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might get threads showing up in case of '--pid' flag too. So using PID is also not correct in that case either.
Anyways my 2 cents are on just to simply stay with showing PID as a column value even for threads. Htop, top does the same. Showing PID/TID is bringing quite a lot possible confusion here.
Or what do you vote for?
If you prefer I can make it show PID if '--pid' and TID if '--tid'.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about this again. For kernel run queue, actually TID indeed makes sense. So, yes, we can just do TID here. If user filtered through pid, showing TID should be okay too, redundant PID information may not be needed. If user did not do any filterring, TID is the right choice too.

In summary, we can just use "TID" as you changed the above. Please a comment though. Please address other comments and resubmit. Thanks!

Copy link
Contributor Author

@bodgergely bodgergely Jan 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just to make it clear: :)

  • In case --pid cmd line arg -> show PID in the report column
  • In case --tid cmd line arg specified -> show TID in the report column

Is the above what you think would make the most sense?

tools/runqslower_example.txt Outdated Show resolved Hide resolved
tools/runqslower_example.txt Outdated Show resolved Hide resolved
Note: Showing TID (thread id) in the report column. The smallest
execution unit becomes a TID when using the --pid flag as
in that case the tool reports not only the parent pid but
its children threads as well.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do make sure when "-p" is specified, the above contents are presented.

Copy link
Contributor Author

@bodgergely bodgergely Jan 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to emit (the below caveat for example) when running the tool with --pid the text:

Note: when using the --pid flag the tool reports both the parent pid and any of its children threads as well.

@bodgergely
Copy link
Contributor Author

Updating the man file too. If we agree on the reporting column (TID and PID) then I will make more updates to it.

Copy link
Collaborator

@yonghong-song yonghong-song left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good. Showing TID even when the filtering based on pid is fine since fundamentally run queue is operating on threads. Thanks! Just one minor comment.

@@ -207,6 +218,10 @@
}
"""


if args.pid and args.tid:
parser.error("specify only one of -p and -t")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With exclusive group in the above, the above checking should be redundant, python should have warned earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same! I added this line because tools/offcputime.py had that too at line 102.
https://github.com/iovisor/bcc/blob/master/tools/offcputime.py#L102
Maybe that should be removed then too.
I will do so in mine.

It is already exclusive.
@bodgergely
Copy link
Contributor Author

Let me know if good to go. Also probably I should squash the commits into a single unit before you merge.

@yonghong-song
Copy link
Collaborator

Also probably I should squash the commits into a single unit before you merge.
No worry. I can squash all commits into one before the merge.

@yonghong-song yonghong-song merged commit db82644 into iovisor:master Jan 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Current runqslower.py seems filtering in bpf program based on task id instead of user visible pid.
2 participants