Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include sampling profiler in julia Base #2597

Closed
ViralBShah opened this issue Mar 18, 2013 · 49 comments
Closed

Include sampling profiler in julia Base #2597

ViralBShah opened this issue Mar 18, 2013 · 49 comments
Assignees
Milestone

Comments

@ViralBShah
Copy link
Member

Include @timholy 's sampling profiler in julia Base. This was the general consensus in #1865 and I am opening a new issue to track this.

@ghost ghost assigned timholy Mar 18, 2013
@timholy
Copy link
Sponsor Member

timholy commented Mar 18, 2013

I haven't forgotten this. There's at least one thing holding me back: I think to be really useful, sprofile_tree needs graphic visualization. I haven't implemented that yet, but it's on my TODO list.

Until we have graphics in base, I wonder if it's better left in a package.

@ViralBShah
Copy link
Member Author

Ok. I personally would be happy with a text based profiler, and can certainly use it from the package for now. Changing this to the 1.0 milestone.

@JeffBezanson
Copy link
Sponsor Member

Agree; I am happy picking through obscure text output. And a simple rough summary of where time is spent can be quite valuable, even without call tree information.

@ViralBShah
Copy link
Member Author

The sprofile_flat example shown in the SProfile README is good enough for a lot of use cases. I would feel a lot better if this was available in base. The graphical version can come later when we have visualization.

@timholy
Copy link
Sponsor Member

timholy commented Mar 18, 2013

OK, I can push what's available to core in time for 0.2. Sometime later I'll experiment with graphical visualization in the package.

As far as interpreting the results of text-based output from sprofile_tree, the best trick is to store the results to a file and use a text editor's cursor to help follow the indentation. (I had to switch to single-character indentation just because the call tree is often so deeply nested, but that's made it quite hard to parse visually.)

@timholy
Copy link
Sponsor Member

timholy commented Mar 19, 2013

There's a new sprofile branch. Feel free to make changes, and merge into master when you deem it ready. (Or give feedback and I can do the cleanups.)

@timholy
Copy link
Sponsor Member

timholy commented Mar 19, 2013

I should have added: sprofile.c needs a timer implementation for Windows. All it needs to do is the same thing that the Darwin & Linux #if defined functions do.

@ViralBShah
Copy link
Member Author

I was able to compile on mac. Can we call this profile instead of sprofile? The original profile would then need a new name.

@ViralBShah
Copy link
Member Author

Can we use the libuv timers here?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 20, 2013

@ViralBShah we need fully async timers, which libuv does not provide

I would be very annoyed to have this merged until windows support can be realized (since it tends to cause a bug report headache -- and windows already has enough open issues). I don't think that is feasible for 0.2.

I can probably help some there. However, brief google searching doesn't reveal the existence of any asynchronous timers, so we probably need a helper thread. But more brief google searching hasn't been very fruitful in finding a version of RtlCaptureStackBackTrace which accepts a thread PHANDLE or CONTEXT argument (although gdb is able to pretend to do it, and fail miserably at succeeding)

@ViralBShah
Copy link
Member Author

That is fair enough. It would be great to have this working on Windows, and we can certainly wait for one more release for that.

@timholy
Copy link
Sponsor Member

timholy commented Mar 20, 2013

Didn't realize that timers on Windows would be so problematic, given that it's a dozen lines on Unix. Sorry about that. I'll do some investigating myself.

If necessary this can sit and stew until 0.3.

@timholy
Copy link
Sponsor Member

timholy commented Mar 20, 2013

Wow, it's shockingly hard to find a "signaling timer" on Windows that has better than several-millisecond accuracy.

I can't quite be sure, but I wonder if this might do the trick:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms686289(v=vs.85).aspx

But see:
https://stackoverflow.com/questions/12199872/resolution-of-waitforsingleobject-with-timesetevent-setwaitabletimer
The last reply about migrating between cores is interesting, but I wonder whether that would, say, reduce BLAS to using just one core?

@timholy
Copy link
Sponsor Member

timholy commented Mar 20, 2013

Can we call this profile instead of sprofile? The original profile would then need a new name.

The original profile has already been renamed iprofile for "instrumenting profiler." So this wouldn't be a problem. But I'm a little hesitant because there are in fact quite a few possible types of profilers:
https://en.wikipedia.org/wiki/Profiling_(computer_programming)

If, for example, we ever develop a call-graph or hypervisor profiler and want to include that in base (that would basically have to be in base), then having the sampling profiler that's simply called "profile" might be annoying.

@timholy
Copy link
Sponsor Member

timholy commented Mar 20, 2013

A couple more thoughts (then I'm going to be away for a couple days with uncertain access to WiFi):

  • If the timer occasionally has much longer gaps, that's not a catastrophe: it just means that you don't collect as many samples as you intended. It only needs to be decently accurate, and preferably able to run at relatively short intervals. (My measurements on Linux indicated that taking a backtrace consumes ~30 microseconds, so running once per millisecond yields a profiling penalty of just a few percent, which seems about the right balance.)
  • If it would help and if you think the function I identified will do the trick, I'm happy to put together a draft Windows implementation. But don't expect too much benefit from that, since it would be untested, and testing/fixing bugs would have to be done on a Windows machine.
  • Alternatively, in sysimg.jl we could wrap the include("sprofile.jl") in @unix_only. Don't know how you'd feel about that, Jameson; it would prevent "bug reports" but might be begging for "feature requests."

timholy added a commit that referenced this issue Mar 20, 2013
Until #2597 gets resolved, at least these changes are needed for the
0.2 branch of the package to work.
@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 20, 2013

Those timers are possibly correct, but need to be used from a separate thread, as they are not asynchronous in the linux sense. Window's will only call the callback function when the application has gone into a sleep state. It wouldn't mess with openblas in that case to set processor affinity for that thread. But the timers aren't necessarily the critical issue -- I'm not sure how to record a backtrace from the another thread (other than doing code injection).

Announcing 0.2 has a new feature is bound to result in bug reports when people try to copy paste the example code into their repl and it doesn't work. I think they have a hard time accepting that linux might just be better.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 20, 2013

edit: I think it can be accomplished with StackWalk64 being called from a timer thread

@timholy
Copy link
Sponsor Member

timholy commented Mar 21, 2013

Announcing 0.2 has a new feature is bound to result in bug reports when
people try to copy paste the example code into their repl and it doesn't
work.

You're absolutely right, of course. Let's just wait until it can be addressed
properly. The package exists for people who want to use it.

@ViralBShah
Copy link
Member Author

@vtjnash It seems that 0.2 should be released in the next month or so. So is 0.3 the right milestone?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 11, 2013

@ViralBShah it still depends on when the windows support can be added to it. also, someone should pick a real date for 0.2 and update all the future milestones appropriately.

On another tangent, anyone know how to suggest features to github? I really want to be able to assign multiple people to issues.

@pao
Copy link
Member

pao commented Apr 11, 2013

@vtjnash Contact support for that--https://github.com/contact

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 2, 2013

The windows backtrace code now uses StackWalk64, so we are half way to having this. Note that StackWalk64 & friends are not thread-safe, so we just need to be careful that we don't try to profile the backtrace.

@timholy
Copy link
Sponsor Member

timholy commented Jun 2, 2013

Exciting! Thanks, Jameson!

@ViralBShah
Copy link
Member Author

Is it possible to include this in Base for 0.2, given the recent progress?

@ViralBShah
Copy link
Member Author

@timholy Trying to gently bring the profiler into Base, I am wondering if as a start, we can at least move the C code for sampling into Base? In the early phases, it may not be compiled on Windows, but it would make installing the profiler package much easier.

@timholy
Copy link
Sponsor Member

timholy commented Jul 24, 2013

I have no objections, but I will defer to @vtjnash on this. Certainly the status quo seems quite satisfactory on Linux, although I guess compilers are less common on other platforms.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 24, 2013

It's probably easier for me if it is in base -- fewer places to update code for adding windows support. I just asked that it not be part of the advertised features in base until we can get windows support ready.

@timholy
Copy link
Sponsor Member

timholy commented Jul 24, 2013

We can leave the Julia code in the package, that way the advertised state won't be an issue.

@ViralBShah
Copy link
Member Author

Thanks for adding it. Mentally, it does feel like it is easier to make it work on Windows now that it is in Base. Any chance this can happen in the 0.2 release timeframe?

For the record, relevant commit is 159203b

@StefanKarpinski
Copy link
Sponsor Member

I strongly feel that the sampling profiler should go in base and always be available. It should also be renamed to just @profile and maybe have a little API work done on it, but having it in base seems essential to me.

@timholy
Copy link
Sponsor Member

timholy commented Jul 24, 2013

I was originally reluctant to name it @profile because there are so many possible types of profilers. However, at this point to me this seems like the One True Profiler, and I have difficulty seeing the need for anything else. (Well, a call-graph analyzer would be nice, essentially @which on steroids.)

So I agree that it should probably be renamed. As for merging the Julia code into base, I'm sensitive to @vtjnash's concern that there's still work to be done on the Windows side (high-resolution timers are clearly a major pain on Windows). And since he's such a genius helping make Julia better in so many ways, his word is like, gospel, to me :-).

@StefanKarpinski
Copy link
Sponsor Member

Well, @vtjnash's only objection was announcing this as a feature that's supported on Windows. In fact, he feels it would be easier for him to get the profiler working on Windows if it's part of Base. So I think we should include it and just be very clear that it doesn't yet work on Windows.

@ViralBShah
Copy link
Member Author

That is a useful point. We can merge the sampling profiler into base, but not document it or advertise it until it is ready. It is basically a new feature under development in master, and can be announced when complete.

Let's just call it @profile.

@quinnj
Copy link
Member

quinnj commented Jul 24, 2013

I'd be happy to help test/work on the 64-bit Windows side if needed. This has been at the top of my wish list as of late :)

@timholy
Copy link
Sponsor Member

timholy commented Jul 24, 2013

Maybe I'm misinterpreting, but I think he's wanting it not to be an advertised feature of any base until the Windows support is ready. They've already worked hard to close a lot of gaps, and this just moves the finish line. So even though I want it in base too, I'm quite sympathetic. It's pretty easy to insert "using Profile" into your .juliarc.jl. (I'm not advocating this as a long-term solution, of course.)

One thing I don't see a reason to wait on, however: @StefanKarpinski, I'd be happy to start working on API changes in preparation for a merge to base. Care to file issues at Profile.jl, so I know what you have in mind? Then it will be ready whenever we decide to pull the trigger.

@karbarcca, certainly any help getting it implemented on Windows is the cleanest solution.

@StefanKarpinski
Copy link
Sponsor Member

Let's just move it into Base and make it clear that it doesn't yet work on Windows. Not having it in Base doesn't make it work on Windows any more. Let's do API changes after that.

vtjnash added a commit that referenced this issue Jul 25, 2013
@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 25, 2013

I'll leave it to @timholy to fix the implementation in the Profile package:

  • removing references to libprofile
  • fixing the ccall types status = ccall((:sprofile_init, libprofile), Cint, (Uint, Uint64), nsamples, delay)

@ViralBShah
Copy link
Member Author

Wow this landed much faster than expected. We can now move the sampling profiler to base!

@ViralBShah
Copy link
Member Author

I am marking this as milestone 0.2, since it may actually be possible. We can move it further out if necessary.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 25, 2013

I'm not sure it should be 0.2, since we really should be tagging that immediately, if Pkg2 is ready.

@ViralBShah
Copy link
Member Author

This perhaps needs to be discussed elsewhere, but we should ideally go through a couple of release candidates, so that we do not have to scamper around and release updates soon after.

@timholy
Copy link
Sponsor Member

timholy commented Jul 25, 2013

Many thanks, @vtjnash, for writing the Windows timers.

The move of the Julia code is done. Given the question about whether this is arriving in time for 0.2, I pushed it to the new "profile" branch. (I'll delete the sprofile branch shortly.)

Quite a lot of API cleanup, too. See 39cd8dc. @StefanKarpinski, please do let me know of any other issues.

@ViralBShah
Copy link
Member Author

The thing is that until this is in master, it is unlikely to get tested out widely, especially on Windows. Perhaps that only happens after the 0.2 release.

@timholy
Copy link
Sponsor Member

timholy commented Jul 25, 2013

I'm not really in the loop about the plans for the 0.2 release; the Boston/IRC crowd probably knows more. So somebody else had better make a decision about this, sooner better than later.

At least outside of base, it is pretty well-tested (on Unix). Since there are still one or two more things left in Pkg2, I personally think this should be able to make it in. But I won't merge profile->master until someone tells me to.

@timholy
Copy link
Sponsor Member

timholy commented Jul 26, 2013

Merged. With docs, too!

@timholy timholy closed this as completed Jul 26, 2013
@quinnj
Copy link
Member

quinnj commented Jul 26, 2013

Awesome! Many thanks to @timholy and @vtjnash. I just gave it a test run on windows 64-bit and it's running great.

Quick question I didn't see mentioned in the docs:
If I get a result like the following, what does the "+1" and "+7" mean in the bottom two backtraces?

julia> Profile.print()
4    ...me.jl\src\Datetime.jl; setoffset; line: 120
2    profile.jl; anonymous; line: 9
3    profile.jl; anonymous; line: 14
 662  ...me.jl\src\Datetime.jl; datetime; line: 132
       541 ....jl\src\Datetime.jl; setoffset; line: 120
  10   ...me.jl\src\Datetime.jl; datetime; line: 132
        9 ....jl\src\Datetime.jl; setoffset; line: 120
   1    stream.jl; process_events; line: 481
                2526 none; test; line: 5
                       1   array.jl; splice!; line: 831
                       1   inference.jl; inlineable; line: 1642
                                     +1 3   ...ime.jl; datetime; line: 132
                                     +7 2 ...time.jl; setoffset; line: 120

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 26, 2013

Cool. I like when stuff actually works on windows, without first needing testing and modification (I only wrote it against wine32).

@timholy
Copy link
Sponsor Member

timholy commented Jul 26, 2013

Great to hear! Nice work, Jameson!

If I get a result like the following, what does the "+1" and "+7" mean in the bottom two backtraces?

It's briefly described in the section on the cols option. Looks like I should expand that a bit.

@quinnj
Copy link
Member

quinnj commented Jul 26, 2013

Ah, got it. I think a quick code snippet showing what you were talking about is all that's needed. Reading through the first time, it didn't click and scanning the code examples I didn't see any examples of that. Great work though, I love having this functionality in Base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants