Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new LLVM optimization pass list #34940

Merged
merged 1 commit into from
Mar 4, 2020
Merged

new LLVM optimization pass list #34940

merged 1 commit into from
Mar 4, 2020

Conversation

JeffBezanson
Copy link
Sponsor Member

This is a rough cut of some possible changes to our pass list, that should run a bit faster and possibly even produce better code. There is some inline commentary that will be removed, and just reflects some notes from a discussion with an LLVM developer. Several variations are possible, but I'm curious to try some benchmarks first. This does make the build a bit faster, and reduces time-to-first-plot by about 4-5%.

@nanosoldier runbenchmarks(ALL, vs=":master")

@JeffBezanson JeffBezanson added compiler:codegen Generation of LLVM IR and native code compiler:latency Compiler latency labels Mar 1, 2020
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

So some of the regressions were caused by moving IndVarSimplify and removing the second LoopDeletion. I put those back. I'll also try replacing the internal InstCombines with InstSimplify, which is much faster. Passing false to InstCombine to disable "expensive" combines does not provide any real speedup.

@nanosoldier runbenchmarks(ALL, vs=":master")

@JeffBezanson
Copy link
Sponsor Member Author

JeffBezanson commented Mar 1, 2020

With this pass list time-to-first-plot is ~8% faster instead of ~4% (due to using InstSimplify). Let's see how it does.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

Ok, one of those InstCombines was important for vectorizing loops over Union arrays. Let's put that back and try again.

@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

That's starting to look pretty good. We may have a solid candidate here.

- use InstSimplify instead of InstCombine in some cases to speed it up
- reorder some passes
- add LoopLoadElimination and DivRemPairs
@JeffBezanson JeffBezanson changed the title WIP: try some changes to our LLVM optimization pass list new LLVM optimization pass list Mar 3, 2020
@JeffBezanson
Copy link
Sponsor Member Author

Ok, @Keno @vtjnash and I went over this and we think it's ready to go.

@JeffBezanson JeffBezanson merged commit 5e162d7 into master Mar 4, 2020
@JeffBezanson JeffBezanson deleted the jb/optpasses branch March 4, 2020 19:56
ravibitsgoa pushed a commit to ravibitsgoa/julia that referenced this pull request Apr 9, 2020
- use InstSimplify instead of InstCombine in some cases to speed it up
- reorder some passes
- add LoopLoadElimination and DivRemPairs
KristofferC pushed a commit that referenced this pull request Apr 11, 2020
- use InstSimplify instead of InstCombine in some cases to speed it up
- reorder some passes
- add LoopLoadElimination and DivRemPairs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code compiler:latency Compiler latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants