Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance problem in LLVM JumpThreading #25630

Merged
merged 1 commit into from
Jan 20, 2018
Merged

Conversation

Keno
Copy link
Member

@Keno Keno commented Jan 18, 2018

This patch is https://reviews.llvm.org/D42262 upstream. When we emit
unboxed unions, we tend to emit switch instructions on a small
integer which serves as a marker for which of types is currently active,
so we often emit things like:

block:
    ; Incoming from a union split branch
    %phi = i8 phi [1, %a], [2, %b]
    < Some other operation not on the union split>
    ; Union split again
    switch i8 %phi, label %boxed [
        i8 1, %abranch
        i8 2, %bbranch
    ]

In many situations the operation in the middle can get optimized away,
so we want to merge the two union split sections into one. LLVM's
jump threading pass can do this by keeping track of how control flow
behaves across a given basic block. Unfortunately, this optimization
wasn't taking effect. This is because InstCombine realized that the
range of possible values was rather small and turned the above into
something like:

   %trunc = truc i8 %phi to i2
   switch i2 %trunc, label %boxed [
       i2 1, %abranch
       i2 -2, %bbranch
   ]

which JumpThreading refused to look through (because of the i2 rather
than the i1). The included patch fixes this. On recent LLVM, we
additionally need https://reviews.llvm.org/D42260, for cases where
a union split branch happens to target a loop header. However,
LLVM 3.9.1 does not include the original commit that regressed that.

This fixes a number of the performance regressions seen in #25261.

This patch is https://reviews.llvm.org/D42262 upstream. When we emit
unboxed unions, we tend to emit `switch` instructions on a small
integer which serves as a marker for which of types is currently active,
so we often emit things like:

```
block:
    ; Incoming from a union split branch
    %phi = i8 phi [1, %a], [2, %b]
    < Some other operation not on the union split>
    ; Union split again
    switch i8 %phi, label %boxed [
        i8 1, %abranch
        i8 2, %bbranch
    ]
```

In many situations the operation in the middle can get optimized away,
so we want to merge the two union split sections into one. LLVM's
jump threading pass can do this by keeping track of how control flow
behaves across a given basic block. Unfortunately, this optimization
wasn't taking effect. This is because InstCombine realized that the
range of possible values was rather small and turned the above into
something like:

```
   %trunc = truc i8 %phi to i2
   switch i2 %trunc, label %boxed [
       i2 1, %abranch
       i2 -2, %bbranch
   ]
```

which JumpThreading refused to look through (because of the i2 rather
than the i1). The included patch fixes this. On recent LLVM, we
additionally need https://reviews.llvm.org/D42260, for cases where
a union split branch happens to target a loop header. However,
LLVM 3.9.1 does not include the original commit that regressed that.

This fixes a number of the performance regressions seen in #25261.
@Keno
Copy link
Member Author

Keno commented Jan 18, 2018

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

@Keno
Copy link
Member Author

Keno commented Jan 18, 2018

cc @ararslan LinAlg move broke nanosoldier

@ararslan
Copy link
Member

Ah, so it did. Thanks for the heads up, I'll get on that right away.

@ararslan
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

@ararslan
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

@Keno
Copy link
Member Author

Keno commented Jan 19, 2018

Looks like the Serializer move also broke it.

@ararslan
Copy link
Member

Thanks, working on that now.

@ararslan
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@Keno Keno merged commit 808e828 into master Jan 20, 2018
@martinholters martinholters deleted the kf/llvmunionsplit branch January 20, 2018 20:34
@maleadt maleadt mentioned this pull request Jul 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants