Skip to content

Commit

Permalink
Fix performance problem in LLVM JumpThreading
Browse files Browse the repository at this point in the history
This patch is https://reviews.llvm.org/D42262 upstream. When we emit
unboxed unions, we tend to emit `switch` instructions on a small
integer which serves as a marker for which of types is currently active,
so we often emit things like:

```
block:
    ; Incoming from a union split branch
    %phi = i8 phi [1, %a], [2, %b]
    < Some other operation not on the union split>
    ; Union split again
    switch i8 %phi, label %boxed [
        i8 1, %abranch
        i8 2, %bbranch
    ]
```

In many situations the operation in the middle can get optimized away,
so we want to merge the two union split sections into one. LLVM's
jump threading pass can do this by keeping track of how control flow
behaves across a given basic block. Unfortunately, this optimization
wasn't taking effect. This is because InstCombine realized that the
range of possible values was rather small and turned the above into
something like:

```
   %trunc = truc i8 %phi to i2
   switch i2 %trunc, label %boxed [
       i2 1, %abranch
       i2 -2, %bbranch
   ]
```

which JumpThreading refused to look through (because of the i2 rather
than the i1). The included patch fixes this. On recent LLVM, we
additionally need https://reviews.llvm.org/D42260, for cases where
a union split branch happens to target a loop header. However,
LLVM 3.9.1 does not include the original commit that regressed that.

This fixes a number of the performance regressions seen in JuliaLang#25261.
  • Loading branch information
Keno committed Jan 18, 2018
1 parent 7deac81 commit e94a1f8
Show file tree
Hide file tree
Showing 2 changed files with 85 additions and 0 deletions.
3 changes: 3 additions & 0 deletions deps/llvm.mk
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,7 @@ $(eval $(call LLVM_PATCH,llvm-PR29010-i386-xmm)) # Remove for 4.0
$(eval $(call LLVM_PATCH,llvm-3.9.0-D37576-NVPTX-sm_70)) # NVPTX, Remove for 6.0
$(eval $(call LLVM_PATCH,llvm-D37939-Mem2Reg-Also-handle-memcpy))
$(eval $(call LLVM_PATCH,llvm-D31524-sovers_4.0)) # Remove for 4.0
$(eval $(call LLVM_PATCH,llvm-D42262-jumpthreading-not-i1))
ifeq ($(BUILD_LLVM_CLANG),1)
$(eval $(call LLVM_PATCH,compiler_rt-3.9-glibc_2.25.90)) # Remove for 5.0
endif
Expand All @@ -487,6 +488,7 @@ $(eval $(call LLVM_PATCH,llvm-D33129-scevexpander-non-integral)) # Remove for 5.
$(eval $(call LLVM_PATCH,llvm-Yet-another-fix))
$(eval $(call LLVM_PATCH,llvm-4.0.0-D37576-NVPTX-sm_70)) # NVPTX, Remove for 6.0
$(eval $(call LLVM_PATCH,llvm-loadcse-addrspace_4.0))
$(eval $(call LLVM_PATCH,llvm-D42262-jumpthreading-not-i1))
ifeq ($(BUILD_LLVM_CLANG),1)
$(eval $(call LLVM_PATCH,compiler_rt-3.9-glibc_2.25.90)) # Remove for 5.0
endif
Expand All @@ -499,6 +501,7 @@ $(eval $(call LLVM_PATCH,llvm-loadcse-addrspace_5.0))
$(eval $(call LLVM_PATCH,llvm-D34078-vectorize-fdiv))
$(eval $(call LLVM_PATCH,llvm-4.0.0-D37576-NVPTX-sm_70)) # NVPTX, Remove for 6.0
$(eval $(call LLVM_PATCH,llvm-D38765-gvn_5.0)) # Remove for 6.0
$(eval $(call LLVM_PATCH,llvm-D42262-jumpthreading-not-i1))
endif # LLVM_VER

# Remove hardcoded OS X requirements in compilter-rt cmake build
Expand Down
82 changes: 82 additions & 0 deletions deps/patches/llvm-D42262-jumpthreading-not-i1.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
commit 6a311a7a804831fea43cfb2f61322adcb407a1af
Author: Keno Fischer <[email protected]>
Date: Thu Jan 18 15:57:05 2018 -0500

[JumpThreading] Don't restrict cast-traversal to i1

Summary:
In D17663, JumpThreading learned to look trough simple cast instructions,
but only if the source of those cast instructions was a phi/cmp i1
(in an effort to limit compile time effects). I think this condition
is too restrictive. For switches with limited value range, InstCombine
will readily introduce an extra `trunc` instruction to a smaller
integer type (e.g. from i8 to i2), leaving us in the somewhat perverse
situation that jump-threading would work before running instcombine,
but not after. Since instcombine produces this pattern, I think we
need to consider it canonical and support it in JumpThreading.
In general, for limiting recursion, I think the existing restriction
to phi and cmp nodes should be sufficient to avoid looking through
unprofitable chains of instructions.

Reviewers: haicheng, gberry, bmakam, mcrosier

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42262

diff --git a/lib/Transforms/Scalar/JumpThreading.cpp b/lib/Transforms/Scalar/JumpThreading.cpp
index 95c4650..1155e18 100644
--- a/lib/Transforms/Scalar/JumpThreading.cpp
+++ b/lib/Transforms/Scalar/JumpThreading.cpp
@@ -647,11 +647,9 @@ bool JumpThreadingPass::ComputeValueKnownInPredecessors(
}

// Handle Cast instructions. Only see through Cast when the source operand is
- // PHI or Cmp and the source type is i1 to save the compilation time.
+ // PHI or Cmp to save the compilation time.
if (CastInst *CI = dyn_cast<CastInst>(I)) {
Value *Source = CI->getOperand(0);
- if (!Source->getType()->isIntegerTy(1))
- return false;
if (!isa<PHINode>(Source) && !isa<CmpInst>(Source))
return false;
ComputeValueKnownInPredecessors(Source, BB, Result, Preference, CxtI);
diff --git a/test/Transforms/JumpThreading/basic.ll b/test/Transforms/JumpThreading/basic.ll
index ce86cba..16e7549 100644
--- a/test/Transforms/JumpThreading/basic.ll
+++ b/test/Transforms/JumpThreading/basic.ll
@@ -547,6 +547,34 @@ l5:
; CHECK: }
}

+define i1 @trunc_switch(i1 %arg) {
+; CHECK-LABEL: @trunc_switch
+top:
+; CHECK: br i1 %arg, label %exitA, label %exitB
+ br i1 %arg, label %common, label %B
+
+B:
+ br label %common
+
+common:
+ %phi = phi i8 [ 2, %B ], [ 1, %top ]
+ %trunc = trunc i8 %phi to i2
+; CHECK-NOT: switch
+ switch i2 %trunc, label %unreach [
+ i2 1, label %exitA
+ i2 -2, label %exitB
+ ]
+
+unreach:
+ unreachable
+
+exitA:
+ ret i1 true
+
+exitB:
+ ret i1 false
+}
+
; CHECK-LABEL: define void @h_con(i32 %p) {
define void @h_con(i32 %p) {
%x = icmp ult i32 %p, 5

0 comments on commit e94a1f8

Please sign in to comment.