Skip to content

Commit

Permalink
cmd/compile: fix incorrect rewriting to if condition
Browse files Browse the repository at this point in the history
Some ARM64 rewriting rules convert 'comparing to zero' conditions of if
statements to a simplified version utilizing CMN and CMP instructions to
branch over condition flags, in order to save one Add or Sub caculation.

Such optimizations lead to wrong branching in case an overflow/underflow
occurs when executing CMN or CMP.

Fix the issue by introducing new block opcodes that don't honor the
overflow/underflow flag, in the following categories:

  Block-Op        Meaning                   ARM condition codes
  1. LTnoov        less than                 MI
  2. GEnoov        greater than or equal     PL
  3. LEnoov        less than or equal        MI || EQ
  4. GTnoov        greater than              NEQ & PL

The backend generates two consecutive branch instructions for 'LEnoov'
and 'GTnoov' to model their expected behavior. A slight change to 'gc'
and amd64/386 backends is made to unify the code generation.

Add a test 'TestCondRewrite' as justification, it covers 32 incorrect rules
identified on arm64, more might be needed on other arches, like 32-bit arm.

Add two benchmarks profiling the aforementioned category 1&2 and category
3&4 separetely, we expect the first two categories will show performance
improvement and the second will not result in visible regression compared with
the non-optimized version.

This change also updates TestFormats to support using %#x.

Examples exhibiting where does the issue come from:
  1: 'if x + 3 < 0' might be converted to:
  before:
    CMN $3, R0
    BGE <else branch> // wrong branch is taken if 'x+3' overflows
  after:
    CMN $3, R0
    BPL <else branch>

  2: 'if y - 3 > 0' might be converted to:
  before:
    CMP $3, R0
    BLE <else branch> // wrong branch is taken if 'y-3' underflows
  after:
    CMP $3, R0
    BMI <else branch>
    BEQ <else branch>

Benchmark data from different kinds of arm64 servers, 'old' is the non-optimized
version (not the parent commit), generally the optimization version outperforms.

S1:
name                    old time/op  new time/op  delta
CondRewrite/SoloJump  13.6ns ± 0%  12.9ns ± 0%  -5.15%  (p=0.000 n=10+10)
CondRewrite/CombJump  13.8ns ± 1%  12.9ns ± 0%  -6.32%  (p=0.000 n=10+10)

S2:
name                     old time/op  new time/op  delta
CondRewrite/SoloJump  11.6ns ± 0%  10.9ns ± 0%  -6.03%  (p=0.000 n=10+10)
CondRewrite/CombJump  11.4ns ± 0%  10.8ns ± 1%  -5.53%  (p=0.000 n=10+10)

S3:
name                     old time/op  new time/op  delta
CondRewrite/SoloJump  7.36ns ± 0%  7.50ns ± 0%  +1.79%  (p=0.000 n=9+10)
CondRewrite/CombJump  7.35ns ± 0%  7.75ns ± 0%  +5.51%  (p=0.000 n=8+9)

S4:
name                      old time/op  new time/op  delta
CondRewrite/SoloJump-224  11.5ns ± 1%  10.9ns ± 0%  -4.97%  (p=0.000 n=10+10)
CondRewrite/CombJump-224  11.9ns ± 0%  11.5ns ± 0%  -2.95%  (p=0.000 n=10+10)

S5:
name                     old time/op  new time/op  delta
CondRewrite/SoloJump  10.0ns ± 0%  10.0ns ± 0%  -0.45%  (p=0.000 n=9+10)
CondRewrite/CombJump  9.93ns ± 0%  9.77ns ± 0%  -1.53%  (p=0.000 n=10+9)

Go1 perf. data:

name                     old time/op    new time/op    delta
BinaryTree17              6.29s ± 1%     6.30s ± 1%    ~     (p=1.000 n=5+5)
Fannkuch11                5.40s ± 0%     5.40s ± 0%    ~     (p=0.841 n=5+5)
FmtFprintfEmpty          97.9ns ± 0%    98.9ns ± 3%    ~     (p=0.937 n=4+5)
FmtFprintfString          171ns ± 3%     171ns ± 2%    ~     (p=0.754 n=5+5)
FmtFprintfInt             212ns ± 0%     217ns ± 6%  +2.55%  (p=0.008 n=5+5)
FmtFprintfIntInt          296ns ± 1%     297ns ± 2%    ~     (p=0.516 n=5+5)
FmtFprintfPrefixedInt     371ns ± 2%     374ns ± 7%    ~     (p=1.000 n=5+5)
FmtFprintfFloat           435ns ± 1%     439ns ± 2%    ~     (p=0.056 n=5+5)
FmtManyArgs              1.37µs ± 1%    1.36µs ± 1%    ~     (p=0.730 n=5+5)
GobDecode                14.6ms ± 4%    14.4ms ± 4%    ~     (p=0.690 n=5+5)
GobEncode                11.8ms ±20%    11.6ms ±15%    ~     (p=1.000 n=5+5)
Gzip                      507ms ± 0%     491ms ± 0%  -3.22%  (p=0.008 n=5+5)
Gunzip                   73.8ms ± 0%    73.9ms ± 0%    ~     (p=0.690 n=5+5)
HTTPClientServer          116µs ± 0%     116µs ± 0%    ~     (p=0.686 n=4+4)
JSONEncode               21.8ms ± 1%    21.6ms ± 2%    ~     (p=0.151 n=5+5)
JSONDecode                104ms ± 1%     103ms ± 1%  -1.08%  (p=0.016 n=5+5)
Mandelbrot200            9.53ms ± 0%    9.53ms ± 0%    ~     (p=0.421 n=5+5)
GoParse                  7.55ms ± 1%    7.51ms ± 1%    ~     (p=0.151 n=5+5)
RegexpMatchEasy0_32       158ns ± 0%     158ns ± 0%    ~     (all equal)
RegexpMatchEasy0_1K       606ns ± 1%     608ns ± 3%    ~     (p=0.937 n=5+5)
RegexpMatchEasy1_32       143ns ± 0%     144ns ± 1%    ~     (p=0.095 n=5+4)
RegexpMatchEasy1_1K       927ns ± 2%     944ns ± 2%    ~     (p=0.056 n=5+5)
RegexpMatchMedium_32     16.0ns ± 0%    16.0ns ± 0%    ~     (all equal)
RegexpMatchMedium_1K     69.3µs ± 2%    69.7µs ± 0%    ~     (p=0.690 n=5+5)
RegexpMatchHard_32       3.73µs ± 0%    3.73µs ± 1%    ~     (p=0.984 n=5+5)
RegexpMatchHard_1K        111µs ± 1%     110µs ± 0%    ~     (p=0.151 n=5+5)
Revcomp                   1.91s ±47%     1.77s ±68%    ~     (p=1.000 n=5+5)
Template                  138ms ± 1%     138ms ± 1%    ~     (p=1.000 n=5+5)
TimeParse                 787ns ± 2%     785ns ± 1%    ~     (p=0.540 n=5+5)
TimeFormat                729ns ± 1%     726ns ± 1%    ~     (p=0.151 n=5+5)

Updates #38740
Change-Id: I06c604874acdc1e63e66452dadee5df053045222
Reviewed-on: https://go-review.googlesource.com/c/go/+/233097
Reviewed-by: Keith Randall <[email protected]>
Run-TryBot: Keith Randall <[email protected]>
  • Loading branch information
Xiangdong Ji authored and randall77 committed May 29, 2020
1 parent 65f514e commit e8f5a33
Show file tree
Hide file tree
Showing 11 changed files with 891 additions and 141 deletions.
2 changes: 2 additions & 0 deletions src/cmd/compile/fmtmap_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,9 +151,11 @@ var knownFormats = map[string]string{
"int %x": "",
"int16 %d": "",
"int16 %x": "",
"int32 %#x": "",
"int32 %d": "",
"int32 %v": "",
"int32 %x": "",
"int64 %#x": "",
"int64 %+d": "",
"int64 %-10d": "",
"int64 %.5d": "",
Expand Down
8 changes: 4 additions & 4 deletions src/cmd/compile/internal/amd64/ssa.go
Original file line number Diff line number Diff line change
Expand Up @@ -1252,11 +1252,11 @@ var blockJump = [...]struct {
ssa.BlockAMD64NAN: {x86.AJPS, x86.AJPC},
}

var eqfJumps = [2][2]gc.FloatingEQNEJump{
var eqfJumps = [2][2]gc.IndexJump{
{{Jump: x86.AJNE, Index: 1}, {Jump: x86.AJPS, Index: 1}}, // next == b.Succs[0]
{{Jump: x86.AJNE, Index: 1}, {Jump: x86.AJPC, Index: 0}}, // next == b.Succs[1]
}
var nefJumps = [2][2]gc.FloatingEQNEJump{
var nefJumps = [2][2]gc.IndexJump{
{{Jump: x86.AJNE, Index: 0}, {Jump: x86.AJPC, Index: 1}}, // next == b.Succs[0]
{{Jump: x86.AJNE, Index: 0}, {Jump: x86.AJPS, Index: 0}}, // next == b.Succs[1]
}
Expand Down Expand Up @@ -1296,10 +1296,10 @@ func ssaGenBlock(s *gc.SSAGenState, b, next *ssa.Block) {
p.To.Sym = b.Aux.(*obj.LSym)

case ssa.BlockAMD64EQF:
s.FPJump(b, next, &eqfJumps)
s.CombJump(b, next, &eqfJumps)

case ssa.BlockAMD64NEF:
s.FPJump(b, next, &nefJumps)
s.CombJump(b, next, &nefJumps)

case ssa.BlockAMD64EQ, ssa.BlockAMD64NE,
ssa.BlockAMD64LT, ssa.BlockAMD64GE,
Expand Down
21 changes: 20 additions & 1 deletion src/cmd/compile/internal/arm64/ssa.go
Original file line number Diff line number Diff line change
Expand Up @@ -998,6 +998,20 @@ var blockJump = map[ssa.BlockKind]struct {
ssa.BlockARM64FGE: {arm64.ABGE, arm64.ABLT},
ssa.BlockARM64FLE: {arm64.ABLS, arm64.ABHI},
ssa.BlockARM64FGT: {arm64.ABGT, arm64.ABLE},
ssa.BlockARM64LTnoov:{arm64.ABMI, arm64.ABPL},
ssa.BlockARM64GEnoov:{arm64.ABPL, arm64.ABMI},
}

// To model a 'LEnoov' ('<=' without overflow checking) branching
var leJumps = [2][2]gc.IndexJump{
{{Jump: arm64.ABEQ, Index: 0}, {Jump: arm64.ABPL, Index: 1}}, // next == b.Succs[0]
{{Jump: arm64.ABMI, Index: 0}, {Jump: arm64.ABEQ, Index: 0}}, // next == b.Succs[1]
}

// To model a 'GTnoov' ('>' without overflow checking) branching
var gtJumps = [2][2]gc.IndexJump{
{{Jump: arm64.ABMI, Index: 1}, {Jump: arm64.ABEQ, Index: 1}}, // next == b.Succs[0]
{{Jump: arm64.ABEQ, Index: 1}, {Jump: arm64.ABPL, Index: 0}}, // next == b.Succs[1]
}

func ssaGenBlock(s *gc.SSAGenState, b, next *ssa.Block) {
Expand Down Expand Up @@ -1045,7 +1059,8 @@ func ssaGenBlock(s *gc.SSAGenState, b, next *ssa.Block) {
ssa.BlockARM64Z, ssa.BlockARM64NZ,
ssa.BlockARM64ZW, ssa.BlockARM64NZW,
ssa.BlockARM64FLT, ssa.BlockARM64FGE,
ssa.BlockARM64FLE, ssa.BlockARM64FGT:
ssa.BlockARM64FLE, ssa.BlockARM64FGT,
ssa.BlockARM64LTnoov, ssa.BlockARM64GEnoov:
jmp := blockJump[b.Kind]
var p *obj.Prog
switch next {
Expand Down Expand Up @@ -1087,6 +1102,10 @@ func ssaGenBlock(s *gc.SSAGenState, b, next *ssa.Block) {
p.From.Type = obj.TYPE_CONST
p.Reg = b.Controls[0].Reg()

case ssa.BlockARM64LEnoov:
s.CombJump(b, next, &leJumps)
case ssa.BlockARM64GTnoov:
s.CombJump(b, next, &gtJumps)
default:
b.Fatalf("branch not implemented: %s", b.LongString())
}
Expand Down
37 changes: 21 additions & 16 deletions src/cmd/compile/internal/gc/ssa.go
Original file line number Diff line number Diff line change
Expand Up @@ -6313,34 +6313,39 @@ func defframe(s *SSAGenState, e *ssafn) {
thearch.ZeroRange(pp, p, frame+lo, hi-lo, &state)
}

type FloatingEQNEJump struct {
// For generating consecutive jump instructions to model a specific branching
type IndexJump struct {
Jump obj.As
Index int
}

func (s *SSAGenState) oneFPJump(b *ssa.Block, jumps *FloatingEQNEJump) {
p := s.Prog(jumps.Jump)
p.To.Type = obj.TYPE_BRANCH
func (s *SSAGenState) oneJump(b *ssa.Block, jump *IndexJump) {
p := s.Br(jump.Jump, b.Succs[jump.Index].Block())
p.Pos = b.Pos
to := jumps.Index
s.Branches = append(s.Branches, Branch{p, b.Succs[to].Block()})
}

func (s *SSAGenState) FPJump(b, next *ssa.Block, jumps *[2][2]FloatingEQNEJump) {
// CombJump generates combinational instructions (2 at present) for a block jump,
// thereby the behaviour of non-standard condition codes could be simulated
func (s *SSAGenState) CombJump(b, next *ssa.Block, jumps *[2][2]IndexJump) {
switch next {
case b.Succs[0].Block():
s.oneFPJump(b, &jumps[0][0])
s.oneFPJump(b, &jumps[0][1])
s.oneJump(b, &jumps[0][0])
s.oneJump(b, &jumps[0][1])
case b.Succs[1].Block():
s.oneFPJump(b, &jumps[1][0])
s.oneFPJump(b, &jumps[1][1])
s.oneJump(b, &jumps[1][0])
s.oneJump(b, &jumps[1][1])
default:
s.oneFPJump(b, &jumps[1][0])
s.oneFPJump(b, &jumps[1][1])
q := s.Prog(obj.AJMP)
var q *obj.Prog
if b.Likely != ssa.BranchUnlikely {
s.oneJump(b, &jumps[1][0])
s.oneJump(b, &jumps[1][1])
q = s.Br(obj.AJMP, b.Succs[1].Block())
} else {
s.oneJump(b, &jumps[0][0])
s.oneJump(b, &jumps[0][1])
q = s.Br(obj.AJMP, b.Succs[0].Block())
}
q.Pos = b.Pos
q.To.Type = obj.TYPE_BRANCH
s.Branches = append(s.Branches, Branch{q, b.Succs[1].Block()})
}
}

Expand Down
68 changes: 36 additions & 32 deletions src/cmd/compile/internal/ssa/gen/ARM64.rules
Original file line number Diff line number Diff line change
Expand Up @@ -598,31 +598,31 @@

(EQ (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (EQ (CMNconst [c] y) yes no)
(NE (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (NE (CMNconst [c] y) yes no)
(LT (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LT (CMNconst [c] y) yes no)
(LE (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LE (CMNconst [c] y) yes no)
(GT (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GT (CMNconst [c] y) yes no)
(GE (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GE (CMNconst [c] y) yes no)
(LT (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LTnoov (CMNconst [c] y) yes no)
(LE (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LEnoov (CMNconst [c] y) yes no)
(GT (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GTnoov (CMNconst [c] y) yes no)
(GE (CMPconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GEnoov (CMNconst [c] y) yes no)

(EQ (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (EQ (CMNWconst [int32(c)] y) yes no)
(NE (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (NE (CMNWconst [int32(c)] y) yes no)
(LT (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LT (CMNWconst [int32(c)] y) yes no)
(LE (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LE (CMNWconst [int32(c)] y) yes no)
(GT (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GT (CMNWconst [int32(c)] y) yes no)
(GE (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GE (CMNWconst [int32(c)] y) yes no)
(LT (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LTnoov (CMNWconst [int32(c)] y) yes no)
(LE (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (LEnoov (CMNWconst [int32(c)] y) yes no)
(GT (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GTnoov (CMNWconst [int32(c)] y) yes no)
(GE (CMPWconst [0] x:(ADDconst [c] y)) yes no) && x.Uses == 1 => (GEnoov (CMNWconst [int32(c)] y) yes no)

(EQ (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (EQ (CMN x y) yes no)
(NE (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (NE (CMN x y) yes no)
(LT (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LT (CMN x y) yes no)
(LE (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LE (CMN x y) yes no)
(GT (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GT (CMN x y) yes no)
(GE (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GE (CMN x y) yes no)
(LT (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LTnoov (CMN x y) yes no)
(LE (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LEnoov (CMN x y) yes no)
(GT (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GTnoov (CMN x y) yes no)
(GE (CMPconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GEnoov (CMN x y) yes no)

(EQ (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (EQ (CMNW x y) yes no)
(NE (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (NE (CMNW x y) yes no)
(LT (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LT (CMNW x y) yes no)
(LE (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LE (CMNW x y) yes no)
(GT (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GT (CMNW x y) yes no)
(GE (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GE (CMNW x y) yes no)
(LT (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LTnoov (CMNW x y) yes no)
(LE (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (LEnoov (CMNW x y) yes no)
(GT (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GTnoov (CMNW x y) yes no)
(GE (CMPWconst [0] z:(ADD x y)) yes no) && z.Uses == 1 => (GEnoov (CMNW x y) yes no)

(EQ (CMP x z:(NEG y)) yes no) && z.Uses == 1 => (EQ (CMN x y) yes no)
(NE (CMP x z:(NEG y)) yes no) && z.Uses == 1 => (NE (CMN x y) yes no)
Expand All @@ -645,31 +645,31 @@

(EQ (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (EQ (CMN a (MUL <x.Type> x y)) yes no)
(NE (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (NE (CMN a (MUL <x.Type> x y)) yes no)
(LT (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (LT (CMN a (MUL <x.Type> x y)) yes no)
(LE (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (LE (CMN a (MUL <x.Type> x y)) yes no)
(GT (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (GT (CMN a (MUL <x.Type> x y)) yes no)
(GE (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (GE (CMN a (MUL <x.Type> x y)) yes no)
(LT (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (LTnoov (CMN a (MUL <x.Type> x y)) yes no)
(LE (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (LEnoov (CMN a (MUL <x.Type> x y)) yes no)
(GT (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (GTnoov (CMN a (MUL <x.Type> x y)) yes no)
(GE (CMPconst [0] z:(MADD a x y)) yes no) && z.Uses==1 => (GEnoov (CMN a (MUL <x.Type> x y)) yes no)

(EQ (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (EQ (CMP a (MUL <x.Type> x y)) yes no)
(NE (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (NE (CMP a (MUL <x.Type> x y)) yes no)
(LE (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (LE (CMP a (MUL <x.Type> x y)) yes no)
(LT (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (LT (CMP a (MUL <x.Type> x y)) yes no)
(GE (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (GE (CMP a (MUL <x.Type> x y)) yes no)
(GT (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (GT (CMP a (MUL <x.Type> x y)) yes no)
(LE (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (LEnoov (CMP a (MUL <x.Type> x y)) yes no)
(LT (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (LTnoov (CMP a (MUL <x.Type> x y)) yes no)
(GE (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (GEnoov (CMP a (MUL <x.Type> x y)) yes no)
(GT (CMPconst [0] z:(MSUB a x y)) yes no) && z.Uses==1 => (GTnoov (CMP a (MUL <x.Type> x y)) yes no)

(EQ (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (EQ (CMNW a (MULW <x.Type> x y)) yes no)
(NE (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (NE (CMNW a (MULW <x.Type> x y)) yes no)
(LE (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (LE (CMNW a (MULW <x.Type> x y)) yes no)
(LT (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (LT (CMNW a (MULW <x.Type> x y)) yes no)
(GE (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (GE (CMNW a (MULW <x.Type> x y)) yes no)
(GT (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (GT (CMNW a (MULW <x.Type> x y)) yes no)
(LE (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (LEnoov (CMNW a (MULW <x.Type> x y)) yes no)
(LT (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (LTnoov (CMNW a (MULW <x.Type> x y)) yes no)
(GE (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (GEnoov (CMNW a (MULW <x.Type> x y)) yes no)
(GT (CMPWconst [0] z:(MADDW a x y)) yes no) && z.Uses==1 => (GTnoov (CMNW a (MULW <x.Type> x y)) yes no)

(EQ (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (EQ (CMPW a (MULW <x.Type> x y)) yes no)
(NE (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (NE (CMPW a (MULW <x.Type> x y)) yes no)
(LE (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (LE (CMPW a (MULW <x.Type> x y)) yes no)
(LT (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (LT (CMPW a (MULW <x.Type> x y)) yes no)
(GE (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (GE (CMPW a (MULW <x.Type> x y)) yes no)
(GT (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (GT (CMPW a (MULW <x.Type> x y)) yes no)
(LE (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (LEnoov (CMPW a (MULW <x.Type> x y)) yes no)
(LT (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (LTnoov (CMPW a (MULW <x.Type> x y)) yes no)
(GE (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (GEnoov (CMPW a (MULW <x.Type> x y)) yes no)
(GT (CMPWconst [0] z:(MSUBW a x y)) yes no) && z.Uses==1 => (GTnoov (CMPW a (MULW <x.Type> x y)) yes no)

// Absorb bit-tests into block
(Z (ANDconst [c] x) yes no) && oneBit(c) => (TBZ [int64(ntz64(c))] x yes no)
Expand Down Expand Up @@ -1503,6 +1503,10 @@
(FGT (InvertFlags cmp) yes no) -> (FLT cmp yes no)
(FLE (InvertFlags cmp) yes no) -> (FGE cmp yes no)
(FGE (InvertFlags cmp) yes no) -> (FLE cmp yes no)
(LTnoov (InvertFlags cmp) yes no) => (GTnoov cmp yes no)
(GEnoov (InvertFlags cmp) yes no) => (LEnoov cmp yes no)
(LEnoov (InvertFlags cmp) yes no) => (GEnoov cmp yes no)
(GTnoov (InvertFlags cmp) yes no) => (LTnoov cmp yes no)

// absorb InvertFlags into CSEL(0)
(CSEL {cc} x y (InvertFlags cmp)) -> (CSEL {arm64Invert(cc.(Op))} x y cmp)
Expand Down
4 changes: 4 additions & 0 deletions src/cmd/compile/internal/ssa/gen/ARM64Ops.go
Original file line number Diff line number Diff line change
Expand Up @@ -701,6 +701,10 @@ func init() {
{name: "FLE", controls: 1},
{name: "FGT", controls: 1},
{name: "FGE", controls: 1},
{name: "LTnoov", controls: 1}, // 'LT' but without honoring overflow
{name: "LEnoov", controls: 1}, // 'LE' but without honoring overflow
{name: "GTnoov", controls: 1}, // 'GT' but without honoring overflow
{name: "GEnoov", controls: 1}, // 'GE' but without honoring overflow
}

archs = append(archs, arch{
Expand Down
48 changes: 28 additions & 20 deletions src/cmd/compile/internal/ssa/opGen.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit e8f5a33

Please sign in to comment.