Skip to content

Commit

Permalink
cmd/compile, runtime: mark R12 clobbered for write barrier call on PPC64
Browse files Browse the repository at this point in the history
When external linking, for large binaries, the external linker
may insert a trampoline for the write barrier call, which looks

0000000005a98cc8 <__long_branch_runtime.gcWriteBarrier>:
 5a98cc8:       86 01 82 3d     addis   r12,r2,390
 5a98ccc:       d8 bd 8c e9     ld      r12,-16936(r12)
 5a98cd0:       a6 03 89 7d     mtctr   r12
 5a98cd4:       20 04 80 4e     bctr

It clobbers R12 (and CTR, which is never live across a call).

As at compile time we don't know whether the binary is big and
what link mode will be used, I think we need to mark R12 as
clobbered for write barrier call. For extra safety (future-proof)
we mark caller-saved register that cannot be used for function
arguments, which includes R11, as potentially clobbered as well.

Fixes #40851.

Change-Id: Iedd901c5072f1127cc59b0a48cfeb4aaec81b519
Reviewed-on: https://go-review.googlesource.com/c/go/+/248917
Run-TryBot: Cherry Zhang <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Austin Clements <[email protected]>
  • Loading branch information
cherrymui committed Aug 18, 2020
1 parent c12d9ed commit b58d297
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 23 deletions.
4 changes: 2 additions & 2 deletions src/cmd/compile/internal/ssa/gen/PPC64Ops.go
Original file line number Diff line number Diff line change
Expand Up @@ -645,9 +645,9 @@ func init() {
{name: "LoweredAtomicOr8", argLength: 3, reg: gpstore, asm: "OR", faultOnNilArg0: true, hasSideEffects: true},

// LoweredWB invokes runtime.gcWriteBarrier. arg0=destptr, arg1=srcptr, arg2=mem, aux=runtime.gcWriteBarrier
// It preserves R0 through R15, g, and its arguments R20 and R21,
// It preserves R0 through R17 (except special registers R1, R2, R11, R12, R13), g, and its arguments R20 and R21,
// but may clobber anything else, including R31 (REGTMP).
{name: "LoweredWB", argLength: 3, reg: regInfo{inputs: []regMask{buildReg("R20"), buildReg("R21")}, clobbers: (callerSave &^ buildReg("R0 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R14 R15 R20 R21 g")) | buildReg("R31")}, clobberFlags: true, aux: "Sym", symEffect: "None"},
{name: "LoweredWB", argLength: 3, reg: regInfo{inputs: []regMask{buildReg("R20"), buildReg("R21")}, clobbers: (callerSave &^ buildReg("R0 R3 R4 R5 R6 R7 R8 R9 R10 R14 R15 R16 R17 R20 R21 g")) | buildReg("R31")}, clobberFlags: true, aux: "Sym", symEffect: "None"},

// There are three of these functions so that they can have three different register inputs.
// When we check 0 <= c <= cap (A), then 0 <= b <= c (B), then 0 <= a <= b (C), we want the
Expand Down
2 changes: 1 addition & 1 deletion src/cmd/compile/internal/ssa/opGen.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

41 changes: 21 additions & 20 deletions src/runtime/asm_ppc64x.s
Original file line number Diff line number Diff line change
Expand Up @@ -916,23 +916,23 @@ TEXT ·checkASM(SB),NOSPLIT,$0-1
// - R20 is the destination of the write
// - R21 is the value being written at R20.
// It clobbers condition codes.
// It does not clobber R0 through R15,
// It does not clobber R0 through R17 (except special registers),
// but may clobber any other register, *including* R31.
TEXT runtime·gcWriteBarrier(SB),NOSPLIT,$112
// The standard prologue clobbers R31.
// We use R16 and R17 as scratch registers.
MOVD g_m(g), R16
MOVD m_p(R16), R16
MOVD (p_wbBuf+wbBuf_next)(R16), R17
// We use R18 and R19 as scratch registers.
MOVD g_m(g), R18
MOVD m_p(R18), R18
MOVD (p_wbBuf+wbBuf_next)(R18), R19
// Increment wbBuf.next position.
ADD $16, R17
MOVD R17, (p_wbBuf+wbBuf_next)(R16)
MOVD (p_wbBuf+wbBuf_end)(R16), R16
CMP R16, R17
ADD $16, R19
MOVD R19, (p_wbBuf+wbBuf_next)(R18)
MOVD (p_wbBuf+wbBuf_end)(R18), R18
CMP R18, R19
// Record the write.
MOVD R21, -16(R17) // Record value
MOVD (R20), R16 // TODO: This turns bad writes into bad reads.
MOVD R16, -8(R17) // Record *slot
MOVD R21, -16(R19) // Record value
MOVD (R20), R18 // TODO: This turns bad writes into bad reads.
MOVD R18, -8(R19) // Record *slot
// Is the buffer full? (flags set in CMP above)
BEQ flush
ret:
Expand All @@ -956,11 +956,12 @@ flush:
MOVD R8, (FIXED_FRAME+56)(R1)
MOVD R9, (FIXED_FRAME+64)(R1)
MOVD R10, (FIXED_FRAME+72)(R1)
MOVD R11, (FIXED_FRAME+80)(R1)
MOVD R12, (FIXED_FRAME+88)(R1)
// R11, R12 may be clobbered by external-linker-inserted trampoline
// R13 is REGTLS
MOVD R14, (FIXED_FRAME+96)(R1)
MOVD R15, (FIXED_FRAME+104)(R1)
MOVD R14, (FIXED_FRAME+80)(R1)
MOVD R15, (FIXED_FRAME+88)(R1)
MOVD R16, (FIXED_FRAME+96)(R1)
MOVD R17, (FIXED_FRAME+104)(R1)

// This takes arguments R20 and R21.
CALL runtime·wbBufFlush(SB)
Expand All @@ -975,10 +976,10 @@ flush:
MOVD (FIXED_FRAME+56)(R1), R8
MOVD (FIXED_FRAME+64)(R1), R9
MOVD (FIXED_FRAME+72)(R1), R10
MOVD (FIXED_FRAME+80)(R1), R11
MOVD (FIXED_FRAME+88)(R1), R12
MOVD (FIXED_FRAME+96)(R1), R14
MOVD (FIXED_FRAME+104)(R1), R15
MOVD (FIXED_FRAME+80)(R1), R14
MOVD (FIXED_FRAME+88)(R1), R15
MOVD (FIXED_FRAME+96)(R1), R16
MOVD (FIXED_FRAME+104)(R1), R17
JMP ret

// Note: these functions use a special calling convention to save generated code space.
Expand Down

0 comments on commit b58d297

Please sign in to comment.