-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: ARM SSA support #15365
Comments
CL https://golang.org/cl/22187 mentions this issue. |
CL https://golang.org/cl/22186 mentions this issue. |
CL https://golang.org/cl/22260 mentions this issue. |
… on ARM Progress on SSA for ARM. Still not complete. Now Fibonacci function compiles and runs correctly. The old backend swaps the operands for CMP instruction. This CL does the same on SSA backend, and uses conditional branch accordingly. Updates #15365. Change-Id: I117e17feb22f03d936608bd232f76970e4bbe21a Reviewed-on: https://go-review.googlesource.com/22187 Reviewed-by: Keith Randall <[email protected]>
Progress on SSA backend for ARM. Still not complete. It compiles a Fibonacci function, but the caller picked the return value from an incorrect offset. This CL adjusts it to match the stack frame layout for architectures with link register. Updates #15365. Change-Id: I01e03c3e95f5503a185e8ac2b6d9caf4faf3d014 Reviewed-on: https://go-review.googlesource.com/22186 Reviewed-by: Keith Randall <[email protected]> Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
CL https://golang.org/cl/22653 mentions this issue. |
CL https://golang.org/cl/22654 mentions this issue. |
CL https://golang.org/cl/22855 mentions this issue. |
CL https://golang.org/cl/22854 mentions this issue. |
CL https://golang.org/cl/22856 mentions this issue. |
Progress on SSA backend for ARM. Still not complete. Now "helloworld" function compiles and runs. Updates #15365. Change-Id: I02f66983cefdf07a6aed262fb4af8add464d8e9a Reviewed-on: https://go-review.googlesource.com/22854 Reviewed-by: Keith Randall <[email protected]>
…Barrier check Use 32-bit load for writeBarrier check on all architectures. Padding added to runtime structure. Updates #15365, #15492. Change-Id: I5d3dadf8609923fe0fe4fcb384a418b7b9624998 Reviewed-on: https://go-review.googlesource.com/22855 Reviewed-by: Keith Randall <[email protected]> Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
- generic Ops: Phi, CALL variants, NilCheck - generic Blocks: Plain, Check - 32-bit arithmetics - CMP and conditional branches - load/store - zero/sign-extensions (8 to 16, 8 to 32, 16 to 32) Progress on SSA backend for ARM. Still not complete. Now "errors" package compiles and tests passed. Updates #15365. Change-Id: If126fd17f8695cbf55d64085bb3f1a4a53205701 Reviewed-on: https://go-review.googlesource.com/22856 Reviewed-by: Keith Randall <[email protected]>
CL https://golang.org/cl/23093 mentions this issue. |
CL https://golang.org/cl/23097 mentions this issue. |
CL https://golang.org/cl/23096 mentions this issue. |
Fix hardcoded flag register mask in ssa/flagalloc.go by auto-generating the mask. Also fix a mistake (in previous CL) about conditional branches. Progress on SSA backend for ARM. Still not complete. Now "container/ring" package compiles and tests passed. Updates #15365. Change-Id: Id7c8805c30dbb8107baedb485ed0f71f59ed6ea8 Reviewed-on: https://go-review.googlesource.com/23093 Reviewed-by: Keith Randall <[email protected]>
Implement shifts and multiplications for up to 32-bit values. Also handle Exit block. Progress on SSA backend for ARM. Still not complete. container/heap, crypto/subtle, hash/adler32 packages compile and tests passed. Updates #15365. Change-Id: I6bee4d5b0051e51d5de97e8a1938c4b87a36cbf8 Reviewed-on: https://go-review.googlesource.com/23096 Reviewed-by: Keith Randall <[email protected]>
CL https://golang.org/cl/23212 mentions this issue. |
Generate load/stores for small zeroing/move, DUFFZERO/DUFFCOPY for medium zeroing/move, and loops for large zeroing/move. cmd/compile/internal/gc/testdata/{copy_ssa.go,zero_ssa.go} tests passed. Progress on SSA backend for ARM. Still not complete. A few packages in the standard library compile and tests passed, including container/list, hash/crc32, unicode/utf8, etc. Updates #15365. Change-Id: Ieb4b68b44ee7de66bf7b68f5f33a605349fcc6fa Reviewed-on: https://go-review.googlesource.com/23097 Reviewed-by: Keith Randall <[email protected]>
CL https://golang.org/cl/23292 mentions this issue. |
CL https://golang.org/cl/23213 mentions this issue. |
Also fix argument offset for runtime calls. Also fix LoadReg/StoreReg by generating instructions by type. Progress on SSA backend for ARM. Still not complete. Tests append_ssa.go, assert_ssa.go, loadstore_ssa.go, short_ssa.go, and deferNoReturn.go in cmd/compile/internal/gc/testdata passed. Updates #15365. Change-Id: I0f0a2398cab8bbb461772a55241a16a7da2ecedf Reviewed-on: https://go-review.googlesource.com/23212 Reviewed-by: David Chase <[email protected]>
CL https://golang.org/cl/23486 mentions this issue. |
CL https://golang.org/cl/23542 mentions this issue. |
Introduce dec64 rules to (generically) decompose 64-bit integer on 32-bit architectures. 64-bit integer is composed/decomposed with Int64Make/Hi/Lo ops, as for complex types. The idea of dealing with Add64 is the following: (Add64 (Int64Make xh xl) (Int64Make yh yl)) -> (Int64Make (Add32withcarry xh yh (Select0 (Add32carry xl yl))) (Select1 (Add32carry xl yl))) where Add32carry returns a tuple (flags,uint32). Select0 and Select1 read the first and the second component of the tuple, respectively. The two Add32carry will be CSE'd. Similarly for multiplication, Mul32uhilo returns a tuple (hi, lo). Also add support of KeepAlive, to fix build after merge. Tests addressed_ssa.go, array_ssa.go, break_ssa.go, chan_ssa.go, cmp_ssa.go, ctl_ssa.go, map_ssa.go, and string_ssa.go in cmd/compile/internal/gc/testdata passed. Progress on SSA for ARM. Still not complete. Updates #15365. Change-Id: I7867c76785a456312de5d8398a6b3f7ca5a4f7ec Reviewed-on: https://go-review.googlesource.com/23213 Reviewed-by: Keith Randall <[email protected]>
Auto-generate register masks and load them through Config. Passed toolstash -cmp on AMD64. Tests phi_ssa.go and regalloc_ssa.go in cmd/compile/internal/gc/testdata passed on ARM. Updates #15365. Change-Id: I393924d68067f2dbb13dab82e569fb452c986593 Reviewed-on: https://go-review.googlesource.com/23292 Reviewed-by: David Chase <[email protected]>
Also fix a mistake in previous CL about x8 and x16 shifts: the shift needs ZeroExt. Progress on SSA for ARM. Still not complete. Updates #15365. Change-Id: Ibc352760023d38bc6b9c5251e929fe26e016637a Reviewed-on: https://go-review.googlesource.com/23486 Reviewed-by: David Chase <[email protected]>
CL https://golang.org/cl/23652 mentions this issue. |
SSA treats SP as constant throughout a function, so as OffPtr [off] SP. When the stack moves, spilled OffPtr values become invalid, if they are not pointer-typed. (Currently it is fine because of the optimization rules that folds OffPtr into Load/Store. But it'd better be "optimization", not requirement.) Updates #15365. Change-Id: I76cf4008dfdc169e1cb5a55a2605b6678efc915d Reviewed-on: https://go-review.googlesource.com/23941 Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Chase <[email protected]>
Introduce an op MOVWaddr for addresses on ARM, instead of overuse ADDconst. Mark MOVWaddr as rematerializable. This fixes a liveness problem: if it were not rematerializable, the address of a variable may be spilled and later use of the address may just load the spilled value without mentioning the variable, and the liveness code may think it is dead prematurely. Update #15365. Change-Id: Ib0b0fa826bdb75c9e6bb362b95c6cf132cc6b1c0 Reviewed-on: https://go-review.googlesource.com/23942 Reviewed-by: David Chase <[email protected]>
Use hardware g register (R10) for GetG, allow g to appear at LHS of some ops. Progress on SSA backend for ARM. Now everything compiles and runs. Updates #15365. Change-Id: Icdf93585579faa86cc29b1e17ab7c90f0119fc4e Reviewed-on: https://go-review.googlesource.com/23952 Reviewed-by: David Chase <[email protected]>
CL https://golang.org/cl/24137 mentions this issue. |
CL https://golang.org/cl/24210 mentions this issue. |
CSE may substitute a tuple generator with another one in a different block. In this case, since we want tuple selectors to stay together with the tuple generator, copy the selector to the new generator's block and rewrite its use. Op.isTupleGenerator and Op.isTupleSelector are introduced to assert tuple ops. Use it in tighten as well. Updates #15365. Change-Id: Ia9e8c734b9cc3bc9fca4a2750041eef9cdfac5a5 Reviewed-on: https://go-review.googlesource.com/24137 Reviewed-by: David Chase <[email protected]>
CL https://golang.org/cl/24451 mentions this issue. |
Like AMD64, don't issue NilCheck instruction if the subsequent block has a load or store at the same address. Pass test/nilptr3_ssa.go. Updates #15365. Change-Id: Ic88780dab8c4893c57d1c95f663760cc185fe51e Reviewed-on: https://go-review.googlesource.com/24451 Reviewed-by: David Chase <[email protected]> Run-TryBot: David Chase <[email protected]>
CL https://golang.org/cl/24511 mentions this issue. |
CL https://golang.org/cl/24512 mentions this issue. |
CL https://golang.org/cl/24646 mentions this issue. |
Encode the size and the alignment into AuxInt of Zero and Move ops. On AMD64, we simply don't look at the alignment. On ARM and PPC64, we only generate aligned stores. Updates #15365. Change-Id: Ifdcc205c364f67c4516b9adebfe7d50d223b6863 Reviewed-on: https://go-review.googlesource.com/24511 Reviewed-by: David Chase <[email protected]> Reviewed-by: Keith Randall <[email protected]> Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
As Josh mentioned in CL 24716, there has been requests for using SSA for ARM. SSA can still be disabled by setting -ssa=0 for cmd/compile, or partially enabled with GOSSAFUNC, GOSSAPKG, and GOSSAHASH. Not enable SSA by default on NaCl, which is not supported yet. Enable SSA-specific tests on ARM: live_ssa.go and nilptr3_ssa.go; disable non-SSA tests: live.go, nilptr3.go, and slicepot.go. Updates #15365. Change-Id: Ic2ca8d166aeca8517b9d262a55e92f2130683a16 Reviewed-on: https://go-review.googlesource.com/23953 Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: David Chase <[email protected]>
Mostly constant folding rules, analogous to AMD64 ones. Along with some simplifications. Updates #15365. Change-Id: If83bc1188bb05acb982ef3a1c21704c187e3eb24 Reviewed-on: https://go-review.googlesource.com/24210 Run-TryBot: David Chase <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Chase <[email protected]>
CL https://golang.org/cl/24790 mentions this issue. |
CL https://golang.org/cl/24859 mentions this issue. |
CL https://golang.org/cl/24909 mentions this issue. |
This CL implements the following optimizations for ARM: - use shifted ops (e.g. ADD R1<<2, R2) and indexed load/stores - break up shift ops. Shifts used to be one SSA op that generates multiple instructions. We break them up to multiple ops, which allows constant folding and CSE for comparisons. Conditional moves are introduced for this. - simplify zero/sign-extension ops. Updates #15365. Change-Id: I55e262a776a7ef2a1505d75e04d1208913c35d39 Reviewed-on: https://go-review.googlesource.com/24512 Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Chase <[email protected]>
The argument size for runtime call was incorrectly includes the size of LR (FixedFrameSize in general). This makes the stack frame sometimes unnecessarily 4 bytes larger on ARM. For example, func f(b []byte) byte { return b[0] } compiles to 0x0000 00000 (h.go:6) TEXT "".f(SB), $4-16 // <-- framesize = 4 0x0000 00000 (h.go:6) MOVW 8(g), R1 0x0004 00004 (h.go:6) CMP R1, R13 0x0008 00008 (h.go:6) BLS 52 0x000c 00012 (h.go:6) MOVW.W R14, -8(R13) 0x0010 00016 (h.go:6) FUNCDATA $0, gclocals·8355ad952265fec823c17fcf739bd009(SB) 0x0010 00016 (h.go:6) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x0010 00016 (h.go:6) MOVW "".b+4(FP), R0 0x0014 00020 (h.go:6) CMP $0, R0 0x0018 00024 (h.go:6) BLS 44 0x001c 00028 (h.go:6) MOVW "".b(FP), R0 0x0020 00032 (h.go:6) MOVBU (R0), R0 0x0024 00036 (h.go:6) MOVB R0, "".~r1+12(FP) 0x0028 00040 (h.go:6) MOVW.P 8(R13), R15 0x002c 00044 (h.go:6) PCDATA $0, $1 0x002c 00044 (h.go:6) CALL runtime.panicindex(SB) 0x0030 00048 (h.go:6) UNDEF 0x0034 00052 (h.go:6) NOP 0x0034 00052 (h.go:6) MOVW R14, R3 0x0038 00056 (h.go:6) CALL runtime.morestack_noctxt(SB) 0x003c 00060 (h.go:6) JMP 0 Note that the frame size is 4, but there is actually no local. It incorrectly thinks call to runtime.panicindex needs 4 bytes space for argument. This CL fixes it. Updates #15365. Change-Id: Ic65d55283a6aa8a7861d7a3fbc7b63c35785eeec Reviewed-on: https://go-review.googlesource.com/24909 Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Chase <[email protected]>
Add some simplification rules for floating point ops. cmd/internal/obj/arm supports instructions that compare FP register to 0, but runtime softfloat simulator does not. This CL adds these instructions to softfloat simulator as well. Updates #15365. Change-Id: I29405b2bfcb4c8cf106cb7a1a811409fec91b170 Reviewed-on: https://go-review.googlesource.com/24790 Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Chase <[email protected]>
NaCl code runs in sandbox and there are restrictions for its instruction uses (https://developer.chrome.com/native-client/reference/sandbox_internals/arm-32-bit-sandbox). Like the legacy backend, on NaCl, - don't use R9, which is used as NaCl's "thread pointer". - don't use Duff's device. - don't use indexed load/stores. - the assembler rewrites DIV/MOD to runtime calls, which on NaCl clobbers R12, so R12 is marked as clobbered for DIV/MOD. - other restrictions are satisfied by the assembler. Enable SSA specific tests on nacl/arm, and disable non-SSA ones. Updates #15365. Change-Id: I9262693ec6756b89ca29d3ae4e52a96fe5403b02 Reviewed-on: https://go-review.googlesource.com/24859 Reviewed-by: Josh Bleecher Snyder <[email protected]>
… on ARM Updates #15365. Change-Id: I372a5617c2c7d91de545cac0464809b96711b63a Reviewed-on: https://go-review.googlesource.com/24646 Run-TryBot: Cherry Zhang <[email protected]> Reviewed-by: David Chase <[email protected]>
CL https://golang.org/cl/25059 mentions this issue. |
For register-register move, if there is only one use, allocate it in the same register so we don't need to emit an instruction. Updates #15365. Change-Id: Iad41843854a506c521d577ad93fcbe73e8de8065 Reviewed-on: https://go-review.googlesource.com/25059 Run-TryBot: Cherry Zhang <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Chase <[email protected]>
Cherry, is this done? |
Yes, it is done. We can close the issue, now or when it is merged to master. |
Implement SSA backend for ARM architecture.
The text was updated successfully, but these errors were encountered: