Message ID | eb78dcd6-ef63-40b1-a7e2-8ec2422dcefb@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | [committed] amdgcn: align TImode registers | expand |
This is now backported to GCC 10. Andrew On 11/09/2020 11:17, Andrew Stubbs wrote: > This patch fixes an execution failure in which the compiler would > corrupt TImode values due to missed early clobber problems with > partially overlapping register allocations. In fact, adding early > clobber constraints does not fix the issue because IRA doesn't check > that for move instructions properly in all circumstances anyway (the > constraint causes an ICE in postreload). > > This patch fixes the problem by ensuring that TImode values are always > aligned to 4-register boundaries, meaning that inputs and outputs will > either overlap completely, or not at all, neither of which have > early-clobber issues. > > This is an artificial restriction the hardware not present in hardware , > but it is the same solution we use for DImode values where we had a lot > of the same problems. > > With the patch I see the following new test passes: > > PASS: gfortran.dg/PR95331.f90 -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions execution test > PASS: gfortran.dg/PR95331.f90 -O3 -g execution test > PASS: gfortran.dg/gamma_5.f90 -O0 execution test > PASS: gfortran.dg/gamma_5.f90 -O1 execution test > PASS: gfortran.dg/gamma_5.f90 -O2 execution test > PASS: gfortran.dg/gamma_5.f90 -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions execution test > PASS: gfortran.dg/gamma_5.f90 -O3 -g execution test > PASS: gfortran.dg/gamma_5.f90 -Os execution test > PASS: gfortran.dg/intrinsic_pack_1.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions execution test > PASS: gfortran.dg/intrinsic_pack_1.f90 -O3 -g execution test > PASS: gfortran.dg/optional_absent_5.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions execution test > PASS: gfortran.dg/optional_absent_5.f90 -O3 -g execution test > > Andrew
amdgcn: align TImode registers This prevents execution failures caused by partially overlapping input and output registers. This is the same solution already used for DImode. gcc/ChangeLog: * config/gcn/gcn.c (gcn_hard_regno_mode_ok): Align TImode registers. * config/gcn/gcn.md: Assert that TImode registers do not early clobber. diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index 8b3c4544dd5..84d1fd9a354 100644 --- a/gcc/config/gcn/gcn.c +++ b/gcc/config/gcn/gcn.c @@ -475,7 +475,8 @@ gcn_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return (vgpr_1reg_mode_p (mode) || (!((regno - FIRST_VGPR_REG) & 1) && vgpr_2reg_mode_p (mode)) /* TImode is used by DImode compare_and_swap. */ - || mode == TImode); + || (mode == TImode + && !((regno - FIRST_VGPR_REG) & 3))); return false; } diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md index aeb25fbb931..0e73fea93cf 100644 --- a/gcc/config/gcn/gcn.md +++ b/gcc/config/gcn/gcn.md @@ -677,6 +677,8 @@ (define_insn_and_split "*movti_insn" (set (match_dup 4) (match_dup 5)) (set (match_dup 6) (match_dup 7))] { + gcc_assert (rtx_equal_p (operands[0], operands[1]) + || !reg_overlap_mentioned_p (operands[0], operands[1])); operands[6] = gcn_operand_part (TImode, operands[0], 3); operands[7] = gcn_operand_part (TImode, operands[1], 3); operands[4] = gcn_operand_part (TImode, operands[0], 2);