Message ID | 201311111432.rABEWoQ7016136@d06av02.portsmouth.uk.ibm.com |
---|---|
State | New |
Headers | show |
On 11/11/13 07:32, Ulrich Weigand wrote: > Hello, > > when implementing the new ABI for powerpc64le-linux, I ran into an assertion > failure in store_unaligned_arguments_into_pseudos: > gcc_assert (args[i].partial % UNITS_PER_WORD == 0); > > This can happen in the new ABI since we pass "homogeneous structures" > consisting of soleley floating point elements of the same type in > floating-point registers until they run out, and the rest of the > structure in memory. If the structure member type is a 4-byte float, > and we only have an odd number of floating point registers available, > then args[i].partial can legitimately end up not being a multiple > of UNITS_PER_WORD (i.e. 8 on the platform). > > Now, there are a number of similar checks that args[i].partial is > aligned elsewhere in calls.c and functions.c. But for all of those > the logic is: if args[i].reg is a REG, then args[i].partial must be > a multiple of the word size; but if args[i].regs is a PARALLEL, then > args[i].partial can be any arbitrary value. In the powerpc64le-linux > use case, the back-end always generates PARALLEL in this situation, > so it seemed the middle-end ought to support this -- and it does, > except for this one case. > > However, looking more closely, it seems store_unaligned_arguments_into_pseudos > is not really useful for PARALLEL arguments in the first place. What this > routine does is load arguments into args[i].aligned_regs. But if we have > an argument where args[i].reg is a PARALLEL, args[i].aligned_regs will in > fact never be used later on at all! Instead, PARALLEL will always be > handled directly via emit_group_move (in load_register_parameters), so > the code generated by store_unaligned_arguments_into_pseudos for such cases > is simply dead anyway. > > Thus, I'd suggest to simply have store_unaligned_arguments_into_pseudos > skip PARALLEL arguments. > > Tested on powerpc64-linux and powerpc64le-linux. > > OK for mainline? > > Bye, > Ulrich > > > > ChangeLog: > > 2013-11-11 Ulrich Weigand <Ulrich.Weigand@de.ibm.com> > > * calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL > arguments. Does this work on the PA, particularly the 32bit ABI? /* Structures 5 to 8 bytes in size are passed in the general registers in the same manner as other non floating-point objects. The data is right-justified and zero-extended to 64 bits. This is opposite to the normal justification used on big endian targets and requires special treatment. We now define BLOCK_REG_PADDING to pad these objects. Aggregates, complex and vector types are passed in the same manner as structures. */ if (mode == BLKmode || (type && (AGGREGATE_TYPE_P (type) || TREE_CODE (type) == COMPLEX_TYPE || TREE_CODE (type) == VECTOR_TYPE))) { rtx loc = gen_rtx_EXPR_LIST (VOIDmode, gen_rtx_REG (DImode, gpr_reg_base), const0_rtx); return gen_rtx_PARALLEL (BLKmode, gen_rtvec (1, loc)); } It's been eons since I looked at this stuff, so if it's not relevant, then just say so and I'll put those concerns aside. jeff
On 11/11/13 07:32, Ulrich Weigand wrote: > Hello, > > when implementing the new ABI for powerpc64le-linux, I ran into an assertion > failure in store_unaligned_arguments_into_pseudos: > gcc_assert (args[i].partial % UNITS_PER_WORD == 0); > > This can happen in the new ABI since we pass "homogeneous structures" > consisting of soleley floating point elements of the same type in > floating-point registers until they run out, and the rest of the > structure in memory. If the structure member type is a 4-byte float, > and we only have an odd number of floating point registers available, > then args[i].partial can legitimately end up not being a multiple > of UNITS_PER_WORD (i.e. 8 on the platform). > > Now, there are a number of similar checks that args[i].partial is > aligned elsewhere in calls.c and functions.c. But for all of those > the logic is: if args[i].reg is a REG, then args[i].partial must be > a multiple of the word size; but if args[i].regs is a PARALLEL, then > args[i].partial can be any arbitrary value. In the powerpc64le-linux > use case, the back-end always generates PARALLEL in this situation, > so it seemed the middle-end ought to support this -- and it does, > except for this one case. > > However, looking more closely, it seems store_unaligned_arguments_into_pseudos > is not really useful for PARALLEL arguments in the first place. What this > routine does is load arguments into args[i].aligned_regs. But if we have > an argument where args[i].reg is a PARALLEL, args[i].aligned_regs will in > fact never be used later on at all! Instead, PARALLEL will always be > handled directly via emit_group_move (in load_register_parameters), so > the code generated by store_unaligned_arguments_into_pseudos for such cases > is simply dead anyway. > > Thus, I'd suggest to simply have store_unaligned_arguments_into_pseudos > skip PARALLEL arguments. > > Tested on powerpc64-linux and powerpc64le-linux. > > OK for mainline? > > Bye, > Ulrich > > > > ChangeLog: > > 2013-11-11 Ulrich Weigand <Ulrich.Weigand@de.ibm.com> > > * calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL > arguments. OK, so after a lot of worrying, I think this is OK. I kept thinking this had to tie into the BLKmode return value braindamage that we have to support on the PA. But that's not the case here. This is strictly arguments. If the argument is a PARALLEL, then yes, there's no sense in using store_unaligned_arguments_into_pseudos, at least AFAICT, they are handled by emit_group_move. OK for the trunk, thanks for your patience, Jeff
Index: gcc/gcc/calls.c =================================================================== --- gcc.orig/gcc/calls.c +++ gcc/gcc/calls.c @@ -981,6 +981,7 @@ store_unaligned_arguments_into_pseudos ( for (i = 0; i < num_actuals; i++) if (args[i].reg != 0 && ! args[i].pass_on_stack + && GET_CODE (args[i].reg) != PARALLEL && args[i].mode == BLKmode && MEM_P (args[i].value) && (MEM_ALIGN (args[i].value)