diff mbox

Fix failing assertion in calls.c:store_unaligned_arguments_into_pseudos

Message ID 201311111432.rABEWoQ7016136@d06av02.portsmouth.uk.ibm.com
State New
Headers show

Commit Message

Ulrich Weigand Nov. 11, 2013, 2:32 p.m. UTC
Hello,

when implementing the new ABI for powerpc64le-linux, I ran into an assertion
failure in store_unaligned_arguments_into_pseudos:
            gcc_assert (args[i].partial % UNITS_PER_WORD == 0);

This can happen in the new ABI since we pass "homogeneous structures"
consisting of soleley floating point elements of the same type in
floating-point registers until they run out, and the rest of the
structure in memory.  If the structure member type is a 4-byte float,
and we only have an odd number of floating point registers available,
then args[i].partial can legitimately end up not being a multiple
of UNITS_PER_WORD (i.e. 8 on the platform).

Now, there are a number of similar checks that args[i].partial is
aligned elsewhere in calls.c and functions.c.  But for all of those
the logic is: if args[i].reg is a REG, then args[i].partial must be
a multiple of the word size; but if args[i].regs is a PARALLEL, then
args[i].partial can be any arbitrary value.  In the powerpc64le-linux
use case, the back-end always generates PARALLEL in this situation,
so it seemed the middle-end ought to support this -- and it does,
except for this one case.

However, looking more closely, it seems store_unaligned_arguments_into_pseudos
is not really useful for PARALLEL arguments in the first place.  What this
routine does is load arguments into args[i].aligned_regs.  But if we have
an argument where args[i].reg is a PARALLEL, args[i].aligned_regs will in
fact never be used later on at all!   Instead, PARALLEL will always be
handled directly via emit_group_move (in load_register_parameters), so
the code generated by store_unaligned_arguments_into_pseudos for such cases
is simply dead anyway.

Thus, I'd suggest to simply have store_unaligned_arguments_into_pseudos
skip PARALLEL arguments.

Tested on powerpc64-linux and powerpc64le-linux.

OK for mainline?

Bye,
Ulrich



ChangeLog:

2013-11-11  Ulrich Weigand  <Ulrich.Weigand@de.ibm.com>

	* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
	arguments.

Comments

Jeff Law Nov. 11, 2013, 6:31 p.m. UTC | #1
On 11/11/13 07:32, Ulrich Weigand wrote:
> Hello,
>
> when implementing the new ABI for powerpc64le-linux, I ran into an assertion
> failure in store_unaligned_arguments_into_pseudos:
>              gcc_assert (args[i].partial % UNITS_PER_WORD == 0);
>
> This can happen in the new ABI since we pass "homogeneous structures"
> consisting of soleley floating point elements of the same type in
> floating-point registers until they run out, and the rest of the
> structure in memory.  If the structure member type is a 4-byte float,
> and we only have an odd number of floating point registers available,
> then args[i].partial can legitimately end up not being a multiple
> of UNITS_PER_WORD (i.e. 8 on the platform).
>
> Now, there are a number of similar checks that args[i].partial is
> aligned elsewhere in calls.c and functions.c.  But for all of those
> the logic is: if args[i].reg is a REG, then args[i].partial must be
> a multiple of the word size; but if args[i].regs is a PARALLEL, then
> args[i].partial can be any arbitrary value.  In the powerpc64le-linux
> use case, the back-end always generates PARALLEL in this situation,
> so it seemed the middle-end ought to support this -- and it does,
> except for this one case.
>
> However, looking more closely, it seems store_unaligned_arguments_into_pseudos
> is not really useful for PARALLEL arguments in the first place.  What this
> routine does is load arguments into args[i].aligned_regs.  But if we have
> an argument where args[i].reg is a PARALLEL, args[i].aligned_regs will in
> fact never be used later on at all!   Instead, PARALLEL will always be
> handled directly via emit_group_move (in load_register_parameters), so
> the code generated by store_unaligned_arguments_into_pseudos for such cases
> is simply dead anyway.
>
> Thus, I'd suggest to simply have store_unaligned_arguments_into_pseudos
> skip PARALLEL arguments.
>
> Tested on powerpc64-linux and powerpc64le-linux.
>
> OK for mainline?
>
> Bye,
> Ulrich
>
>
>
> ChangeLog:
>
> 2013-11-11  Ulrich Weigand  <Ulrich.Weigand@de.ibm.com>
>
> 	* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
> 	arguments.
Does this work on the PA, particularly the 32bit ABI?

          /* Structures 5 to 8 bytes in size are passed in the general
              registers in the same manner as other non floating-point
              objects.  The data is right-justified and zero-extended
              to 64 bits.  This is opposite to the normal justification
              used on big endian targets and requires special treatment.
              We now define BLOCK_REG_PADDING to pad these objects.
              Aggregates, complex and vector types are passed in the same
              manner as structures.  */
           if (mode == BLKmode
               || (type && (AGGREGATE_TYPE_P (type)
                            || TREE_CODE (type) == COMPLEX_TYPE
                            || TREE_CODE (type) == VECTOR_TYPE)))
             {
               rtx loc = gen_rtx_EXPR_LIST (VOIDmode,
                                            gen_rtx_REG (DImode, 
gpr_reg_base),
                                            const0_rtx);
               return gen_rtx_PARALLEL (BLKmode, gen_rtvec (1, loc));
             }



It's been eons since I looked at this stuff, so if it's not relevant, 
then just say so and I'll put those concerns aside.

jeff
Jeff Law Nov. 14, 2013, 6:40 a.m. UTC | #2
On 11/11/13 07:32, Ulrich Weigand wrote:
> Hello,
>
> when implementing the new ABI for powerpc64le-linux, I ran into an assertion
> failure in store_unaligned_arguments_into_pseudos:
>              gcc_assert (args[i].partial % UNITS_PER_WORD == 0);
>
> This can happen in the new ABI since we pass "homogeneous structures"
> consisting of soleley floating point elements of the same type in
> floating-point registers until they run out, and the rest of the
> structure in memory.  If the structure member type is a 4-byte float,
> and we only have an odd number of floating point registers available,
> then args[i].partial can legitimately end up not being a multiple
> of UNITS_PER_WORD (i.e. 8 on the platform).
>
> Now, there are a number of similar checks that args[i].partial is
> aligned elsewhere in calls.c and functions.c.  But for all of those
> the logic is: if args[i].reg is a REG, then args[i].partial must be
> a multiple of the word size; but if args[i].regs is a PARALLEL, then
> args[i].partial can be any arbitrary value.  In the powerpc64le-linux
> use case, the back-end always generates PARALLEL in this situation,
> so it seemed the middle-end ought to support this -- and it does,
> except for this one case.
>
> However, looking more closely, it seems store_unaligned_arguments_into_pseudos
> is not really useful for PARALLEL arguments in the first place.  What this
> routine does is load arguments into args[i].aligned_regs.  But if we have
> an argument where args[i].reg is a PARALLEL, args[i].aligned_regs will in
> fact never be used later on at all!   Instead, PARALLEL will always be
> handled directly via emit_group_move (in load_register_parameters), so
> the code generated by store_unaligned_arguments_into_pseudos for such cases
> is simply dead anyway.
>
> Thus, I'd suggest to simply have store_unaligned_arguments_into_pseudos
> skip PARALLEL arguments.
>
> Tested on powerpc64-linux and powerpc64le-linux.
>
> OK for mainline?
>
> Bye,
> Ulrich
>
>
>
> ChangeLog:
>
> 2013-11-11  Ulrich Weigand  <Ulrich.Weigand@de.ibm.com>
>
> 	* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
> 	arguments.
OK, so after a lot of worrying, I think this is OK.  I kept thinking 
this had to tie into the BLKmode return value braindamage that we have 
to support on the PA.  But that's not the case here.  This is strictly 
arguments.

If the argument is a PARALLEL, then yes, there's no sense in using 
store_unaligned_arguments_into_pseudos, at least AFAICT, they are 
handled by emit_group_move.

OK for the trunk, thanks for your patience,
Jeff
diff mbox

Patch

Index: gcc/gcc/calls.c
===================================================================
--- gcc.orig/gcc/calls.c
+++ gcc/gcc/calls.c
@@ -981,6 +981,7 @@  store_unaligned_arguments_into_pseudos (
 
   for (i = 0; i < num_actuals; i++)
     if (args[i].reg != 0 && ! args[i].pass_on_stack
+	&& GET_CODE (args[i].reg) != PARALLEL
 	&& args[i].mode == BLKmode
 	&& MEM_P (args[i].value)
 	&& (MEM_ALIGN (args[i].value)