diff mbox

Fix ICE on unaligned record field

Message ID 20141203140207.GL8214@virgil.suse
State New
Headers show

Commit Message

Martin Jambor Dec. 3, 2014, 2:02 p.m. UTC
Hi,

On Mon, Dec 01, 2014 at 12:00:14PM +0100, Richard Biener wrote:
> On Fri, Nov 28, 2014 at 5:20 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> > Hi,
> >
> > the attached Ada testcase triggers an assertion in the RTL expander for the
> > address operator because the operator has been applied to a non-byte-aligned
> > record field.  The problematic ADDR_EXPR is built by ipa_modify_call_arguments
> > which has a hole when get_addr_base_and_unit_offset returns NULL_TREE: the
> > variable offset case is handled but not the non-byte-aligned case, which can
> > rountinely happen in Ada, hence the proposed fix.
> >
> > Tested on x86_64-suse-linux, OK for the mainline?
> 
> Umm.  So you are doing a possibly aggregate copy here?  Or how
> are we sure we are dealing with a register op only?  (the function
> is always a twisted maze to me...)
> 
> That said - I suppose this is called from IPA-SRA?  In that case,
> can't we please avoid doing the transform in the first place?
> 

I suppose that could be done by something like the following, which I
have tested only very mildly so far, in particular I have not double
checked that get_inner_reference is cfun-agnostic.

Hope it helps,

Martin



2014-12-03  Martin Jambor  <mjambor@suse.cz>

	* tree-sra.c (ipa_sra_check_caller_data): New type.
	(has_caller_p): Removed.
	(ipa_sra_check_caller): New function.
	(ipa_sra_preliminary_function_checks): Use it.

Comments

Eric Botcazou Dec. 10, 2014, 11:27 a.m. UTC | #1
> I suppose that could be done by something like the following, which I
> have tested only very mildly so far, in particular I have not double
> checked that get_inner_reference is cfun-agnostic.

Thanks, this works fine on the testcase and I believe that get_inner_reference 
is indeed cfun-agnostic (for example it's called from front-ends).

> 2014-12-03  Martin Jambor  <mjambor@suse.cz>
> 
> 	* tree-sra.c (ipa_sra_check_caller_data): New type.
> 	(has_caller_p): Removed.
> 	(ipa_sra_check_caller): New function.
> 	(ipa_sra_preliminary_function_checks): Use it.

If Richard and you think it's the way to go, then fine by me.
Richard Biener Dec. 10, 2014, 2:01 p.m. UTC | #2
On Wed, Dec 3, 2014 at 3:02 PM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Mon, Dec 01, 2014 at 12:00:14PM +0100, Richard Biener wrote:
>> On Fri, Nov 28, 2014 at 5:20 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> > Hi,
>> >
>> > the attached Ada testcase triggers an assertion in the RTL expander for the
>> > address operator because the operator has been applied to a non-byte-aligned
>> > record field.  The problematic ADDR_EXPR is built by ipa_modify_call_arguments
>> > which has a hole when get_addr_base_and_unit_offset returns NULL_TREE: the
>> > variable offset case is handled but not the non-byte-aligned case, which can
>> > rountinely happen in Ada, hence the proposed fix.
>> >
>> > Tested on x86_64-suse-linux, OK for the mainline?
>>
>> Umm.  So you are doing a possibly aggregate copy here?  Or how
>> are we sure we are dealing with a register op only?  (the function
>> is always a twisted maze to me...)
>>
>> That said - I suppose this is called from IPA-SRA?  In that case,
>> can't we please avoid doing the transform in the first place?
>>
>
> I suppose that could be done by something like the following, which I
> have tested only very mildly so far, in particular I have not double
> checked that get_inner_reference is cfun-agnostic.
>
> Hope it helps,
>
> Martin
>
>
>
> 2014-12-03  Martin Jambor  <mjambor@suse.cz>
>
>         * tree-sra.c (ipa_sra_check_caller_data): New type.
>         (has_caller_p): Removed.
>         (ipa_sra_check_caller): New function.
>         (ipa_sra_preliminary_function_checks): Use it.
>
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index f213c80..900f3c3 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -4977,13 +4977,54 @@ modify_function (struct cgraph_node *node, ipa_parm_adjustment_vec adjustments)
>    return cfg_changed;
>  }
>
> -/* If NODE has a caller, return true.  */
> +/* Means of communication between ipa_sra_check_caller and
> +   ipa_sra_preliminary_function_checks.  */
> +
> +struct ipa_sra_check_caller_data
> +{
> +  bool has_callers;
> +  bool bad_arg_alignment;
> +};
> +
> +/* If NODE has a caller, mark that fact in DATA which is pointer to
> +   ipa_sra_check_caller_data.  Also check all aggregate arguments in all known
> +   calls if they are unit aligned and if not, set the appropriate flag in DATA
> +   too. */
>
>  static bool
> -has_caller_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED)
> +ipa_sra_check_caller (struct cgraph_node *node, void *data)
>  {
> -  if (node->callers)
> -    return true;
> +  if (!node->callers)
> +    return false;
> +
> +  struct ipa_sra_check_caller_data *iscc;
> +  iscc = (struct ipa_sra_check_caller_data *) data;
> +  iscc->has_callers = true;
> +
> +  for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller)
> +    {
> +      gimple call_stmt = cs->call_stmt;
> +      unsigned count = gimple_call_num_args (call_stmt);
> +      for (unsigned i = 0; i < count; i++)
> +       {
> +         tree arg = gimple_call_arg (call_stmt, i);
> +         if (is_gimple_reg (arg))

!handled_component_p (arg)

would better match what you are checking below.

Note that I think the place of the check is unfortunate as you for example
will not remove the argument if it is unused.  In fact I'm not yet sure
what transform exactly we are disabling.  I am guessing we are
passing an aggregate by value that resides at a bit-aligned offset
of some outer object:

  foo (x.aggr);

and the function then does

foo (Aggr a)
{
  int i = a.foo;
...
}

thus use only a part of the aggregate.  Then IPA SRA would like to
pass x.aggr.foo instead of x.aggr and thus tries to materialize a
load from x.aggr.foo at all callers but fails to do that in a valid way.

Erics fix did, at all callers

  Aggr tem = x.aggr;
  foo (tem.foo);

?

While we should be able to simply do

  foo (BIT_FIELD_REF <x.aggr, .....>)

with the appropriate bit offset and size?  (if that's of register type
you need to do the load in a separate stmt of couse).

Thus similar to Erics fix but avoiding the aggregate copy.

But maybe I am not really understanding the issue (didn't look at the
testcase).

Richard.

> +             continue;
> +
> +         tree offset;
> +         HOST_WIDE_INT bitsize, bitpos;
> +         machine_mode mode;
> +         int unsignedp, volatilep = 0;
> +         get_inner_reference (arg, &bitsize, &bitpos, &offset, &mode,
> +                              &unsignedp, &volatilep, false);
> +         if (bitpos % BITS_PER_UNIT)
> +           {
> +             iscc->bad_arg_alignment = true;
> +             return true;
> +           }
> +       }
> +    }
> +
>    return false;
>  }
>
> @@ -5038,14 +5079,6 @@ ipa_sra_preliminary_function_checks (struct cgraph_node *node)
>        return false;
>      }
>
> -  if (!node->call_for_symbol_thunks_and_aliases (has_caller_p, NULL, true))
> -    {
> -      if (dump_file)
> -       fprintf (dump_file,
> -                "Function has no callers in this compilation unit.\n");
> -      return false;
> -    }
> -
>    if (cfun->stdarg)
>      {
>        if (dump_file)
> @@ -5064,6 +5097,25 @@ ipa_sra_preliminary_function_checks (struct cgraph_node *node)
>        return false;
>      }
>
> +  struct ipa_sra_check_caller_data iscc;
> +  memset (&iscc, 0, sizeof(iscc));
> +  node->call_for_symbol_thunks_and_aliases (ipa_sra_check_caller, &iscc, true);
> +  if (!iscc.has_callers)
> +    {
> +      if (dump_file)
> +       fprintf (dump_file,
> +                "Function has no callers in this compilation unit.\n");
> +      return false;
> +    }
> +
> +  if (iscc.bad_arg_alignment)
> +    {
> +      if (dump_file)
> +       fprintf (dump_file,
> +                "A function call has an argument with non-unit alignemnt.\n");
> +      return false;
> +    }
> +
>    return true;
>  }
>
Eric Botcazou Dec. 11, 2014, 9:52 p.m. UTC | #3
> Note that I think the place of the check is unfortunate as you for example
> will not remove the argument if it is unused.  In fact I'm not yet sure
> what transform exactly we are disabling.  I am guessing we are
> passing an aggregate by value that resides at a bit-aligned offset
> of some outer object:
> 
>   foo (x.aggr);
> 
> and the function then does
> 
> foo (Aggr a)
> {
>   int i = a.foo;
> ...
> }
> 
> thus use only a part of the aggregate.  Then IPA SRA would like to
> pass x.aggr.foo instead of x.aggr and thus tries to materialize a
> load from x.aggr.foo at all callers but fails to do that in a valid way.

Right, it's the usual MEM_EXPR business creating ADDR_EXPRs out of nowhere and 
miserably failing on something not addressable.

> Erics fix did, at all callers
> 
>   Aggr tem = x.aggr;
>   foo (tem.foo);
> 
> ?

Yes, because the code wants to take &tem afterwards.

> While we should be able to simply do
> 
>   foo (BIT_FIELD_REF <x.aggr, .....>)
> 
> with the appropriate bit offset and size?  (if that's of register type
> you need to do the load in a separate stmt of couse).
> 
> Thus similar to Erics fix but avoiding the aggregate copy.

Yes, that should be doable, but I'm not sure it's worth the hassle.
Richard Biener Dec. 12, 2014, 9:22 a.m. UTC | #4
On Thu, Dec 11, 2014 at 10:52 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Note that I think the place of the check is unfortunate as you for example
>> will not remove the argument if it is unused.  In fact I'm not yet sure
>> what transform exactly we are disabling.  I am guessing we are
>> passing an aggregate by value that resides at a bit-aligned offset
>> of some outer object:
>>
>>   foo (x.aggr);
>>
>> and the function then does
>>
>> foo (Aggr a)
>> {
>>   int i = a.foo;
>> ...
>> }
>>
>> thus use only a part of the aggregate.  Then IPA SRA would like to
>> pass x.aggr.foo instead of x.aggr and thus tries to materialize a
>> load from x.aggr.foo at all callers but fails to do that in a valid way.
>
> Right, it's the usual MEM_EXPR business creating ADDR_EXPRs out of nowhere and
> miserably failing on something not addressable.

Well, I call it a convenience that MEM_EXPR, unlike INDIRECT_REF, can
be used to encapsulate an arbitrary byte-offset and view-conversion.  Of course
it's still a dereference of an address so that convenience doesn't work on sth
non-addressable.

>> Erics fix did, at all callers
>>
>>   Aggr tem = x.aggr;
>>   foo (tem.foo);
>>
>> ?
>
> Yes, because the code wants to take &tem afterwards.
>
>> While we should be able to simply do
>>
>>   foo (BIT_FIELD_REF <x.aggr, .....>)
>>
>> with the appropriate bit offset and size?  (if that's of register type
>> you need to do the load in a separate stmt of couse).
>>
>> Thus similar to Erics fix but avoiding the aggregate copy.
>
> Yes, that should be doable, but I'm not sure it's worth the hassle.

I'll leave that to you two to decide - Martins patch is ok if you are fine
with disabling the optimization (also removing an unused parameter).

Thanks,
Richard.

> --
> Eric Botcazou
Eric Botcazou Dec. 12, 2014, 10:50 a.m. UTC | #5
> Well, I call it a convenience that MEM_EXPR, unlike INDIRECT_REF, can
> be used to encapsulate an arbitrary byte-offset and view-conversion.  Of
> course it's still a dereference of an address so that convenience doesn't
> work on sth non-addressable.

No discussion on the merits of MEM_EXPR vs INDIRECT_REF but on the pertinence 
of creating ADDR_EXPRs out of nowhere just to use them.

> I'll leave that to you two to decide - Martins patch is ok if you are fine
> with disabling the optimization (also removing an unused parameter).

I'm fine with disabling it: the aggregate is passed directly so it's probably 
small and, in the case at hand, the optimized caller would do 2 extractions 
instead of only 1 so the gain is not obvious.
Eric Botcazou Jan. 6, 2015, 5:07 p.m. UTC | #6
Martin,

> I suppose that could be done by something like the following, which I
> have tested only very mildly so far, in particular I have not double
> checked that get_inner_reference is cfun-agnostic.

The patch introduces no regressions on x86-64/Linux and makes the testcase 
(gnat.dg/specs/pack12.ads attached to the first message) pass.

Do you plan to install it (along with the testcase)?

> 2014-12-03  Martin Jambor  <mjambor@suse.cz>
> 
> 	* tree-sra.c (ipa_sra_check_caller_data): New type.
> 	(has_caller_p): Removed.
> 	(ipa_sra_check_caller): New function.
> 	(ipa_sra_preliminary_function_checks): Use it.
diff mbox

Patch

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index f213c80..900f3c3 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -4977,13 +4977,54 @@  modify_function (struct cgraph_node *node, ipa_parm_adjustment_vec adjustments)
   return cfg_changed;
 }
 
-/* If NODE has a caller, return true.  */
+/* Means of communication between ipa_sra_check_caller and
+   ipa_sra_preliminary_function_checks.  */
+
+struct ipa_sra_check_caller_data
+{
+  bool has_callers;
+  bool bad_arg_alignment;
+};
+
+/* If NODE has a caller, mark that fact in DATA which is pointer to
+   ipa_sra_check_caller_data.  Also check all aggregate arguments in all known
+   calls if they are unit aligned and if not, set the appropriate flag in DATA
+   too. */
 
 static bool
-has_caller_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED)
+ipa_sra_check_caller (struct cgraph_node *node, void *data)
 {
-  if (node->callers)
-    return true;
+  if (!node->callers)
+    return false;
+
+  struct ipa_sra_check_caller_data *iscc;
+  iscc = (struct ipa_sra_check_caller_data *) data;
+  iscc->has_callers = true;
+
+  for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller)
+    {
+      gimple call_stmt = cs->call_stmt;
+      unsigned count = gimple_call_num_args (call_stmt);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  tree arg = gimple_call_arg (call_stmt, i);
+	  if (is_gimple_reg (arg))
+	      continue;
+
+	  tree offset;
+	  HOST_WIDE_INT bitsize, bitpos;
+	  machine_mode mode;
+	  int unsignedp, volatilep = 0;
+	  get_inner_reference (arg, &bitsize, &bitpos, &offset, &mode,
+			       &unsignedp, &volatilep, false);
+	  if (bitpos % BITS_PER_UNIT)
+	    {
+	      iscc->bad_arg_alignment = true;
+	      return true;
+	    }
+	}
+    }
+
   return false;
 }
 
@@ -5038,14 +5079,6 @@  ipa_sra_preliminary_function_checks (struct cgraph_node *node)
       return false;
     }
 
-  if (!node->call_for_symbol_thunks_and_aliases (has_caller_p, NULL, true))
-    {
-      if (dump_file)
-	fprintf (dump_file,
-		 "Function has no callers in this compilation unit.\n");
-      return false;
-    }
-
   if (cfun->stdarg)
     {
       if (dump_file)
@@ -5064,6 +5097,25 @@  ipa_sra_preliminary_function_checks (struct cgraph_node *node)
       return false;
     }
 
+  struct ipa_sra_check_caller_data iscc;
+  memset (&iscc, 0, sizeof(iscc));
+  node->call_for_symbol_thunks_and_aliases (ipa_sra_check_caller, &iscc, true);
+  if (!iscc.has_callers)
+    {
+      if (dump_file)
+	fprintf (dump_file,
+		 "Function has no callers in this compilation unit.\n");
+      return false;
+    }
+
+  if (iscc.bad_arg_alignment)
+    {
+      if (dump_file)
+	fprintf (dump_file,
+		 "A function call has an argument with non-unit alignemnt.\n");
+      return false;
+    }
+
   return true;
 }