diff mbox series

Do not give up early on access path oracle

Message ID 20190617091710.wochzapcojjpuwfz@kam.mff.cuni.cz
State New
Headers show
Series Do not give up early on access path oracle | expand

Commit Message

Jan Hubicka June 17, 2019, 9:17 a.m. UTC
Hi,
while working on testcases for nonoverlapping_component_refs_p I run into issue
that we often bypass it because the indirect-decl and indirect-indirect oracles
give up if they manage to match bases and ranges_maybe_overlap_p return true.
(one testcase is included in patch).

The issue is that decl-decl oracle first test base+offsets and if that
fails it proceeds to nonoverlapping_component_refs_of_decl_p which is
still able to do some useful disambiguations when the access paths
contains non-constat ARRAY_REFs where max_size is not very informative.

I modified the other two oracles to avoid the early return which increased
effectivity of nonoverlapping_component_refs_p and aliasing_component_refs_p
somewhat but also about doubled number of calls of them.

I can't say if the early return is just omision or intended to save compile time
but it appeared to me that most of those nondisambiguated comparsions acutally
covers the case we know that access paths are the same and in such case
we could still return early.

This patch adds same_access_paths_p which implements simple logic matching
max_size with type size (thus ruling out variable array accesses) and then
verifying that there are no unions on the way (in that case we still may have
different access paths to same memory location.

With this patch I get relatively reasonable increase in number of querries
and disambiguations.

For tramp3d:

Alias oracle query stats:
  refs_may_alias_p: 3021539 disambiguations, 3321136 queries
  ref_maybe_used_by_call_p: 7118 disambiguations, 3047133 queries
  call_may_clobber_ref_p: 817 disambiguations, 817 queries
  nonoverlapping_component_refs_p: 22 disambiguations, 61735 queries
  nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
  aliasing_component_refs_p: 2050 disambiguations, 19908 queries
  TBAA oracle: 1419961 disambiguations 2916692 queries
               555158 are in alias set 0
               575103 queries asked about the same object
               0 queries asked about the same alias set
               0 access volatile
               252874 are dependent in the DAG
               113596 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 671982 disambiguations, 952513 queries
  pt_solutions_intersect: 97060 disambiguations, 437912 queries

to:

Alias oracle query stats:
  refs_may_alias_p: 3021842 disambiguations, 3321308 queries
  ref_maybe_used_by_call_p: 7118 disambiguations, 3047440 queries
  call_may_clobber_ref_p: 817 disambiguations, 817 queries
  nonoverlapping_component_refs_p: 25 disambiguations, 63108 queries
  nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
  aliasing_component_refs_p: 2236 disambiguations, 20594 queries
  TBAA oracle: 1420030 disambiguations 2918103 queries
               555158 are in alias set 0
               575109 queries asked about the same object
               0 queries asked about the same alias set
               0 access volatile
               253260 are dependent in the DAG
               114546 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 671990 disambiguations, 952532 queries
  pt_solutions_intersect: 97060 disambiguations, 438091 queries

So 3% new querries on wich we seems to have have about 30% disambiguation rate
(190 new disambiguations)

For cc1:

Alias oracle query stats:
  refs_may_alias_p: 38345354 disambiguations, 46034874 queries
  ref_maybe_used_by_call_p: 57452 disambiguations, 38905700 queries
  call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
  nonoverlapping_component_refs_p: 9674 disambiguations, 743745 queries
  nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 queries
  aliasing_component_refs_p: 103234 disambiguations, 291017 queries
  TBAA oracle: 10719978 disambiguations 32885735 queries
               13521045 are in alias set 0
               5233132 queries asked about the same object
               200 queries asked about the same alias set
               0 access volatile
               1459189 are dependent in the DAG
               1952191 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 465494 disambiguations, 6918046 queries
  pt_solutions_intersect: 342384 disambiguations, 6435014 queries

to:

Alias oracle query stats:
  refs_may_alias_p: 38345581 disambiguations, 46035002 queries
  ref_maybe_used_by_call_p: 57452 disambiguations, 38905936 queries
  call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
  nonoverlapping_component_refs_p: 9754 disambiguations, 768725 queries
  nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 queries
  aliasing_component_refs_p: 103316 disambiguations, 314129 queries
  TBAA oracle: 10720037 disambiguations 32893789 queries
               13521037 are in alias set 0
               5233163 queries asked about the same object
               200 queries asked about the same alias set
               0 access volatile
               1462737 are dependent in the DAG
               1956615 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 465494 disambiguations, 6918049 queries
  pt_solutions_intersect: 342386 disambiguations, 6435109 queries

So 23112 (7%) increase in the number of querries to access path and 162 more
disambiguations which is not great, but I hope it will still increase
as the access path oracle starts to work better.

It will also let me to glue aliasing_coponent_refs into decl-decl oracle
w/o wasting too much of compile time. This seems useful since we now
have a lot of non-trivial MEM_REFs on decls resulting from inlining of
member functions which we do not disambiguate very well if they mix with
arrays (nonoverlapping_component_refs_of_decl_p gives up and we do
nothing except for base/offset/maxsize tests).

Bootstrapped/regtested x86_64-linux, seems reasonable?

	* gcc.dg/tree-ssa/alias-access-path-3.c
	* tree-ssa-alias.c (same_access_paths_p): New predicate.
	(indirect_ref_may_alias_decl_p, indirect_refs_may_alias_p):
	Use it.

Comments

Richard Biener June 17, 2019, 12:33 p.m. UTC | #1
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> Hi,
> while working on testcases for nonoverlapping_component_refs_p I run into issue
> that we often bypass it because the indirect-decl and indirect-indirect oracles
> give up if they manage to match bases and ranges_maybe_overlap_p return true.
> (one testcase is included in patch).
> 
> The issue is that decl-decl oracle first test base+offsets and if that
> fails it proceeds to nonoverlapping_component_refs_of_decl_p which is
> still able to do some useful disambiguations when the access paths
> contains non-constat ARRAY_REFs where max_size is not very informative.
> 
> I modified the other two oracles to avoid the early return which increased
> effectivity of nonoverlapping_component_refs_p and aliasing_component_refs_p
> somewhat but also about doubled number of calls of them.
> 
> I can't say if the early return is just omision or intended to save compile time
> but it appeared to me that most of those nondisambiguated comparsions acutally
> covers the case we know that access paths are the same and in such case
> we could still return early.
> 
> This patch adds same_access_paths_p which implements simple logic matching
> max_size with type size (thus ruling out variable array accesses) and then
> verifying that there are no unions on the way (in that case we still may have
> different access paths to same memory location.
> 
> With this patch I get relatively reasonable increase in number of querries
> and disambiguations.
> 
> For tramp3d:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3021539 disambiguations, 3321136 queries
>   ref_maybe_used_by_call_p: 7118 disambiguations, 3047133 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   nonoverlapping_component_refs_p: 22 disambiguations, 61735 queries
>   nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
>   aliasing_component_refs_p: 2050 disambiguations, 19908 queries
>   TBAA oracle: 1419961 disambiguations 2916692 queries
>                555158 are in alias set 0
>                575103 queries asked about the same object
>                0 queries asked about the same alias set
>                0 access volatile
>                252874 are dependent in the DAG
>                113596 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 671982 disambiguations, 952513 queries
>   pt_solutions_intersect: 97060 disambiguations, 437912 queries
> 
> to:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3021842 disambiguations, 3321308 queries
>   ref_maybe_used_by_call_p: 7118 disambiguations, 3047440 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   nonoverlapping_component_refs_p: 25 disambiguations, 63108 queries
>   nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
>   aliasing_component_refs_p: 2236 disambiguations, 20594 queries
>   TBAA oracle: 1420030 disambiguations 2918103 queries
>                555158 are in alias set 0
>                575109 queries asked about the same object
>                0 queries asked about the same alias set
>                0 access volatile
>                253260 are dependent in the DAG
>                114546 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 671990 disambiguations, 952532 queries
>   pt_solutions_intersect: 97060 disambiguations, 438091 queries
> 
> So 3% new querries on wich we seems to have have about 30% disambiguation rate
> (190 new disambiguations)
> 
> For cc1:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 38345354 disambiguations, 46034874 queries
>   ref_maybe_used_by_call_p: 57452 disambiguations, 38905700 queries
>   call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
>   nonoverlapping_component_refs_p: 9674 disambiguations, 743745 queries
>   nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 queries
>   aliasing_component_refs_p: 103234 disambiguations, 291017 queries
>   TBAA oracle: 10719978 disambiguations 32885735 queries
>                13521045 are in alias set 0
>                5233132 queries asked about the same object
>                200 queries asked about the same alias set
>                0 access volatile
>                1459189 are dependent in the DAG
>                1952191 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 465494 disambiguations, 6918046 queries
>   pt_solutions_intersect: 342384 disambiguations, 6435014 queries
> 
> to:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 38345581 disambiguations, 46035002 queries
>   ref_maybe_used_by_call_p: 57452 disambiguations, 38905936 queries
>   call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
>   nonoverlapping_component_refs_p: 9754 disambiguations, 768725 queries
>   nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 queries
>   aliasing_component_refs_p: 103316 disambiguations, 314129 queries
>   TBAA oracle: 10720037 disambiguations 32893789 queries
>                13521037 are in alias set 0
>                5233163 queries asked about the same object
>                200 queries asked about the same alias set
>                0 access volatile
>                1462737 are dependent in the DAG
>                1956615 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 465494 disambiguations, 6918049 queries
>   pt_solutions_intersect: 342386 disambiguations, 6435109 queries
> 
> So 23112 (7%) increase in the number of querries to access path and 162 more
> disambiguations which is not great, but I hope it will still increase
> as the access path oracle starts to work better.
> 
> It will also let me to glue aliasing_coponent_refs into decl-decl oracle
> w/o wasting too much of compile time. This seems useful since we now
> have a lot of non-trivial MEM_REFs on decls resulting from inlining of
> member functions which we do not disambiguate very well if they mix with
> arrays (nonoverlapping_component_refs_of_decl_p gives up and we do
> nothing except for base/offset/maxsize tests).
> 
> Bootstrapped/regtested x86_64-linux, seems reasonable?
> 
> 	* gcc.dg/tree-ssa/alias-access-path-3.c
> 	* tree-ssa-alias.c (same_access_paths_p): New predicate.
> 	(indirect_ref_may_alias_decl_p, indirect_refs_may_alias_p):
> 	Use it.
> 
> Index: testsuite/gcc.dg/tree-ssa/alias-access-path-3.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/alias-access-path-3.c	(nonexistent)
> +++ testsuite/gcc.dg/tree-ssa/alias-access-path-3.c	(working copy)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-fre1" } */
> +struct a {int v1;
> +	  int v2;};
> +struct b {struct a a[0];};
> +
> +int
> +test (struct b *bptr1, struct b *bptr2, int i, int j)
> +{
> +  bptr1->a[i].v1=123;
> +  bptr2->a[j].v2=1;
> +  return bptr1->a[i].v1;
> +}
> +int
> +test2 (struct b *bptr1, struct b *bptr2, int i, int j)
> +{
> +  bptr1->a[i].v1=124;
> +  bptr2->a[j].v1=1;
> +  return bptr1->a[i].v1;
> +}
> +/* test should be optimized, while test2 should not.  */
> +/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */
> +/* { dg-final { scan-tree-dump-not "return 124" "fre1"} } */
> Index: tree-ssa-alias.c
> ===================================================================
> --- tree-ssa-alias.c	(revision 272358)
> +++ tree-ssa-alias.c	(working copy)
> @@ -1413,6 +1413,36 @@ decl_refs_may_alias_p (tree ref1, tree b
>    return true;     
>  }
>  
> +/* Return true if access paths are same and thus access path
> +   oracles can be skiped.  This is just a quick check for common
> +   cases where we assume that outer types or addresses has been
> +   previously matched.  */
> +
> +bool
> +same_access_paths_p (tree ref1, poly_int64 max_size1,
> +		     tree ref2, poly_int64 max_size2)
> +{
> +  poly_uint64 size;
> +  if (!ref1 || !ref2
> +      || !known_eq (max_size1, max_size2)
> +      || TYPE_SIZE (TREE_TYPE (ref1)) != TYPE_SIZE (TREE_TYPE (ref2))
> +      || !poly_int_tree_p (TYPE_SIZE (TREE_TYPE (ref1)), &size)
> +      || !known_eq ((poly_int64)size, max_size1))
> +    return false;
> +  while (handled_component_p (ref1))
> +    {
> +      if (!handled_component_p (ref2))
> +	return false;
> +      if (TREE_CODE (TREE_TYPE (ref1)) == UNION_TYPE
> +	  || TREE_CODE (TREE_TYPE (ref1))
> +	     != TREE_CODE (TREE_TYPE (ref2)))
> +	return false;
> +      ref1 = TREE_OPERAND (ref1, 0);
> +      ref2 = TREE_OPERAND (ref2, 0);
> +    }

But part of the expensiveness we want to avoid is this
(repeated) walking of the ref tree...

So...

> +  return !handled_component_p (ref2);
> +}
> +
>  /* Return true if an indirect reference based on *PTR1 constrained
>     to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
>     constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
> @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
>        && (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
>  	  || (TYPE_SIZE (TREE_TYPE (base1))
>  	      && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
> -    return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2);
> +    {
> +      if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2))
> +	return false;
> +      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
> +	return true;

how about a simpler test like

         if (known_size_p (max_size1) && known_size_p (max_size2))
           return true;
         /* If there's an unconstrained variable access in the ref fall
	    through to access-path based disambiguation.  */

?

I'd certainly like to see testcases btw...

A more stricter test would be

	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
          return true;
        /* If there's a variable access in one of the refs fall through
           to access-path based disambiguation.  */

where you'd need to pass down ao_ref_size in addition to max_size as well.

Richard.

> +    }
>  
>    if (ref1 && ref2
>        && nonoverlapping_component_refs_p (ref1, ref2))
> @@ -1613,8 +1648,9 @@ indirect_refs_may_alias_p (tree ref1 ATT
>      {
>        poly_offset_int moff1 = mem_ref_offset (base1) << LOG2_BITS_PER_UNIT;
>        poly_offset_int moff2 = mem_ref_offset (base2) << LOG2_BITS_PER_UNIT;
> -      return ranges_maybe_overlap_p (offset1 + moff1, max_size1,
> -				     offset2 + moff2, max_size2);
> +      if (!ranges_maybe_overlap_p (offset1 + moff1, max_size1,
> +				   offset2 + moff2, max_size2))
> +	return false;
>      }
>    if (!ptr_derefs_may_alias_p (ptr1, ptr2))
>      return false;
> @@ -1654,7 +1690,12 @@ indirect_refs_may_alias_p (tree ref1 ATT
>           can overlap by an exact multiple of their element size.
>           See gcc.dg/torture/alias-2.c.  */
>        && TREE_CODE (TREE_TYPE (ptrtype1)) != ARRAY_TYPE)
> -    return ranges_maybe_overlap_p (offset1, max_size1, offset2, max_size2);
> +    {
> +      if (!ranges_maybe_overlap_p (offset1, max_size1, offset2, max_size2))
> +	return false;
> +      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
> +	return true;
> +    }
>  
>    if (ref1 && ref2
>        && nonoverlapping_component_refs_p (ref1, ref2))
>
Jan Hubicka June 17, 2019, 12:43 p.m. UTC | #2
> 
> But part of the expensiveness we want to avoid is this
> (repeated) walking of the ref tree...

I was considering to pass contains_union_p down from one of
earlier walks, but did not find suitable one for that...
> 
> So...
> 
> > +  return !handled_component_p (ref2);
> > +}
> > +
> >  /* Return true if an indirect reference based on *PTR1 constrained
> >     to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
> >     constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
> > @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
> >        && (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
> >  	  || (TYPE_SIZE (TREE_TYPE (base1))
> >  	      && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
> > -    return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2);
> > +    {
> > +      if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2))
> > +	return false;
> > +      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
> > +	return true;
> 
> how about a simpler test like
> 
>          if (known_size_p (max_size1) && known_size_p (max_size2))
>            return true;
>          /* If there's an unconstrained variable access in the ref fall
> 	    through to access-path based disambiguation.  */

If I have something like
 struct a {int a[10];int b;}
and then
 aptr->a[i]
in the access path, won't be max_size known (40) where type size is 4?
In this case I want to contiue to access path.
> 
> ?
> 
> I'd certainly like to see testcases btw...

There is a testcase for variable array access included in the patch,
would you like to have one with union in it?
> 
> A more stricter test would be
> 
> 	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
>           return true;
>         /* If there's a variable access in one of the refs fall through
>            to access-path based disambiguation.  */
> 
> where you'd need to pass down ao_ref_size in addition to max_size as well.

Proably || here?

Honza
Jan Hubicka June 17, 2019, 1 p.m. UTC | #3
Hi,
this is testcase for unions.  It is somewhat artificial because
nonoverlapping_component_refs is conservative about matching union
fields and aliasing_component_refs_p resorts to offset/max_size
test which of course suceeds.
So I had to add the wrapping struct a (which is matched by
the earlier) and disambiguated.

ICC optimizes it as expected.

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-fre1" } */
struct a {int v1;
	  int v2;};
struct b {int x; struct a a;};
struct c {struct a a;};
union d {struct b b; struct c c;};

int
test (union d *dptr1, union d *dptr2)
{
  dptr1->b.a.v1=123;
  dptr2->c.a.v2=1;
  return dptr1->b.a.v1;
}
/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */
Richard Biener June 17, 2019, 1:22 p.m. UTC | #4
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> > 
> > But part of the expensiveness we want to avoid is this
> > (repeated) walking of the ref tree...
> 
> I was considering to pass contains_union_p down from one of
> earlier walks, but did not find suitable one for that...
> > 
> > So...
> > 
> > > +  return !handled_component_p (ref2);
> > > +}
> > > +
> > >  /* Return true if an indirect reference based on *PTR1 constrained
> > >     to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
> > >     constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
> > > @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
> > >        && (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
> > >  	  || (TYPE_SIZE (TREE_TYPE (base1))
> > >  	      && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
> > > -    return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2);
> > > +    {
> > > +      if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2))
> > > +	return false;
> > > +      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
> > > +	return true;
> > 
> > how about a simpler test like
> > 
> >          if (known_size_p (max_size1) && known_size_p (max_size2))
> >            return true;
> >          /* If there's an unconstrained variable access in the ref fall
> > 	    through to access-path based disambiguation.  */
> 
> If I have something like
>  struct a {int a[10];int b;}
> and then
>  aptr->a[i]
> in the access path, won't be max_size known (40) where type size is 4?

Yes.

> In this case I want to contiue to access path.
> > 
> > ?
> > 
> > I'd certainly like to see testcases btw...
> 
> There is a testcase for variable array access included in the patch,
> would you like to have one with union in it?

Oh, I missed the testcase ;)

> > 
> > A more stricter test would be
> > 
> > 	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
> >           return true;
> >         /* If there's a variable access in one of the refs fall through
> >            to access-path based disambiguation.  */
> > 
> > where you'd need to pass down ao_ref_size in addition to max_size as well.
> 
> Proably || here?

Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
I think && is correct if you want to disambiguate a[1].v2 and a[i].v1

But yes, if you don't want that then || is cheaper.  Probably add
another testcase with one of the accesses with a constant index?

Richard.
Jan Hubicka June 17, 2019, 1:34 p.m. UTC | #5
> 
> Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> 
> But yes, if you don't want that then || is cheaper.  Probably add
> another testcase with one of the accesses with a constant index?

Hmm, OK, without unions I do not see how I could disambiguate something
with || and not &&.
Do we care about the testcase with union and constant accesses I sent?

Honza
> 
> Richard.
Richard Biener June 17, 2019, 1:43 p.m. UTC | #6
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> > 
> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> > 
> > But yes, if you don't want that then || is cheaper.  Probably add
> > another testcase with one of the accesses with a constant index?
> 
> Hmm, OK, without unions I do not see how I could disambiguate something
> with || and not &&.
> Do we care about the testcase with union and constant accesses I sent?

Wouldn't it be as simple as

struct S { int i; int j; };

struct S a[10];

and a[i].i vs. a[1].j?  The first has offset = 0, max_size = sizeof(a)-4
while the second has offset = 12, max_size = 4 and thus they overlap.

Richard.
Jan Hubicka June 17, 2019, 1:52 p.m. UTC | #7
> On Mon, 17 Jun 2019, Jan Hubicka wrote:
> 
> > > 
> > > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> > > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> > > 
> > > But yes, if you don't want that then || is cheaper.  Probably add
> > > another testcase with one of the accesses with a constant index?
> > 
> > Hmm, OK, without unions I do not see how I could disambiguate something
> > with || and not &&.
> > Do we care about the testcase with union and constant accesses I sent?
> 
> Wouldn't it be as simple as
> 
> struct S { int i; int j; };
> 
> struct S a[10];
> 
> and a[i].i vs. a[1].j?  The first has offset = 0, max_size = sizeof(a)-4
> while the second has offset = 12, max_size = 4 and thus they overlap.

Ha, you are right of course (and in fact similar testcases was why
I started to look into this :)

Honza
Richard Sandiford June 17, 2019, 2:26 p.m. UTC | #8
Richard Biener <rguenther@suse.de> writes:
> On Mon, 17 Jun 2019, Jan Hubicka wrote:
>
>> > 
>> > But part of the expensiveness we want to avoid is this
>> > (repeated) walking of the ref tree...
>> 
>> I was considering to pass contains_union_p down from one of
>> earlier walks, but did not find suitable one for that...
>> > 
>> > So...
>> > 
>> > > +  return !handled_component_p (ref2);
>> > > +}
>> > > +
>> > >  /* Return true if an indirect reference based on *PTR1 constrained
>> > >     to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
>> > >     constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
>> > > @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
>> > >        && (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
>> > >  	  || (TYPE_SIZE (TREE_TYPE (base1))
>> > >  	      && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
>> > > -    return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2);
>> > > +    {
>> > > +      if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2))
>> > > +	return false;
>> > > +      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
>> > > +	return true;
>> > 
>> > how about a simpler test like
>> > 
>> >          if (known_size_p (max_size1) && known_size_p (max_size2))
>> >            return true;
>> >          /* If there's an unconstrained variable access in the ref fall
>> > 	    through to access-path based disambiguation.  */
>> 
>> If I have something like
>>  struct a {int a[10];int b;}
>> and then
>>  aptr->a[i]
>> in the access path, won't be max_size known (40) where type size is 4?
>
> Yes.
>
>> In this case I want to contiue to access path.
>> > 
>> > ?
>> > 
>> > I'd certainly like to see testcases btw...
>> 
>> There is a testcase for variable array access included in the patch,
>> would you like to have one with union in it?
>
> Oh, I missed the testcase ;)
>
>> > 
>> > A more stricter test would be
>> > 
>> > 	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
>> >           return true;
>> >         /* If there's a variable access in one of the refs fall through
>> >            to access-path based disambiguation.  */
>> > 
>> > where you'd need to pass down ao_ref_size in addition to max_size as well.
>> 
>> Proably || here?
>
> Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
>
> But yes, if you don't want that then || is cheaper.  Probably add
> another testcase with one of the accesses with a constant index?

Might be misunderstanding, but isn't the check for a variable access
!known_eq (max_size1, size1) == maybe_ne (max_size1, size1)?  "maybe_eq"
means "could be equal in some circumstances, even if they're not always
equal".

Richard
Jan Hubicka June 17, 2019, 2:33 p.m. UTC | #9
> >> > 
> >> > A more stricter test would be
> >> > 
> >> > 	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
> >> >           return true;
> >> >         /* If there's a variable access in one of the refs fall through
> >> >            to access-path based disambiguation.  */
> >> > 
> >> > where you'd need to pass down ao_ref_size in addition to max_size as well.
> >> 
> >> Proably || here?
> >
> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> >
> > But yes, if you don't want that then || is cheaper.  Probably add
> > another testcase with one of the accesses with a constant index?
> 
> Might be misunderstanding, but isn't the check for a variable access
> !known_eq (max_size1, size1) == maybe_ne (max_size1, size1)?  "maybe_eq"
> means "could be equal in some circumstances, even if they're not always
> equal".

Seems I am getting progressively more confused.  I think I want to pass
the size down from ao_ref to the functions and then see if they are
always equal.

Why would I want to use maybe variant?

Honza

> 
> Richard
Richard Sandiford June 17, 2019, 4:23 p.m. UTC | #10
Jan Hubicka <hubicka@ucw.cz> writes:
>> >> > 
>> >> > A more stricter test would be
>> >> > 
>> >> > 	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
>> >> >           return true;
>> >> >         /* If there's a variable access in one of the refs fall through
>> >> >            to access-path based disambiguation.  */
>> >> > 
>> >> > where you'd need to pass down ao_ref_size in addition to max_size as well.
>> >> 
>> >> Proably || here?
>> >
>> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
>> > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
>> >
>> > But yes, if you don't want that then || is cheaper.  Probably add
>> > another testcase with one of the accesses with a constant index?
>> 
>> Might be misunderstanding, but isn't the check for a variable access
>> !known_eq (max_size1, size1) == maybe_ne (max_size1, size1)?  "maybe_eq"
>> means "could be equal in some circumstances, even if they're not always
>> equal".
>
> Seems I am getting progressively more confused.

Well, me too :-)  I didn't really understand the choice of the original
condition above.  It seemed to be "return true if both access sizes are
variable", but the comment implied something else.

But I was just picking up on the choice of maybe_eq vs. known_eq in the
condition, which as things stand would only matter for SVE.  And...

> I think I want to pass the size down from ao_ref to the functions and
> then see if they are always equal.

...right, that's what I mean.  known_eq gives you "always equal",
i.e. "not variable".  The maybe_eq in the original condition seemed odd
because it includes the constant case and (potentially) variable cases
that are size1 or bigger.

In inverted conditions, !known_eq can be written maybe_ne (but doesn't
need to be, if !known_eq seems clearer).

Thanks,
Richard

>
> Why would I want to use maybe variant?
>
> Honza
>
>> 
>> Richard
Richard Biener June 17, 2019, 4:40 p.m. UTC | #11
On June 17, 2019 6:23:03 PM GMT+02:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>Jan Hubicka <hubicka@ucw.cz> writes:
>>> >> > 
>>> >> > A more stricter test would be
>>> >> > 
>>> >> > 	if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2,
>size2))
>>> >> >           return true;
>>> >> >         /* If there's a variable access in one of the refs fall
>through
>>> >> >            to access-path based disambiguation.  */
>>> >> > 
>>> >> > where you'd need to pass down ao_ref_size in addition to
>max_size as well.
>>> >> 
>>> >> Proably || here?
>>> >
>>> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1
>thus
>>> > I think && is correct if you want to disambiguate a[1].v2 and
>a[i].v1
>>> >
>>> > But yes, if you don't want that then || is cheaper.  Probably add
>>> > another testcase with one of the accesses with a constant index?
>>> 
>>> Might be misunderstanding, but isn't the check for a variable access
>>> !known_eq (max_size1, size1) == maybe_ne (max_size1, size1)? 
>"maybe_eq"
>>> means "could be equal in some circumstances, even if they're not
>always
>>> equal".
>>
>> Seems I am getting progressively more confused.
>
>Well, me too :-)  I didn't really understand the choice of the original
>condition above.  It seemed to be "return true if both access sizes are
>variable", but the comment implied something else.

Sorry,! Must_eq is obviously fine. 

>But I was just picking up on the choice of maybe_eq vs. known_eq in the
>condition, which as things stand would only matter for SVE.  And...
>
>> I think I want to pass the size down from ao_ref to the functions and
>> then see if they are always equal.
>
>...right, that's what I mean.  known_eq gives you "always equal",
>i.e. "not variable".  The maybe_eq in the original condition seemed odd
>because it includes the constant case and (potentially) variable cases
>that are size1 or bigger.
>
>In inverted conditions, !known_eq can be written maybe_ne (but doesn't
>need to be, if !known_eq seems clearer).
>
>Thanks,
>Richard
>
>>
>> Why would I want to use maybe variant?
>>
>> Honza
>>
>>> 
>>> Richard
Jan Hubicka June 18, 2019, 3:31 p.m. UTC | #12
> >Well, me too :-)  I didn't really understand the choice of the original
> >condition above.  It seemed to be "return true if both access sizes are
> >variable", but the comment implied something else.
> 
> Sorry,! Must_eq is obviously fine. 

Thanks, good to know we are on same page :)
After some thinking I however believe we could reorganize oracle to be
faster.  What we do currently is

1) try to match base pointers and if they do we go for rangle_overlap_p only
2) try to match base types and if they do go for rangle_overlap_p only
3) try nonoverlapping_component_refs_p
4) try aliasing_component_refs_p.

Now I think we could do

1) try to match base pointers, if they do and rangle_overlap_p is false,
   return false. Otherwise remember that we can not have partial overlap
   of arrays which we handle conservatively otherwise.
2) try to match base types. If they do
    a) return false if range_overlap_p is false
    b) do variant of nonoverlaping_component_refs_decl_p starting from
       the known match of types and using LTO friendly
       same_types_for_tbaa_p compare.

       If it does not trip over array_range_refs or unions I do not
       think we need to do sorting like nonoverlapping_component_refs_p
       does. This perhaps the sorting path can be dropped compoetely.
    c) return true if failed
3) continue by looking for match of basetype of one path with inner type
   of the ohter (aliasing_component_refs_p) and if match is found
   repeat a),b),c)
4) do the access path continuation test.

I do not think this scheme should miss any cases where
nonoverlapping_component_refs_p matches since if we have match in the
middle of access path then either the access paths are incompatible
(will be ruled out by 4) or mathing types are found in 2/3.

2) is kind of redundant with 3), but since basetype match is common and
it saves some extra walk I think it makes sense to do it first and make
3) to skip outermost REF.

So I think I will drop this patch, fix divergences between
nonoverlapping_component_refs_p and and nonoverlaping_component_refs_decl_p,
hopefully enable aliasing_component_refs and nonoverlapping
and then see if I can reorgnaize the code this way.

Honza
diff mbox series

Patch

Index: testsuite/gcc.dg/tree-ssa/alias-access-path-3.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/alias-access-path-3.c	(nonexistent)
+++ testsuite/gcc.dg/tree-ssa/alias-access-path-3.c	(working copy)
@@ -0,0 +1,23 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+struct a {int v1;
+	  int v2;};
+struct b {struct a a[0];};
+
+int
+test (struct b *bptr1, struct b *bptr2, int i, int j)
+{
+  bptr1->a[i].v1=123;
+  bptr2->a[j].v2=1;
+  return bptr1->a[i].v1;
+}
+int
+test2 (struct b *bptr1, struct b *bptr2, int i, int j)
+{
+  bptr1->a[i].v1=124;
+  bptr2->a[j].v1=1;
+  return bptr1->a[i].v1;
+}
+/* test should be optimized, while test2 should not.  */
+/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */
+/* { dg-final { scan-tree-dump-not "return 124" "fre1"} } */
Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c	(revision 272358)
+++ tree-ssa-alias.c	(working copy)
@@ -1413,6 +1413,36 @@  decl_refs_may_alias_p (tree ref1, tree b
   return true;     
 }
 
+/* Return true if access paths are same and thus access path
+   oracles can be skiped.  This is just a quick check for common
+   cases where we assume that outer types or addresses has been
+   previously matched.  */
+
+bool
+same_access_paths_p (tree ref1, poly_int64 max_size1,
+		     tree ref2, poly_int64 max_size2)
+{
+  poly_uint64 size;
+  if (!ref1 || !ref2
+      || !known_eq (max_size1, max_size2)
+      || TYPE_SIZE (TREE_TYPE (ref1)) != TYPE_SIZE (TREE_TYPE (ref2))
+      || !poly_int_tree_p (TYPE_SIZE (TREE_TYPE (ref1)), &size)
+      || !known_eq ((poly_int64)size, max_size1))
+    return false;
+  while (handled_component_p (ref1))
+    {
+      if (!handled_component_p (ref2))
+	return false;
+      if (TREE_CODE (TREE_TYPE (ref1)) == UNION_TYPE
+	  || TREE_CODE (TREE_TYPE (ref1))
+	     != TREE_CODE (TREE_TYPE (ref2)))
+	return false;
+      ref1 = TREE_OPERAND (ref1, 0);
+      ref2 = TREE_OPERAND (ref2, 0);
+    }
+  return !handled_component_p (ref2);
+}
+
 /* Return true if an indirect reference based on *PTR1 constrained
    to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
    constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
@@ -1533,7 +1563,12 @@  indirect_ref_may_alias_decl_p (tree ref1
       && (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
 	  || (TYPE_SIZE (TREE_TYPE (base1))
 	      && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
-    return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2);
+    {
+      if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, max_size2))
+	return false;
+      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
+	return true;
+    }
 
   if (ref1 && ref2
       && nonoverlapping_component_refs_p (ref1, ref2))
@@ -1613,8 +1648,9 @@  indirect_refs_may_alias_p (tree ref1 ATT
     {
       poly_offset_int moff1 = mem_ref_offset (base1) << LOG2_BITS_PER_UNIT;
       poly_offset_int moff2 = mem_ref_offset (base2) << LOG2_BITS_PER_UNIT;
-      return ranges_maybe_overlap_p (offset1 + moff1, max_size1,
-				     offset2 + moff2, max_size2);
+      if (!ranges_maybe_overlap_p (offset1 + moff1, max_size1,
+				   offset2 + moff2, max_size2))
+	return false;
     }
   if (!ptr_derefs_may_alias_p (ptr1, ptr2))
     return false;
@@ -1654,7 +1690,12 @@  indirect_refs_may_alias_p (tree ref1 ATT
          can overlap by an exact multiple of their element size.
          See gcc.dg/torture/alias-2.c.  */
       && TREE_CODE (TREE_TYPE (ptrtype1)) != ARRAY_TYPE)
-    return ranges_maybe_overlap_p (offset1, max_size1, offset2, max_size2);
+    {
+      if (!ranges_maybe_overlap_p (offset1, max_size1, offset2, max_size2))
+	return false;
+      if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
+	return true;
+    }
 
   if (ref1 && ref2
       && nonoverlapping_component_refs_p (ref1, ref2))