From patchwork Sat Aug 28 19:08:58 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jan Hubicka <hubicka@ucw.cz>
X-Patchwork-Id: 1521912
Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=8.43.85.97; helo=sourceware.org;
 envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=<UNKNOWN>)
Received: from sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest
 SHA256)
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 4GxmNT3bx2z9sVw
	for <incoming@patchwork.ozlabs.org>; Sun, 29 Aug 2021 05:09:27 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 1E552385700F
	for <incoming@patchwork.ozlabs.org>; Sat, 28 Aug 2021 19:09:24 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16])
 by sourceware.org (Postfix) with ESMTPS id 848703858435
 for <gcc-patches@gcc.gnu.org>; Sat, 28 Aug 2021 19:08:59 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 848703858435
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=ucw.cz
Authentication-Results: sourceware.org; spf=none smtp.mailfrom=kam.mff.cuni.cz
Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202)
 id 703F5280874; Sat, 28 Aug 2021 21:08:58 +0200 (CEST)
Date: Sat, 28 Aug 2021 21:08:58 +0200
From: Jan Hubicka <hubicka@ucw.cz>
To: gcc-patches@gcc.gnu.org
Subject: Improve merging of modref_access_node
Message-ID: <20210828190858.GA76987@kam.mff.cuni.cz>
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.10.1 (2018-07-13)
X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY,
 RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>

Hi,
this should be final bit of the fancy access merging.  We limit number of
accesses to 16 and on the overflow we currently just throw away the whole
table.  This patch instead looks for closest pair of entries in the table and
merge them (losing some precision).  This is not very often during normal gcc
bootstrap, but with -fno-strict-aliasing the overflows are much more common
and happens 272 times (for stuff like our autogenerated param handling).

I hope this may be useful for some real world code that does a lot of array manipulations.
Bootstrapped/regtested x86_64-linux, comitted.

Since I produced aliasing stats for cc1plus build with lto-bootstrap and
-fno-strict-aliasing, I am attaching it.

Alias oracle query stats:
  refs_may_alias_p: 73520790 disambiguations, 96898146 queries
  ref_maybe_used_by_call_p: 515551 disambiguations, 74487359 queries
  call_may_clobber_ref_p: 348504 disambiguations, 352504 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 0 queries
  nonoverlapping_refs_since_match_p: 33251 disambiguations, 42895 must overlaps, 76905 queries
  aliasing_component_refs_p: 0 disambiguations, 0 queries
  TBAA oracle: 0 disambiguations 128 queries
               0 are in alias set 0
               0 queries asked about the same object
               128 queries asked about the same alias set
               0 access volatile
               0 are dependent in the DAG
               0 are aritificially in conflict with void *

Modref stats:
  modref use: 7260 disambiguations, 640326 queries
  modref clobber: 591264 disambiguations, 20039893 queries
  0 tbaa queries (0.000000 per modref query)
  1145567 base compares (0.057164 per modref query)

PTA query stats:
  pt_solution_includes: 13729755 disambiguations, 35737015 queries
  pt_solutions_intersect: 1703678 disambiguations, 13200534 queries

It seems that modref is still quite effective on handling clobbers.

gcc/ChangeLog:

	* ipa-modref-tree.h (modref_access_node::merge): Break out
	logic combining offsets and logic merging ranges to ...
	(modref_access_node::combined_offsets): ... here
	(modref_access_node::update2): ... here
	(modref_access_node::closer_pair_p): New member function.
	(modref_access_node::forced_merge): New member function.
	(modre_ref_node::insert): Do merging when table is full.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/modref-9.c: New test.
diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
index a86e684a030..6a9ed5ce54b 100644
--- a/gcc/ipa-modref-tree.h
+++ b/gcc/ipa-modref-tree.h
@@ -196,8 +196,9 @@ struct GTY(()) modref_access_node
      was prolonged and punt when there are too many.  */
   bool merge (const modref_access_node &a, bool record_adjustments)
     {
-      poly_int64 aoffset_adj = 0, offset_adj = 0;
-      poly_int64 new_parm_offset = parm_offset;
+      poly_int64 offset1 = 0;
+      poly_int64 aoffset1 = 0;
+      poly_int64 new_parm_offset = 0;
 
       /* We assume that containment was tested earlier.  */
       gcc_checking_assert (!contains (a) && !a.contains (*this));
@@ -209,29 +210,13 @@ struct GTY(()) modref_access_node
 	    {
 	      if (!a.parm_offset_known)
 		return false;
-	      if (known_le (a.parm_offset, parm_offset))
-		{
-		  offset_adj = (parm_offset - a.parm_offset)
-				<< LOG2_BITS_PER_UNIT;
-		  aoffset_adj = 0;
-		  new_parm_offset = a.parm_offset;
-		}
-	      else if (known_le (parm_offset, a.parm_offset))
-		{
-		  aoffset_adj = (a.parm_offset - parm_offset)
-				 << LOG2_BITS_PER_UNIT;
-		  offset_adj = 0;
-		}
-	      else
+	      if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
 		return false;
 	    }
 	}
       /* See if we can merge ranges.  */
       if (range_info_useful_p ())
 	{
-	  poly_int64 offset1 = offset + offset_adj;
-	  poly_int64 aoffset1 = a.offset + aoffset_adj;
-
 	  /* In this case we have containment that should be
 	     handled earlier.  */
 	  gcc_checking_assert (a.range_info_useful_p ());
@@ -255,46 +240,211 @@ struct GTY(()) modref_access_node
 	    return false;
 	  if (known_le (offset1, aoffset1))
 	    {
-	      if (!known_size_p (max_size))
+	      if (!known_size_p (max_size)
+		  || known_ge (offset1 + max_size, aoffset1))
 		{
-		  update (new_parm_offset, offset1, size, max_size,
-			  record_adjustments);
-		  return true;
-		}
-	      else if (known_ge (offset1 + max_size, aoffset1))
-		{
-		  poly_int64 new_max_size = max_size;
-		  if (known_le (max_size, a.max_size + aoffset1 - offset1))
-		    new_max_size = a.max_size + aoffset1 - offset1;
-		  update (new_parm_offset, offset1, size, new_max_size,
-			  record_adjustments);
+		  update2 (new_parm_offset, offset1, size, max_size,
+			   aoffset1, a.size, a.max_size,
+			   record_adjustments);
 		  return true;
 		}
 	    }
 	  else if (known_le (aoffset1, offset1))
 	    {
-	      if (!known_size_p (a.max_size))
-		{
-		  update (new_parm_offset, aoffset1, size, a.max_size,
-			  record_adjustments);
-		  return true;
-		}
-	      else if (known_ge (aoffset1 + a.max_size, offset1))
+	      if (!known_size_p (a.max_size)
+		  || known_ge (aoffset1 + a.max_size, offset1))
 		{
-		  poly_int64 new_max_size = a.max_size;
-		  if (known_le (a.max_size, max_size + offset1 - aoffset1))
-		    new_max_size = max_size + offset1 - aoffset1;
-		  update (new_parm_offset, aoffset1, size, new_max_size,
-			  record_adjustments);
+		  update2 (new_parm_offset, offset1, size, max_size,
+			   aoffset1, a.size, a.max_size,
+			   record_adjustments);
 		  return true;
 		}
 	    }
 	  return false;
 	}
-      update (new_parm_offset, offset + offset_adj,
+      update (new_parm_offset, offset1,
 	      size, max_size, record_adjustments);
       return true;
     }
+  /* Return true if A1 and B1 can be merged with lower informatoin
+     less than A2 and B2.
+     Assume that no containment or lossless merging is possible.  */
+  static bool closer_pair_p (const modref_access_node &a1,
+			     const modref_access_node &b1,
+			     const modref_access_node &a2,
+			     const modref_access_node &b2)
+    {
+      /* Merging different parm indexes comes to complete loss
+	 of range info.  */
+      if (a1.parm_index != b1.parm_index)
+	return false;
+      if (a2.parm_index != b2.parm_index)
+	return true;
+      /* If parm is known and parm indexes are the same we should
+	 already have containment.  */
+      gcc_checking_assert (a1.parm_offset_known && b1.parm_offset_known);
+      gcc_checking_assert (a2.parm_offset_known && b2.parm_offset_known);
+
+      /* First normalize offsets for parm offsets.  */
+      poly_int64 new_parm_offset, offseta1, offsetb1, offseta2, offsetb2;
+      if (!a1.combined_offsets (b1, &new_parm_offset, &offseta1, &offsetb1)
+	  || !a2.combined_offsets (b2, &new_parm_offset, &offseta2, &offsetb2))
+	gcc_unreachable ();
+
+
+      /* Now compute distnace of the intervals.  */
+      poly_int64 dist1, dist2;
+      if (known_le (offseta1, offsetb1))
+	{
+	  if (!known_size_p (a1.max_size))
+	    dist1 = 0;
+	  else
+	    dist1 = offsetb1 - offseta1 - a1.max_size;
+	}
+      else
+	{
+	  if (!known_size_p (b1.max_size))
+	    dist1 = 0;
+	  else
+	    dist1 = offseta1 - offsetb1 - b1.max_size;
+	}
+      if (known_le (offseta2, offsetb2))
+	{
+	  if (!known_size_p (a2.max_size))
+	    dist2 = 0;
+	  else
+	    dist2 = offsetb2 - offseta2 - a2.max_size;
+	}
+      else
+	{
+	  if (!known_size_p (b2.max_size))
+	    dist2 = 0;
+	  else
+	    dist2 = offseta2 - offsetb2 - b2.max_size;
+	}
+      /* It may happen that intervals overlap in case size
+	 is different.  Preffer the overlap to non-overlap.  */
+      if (known_lt (dist1, 0) && known_ge (dist2, 0))
+	return true;
+      if (known_lt (dist2, 0) && known_ge (dist1, 0))
+	return false;
+      if (known_lt (dist1, 0))
+	/* If both overlaps minimize overlap.  */
+	return known_le (dist2, dist1);
+      else
+	/* If both are disjoint look for smaller distance.  */
+	return known_le (dist1, dist2);
+    }
+
+  /* Merge in access A while losing precision.  */
+  void forced_merge (const modref_access_node &a, bool record_adjustments)
+    {
+      if (parm_index != a.parm_index)
+	{
+	  gcc_checking_assert (parm_index != -1);
+	  parm_index = -1;
+	  return;
+	}
+
+      /* We assume that containment and lossless merging
+	 was tested earlier.  */
+      gcc_checking_assert (!contains (a) && !a.contains (*this)
+			   && !merge (a, record_adjustments));
+      gcc_checking_assert (parm_offset_known && a.parm_offset_known);
+
+      poly_int64 new_parm_offset, offset1, aoffset1;
+      if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
+	{
+	  parm_offset_known = false;
+	  return;
+	}
+      gcc_checking_assert (range_info_useful_p ()
+			   && a.range_info_useful_p ());
+      if (record_adjustments)
+	adjustments += a.adjustments;
+      update2 (new_parm_offset,
+	       offset1, size, max_size,
+	       aoffset1, a.size, a.max_size,
+	       record_adjustments);
+    }
+private:
+  /* Merge two ranges both starting at parm_offset1 and update THIS
+     with result.  */
+  void update2 (poly_int64 parm_offset1,
+		poly_int64 offset1, poly_int64 size1, poly_int64 max_size1,
+		poly_int64 offset2, poly_int64 size2, poly_int64 max_size2,
+		bool record_adjustments)
+    {
+      poly_int64 new_size = size1;
+
+      if (!known_size_p (size2)
+	  || known_le (size2, size1))
+	new_size = size2;
+      else
+	gcc_checking_assert (known_le (size1, size2));
+
+      if (known_le (offset1, offset2))
+	;
+      else if (known_le (offset2, offset1))
+	{
+	  std::swap (offset1, offset2);
+	  std::swap (max_size1, max_size2);
+	}
+      else
+	gcc_unreachable ();
+
+      poly_int64 new_max_size;
+
+      if (!known_size_p (max_size1))
+	new_max_size = max_size1;
+      else if (!known_size_p (max_size2))
+	new_max_size = max_size2;
+      else
+	{
+	  max_size2 = max_size2 + offset2 - offset1;
+	  if (known_le (max_size, max_size2))
+	    new_max_size = max_size2;
+	  else if (known_le (max_size2, max_size))
+	    new_max_size = max_size;
+	  else
+	    gcc_unreachable ();
+	}
+
+      update (parm_offset1, offset1,
+	      new_size, new_max_size, record_adjustments);
+    }
+  /* Given access nodes THIS and A, return true if they
+     can be done with common parm_offsets.  In this case
+     return parm offset in new_parm_offset, new_offset
+     which is start of range in THIS and new_aoffset that
+     is start of range in A.  */
+  bool combined_offsets (const modref_access_node &a,
+			 poly_int64 *new_parm_offset,
+			 poly_int64 *new_offset,
+			 poly_int64 *new_aoffset) const
+    {
+      gcc_checking_assert (parm_offset_known && a.parm_offset_known);
+      if (known_le (a.parm_offset, parm_offset))
+	{
+	  *new_offset = offset
+			+ ((parm_offset - a.parm_offset)
+			   << LOG2_BITS_PER_UNIT);
+	  *new_aoffset = a.offset;
+	  *new_parm_offset = a.parm_offset;
+	  return true;
+	}
+      else if (known_le (parm_offset, a.parm_offset))
+	{
+	  *new_aoffset = a.offset
+	  		  + ((a.parm_offset - parm_offset)
+			     << LOG2_BITS_PER_UNIT);
+	  *new_offset = offset;
+	  *new_parm_offset = parm_offset;
+	  return true;
+	}
+      else
+	return false;
+    }
 };
 
 /* Access node specifying no useful info.  */
@@ -348,7 +498,7 @@ struct GTY((user)) modref_ref_node
       return false;
 
     /* Otherwise, insert a node for the ref of the access under the base.  */
-    size_t i;
+    size_t i, j;
     modref_access_node *a2;
 
     if (flag_checking)
@@ -390,11 +540,61 @@ struct GTY((user)) modref_ref_node
        all accesses and bail out.  */
     if (accesses && accesses->length () >= max_accesses)
       {
-	if (dump_file)
+	if (max_accesses < 2)
+	  {
+	    collapse ();
+	    if (dump_file)
+	      fprintf (dump_file,
+		       "--param param=modref-max-accesses limit reached;"
+		       " collapsing\n");
+	    return true;
+	  }
+	/* Find least harmful merge and perform it.  */
+	int best1 = -1, best2 = -1;
+	FOR_EACH_VEC_SAFE_ELT (accesses, i, a2)
+	  {
+	    for (j = i + 1; j < accesses->length (); j++)
+	      if (best1 < 0
+		  || modref_access_node::closer_pair_p
+		       (*a2, (*accesses)[j],
+			(*accesses)[best1],
+			best2 < 0 ? a : (*accesses)[best2]))
+		{
+		  best1 = i;
+		  best2 = j;
+		}
+	    if (modref_access_node::closer_pair_p
+		       (*a2, a,
+			(*accesses)[best1],
+			best2 < 0 ? a : (*accesses)[best2]))
+	      {
+		best1 = i;
+		best2 = -1;
+	      }
+	  }
+	(*accesses)[best1].forced_merge (best2 < 0 ? a : (*accesses)[best2],
+					 record_adjustments);
+	if (!(*accesses)[best1].useful_p ())
+	  {
+	    collapse ();
+	    if (dump_file)
+	      fprintf (dump_file,
+		       "--param param=modref-max-accesses limit reached;"
+		       " collapsing\n");
+	    return true;
+	  }
+	if (dump_file && best2 >= 0)
 	  fprintf (dump_file,
-		   "--param param=modref-max-accesses limit reached\n");
-	collapse ();
-	return true;
+		   "--param param=modref-max-accesses limit reached;"
+		   " merging %i and %i\n", best1, best2);
+	else if (dump_file)
+	  fprintf (dump_file,
+		   "--param param=modref-max-accesses limit reached;"
+		   " merging with %i\n", best1);
+	try_merge_with (best1);
+	if (best2 >= 0)
+	  insert_access (a, max_accesses, record_adjustments);
+	return 1;
       }
     a.adjustments = 0;
     vec_safe_push (accesses, a);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-9.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-9.c
new file mode 100644
index 00000000000..02de2f09288
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-9.c
@@ -0,0 +1,15 @@
+/* { dg-options "-O2 --param modref-max-accesses=2 -fdump-tree-modref1"  } */
+/* { dg-do compile } */
+void
+test(char *a)
+{
+  a[0] = 0;
+  a[1] = 1;
+  a[3] = 3;
+  a[7] = 7;
+  a[9] = 9;
+}
+/* We allow only two accesses per function.
+   It is best to group together {0,1,3} and {7,9}.  */
+/* { dg-final { scan-tree-dump "access: Parm 0 param offset:0 offset:0 size:8 max_size:32" "modref1" } } */
+/* { dg-final { scan-tree-dump "access: Parm 0 param offset:7 offset:0 size:8 max_size:24" "modref1" } } */