diff mbox series

[rs6000] Don't optimize swaps when a swap has mixed use

Message ID 2a44ada4-e1d0-3b39-7a88-7f82f55f7066@linux.vnet.ibm.com
State New
Headers show
Series [rs6000] Don't optimize swaps when a swap has mixed use | expand

Commit Message

Bill Schmidt Dec. 19, 2017, 1:59 p.m. UTC
Hi,

Carl Love is working on a patch to add missing flavors of the
vec_xst_be intrinsic and test cases to cover all flavors.  He ran
into a latent bug in swap optimization that this patch addresses.
Swap optimization operates on the principle that a computation can
have swaps removed if all permuting loads are accompanied by a swap,
all permuting stores are accompanied by a swap, and the remaining
vector computations are lane-insensitive or easy to adjust if lanes
are swapped across doublewords.

A new problem that arises with vec_xl_be and vec_xst_be is that the
same swap may accompany both a load and a store, so that removing
that swap changes the semantics of the program.  Suppose we have a
vec_xl from *(a+b) followed by a vec_xst_be to *(c+d).  The code at
expand time then looks like:

   lxvd2x x,a,b
   xxswapd x,x
   stxvd2x x,c,d

The first two instructions are generated by vec_xl, while the last
is generated by vec_xst_be.  Swap optimization removes the xxswapd
because this sequence satisfies the rules, but now we have the same
result as if the vec_xst_be were actually a vec_xst.

To avoid this, this patch marks a computation as unoptimizable if
it contains a swap that is both fed by a permuting load and feeds
into a permuting store.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu for POWER8
with no regressions.  Carl has verified this fixes the related
problems in his test cases under development.  Is this okay for
trunk?

Thanks,
Bill


2017-12-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
	New function.
	(rs6000_analyze_swaps): Mark a web unoptimizable if it contains a
	swap associated with both a load and a store.

Comments

Bill Schmidt Jan. 2, 2018, 10:57 p.m. UTC | #1
Segher was kind enough to give me an offline review on his vacation.
I made some small changes and committed the following.  Thanks!

Bill

2018-01-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
	New function.
	(rs6000_analyze_swaps): Mark a web unoptimizable if it contains a
	swap associated with both a load and a store.

Index: gcc/config/rs6000/rs6000-p8swap.c
===================================================================
--- gcc/config/rs6000/rs6000-p8swap.c	(revision 256110)
+++ gcc/config/rs6000/rs6000-p8swap.c	(working copy)
@@ -328,6 +328,38 @@ insn_is_swap_p (rtx insn)
   return 1;
 }
 
+/* Return 1 iff UID, known to reference a swap, is both fed by a load
+   and a feeder of a store.  */
+static unsigned int
+swap_feeds_both_load_and_store (swap_web_entry *insn_entry)
+{
+  rtx insn = insn_entry->insn;
+  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+  df_ref def, use;
+  struct df_link *link = 0;
+  rtx_insn *load = 0, *store = 0;
+  bool fed_by_load = 0;
+  bool feeds_store = 0;
+
+  FOR_EACH_INSN_INFO_USE (use, insn_info)
+    {
+      link = DF_REF_CHAIN (use);
+      load = DF_REF_INSN (link->ref);
+      if (insn_is_load_p (load) && insn_is_swap_p (load))
+	fed_by_load = 1;
+    }
+
+  FOR_EACH_INSN_INFO_DEF (def, insn_info)
+    {
+      link = DF_REF_CHAIN (def);
+      store = DF_REF_INSN (link->ref);
+      if (insn_is_store_p (store) && insn_is_swap_p (store))
+	feeds_store = 1;
+    }
+
+  return fed_by_load && feeds_store;
+}
+
 /* Return TRUE if insn is a swap fed by a load from the constant pool.  */
 static bool
 const_load_sequence_p (swap_web_entry *insn_entry, rtx insn)
@@ -2030,6 +2062,14 @@ rs6000_analyze_swaps (function *fun)
 	  && !insn_entry[i].is_swap && !insn_entry[i].is_swappable)
 	root->web_not_optimizable = 1;
 
+      /* If we have a swap that is both fed by a permuting load
+	 and a feeder of a permuting store, then the optimization
+	 isn't appropriate.  (Consider vec_xl followed by vec_xst_be.)  */
+      else if (insn_entry[i].is_swap && !insn_entry[i].is_load
+	       && !insn_entry[i].is_store
+	       && swap_feeds_both_load_and_store (&insn_entry[i]))
+	root->web_not_optimizable = 1;
+
       /* If we have permuting loads or stores that are not accompanied
 	 by a register swap, the optimization isn't appropriate.  */
       else if (insn_entry[i].is_load && insn_entry[i].is_swap)


On 12/19/17 7:59 AM, Bill Schmidt wrote:
> Hi,
>
> Carl Love is working on a patch to add missing flavors of the
> vec_xst_be intrinsic and test cases to cover all flavors.  He ran
> into a latent bug in swap optimization that this patch addresses.
> Swap optimization operates on the principle that a computation can
> have swaps removed if all permuting loads are accompanied by a swap,
> all permuting stores are accompanied by a swap, and the remaining
> vector computations are lane-insensitive or easy to adjust if lanes
> are swapped across doublewords.
>
> A new problem that arises with vec_xl_be and vec_xst_be is that the
> same swap may accompany both a load and a store, so that removing
> that swap changes the semantics of the program.  Suppose we have a
> vec_xl from *(a+b) followed by a vec_xst_be to *(c+d).  The code at
> expand time then looks like:
>
>    lxvd2x x,a,b
>    xxswapd x,x
>    stxvd2x x,c,d
>
> The first two instructions are generated by vec_xl, while the last
> is generated by vec_xst_be.  Swap optimization removes the xxswapd
> because this sequence satisfies the rules, but now we have the same
> result as if the vec_xst_be were actually a vec_xst.
>
> To avoid this, this patch marks a computation as unoptimizable if
> it contains a swap that is both fed by a permuting load and feeds
> into a permuting store.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu for POWER8
> with no regressions.  Carl has verified this fixes the related
> problems in his test cases under development.  Is this okay for
> trunk?
>
> Thanks,
> Bill
>
>
diff mbox series

Patch

Index: gcc/config/rs6000/rs6000-p8swap.c
===================================================================
--- gcc/config/rs6000/rs6000-p8swap.c	(revision 255801)
+++ gcc/config/rs6000/rs6000-p8swap.c	(working copy)
@@ -327,6 +327,38 @@  insn_is_swap_p (rtx insn)
   return 1;
 }
 
+/* Return 1 iff UID, known to reference a swap, is both fed by a load
+   and a feeder of a store.  */
+static unsigned int
+swap_feeds_both_load_and_store (swap_web_entry *insn_entry)
+{
+  rtx insn = insn_entry->insn;
+  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+  df_ref def, use;
+  struct df_link *link = 0;
+  rtx_insn *load = 0, *store = 0;
+  unsigned int fed_by_load = 0;
+  unsigned int feeds_store = 0;
+
+  FOR_EACH_INSN_INFO_USE (use, insn_info)
+    {
+      link = DF_REF_CHAIN (use);
+      load = DF_REF_INSN (link->ref);
+      if (insn_is_load_p (load) && insn_is_swap_p (load))
+	fed_by_load = 1;
+    }
+
+  FOR_EACH_INSN_INFO_DEF (def, insn_info)
+    {
+      link = DF_REF_CHAIN (def);
+      store = DF_REF_INSN (link->ref);
+      if (insn_is_store_p (store) && insn_is_swap_p (store))
+	feeds_store = 1;
+    }
+
+  return fed_by_load & feeds_store;
+}
+
 /* Return TRUE if insn is a swap fed by a load from the constant pool.  */
 static bool
 const_load_sequence_p (swap_web_entry *insn_entry, rtx insn)
@@ -2029,6 +2061,14 @@  rs6000_analyze_swaps (function *fun)
 	  && !insn_entry[i].is_swap && !insn_entry[i].is_swappable)
 	root->web_not_optimizable = 1;
 
+      /* If we have a swap that is both fed by a permuting load
+	 and a feeder of a permuting store, then the optimization
+	 isn't appropriate.  (Consider vec_xl followed by vec_xst_be.)  */
+      else if (insn_entry[i].is_swap && !insn_entry[i].is_load
+	       && !insn_entry[i].is_store
+	       && swap_feeds_both_load_and_store (&insn_entry[i]))
+	root->web_not_optimizable = 1;
+
       /* If we have permuting loads or stores that are not accompanied
 	 by a register swap, the optimization isn't appropriate.  */
       else if (insn_entry[i].is_load && insn_entry[i].is_swap)