diff mbox series

[2/8,RS6000] rs6000_rtx_costs for AND

Message ID 20201007225800.9536-3-amodra@gmail.com
State New
Headers show
Series rs6000_rtx_costs V2 | expand

Commit Message

Alan Modra Oct. 7, 2020, 10:57 p.m. UTC
The existing "case AND" in this function is not sufficient for
optabs.c:avoid_expensive_constant usage, where the AND is passed in
outer_code.  We'd like to cost AND of rs6000_is_valid_and_mask
or rs6000_is_valid_2insn_and variety there, so that those masks aren't
seen as expensive (ie. better to load to a reg then AND).

	* config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
	AND handling with IOR/XOR.  Move costing for AND with
	rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
	CONST_INT.

Comments

Segher Boessenkool Oct. 20, 2020, 6:55 p.m. UTC | #1
On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote:
> The existing "case AND" in this function is not sufficient for
> optabs.c:avoid_expensive_constant usage, where the AND is passed in
> outer_code.  We'd like to cost AND of rs6000_is_valid_and_mask
> or rs6000_is_valid_2insn_and variety there, so that those masks aren't
> seen as expensive (ie. better to load to a reg then AND).
> 
> 	* config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
> 	AND handling with IOR/XOR.  Move costing for AND with
> 	rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
> 	CONST_INT.

Sorry this took so long to review :-(

On 64-bit BE this leads to *bigger* code, and closer observation shows
that some common sequences degrade on all configs.  This seems to mostly
be about "andc" (and its dot form).  It wasn't costed properly before,
but after your patch, a single instruction is replaced by three.

Could you look into this?


Segher
Alan Modra Oct. 21, 2020, 2:57 a.m. UTC | #2
On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote:
> > The existing "case AND" in this function is not sufficient for
> > optabs.c:avoid_expensive_constant usage, where the AND is passed in
> > outer_code.  We'd like to cost AND of rs6000_is_valid_and_mask
> > or rs6000_is_valid_2insn_and variety there, so that those masks aren't
> > seen as expensive (ie. better to load to a reg then AND).
> > 
> > 	* config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
> > 	AND handling with IOR/XOR.  Move costing for AND with
> > 	rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
> > 	CONST_INT.
> 
> Sorry this took so long to review :-(
> 
> On 64-bit BE this leads to *bigger* code, and closer observation shows
> that some common sequences degrade on all configs.  This seems to mostly
> be about "andc" (and its dot form).  It wasn't costed properly before,
> but after your patch, a single instruction is replaced by three.
> 
> Could you look into this?

~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; done
gimplify.o
insn-emit.o
insn-opinit.o
insn-recog.o
rs6000-string.o

All of these are exactly the case I talked about in
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html

"Sometimes correct insn cost leads to unexpected results.  For
example:

extern unsigned bar (void);
unsigned
f1 (unsigned a)
{
  if ((a & 0x01000200) == 0x01000200)
    return bar ();
  return 0;
}

emits for a & 0x01000200
 (set (reg) (and (reg) (const_int 0x01000200)))
at expand time (two rlwinm insns) rather than the older
 (set (reg) (const_int 0x01000200))
 (set (reg) (and (reg) (reg)))
which is three insns.  However, since 0x01000200 is needed later the
older code after optimisation is smaller."

Things have changed slightly since I wrote the above, with the two
rlwinm insns being emitted at expand time, so you see
 (set (reg) (and (reg) (const_int 0xffffffffff0003ff)))
 (set (reg) (and (reg) (const_int 0x0000000001fffe00)))
but of course that doesn't change anything regarding the cost of
"a & 0x01000200".
Segher Boessenkool Oct. 21, 2020, 8:29 p.m. UTC | #3
On Wed, Oct 21, 2020 at 01:27:42PM +1030, Alan Modra wrote:
> On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote:
> > On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote:
> > > The existing "case AND" in this function is not sufficient for
> > > optabs.c:avoid_expensive_constant usage, where the AND is passed in
> > > outer_code.  We'd like to cost AND of rs6000_is_valid_and_mask
> > > or rs6000_is_valid_2insn_and variety there, so that those masks aren't
> > > seen as expensive (ie. better to load to a reg then AND).
> > > 
> > > 	* config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
> > > 	AND handling with IOR/XOR.  Move costing for AND with
> > > 	rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
> > > 	CONST_INT.
> > 
> > Sorry this took so long to review :-(
> > 
> > On 64-bit BE this leads to *bigger* code, and closer observation shows
> > that some common sequences degrade on all configs.  This seems to mostly
> > be about "andc" (and its dot form).  It wasn't costed properly before,
> > but after your patch, a single instruction is replaced by three.
> > 
> > Could you look into this?
> 
> ~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; done
> gimplify.o
> insn-emit.o
> insn-opinit.o
> insn-recog.o
> rs6000-string.o
> 
> All of these are exactly the case I talked about in
> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html

For a kernel build (my testcase) it happens more often.

> "Sometimes correct insn cost leads to unexpected results.  For
> example:
> 
> extern unsigned bar (void);
> unsigned
> f1 (unsigned a)
> {
>   if ((a & 0x01000200) == 0x01000200)
>     return bar ();
>   return 0;
> }
> 
> emits for a & 0x01000200
>  (set (reg) (and (reg) (const_int 0x01000200)))
> at expand time (two rlwinm insns) rather than the older
>  (set (reg) (const_int 0x01000200))
>  (set (reg) (and (reg) (reg)))

And that is bad.  Why on earth does expand "optimise" this?  It should
not, it hinders various *real* optimisations!

> which is three insns.  However, since 0x01000200 is needed later the
> older code after optimisation is smaller."
> 
> Things have changed slightly since I wrote the above, with the two
> rlwinm insns being emitted at expand time, so you see
>  (set (reg) (and (reg) (const_int 0xffffffffff0003ff)))
>  (set (reg) (and (reg) (const_int 0x0000000001fffe00)))

It has done that for many years?

> but of course that doesn't change anything regarding the cost of
> "a & 0x01000200".

Yeah.  But the problem is that cost that are "better", "closer to
reality", sometimes result in worse results :-(

Anyway:

+              || (outer_code == AND
+                  && rs6000_is_valid_2insn_and (x, mode)))
        {
          *total = COSTS_N_INSNS (1);
          return true;

It should return COSTS_N_INSNS (2) for that?

Testing with that now.


Segher
Alan Modra Oct. 21, 2020, 10:11 p.m. UTC | #4
On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote:
> Anyway:
> 
> +              || (outer_code == AND
> +                  && rs6000_is_valid_2insn_and (x, mode)))
>         {
>           *total = COSTS_N_INSNS (1);
>           return true;
> 
> It should return COSTS_N_INSNS (2) for that?

No, it should not!

      /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
	 baseline for rtx costs:  If a constant is valid in an insn,
	 it is free.  */
Segher Boessenkool Oct. 22, 2020, 1:29 p.m. UTC | #5
On Thu, Oct 22, 2020 at 08:41:50AM +1030, Alan Modra wrote:
> On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote:
> > Anyway:
> > 
> > +              || (outer_code == AND
> > +                  && rs6000_is_valid_2insn_and (x, mode)))
> >         {
> >           *total = COSTS_N_INSNS (1);
> >           return true;
> > 
> > It should return COSTS_N_INSNS (2) for that?
> 
> No, it should not!
> 
>       /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
> 	 baseline for rtx costs:  If a constant is valid in an insn,
> 	 it is free.  */

Yeah, this keeps tripping me up.

The patch is okay for trunk (as I told you offline).  Thanks!

The regressions are really from existing bad behaviour in expand :-(


Segher
Hans-Peter Nilsson Oct. 23, 2020, 11:18 p.m. UTC | #6
On Thu, 22 Oct 2020, Alan Modra via Gcc-patches wrote:

Hi!

> On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote:
> > Anyway:
> >
> > +              || (outer_code == AND
> > +                  && rs6000_is_valid_2insn_and (x, mode)))
> >         {
> >           *total = COSTS_N_INSNS (1);
> >           return true;
> >
> > It should return COSTS_N_INSNS (2) for that?
>
> No, it should not!
>
>       /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
> 	 baseline for rtx costs:  If a constant is valid in an insn,
> 	 it is free.  */

From where is this quote?  My "git grep" fails to find it for me
(on master).  It seems like a port-specific commment so I'd
have expected to find it somewhere in config/rs6000.

brgds, H-P
Alan Modra Oct. 24, 2020, 1:04 a.m. UTC | #7
On Fri, Oct 23, 2020 at 07:18:44PM -0400, Hans-Peter Nilsson wrote:
> On Thu, 22 Oct 2020, Alan Modra via Gcc-patches wrote:
> >       /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
> > 	 baseline for rtx costs:  If a constant is valid in an insn,
> > 	 it is free.  */
> 
> >From where is this quote?  My "git grep" fails to find it for me
> (on master).  It seems like a port-specific commment so I'd
> have expected to find it somewhere in config/rs6000.

It's in a patch in the rs6000_rtx_costs series that hasn't yet been
committed.
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555757.html
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 97180bb3819..e870ba0039a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21264,16 +21264,13 @@  rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	    || outer_code == MINUS)
 	   && (satisfies_constraint_I (x)
 	       || satisfies_constraint_L (x)))
-	  || (outer_code == AND
-	      && (satisfies_constraint_K (x)
-		  || (mode == SImode
-		      ? satisfies_constraint_L (x)
-		      : satisfies_constraint_J (x))))
-	  || ((outer_code == IOR || outer_code == XOR)
+	  || ((outer_code == AND || outer_code == IOR || outer_code == XOR)
 	      && (satisfies_constraint_K (x)
 		  || (mode == SImode
 		      ? satisfies_constraint_L (x)
 		      : satisfies_constraint_J (x))))
+	  || (outer_code == AND
+	      && rs6000_is_valid_and_mask (x, mode))
 	  || outer_code == ASHIFT
 	  || outer_code == ASHIFTRT
 	  || outer_code == LSHIFTRT
@@ -21310,7 +21307,9 @@  rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 		    || outer_code == IOR
 		    || outer_code == XOR)
 		   && (INTVAL (x)
-		       & ~ (unsigned HOST_WIDE_INT) 0xffffffff) == 0))
+		       & ~ (unsigned HOST_WIDE_INT) 0xffffffff) == 0)
+	       || (outer_code == AND
+		   && rs6000_is_valid_2insn_and (x, mode)))
 	{
 	  *total = COSTS_N_INSNS (1);
 	  return true;
@@ -21448,26 +21447,6 @@  rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	      *total += COSTS_N_INSNS (1);
 	      return true;
 	    }
-
-	  /* rotate-and-mask (no rotate), andi., andis.: 1 insn.  */
-	  HOST_WIDE_INT val = INTVAL (XEXP (x, 1));
-	  if (rs6000_is_valid_and_mask (XEXP (x, 1), mode)
-	      || (val & 0xffff) == val
-	      || (val & 0xffff0000) == val
-	      || ((val & 0xffff) == 0 && mode == SImode))
-	    {
-	      *total = rtx_cost (left, mode, AND, 0, speed);
-	      *total += COSTS_N_INSNS (1);
-	      return true;
-	    }
-
-	  /* 2 insns.  */
-	  if (rs6000_is_valid_2insn_and (XEXP (x, 1), mode))
-	    {
-	      *total = rtx_cost (left, mode, AND, 0, speed);
-	      *total += COSTS_N_INSNS (2);
-	      return true;
-	    }
 	}
 
       *total = COSTS_N_INSNS (1);