Message ID | 20201007225800.9536-3-amodra@gmail.com |
---|---|
State | New |
Headers | show |
Series | rs6000_rtx_costs V2 | expand |
On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote: > The existing "case AND" in this function is not sufficient for > optabs.c:avoid_expensive_constant usage, where the AND is passed in > outer_code. We'd like to cost AND of rs6000_is_valid_and_mask > or rs6000_is_valid_2insn_and variety there, so that those masks aren't > seen as expensive (ie. better to load to a reg then AND). > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT > AND handling with IOR/XOR. Move costing for AND with > rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to > CONST_INT. Sorry this took so long to review :-( On 64-bit BE this leads to *bigger* code, and closer observation shows that some common sequences degrade on all configs. This seems to mostly be about "andc" (and its dot form). It wasn't costed properly before, but after your patch, a single instruction is replaced by three. Could you look into this? Segher
On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote: > On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote: > > The existing "case AND" in this function is not sufficient for > > optabs.c:avoid_expensive_constant usage, where the AND is passed in > > outer_code. We'd like to cost AND of rs6000_is_valid_and_mask > > or rs6000_is_valid_2insn_and variety there, so that those masks aren't > > seen as expensive (ie. better to load to a reg then AND). > > > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT > > AND handling with IOR/XOR. Move costing for AND with > > rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to > > CONST_INT. > > Sorry this took so long to review :-( > > On 64-bit BE this leads to *bigger* code, and closer observation shows > that some common sequences degrade on all configs. This seems to mostly > be about "andc" (and its dot form). It wasn't costed properly before, > but after your patch, a single instruction is replaced by three. > > Could you look into this? ~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; done gimplify.o insn-emit.o insn-opinit.o insn-recog.o rs6000-string.o All of these are exactly the case I talked about in https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html "Sometimes correct insn cost leads to unexpected results. For example: extern unsigned bar (void); unsigned f1 (unsigned a) { if ((a & 0x01000200) == 0x01000200) return bar (); return 0; } emits for a & 0x01000200 (set (reg) (and (reg) (const_int 0x01000200))) at expand time (two rlwinm insns) rather than the older (set (reg) (const_int 0x01000200)) (set (reg) (and (reg) (reg))) which is three insns. However, since 0x01000200 is needed later the older code after optimisation is smaller." Things have changed slightly since I wrote the above, with the two rlwinm insns being emitted at expand time, so you see (set (reg) (and (reg) (const_int 0xffffffffff0003ff))) (set (reg) (and (reg) (const_int 0x0000000001fffe00))) but of course that doesn't change anything regarding the cost of "a & 0x01000200".
On Wed, Oct 21, 2020 at 01:27:42PM +1030, Alan Modra wrote: > On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote: > > On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote: > > > The existing "case AND" in this function is not sufficient for > > > optabs.c:avoid_expensive_constant usage, where the AND is passed in > > > outer_code. We'd like to cost AND of rs6000_is_valid_and_mask > > > or rs6000_is_valid_2insn_and variety there, so that those masks aren't > > > seen as expensive (ie. better to load to a reg then AND). > > > > > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT > > > AND handling with IOR/XOR. Move costing for AND with > > > rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to > > > CONST_INT. > > > > Sorry this took so long to review :-( > > > > On 64-bit BE this leads to *bigger* code, and closer observation shows > > that some common sequences degrade on all configs. This seems to mostly > > be about "andc" (and its dot form). It wasn't costed properly before, > > but after your patch, a single instruction is replaced by three. > > > > Could you look into this? > > ~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; done > gimplify.o > insn-emit.o > insn-opinit.o > insn-recog.o > rs6000-string.o > > All of these are exactly the case I talked about in > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html For a kernel build (my testcase) it happens more often. > "Sometimes correct insn cost leads to unexpected results. For > example: > > extern unsigned bar (void); > unsigned > f1 (unsigned a) > { > if ((a & 0x01000200) == 0x01000200) > return bar (); > return 0; > } > > emits for a & 0x01000200 > (set (reg) (and (reg) (const_int 0x01000200))) > at expand time (two rlwinm insns) rather than the older > (set (reg) (const_int 0x01000200)) > (set (reg) (and (reg) (reg))) And that is bad. Why on earth does expand "optimise" this? It should not, it hinders various *real* optimisations! > which is three insns. However, since 0x01000200 is needed later the > older code after optimisation is smaller." > > Things have changed slightly since I wrote the above, with the two > rlwinm insns being emitted at expand time, so you see > (set (reg) (and (reg) (const_int 0xffffffffff0003ff))) > (set (reg) (and (reg) (const_int 0x0000000001fffe00))) It has done that for many years? > but of course that doesn't change anything regarding the cost of > "a & 0x01000200". Yeah. But the problem is that cost that are "better", "closer to reality", sometimes result in worse results :-( Anyway: + || (outer_code == AND + && rs6000_is_valid_2insn_and (x, mode))) { *total = COSTS_N_INSNS (1); return true; It should return COSTS_N_INSNS (2) for that? Testing with that now. Segher
On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote: > Anyway: > > + || (outer_code == AND > + && rs6000_is_valid_2insn_and (x, mode))) > { > *total = COSTS_N_INSNS (1); > return true; > > It should return COSTS_N_INSNS (2) for that? No, it should not! /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a baseline for rtx costs: If a constant is valid in an insn, it is free. */
On Thu, Oct 22, 2020 at 08:41:50AM +1030, Alan Modra wrote: > On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote: > > Anyway: > > > > + || (outer_code == AND > > + && rs6000_is_valid_2insn_and (x, mode))) > > { > > *total = COSTS_N_INSNS (1); > > return true; > > > > It should return COSTS_N_INSNS (2) for that? > > No, it should not! > > /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a > baseline for rtx costs: If a constant is valid in an insn, > it is free. */ Yeah, this keeps tripping me up. The patch is okay for trunk (as I told you offline). Thanks! The regressions are really from existing bad behaviour in expand :-( Segher
On Thu, 22 Oct 2020, Alan Modra via Gcc-patches wrote: Hi! > On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote: > > Anyway: > > > > + || (outer_code == AND > > + && rs6000_is_valid_2insn_and (x, mode))) > > { > > *total = COSTS_N_INSNS (1); > > return true; > > > > It should return COSTS_N_INSNS (2) for that? > > No, it should not! > > /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a > baseline for rtx costs: If a constant is valid in an insn, > it is free. */ From where is this quote? My "git grep" fails to find it for me (on master). It seems like a port-specific commment so I'd have expected to find it somewhere in config/rs6000. brgds, H-P
On Fri, Oct 23, 2020 at 07:18:44PM -0400, Hans-Peter Nilsson wrote: > On Thu, 22 Oct 2020, Alan Modra via Gcc-patches wrote: > > /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a > > baseline for rtx costs: If a constant is valid in an insn, > > it is free. */ > > >From where is this quote? My "git grep" fails to find it for me > (on master). It seems like a port-specific commment so I'd > have expected to find it somewhere in config/rs6000. It's in a patch in the rs6000_rtx_costs series that hasn't yet been committed. https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555757.html
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 97180bb3819..e870ba0039a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -21264,16 +21264,13 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, || outer_code == MINUS) && (satisfies_constraint_I (x) || satisfies_constraint_L (x))) - || (outer_code == AND - && (satisfies_constraint_K (x) - || (mode == SImode - ? satisfies_constraint_L (x) - : satisfies_constraint_J (x)))) - || ((outer_code == IOR || outer_code == XOR) + || ((outer_code == AND || outer_code == IOR || outer_code == XOR) && (satisfies_constraint_K (x) || (mode == SImode ? satisfies_constraint_L (x) : satisfies_constraint_J (x)))) + || (outer_code == AND + && rs6000_is_valid_and_mask (x, mode)) || outer_code == ASHIFT || outer_code == ASHIFTRT || outer_code == LSHIFTRT @@ -21310,7 +21307,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, || outer_code == IOR || outer_code == XOR) && (INTVAL (x) - & ~ (unsigned HOST_WIDE_INT) 0xffffffff) == 0)) + & ~ (unsigned HOST_WIDE_INT) 0xffffffff) == 0) + || (outer_code == AND + && rs6000_is_valid_2insn_and (x, mode))) { *total = COSTS_N_INSNS (1); return true; @@ -21448,26 +21447,6 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, *total += COSTS_N_INSNS (1); return true; } - - /* rotate-and-mask (no rotate), andi., andis.: 1 insn. */ - HOST_WIDE_INT val = INTVAL (XEXP (x, 1)); - if (rs6000_is_valid_and_mask (XEXP (x, 1), mode) - || (val & 0xffff) == val - || (val & 0xffff0000) == val - || ((val & 0xffff) == 0 && mode == SImode)) - { - *total = rtx_cost (left, mode, AND, 0, speed); - *total += COSTS_N_INSNS (1); - return true; - } - - /* 2 insns. */ - if (rs6000_is_valid_2insn_and (XEXP (x, 1), mode)) - { - *total = rtx_cost (left, mode, AND, 0, speed); - *total += COSTS_N_INSNS (2); - return true; - } } *total = COSTS_N_INSNS (1);