Message ID | 55B8D9EB.3060806@arm.com |
---|---|
State | New |
Headers | show |
On 07/29/2015 07:49 AM, Kyrill Tkachov wrote: > Hi all, > > This patch improves RTL if-conversion on sequences that perform a > conditional select on integer constants. > Most of the smart logic to do that already exists in the > noce_try_store_flag_constants function. > However, currently that function is tried after noce_try_cmove. > noce_try_cmove is a simple catch-all function that just loads the two > immediates and performs a conditional > select between them. It returns true and then the caller > noce_process_if_block doesn't try any other transformations, > completely skipping the more aggressive transformations that > noce_try_store_flag_constants allows! > > Calling noce_try_store_flag_constants before noce_try_cmove allows for > the smarter if-conversion transformations > to be used. An example that occurs a lot in the gcc code itself is for > the C code: > int > foo (int a, int b) > { > return ((a & (1 << 25)) ? 5 : 4); > } > > i.e. test a bit in a and return 5 or 4. Currently on aarch64 this > generates the naive: > and w2, w0, 33554432 // mask away all bits except bit 25 > mov w1, 4 > cmp w2, wzr > mov w0, 5 > csel w0, w0, w1, ne > > > whereas with this patch this can be transformed into the much better: > ubfx x0, x0, 25, 1 // extract bit 25 > add w0, w0, 4 I suspect the PA would benefit from this as well, probably several other architectures with reasonable bitfield extraction capabilities. > > Another issue I encountered is that the code that claims to perform the > transformation: > /* if (test) x = 3; else x = 4; > => x = 3 + (test == 0); */ > > doesn't seem to do exactly that in all cases. In fact for that case it > will try something like: > x = 4 - (test == 0) > which is suboptimal for targets like aarch64 which have a conditional > increment operation. I vaguely recall targets that don't have const - X insns, but do have X + const style insns. And more generally I think we're better off generating the PLUS version. > This patch tweaks that code to always try to generate an addition of the > condition rather than > a subtraction. > > Anyway, for code: > int > fooinc (int x) > { > return x ? 1025 : 1026; > } > > we currently generate: > mov w2, 1025 > mov w1, 1026 > cmp w0, wzr > csel w0, w2, w1, ne > > whereas with this patch we will generate: > cmp w0, wzr > cset w0, eq > add w0, w0, 1025 > > Bootstrapped and tested on arm, aarch64, x86_64. > Ok for trunk? > > Thanks, > Kyrill > > P.S. noce_try_store_flag_constants is currently gated on > !targetm.have_conditional_execution () but I don't see > any reason to restrict it on targets with conditional execution. For > example, I think the first example above > would show a benefit on arm if it was enabled there. But that can be a > separate investigation. > > 2015-07-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * ifcvt.c (noce_try_store_flag_constants): Reverse when diff is > -STORE_FLAG and condition is reversable. Prefer to add to the > flag value. > (noce_process_if_block): Try noce_try_store_flag_constants before > noce_try_cmove. > > 2015-07-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/aarch64/csel_bfx_1.c: New test. > * gcc.target/aarch64/csel_imms_inc_1.c: Likewise. > > ifcvt-csel-imms.patch > > > commit 0164ef164483bdf0b2f73e267e2ff1df7800dd6d > Author: Kyrylo Tkachov<kyrylo.tkachov@arm.com> > Date: Tue Jul 28 14:59:46 2015 +0100 > > [RTL-ifcvt] Improve conditional increment ops on immediates > > diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c > index a57d78c..80d0285 100644 > --- a/gcc/ifcvt.c > +++ b/gcc/ifcvt.c > @@ -1222,7 +1222,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) > > reversep = 0; > if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE) > - normalize = 0; > + normalize = 0, reversep = (diff == -STORE_FLAG_VALUE) && can_reverse; We generally avoid using a ',' operator like this. However, I can see that you're just following existing convention in that code. So I won't object. > else if (ifalse == 0 && exact_log2 (itrue) >= 0 > && (STORE_FLAG_VALUE == 1 > || if_info->branch_cost >= 2)) > @@ -1261,10 +1261,13 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) > => x = 3 + (test == 0); */ > if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE) > { > - target = expand_simple_binop (mode, > - (diff == STORE_FLAG_VALUE > - ? PLUS : MINUS), > - gen_int_mode (ifalse, mode), target, > + rtx_code code = reversep ? PLUS : > + (diff == STORE_FLAG_VALUE ? PLUS > + : MINUS); > + HOST_WIDE_INT to_add = reversep ? MIN (ifalse, itrue) : ifalse; > + > + target = expand_simple_binop (mode, code, > + gen_int_mode (to_add, mode), target, > if_info->x, 0, OPTAB_WIDEN); > } Note that STORE_FLAG_VALUE can potentially be any value. Most targets use "1", but others use -1 (m68k for example, there are others). I'm concerned that the reversep ? MIN (ifalse, itrue) won't do what we want on targets such as the m68k. The logic here is also somewhat convoluted. When reversep is true, we always use PLUS, but is that actually correct? Normally we'd compute the code, then invert the code. ie, if we normally want PLUS, then we've flip to MINUS and vice-versa. > > @@ -3120,13 +3123,14 @@ noce_process_if_block (struct noce_if_info *if_info) > goto success; > if (noce_try_abs (if_info)) > goto success; > + if (!targetm.have_conditional_execution () > + && noce_try_store_flag_constants (if_info)) > + goto success; > if (HAVE_conditional_move > && noce_try_cmove (if_info)) > goto success; > if (! targetm.have_conditional_execution ()) > { > - if (noce_try_store_flag_constants (if_info)) > - goto success; > if (noce_try_addcc (if_info)) > goto success; > if (noce_try_store_flag_mask (if_info)) This part seems fine and ought to be able to go forward independently. Consider this hunk and its associated testcase approved if you want to test it independently and get it installed while we iterate on the other hunks, jeff
commit 0164ef164483bdf0b2f73e267e2ff1df7800dd6d Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Tue Jul 28 14:59:46 2015 +0100 [RTL-ifcvt] Improve conditional increment ops on immediates diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index a57d78c..80d0285 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -1222,7 +1222,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) reversep = 0; if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE) - normalize = 0; + normalize = 0, reversep = (diff == -STORE_FLAG_VALUE) && can_reverse; else if (ifalse == 0 && exact_log2 (itrue) >= 0 && (STORE_FLAG_VALUE == 1 || if_info->branch_cost >= 2)) @@ -1261,10 +1261,13 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) => x = 3 + (test == 0); */ if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE) { - target = expand_simple_binop (mode, - (diff == STORE_FLAG_VALUE - ? PLUS : MINUS), - gen_int_mode (ifalse, mode), target, + rtx_code code = reversep ? PLUS : + (diff == STORE_FLAG_VALUE ? PLUS + : MINUS); + HOST_WIDE_INT to_add = reversep ? MIN (ifalse, itrue) : ifalse; + + target = expand_simple_binop (mode, code, + gen_int_mode (to_add, mode), target, if_info->x, 0, OPTAB_WIDEN); } @@ -3120,13 +3123,14 @@ noce_process_if_block (struct noce_if_info *if_info) goto success; if (noce_try_abs (if_info)) goto success; + if (!targetm.have_conditional_execution () + && noce_try_store_flag_constants (if_info)) + goto success; if (HAVE_conditional_move && noce_try_cmove (if_info)) goto success; if (! targetm.have_conditional_execution ()) { - if (noce_try_store_flag_constants (if_info)) - goto success; if (noce_try_addcc (if_info)) goto success; if (noce_try_store_flag_mask (if_info)) diff --git a/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c b/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c new file mode 100644 index 0000000..c20597f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-save-temps -O2" } */ + +int +foo (int a, int b) +{ + return ((a & (1 << 25)) ? 5 : 4); +} + +/* { dg-final { scan-assembler "ubfx\t\[xw\]\[0-9\]*.*" } } */ +/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c b/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c new file mode 100644 index 0000000..2ae434d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -O2 -fno-inline" } */ + +extern void abort (void); + +int +fooinc (int x) +{ + if (x) + return 1025; + else + return 1026; +} + +int +fooinc2 (int x) +{ + if (x) + return 1026; + else + return 1025; +} + +int +main (void) +{ + if (fooinc (0) != 1026) + abort (); + + if (fooinc (1) != 1025) + abort (); + + if (fooinc2 (0) != 1025) + abort (); + + if (fooinc2 (1) != 1026) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */