Message ID | 55E092D9.8070700@arm.com |
---|---|
State | New |
Headers | show |
(not a review, I haven't even read the whole patch) On Fri, 28 Aug 2015, Andre Vieira wrote: > 2015-08-03 Andre Vieira <andre.simoesdiasvieira@arm.com> > > * match.pd: Added new patterns: > ((X {&,<<,>>} C0) {|,^} C1) {^,|} C2) > (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1) +(for op0 (rshift rshift lshift lshift bit_and bit_and) + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor) + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior) You can nest for-loops, it seems clearer as: (for op0 (rshift lshift bit_and) (for op1 (bit_ior bit_xor) op2 (bit_xor bit_ior) +(simplify + (op2:c + (op1:c + (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3) I suspect you will want more :s (single_use) and less :c (canonicalization should put constants in second position). + C1 = wi::bit_and_not (C1,C2); Space after ','. Having wide_int_storage in many places is surprising, I can't find similar code anywhere else in gcc.
Hi Marc, On 28/08/15 19:07, Marc Glisse wrote: > (not a review, I haven't even read the whole patch) > > On Fri, 28 Aug 2015, Andre Vieira wrote: > >> 2015-08-03 Andre Vieira <andre.simoesdiasvieira@arm.com> >> >> * match.pd: Added new patterns: >> ((X {&,<<,>>} C0) {|,^} C1) {^,|} C2) >> (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1) > > +(for op0 (rshift rshift lshift lshift bit_and bit_and) > + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor) > + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior) > > You can nest for-loops, it seems clearer as: > (for op0 (rshift lshift bit_and) > (for op1 (bit_ior bit_xor) > op2 (bit_xor bit_ior) Will do, thank you for pointing it out. > > +(simplify > + (op2:c > + (op1:c > + (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3) > > I suspect you will want more :s (single_use) and less :c (canonicalization > should put constants in second position). > I can't find the definition of :s (single_use). GCC internals do point out that canonicalization does put constants in the second position, didnt see that first. Thank you for pointing it out. > + C1 = wi::bit_and_not (C1,C2); > > Space after ','. > Will do. > Having wide_int_storage in many places is surprising, I can't find similar > code anywhere else in gcc. > > I tried looking for examples of something similar, I think I ended up using wide_int because I was able to convert easily to and from it and it has the "mask" and "wide_int_to_tree" functions. I welcome any suggestions on what I should be using here for integer constant transformations and comparisons. BR, Andre
On Tue, Sep 1, 2015 at 3:40 PM, Andre Vieira <Andre.SimoesDiasVieira@arm.com> wrote: > Hi Marc, > > On 28/08/15 19:07, Marc Glisse wrote: >> >> (not a review, I haven't even read the whole patch) >> >> On Fri, 28 Aug 2015, Andre Vieira wrote: >> >>> 2015-08-03 Andre Vieira <andre.simoesdiasvieira@arm.com> >>> >>> * match.pd: Added new patterns: >>> ((X {&,<<,>>} C0) {|,^} C1) {^,|} C2) >>> (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1) >> >> >> +(for op0 (rshift rshift lshift lshift bit_and bit_and) >> + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor) >> + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior) >> >> You can nest for-loops, it seems clearer as: >> (for op0 (rshift lshift bit_and) >> (for op1 (bit_ior bit_xor) >> op2 (bit_xor bit_ior) > > > Will do, thank you for pointing it out. >> >> >> +(simplify >> + (op2:c >> + (op1:c >> + (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3) >> >> I suspect you will want more :s (single_use) and less :c (canonicalization >> should put constants in second position). >> > I can't find the definition of :s (single_use). Sorry for that - didn't get along updating it yet :/ It restricts matching to sub-expressions that have a single-use. So + a &= 0xd123; + unsigned short tem = a ^ 0x6040; + a = tem | 0xc031; /* Simplify _not_ to ((a & 0xd123) | 0xe071). */ ... use of tem ... we shouldn't do the simplifcation here because the expression (a & 0x123) ^ 0x6040 is kept live by 'tem'. > GCC internals do point out > that canonicalization does put constants in the second position, didnt see > that first. Thank you for pointing it out. > >> + C1 = wi::bit_and_not (C1,C2); >> >> Space after ','. >> > Will do. > >> Having wide_int_storage in many places is surprising, I can't find similar >> code anywhere else in gcc. >> >> > > I tried looking for examples of something similar, I think I ended up using > wide_int because I was able to convert easily to and from it and it has the > "mask" and "wide_int_to_tree" functions. I welcome any suggestions on what I > should be using here for integer constant transformations and comparisons. Using wide-ints is fine, but you shouldn't need 'wide_int_storage' constructors - those are indeed odd. Is it just for the initializers of wide-ints? + wide_int zero_mask = wi::zero (prec); + wide_int C0 = wide_int_storage (@1); + wide_int C1 = wide_int_storage (@2); + wide_int C2 = wide_int_storage (@3); ... + zero_mask = wide_int_storage (wi::mask (C0.to_uhwi (), false, prec)); tree_to_uhwi (@1) should do the trick as well + C1 = wi::bit_and_not (C1,C2); + cst_emit = wi::bit_or (C1, C2); the ops should be replacable with @2 and @3, the store to C1 obviously not (but you can use a tree temporary and use wide_int_to_tree here to avoid the back-and-forth for the case where C1 is not assigned to). Note that transforms only doing association are prone to endless recursion in case some other pattern does the reverse op... Richard. > BR, > Andre >
On Fri, 2015-08-28 at 17:56 +0100, Andre Vieira wrote: [..snip..] > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c > new file mode 100644 > index 0000000000000000000000000000000000000000..984d8b37a01defe0e6852070a7dfa7ace5027c51 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c > @@ -0,0 +1,42 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O -fdump-tree-forwprop1" } */ > + > +unsigned short > +foo (unsigned short a) > +{ > + a ^= 0x4002; > + a >>= 1; > + a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001). */ > + return a; > +} > + > +unsigned short > +bar (unsigned short a) > +{ > + a |= 0x4002; > + a <<= 1; > + a ^= 0x0001; /* Simplify to ((a << 1) | 0x8005). */ > + return a; > +} > + > +unsigned short > +baz (unsigned short a) > +{ > + a &= 0xd123; > + a ^= 0x6040; > + a |= 0xc031; /* Simplify to ((a & 0xd123) | 0xe071). */ > + return a; > +} > + > +short > +qux (short a) > +{ > + a ^= 0x8002; > + a >>= 1; > + a |= 0x8000; /* Only move shift inward: (((a >> 1) ^ 0x4001 |) 0x8000). */ > + return a; > +} > +/* { dg-final { scan-tree-dump "\\^ 40961" "forwprop1" } } */ > +/* { dg-final { scan-tree-dump "\\| 32773" "forwprop1" } } */ > +/* { dg-final { scan-tree-dump "\\| 57457" "forwprop1" } } */ > +/* { dg-final { scan-tree-dump "\\^ -16383" "forwprop1" } } */ > -- I can't comment on the patch itself, but I noticed that in the testsuite addition, you've gathered all the "dg-final" clauses at the end. I think that this is consistent with existing practice in gcc, but AFAIK, the dg-final clauses can appear anywhere in the file. Would it make test cases more readable if we put the dg-final clauses next to the code that they relate to? Or, at least after the function that they relate to? For example: unsigned short foo (unsigned short a) { a ^= 0x4002; a >>= 1; a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001). */ return a; } /* { dg-final { scan-tree-dump "\\^ 40961" "forwprop1" } } */ unsigned short bar (unsigned short a) { /* snip */ /* Simplify to ((a << 1) | 0x8005). */ } /* { dg-final { scan-tree-dump "\\| 32773" "forwprop1" } } */ ... etc ... I think this may be more readable than the "group them all at the end of the file approach", especially with testcases with many files and dg-final directives. Hope this is constructive and not too "bikesheddy" Dave
On Tue, Sep 01, 2015 at 12:50:27PM -0400, David Malcolm wrote: > I can't comment on the patch itself, but I noticed that in the testsuite > addition, you've gathered all the "dg-final" clauses at the end. > > I think that this is consistent with existing practice in gcc, but > AFAIK, the dg-final clauses can appear anywhere in the file. > > Would it make test cases more readable if we put the dg-final clauses > next to the code that they relate to? Or, at least after the function > that they relate to? > > For example: > > unsigned short > foo (unsigned short a) > { > a ^= 0x4002; > a >>= 1; > a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001). */ > return a; > } > /* { dg-final { scan-tree-dump "\\^ 40961" "forwprop1" } } */ > > unsigned short > bar (unsigned short a) > { > /* snip */ /* Simplify to ((a << 1) | 0x8005). */ > } > /* { dg-final { scan-tree-dump "\\| 32773" "forwprop1" } } */ > > ... etc ... > > I think this may be more readable than the "group them all at the end of > the file approach", especially with testcases with many files and > dg-final directives. Yeah, it's probably somewhat more readable. Same for dg-output. E.g. some ubsan tests already use dg-outputs after functions they relate to. Marek
On 01/09/15 17:54, Marek Polacek wrote: > On Tue, Sep 01, 2015 at 12:50:27PM -0400, David Malcolm wrote: >> I can't comment on the patch itself, but I noticed that in the testsuite >> addition, you've gathered all the "dg-final" clauses at the end. >> >> I think that this is consistent with existing practice in gcc, but >> AFAIK, the dg-final clauses can appear anywhere in the file. >> >> Would it make test cases more readable if we put the dg-final clauses >> next to the code that they relate to? Or, at least after the function >> that they relate to? >> >> For example: >> >> unsigned short >> foo (unsigned short a) >> { >> a ^= 0x4002; >> a >>= 1; >> a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001). */ >> return a; >> } >> /* { dg-final { scan-tree-dump "\\^ 40961" "forwprop1" } } */ >> >> unsigned short >> bar (unsigned short a) >> { >> /* snip */ /* Simplify to ((a << 1) | 0x8005). */ >> } >> /* { dg-final { scan-tree-dump "\\| 32773" "forwprop1" } } */ >> >> ... etc ... >> >> I think this may be more readable than the "group them all at the end of >> the file approach", especially with testcases with many files and >> dg-final directives. > > Yeah, it's probably somewhat more readable. Same for dg-output. E.g. some > ubsan tests already use dg-outputs after functions they relate to. > > Marek > If no one objects Ill make those changes too. Sounds reasonable to me. Andre
From 15f86df5b3561edf26ae79cedbe160fd46596fd9 Mon Sep 17 00:00:00 2001 From: Andre Simoes Dias Vieira <andsim01@arm.com> Date: Wed, 26 Aug 2015 16:27:31 +0100 Subject: [PATCH] algorithmic optimization --- gcc/match.pd | 70 +++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c | 42 +++++++++++++++++ 2 files changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c diff --git a/gcc/match.pd b/gcc/match.pd index eb0ba9d10a9b8ca66c23c56da0678477379daf80..3d9a8f52713e8dfb2189aad76bce709c924fa286 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -708,6 +708,76 @@ along with GCC; see the file COPYING3. If not see && tree_nop_conversion_p (type, TREE_TYPE (@1))) (convert (bit_and (bit_not @1) @0)))) +/* (X bit_op C0) rshift C1 -> (X rshift C0) bit_op (C0 rshift C1) */ +(for bit_op (bit_ior bit_xor bit_and) +(simplify + (rshift (bit_op:c @0 INTEGER_CST@1) INTEGER_CST@2) + (bit_op + (rshift @0 @2) + { wide_int_to_tree (type, wi::rshift (@1, @2, TYPE_SIGN (type))); }))) + +/* (X bit_op C0) lshift C1 -> (X lshift C0) bit_op (C0 lshift C1) */ +(for bit_op (bit_ior bit_xor bit_and) +(simplify + (lshift (bit_op:c @0 INTEGER_CST@1) INTEGER_CST@2) + (bit_op + (lshift @0 @2) + { wide_int_to_tree (type, wi::lshift (@1, @2)); }))) + + +/* ((X op0 C0) op1 C1) op2 C2) + with op0 = {&, >>, <<}, op1 = {|,^}, op2 = {|,^} and op1 != op2 + zero_mask has 1's for all bits that are sure to be 0 in (X op0 C0) + and 0's otherwise. + if (op1 == '^') C1 &= ~C2; + if ((C1 & ~zero_mask) == 0) then emit (X op0 C0) op2 (C1 op2 C2) + if ((C2 & ~zero_mask) == 0) then emit (X op0 C0) op1 (C1 op2 C2) +*/ +(for op0 (rshift rshift lshift lshift bit_and bit_and) + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor) + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior) +(simplify + (op2:c + (op1:c + (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3) + (with + { + unsigned int prec = TYPE_PRECISION (type); + wide_int zero_mask = wi::zero (prec); + wide_int C0 = wide_int_storage (@1); + wide_int C1 = wide_int_storage (@2); + wide_int C2 = wide_int_storage (@3); + wide_int cst_emit; + + if (op0 == BIT_AND_EXPR) + { + zero_mask = wide_int_storage (wi::neg (@1)); + } + else if (op0 == LSHIFT_EXPR && wi::fits_uhwi_p (@1)) + { + zero_mask = wide_int_storage (wi::mask (C0.to_uhwi (), false, prec)); + } + else if (op0 == RSHIFT_EXPR && TYPE_UNSIGNED (type) && wi::fits_uhwi_p (@1)) + { + unsigned HOST_WIDE_INT m = prec - C0.to_uhwi (); + zero_mask = wide_int_storage (wi::mask (m, true, prec)); + } + + if (op1 == BIT_XOR_EXPR) + { + C1 = wi::bit_and_not (C1,C2); + cst_emit = wi::bit_or (C1, C2); + } + else + { + cst_emit = wi::bit_xor (C1, C2); + } + } + (if ((C1 & ~zero_mask) == 0) + (op2 (op0 @0 @1) { wide_int_to_tree (type, cst_emit); }) + (if ((C2 & ~zero_mask) == 0) + (op1 (op0 @0 @1) { wide_int_to_tree (type, cst_emit); })))))) + /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ (simplify (pointer_plus (pointer_plus:s @0 @1) @3) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c new file mode 100644 index 0000000000000000000000000000000000000000..984d8b37a01defe0e6852070a7dfa7ace5027c51 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c @@ -0,0 +1,42 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-forwprop1" } */ + +unsigned short +foo (unsigned short a) +{ + a ^= 0x4002; + a >>= 1; + a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001). */ + return a; +} + +unsigned short +bar (unsigned short a) +{ + a |= 0x4002; + a <<= 1; + a ^= 0x0001; /* Simplify to ((a << 1) | 0x8005). */ + return a; +} + +unsigned short +baz (unsigned short a) +{ + a &= 0xd123; + a ^= 0x6040; + a |= 0xc031; /* Simplify to ((a & 0xd123) | 0xe071). */ + return a; +} + +short +qux (short a) +{ + a ^= 0x8002; + a >>= 1; + a |= 0x8000; /* Only move shift inward: (((a >> 1) ^ 0x4001 |) 0x8000). */ + return a; +} +/* { dg-final { scan-tree-dump "\\^ 40961" "forwprop1" } } */ +/* { dg-final { scan-tree-dump "\\| 32773" "forwprop1" } } */ +/* { dg-final { scan-tree-dump "\\| 57457" "forwprop1" } } */ +/* { dg-final { scan-tree-dump "\\^ -16383" "forwprop1" } } */ -- 1.9.1