Message ID | CAMZc-byf4rYBdOtgC2enpVymG6iRcrdyRPYHCUfn6J=F72jfCg@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738] | expand |
On Mon, May 31, 2021 at 10:21 PM Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi: > This patch is about to simplify (view_convert:type ~a) < 0 to > (view_convert:type a) >= 0 when type is signed integer. Similar for > (view_convert:type ~a) >= 0. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for the trunk? Why not just do instead: /* ((view_convert)~a) is just ~(view_convert)a . */ (simplify (view_convert (bit_not @0)) (if ((VECTOR_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (type))) || INTEGRAL_TYPE_P (type)) (bit_not (view_convert @0)))) And then the other patterns for converting ~a < 0 to a >= 0 should happen (if they don't add a few) (for wrapping types). Thanks, Andrew Pinski > > gcc/ChangeLog: > > PR middle-end/100738 > * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > simplification. > > gcc/testsuite/ChangeLog: > > PR middle-end/100738 > * g++.target/i386/avx2-pr100738-1.C: New test. > * g++.target/i386/sse4_1-pr100738-1.C: New test. > > -- > BR, > Hongtao
On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: > Hi: > This patch is about to simplify (view_convert:type ~a) < 0 to > (view_convert:type a) >= 0 when type is signed integer. Similar for > (view_convert:type ~a) >= 0. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for the trunk? > > gcc/ChangeLog: > > PR middle-end/100738 > * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > simplification. We already have /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ (for cmp (simple_comparison) scmp (swapped_simple_comparison) (simplify (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) (if (single_use (@2) && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) (scmp @0 (bit_not @1))))) Would it make sense to try and generalize it a bit, say with (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) (scmp (view_convert:XXX @0) (bit_not @1)) (I still believe that it is a bad idea that SSA_NAMEs are strongly typed, encoding the type in operations would be more convenient, but I think the time for that choice has long gone)
On Tue, Jun 1, 2021 at 1:29 PM Andrew Pinski <pinskia@gmail.com> wrote: > > On Mon, May 31, 2021 at 10:21 PM Hongtao Liu via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > Hi: > > This patch is about to simplify (view_convert:type ~a) < 0 to > > (view_convert:type a) >= 0 when type is signed integer. Similar for > > (view_convert:type ~a) >= 0. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > Ok for the trunk? > > Why not just do instead: > > /* ((view_convert)~a) is just ~(view_convert)a . */ > (simplify > (view_convert (bit_not @0)) > (if ((VECTOR_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (type))) > || INTEGRAL_TYPE_P (type)) > (bit_not (view_convert @0)))) > > And then the other patterns for converting ~a < 0 to a >= 0 should > happen (if they don't add a few) (for wrapping types). > It fails c11-atomic-exec-3.c where @0 is _Bool and type is unsigned char. > Thanks, > Andrew Pinski > > > > > gcc/ChangeLog: > > > > PR middle-end/100738 > > * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > > (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > > simplification. > > > > gcc/testsuite/ChangeLog: > > > > PR middle-end/100738 > > * g++.target/i386/avx2-pr100738-1.C: New test. > > * g++.target/i386/sse4_1-pr100738-1.C: New test. > > > > -- > > BR, > > Hongtao
On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > > Hi: > > This patch is about to simplify (view_convert:type ~a) < 0 to > > (view_convert:type a) >= 0 when type is signed integer. Similar for > > (view_convert:type ~a) >= 0. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > Ok for the trunk? > > > > gcc/ChangeLog: > > > > PR middle-end/100738 > > * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > > (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > > simplification. > > We already have > > /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ > (for cmp (simple_comparison) > scmp (swapped_simple_comparison) > (simplify > (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) > (if (single_use (@2) > && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) > (scmp @0 (bit_not @1))))) > > Would it make sense to try and generalize it a bit, say with > > (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) > > (scmp (view_convert:XXX @0) (bit_not @1)) > Thanks for your advice, it looks great. And can I use *view_convert1?* instead of *nop_convert1?* here, because the original case is view_convert, and nop_convert would fail to simplify the case. > (I still believe that it is a bad idea that SSA_NAMEs are strongly typed, > encoding the type in operations would be more convenient, but I think the > time for that choice has long gone) > > -- > Marc Glisse
On Fri, Jun 4, 2021 at 1:01 PM Hongtao Liu <crazylht@gmail.com> wrote: > > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > > > On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > > > > Hi: > > > This patch is about to simplify (view_convert:type ~a) < 0 to > > > (view_convert:type a) >= 0 when type is signed integer. Similar for > > > (view_convert:type ~a) >= 0. > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > > Ok for the trunk? > > > > > > gcc/ChangeLog: > > > > > > PR middle-end/100738 > > > * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > > > (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > > > simplification. > > > > We already have > > > > /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ > > (for cmp (simple_comparison) > > scmp (swapped_simple_comparison) > > (simplify > > (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) > > (if (single_use (@2) > > && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) > > (scmp @0 (bit_not @1))))) > > > > Would it make sense to try and generalize it a bit, say with > > > > (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) > > > > (scmp (view_convert:XXX @0) (bit_not @1)) > > > Thanks for your advice, it looks great. > And can I use *view_convert1?* instead of *nop_convert1?* here, > because the original case is view_convert, and nop_convert would fail > to simplify the case. Here is updated patch gcc/ChangeLog: PR middle-end/100738 * match.pd (Fold ~X op C as X op' ~C): Extend GIMPLE simplification to handle view_convert ~X. gcc/testsuite/ChangeLog: PR middle-end/100738 * g++.target/i386/avx2-pr100738-1.C: New test. * g++.target/i386/sse4_1-pr100738-1.C: New test. > > (I still believe that it is a bad idea that SSA_NAMEs are strongly typed, > > encoding the type in operations would be more convenient, but I think the > > time for that choice has long gone) > > > > -- > > Marc Glisse > > > > -- > BR, > Hongtao
On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote: > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: >> >>> Hi: >>> This patch is about to simplify (view_convert:type ~a) < 0 to >>> (view_convert:type a) >= 0 when type is signed integer. Similar for >>> (view_convert:type ~a) >= 0. >>> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. >>> Ok for the trunk? >>> >>> gcc/ChangeLog: >>> >>> PR middle-end/100738 >>> * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, >>> (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE >>> simplification. >> >> We already have >> >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ >> (for cmp (simple_comparison) >> scmp (swapped_simple_comparison) >> (simplify >> (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) >> (if (single_use (@2) >> && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) >> (scmp @0 (bit_not @1))))) >> >> Would it make sense to try and generalize it a bit, say with >> >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) >> >> (scmp (view_convert:XXX @0) (bit_not @1)) >> > Thanks for your advice, it looks great. > And can I use *view_convert1?* instead of *nop_convert1?* here, > because the original case is view_convert, and nop_convert would fail > to simplify the case. Near the top of match.pd, you can see /* With nop_convert? combine convert? and view_convert? in one pattern plus conditionalize on tree_nop_conversion_p conversions. */ (match (nop_convert @0) (convert @0) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))))) (match (nop_convert @0) (view_convert @0) (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0)) && known_eq (TYPE_VECTOR_SUBPARTS (type), TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))) && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0)))))) So at least the intention is that it can handle both NOP_EXPR for scalars and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way in some places in match.pd, like (simplify (negate (nop_convert? (bit_not @0))) (plus (view_convert @0) { build_each_one_cst (type); })) (simplify (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (bit_not (bit_xor (view_convert @0) @1)))) (the 'if' seems redundant for this one) (simplify (negate (nop_convert? (negate @1))) (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1))) (view_convert @1))) etc. At some point this got some genmatch help, to handle '?' and numbers, so I don't remember all the details, but following these examples should work.
On Fri, Jun 4, 2021 at 4:18 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote: > >> > >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: > >> > >>> Hi: > >>> This patch is about to simplify (view_convert:type ~a) < 0 to > >>> (view_convert:type a) >= 0 when type is signed integer. Similar for > >>> (view_convert:type ~a) >= 0. > >>> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > >>> Ok for the trunk? > >>> > >>> gcc/ChangeLog: > >>> > >>> PR middle-end/100738 > >>> * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > >>> (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > >>> simplification. > >> > >> We already have > >> > >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ > >> (for cmp (simple_comparison) > >> scmp (swapped_simple_comparison) > >> (simplify > >> (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) > >> (if (single_use (@2) > >> && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) > >> (scmp @0 (bit_not @1))))) > >> > >> Would it make sense to try and generalize it a bit, say with > >> > >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) > >> > >> (scmp (view_convert:XXX @0) (bit_not @1)) > >> > > Thanks for your advice, it looks great. > > And can I use *view_convert1?* instead of *nop_convert1?* here, > > because the original case is view_convert, and nop_convert would fail > > to simplify the case. > > Near the top of match.pd, you can see > > /* With nop_convert? combine convert? and view_convert? in one pattern > plus conditionalize on tree_nop_conversion_p conversions. */ > (match (nop_convert @0) > (convert @0) > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))))) > (match (nop_convert @0) > (view_convert @0) > (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0)) > && known_eq (TYPE_VECTOR_SUBPARTS (type), > TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))) > && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0)))))) > Oh, it's restricted to the same number of elements which is not the case i tested. That's why nop_convert failed to simplify the case. Guess we can define another nop1_convert to handle vector types with different number of elements, but still tree_nop_convertion_p? > So at least the intention is that it can handle both NOP_EXPR for scalars > and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way > in some places in match.pd, like > > (simplify > (negate (nop_convert? (bit_not @0))) > (plus (view_convert @0) { build_each_one_cst (type); })) > > (simplify > (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1) > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > (bit_not (bit_xor (view_convert @0) @1)))) > > (the 'if' seems redundant for this one) > > (simplify > (negate (nop_convert? (negate @1))) > (if (!TYPE_OVERFLOW_SANITIZED (type) > && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1))) > (view_convert @1))) > > etc. > > > At some point this got some genmatch help, to handle '?' and numbers, so I > don't remember all the details, but following these examples should work. > > -- > Marc Glisse
On Mon, Jun 7, 2021 at 2:22 PM Hongtao Liu <crazylht@gmail.com> wrote: > > On Fri, Jun 4, 2021 at 4:18 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > > > On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > > > > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > >> > > >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > >> > > >>> Hi: > > >>> This patch is about to simplify (view_convert:type ~a) < 0 to > > >>> (view_convert:type a) >= 0 when type is signed integer. Similar for > > >>> (view_convert:type ~a) >= 0. > > >>> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > >>> Ok for the trunk? > > >>> > > >>> gcc/ChangeLog: > > >>> > > >>> PR middle-end/100738 > > >>> * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > > >>> (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > > >>> simplification. > > >> > > >> We already have > > >> > > >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ > > >> (for cmp (simple_comparison) > > >> scmp (swapped_simple_comparison) > > >> (simplify > > >> (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) > > >> (if (single_use (@2) > > >> && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) > > >> (scmp @0 (bit_not @1))))) > > >> > > >> Would it make sense to try and generalize it a bit, say with > > >> > > >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) > > >> > > >> (scmp (view_convert:XXX @0) (bit_not @1)) > > >> > > > Thanks for your advice, it looks great. > > > And can I use *view_convert1?* instead of *nop_convert1?* here, > > > because the original case is view_convert, and nop_convert would fail > > > to simplify the case. > > > > Near the top of match.pd, you can see > > > > /* With nop_convert? combine convert? and view_convert? in one pattern > > plus conditionalize on tree_nop_conversion_p conversions. */ > > (match (nop_convert @0) > > (convert @0) > > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))))) > > (match (nop_convert @0) > > (view_convert @0) > > (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0)) > > && known_eq (TYPE_VECTOR_SUBPARTS (type), > > TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))) > > && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0)))))) > > > Oh, it's restricted to the same number of elements which is not the > case i tested. > That's why nop_convert failed to simplify the case. And tree_nop_conversion_p also doesn't handle vector types with different element numbers. Shouldn't v4si --> v16qi a nop conversion for all targets? > > Guess we can define another nop1_convert to handle vector types with > different number of elements, but still tree_nop_convertion_p? > > > So at least the intention is that it can handle both NOP_EXPR for scalars > > and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way > > in some places in match.pd, like > > > > (simplify > > (negate (nop_convert? (bit_not @0))) > > (plus (view_convert @0) { build_each_one_cst (type); })) > > > > (simplify > > (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1) > > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > > (bit_not (bit_xor (view_convert @0) @1)))) > > > > (the 'if' seems redundant for this one) > > > > (simplify > > (negate (nop_convert? (negate @1))) > > (if (!TYPE_OVERFLOW_SANITIZED (type) > > && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1))) > > (view_convert @1))) > > > > etc. > > > > > > At some point this got some genmatch help, to handle '?' and numbers, so I > > don't remember all the details, but following these examples should work. > > > > -- > > Marc Glisse > > > > -- > BR, > Hongtao
On Mon, Jun 7, 2021 at 9:02 AM Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Mon, Jun 7, 2021 at 2:22 PM Hongtao Liu <crazylht@gmail.com> wrote: > > > > On Fri, Jun 4, 2021 at 4:18 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > > > > > On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > > > > > > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > > >> > > > >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: > > > >> > > > >>> Hi: > > > >>> This patch is about to simplify (view_convert:type ~a) < 0 to > > > >>> (view_convert:type a) >= 0 when type is signed integer. Similar for > > > >>> (view_convert:type ~a) >= 0. > > > >>> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > > >>> Ok for the trunk? > > > >>> > > > >>> gcc/ChangeLog: > > > >>> > > > >>> PR middle-end/100738 > > > >>> * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, > > > >>> (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE > > > >>> simplification. > > > >> > > > >> We already have > > > >> > > > >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ > > > >> (for cmp (simple_comparison) > > > >> scmp (swapped_simple_comparison) > > > >> (simplify > > > >> (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) > > > >> (if (single_use (@2) > > > >> && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) > > > >> (scmp @0 (bit_not @1))))) > > > >> > > > >> Would it make sense to try and generalize it a bit, say with > > > >> > > > >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) > > > >> > > > >> (scmp (view_convert:XXX @0) (bit_not @1)) > > > >> > > > > Thanks for your advice, it looks great. > > > > And can I use *view_convert1?* instead of *nop_convert1?* here, > > > > because the original case is view_convert, and nop_convert would fail > > > > to simplify the case. > > > > > > Near the top of match.pd, you can see > > > > > > /* With nop_convert? combine convert? and view_convert? in one pattern > > > plus conditionalize on tree_nop_conversion_p conversions. */ > > > (match (nop_convert @0) > > > (convert @0) > > > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))))) > > > (match (nop_convert @0) > > > (view_convert @0) > > > (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0)) > > > && known_eq (TYPE_VECTOR_SUBPARTS (type), > > > TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))) > > > && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0)))))) > > > > > Oh, it's restricted to the same number of elements which is not the > > case i tested. > > That's why nop_convert failed to simplify the case. > And tree_nop_conversion_p also doesn't handle vector types with > different element numbers. > Shouldn't v4si --> v16qi a nop conversion for all targets? It's a conversion with semantics, like v4si + V_C_E<v4si> (v16qi) doesn't make sense when you strip the V_C_E as you'll get v4si + v16qi. We don't consider those a nop conversion. Richard. > > > > Guess we can define another nop1_convert to handle vector types with > > different number of elements, but still tree_nop_convertion_p? > > > > > So at least the intention is that it can handle both NOP_EXPR for scalars > > > and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way > > > in some places in match.pd, like > > > > > > (simplify > > > (negate (nop_convert? (bit_not @0))) > > > (plus (view_convert @0) { build_each_one_cst (type); })) > > > > > > (simplify > > > (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1) > > > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > > > (bit_not (bit_xor (view_convert @0) @1)))) > > > > > > (the 'if' seems redundant for this one) > > > > > > (simplify > > > (negate (nop_convert? (negate @1))) > > > (if (!TYPE_OVERFLOW_SANITIZED (type) > > > && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1))) > > > (view_convert @1))) > > > > > > etc. > > > > > > > > > At some point this got some genmatch help, to handle '?' and numbers, so I > > > don't remember all the details, but following these examples should work. > > > > > > -- > > > Marc Glisse > > > > > > > > -- > > BR, > > Hongtao > > > > -- > BR, > Hongtao
From 8c13f61c25821aca63ef2920fddce9704bfadeec Mon Sep 17 00:00:00 2001 From: liuhongt <hongtao.liu@intel.com> Date: Thu, 27 May 2021 15:21:06 +0800 Subject: [PATCH] Optimize (view_convert:type ~a) < 0 to (view_convert:type a) >= 0 when type is signed integer. Similar for (view_convert:type ~a) >= 0. gcc/ChangeLog: PR middle-end/100738 * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE simplification. gcc/testsuite/ChangeLog: PR middle-end/100738 * g++.target/i386/avx2-pr100738-1.C: New test. * g++.target/i386/sse4_1-pr100738-1.C: New test. --- gcc/match.pd | 9 ++ .../g++.target/i386/avx2-pr100738-1.C | 120 ++++++++++++++++++ .../g++.target/i386/sse4_1-pr100738-1.C | 120 ++++++++++++++++++ 3 files changed, 249 insertions(+) create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr100738-1.C create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C diff --git a/gcc/match.pd b/gcc/match.pd index cdb87636951..d1c6b4ea2b4 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3983,6 +3983,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) wide_int_to_tree (TREE_TYPE (cst), wi::to_wide (cst) - 1)); }))))) +/* ((view_convert:signed_type)~a) < 0 is just (view_convert) a >= 0. */ +(for cmp (lt ge) + acmp (ge lt) + (simplify + (cmp (view_convert (bit_not @0)) integer_zerop@1) + (if (!TYPE_UNSIGNED (TREE_TYPE (@1))) + (with { tree stype = TREE_TYPE (@1); } + (acmp (view_convert:stype @0) @1))))) + /* We can simplify a logical negation of a comparison to the inverted comparison. As we cannot compute an expression operator using invert_tree_comparison we have to simulate diff --git a/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C b/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C new file mode 100644 index 00000000000..80fdad3e5f0 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C @@ -0,0 +1,120 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx2 -std=c++14 -O2 -mno-avx512f -mno-xop" } */ +/* { dg-final { scan-assembler-not "pxor" } } */ +/* { dg-final { scan-assembler-not "pcmpgt\[bdq]" } } */ +/* { dg-final { scan-assembler-times "pblendvb" 6 } } */ +/* { dg-final { scan-assembler-times "blendvps" 6 } } */ +/* { dg-final { scan-assembler-times "blendvpd" 6 } } */ + +typedef char v32qi __attribute__ ((vector_size (32))); +typedef short v16hi __attribute__ ((vector_size (32))); +typedef int v8si __attribute__ ((vector_size (32))); +typedef long long v4di __attribute__ ((vector_size (32))); + +v8si +f1 (v32qi a, v8si b, v8si c) +{ + return ((v8si)~a) < 0 ? b : c; +} + +v4di +f2 (v32qi a, v4di b, v4di c) +{ + return ((v4di)~a) < 0 ? b : c; +} + +v32qi +f3 (v16hi a, v32qi b, v32qi c) +{ + return ((v32qi)~a) < 0 ? b : c; +} + +v8si +f4 (v16hi a, v8si b, v8si c) +{ + return ((v8si)~a) < 0 ? b : c; +} + +v4di +f5 (v16hi a, v4di b, v4di c) +{ + return ((v4di)~a) < 0 ? b : c; +} + +v32qi +f6 (v8si a, v32qi b, v32qi c) +{ + return ((v32qi)~a) < 0 ? b : c; +} + +v4di +f7 (v8si a, v4di b, v4di c) +{ + return ((v4di)~a) < 0 ? b : c; +} + +v32qi +f8 (v4di a, v32qi b, v32qi c) +{ + return ((v32qi)~a) < 0 ? b : c; +} + +v8si +f9 (v4di a, v8si b, v8si c) +{ + return ((v8si)~a) < 0 ? b : c; +} + +v8si +f10 (v32qi a, v8si b, v8si c) +{ + return ((v8si)~a) >= 0 ? b : c; +} + +v4di +f11 (v32qi a, v4di b, v4di c) +{ + return ((v4di)~a) >= 0 ? b : c; +} + +v32qi +f12 (v16hi a, v32qi b, v32qi c) +{ + return ((v32qi)~a) >= 0 ? b : c; +} + +v8si +f13 (v16hi a, v8si b, v8si c) +{ + return ((v8si)~a) >= 0 ? b : c; +} + +v4di +f14 (v16hi a, v4di b, v4di c) +{ + return ((v4di)~a) >= 0 ? b : c; +} + +v32qi +f15 (v8si a, v32qi b, v32qi c) +{ + return ((v32qi)~a) >= 0 ? b : c; +} + +v4di +f16 (v8si a, v4di b, v4di c) +{ + return ((v4di)~a) >= 0 ? b : c; +} + +v32qi +f17 (v4di a, v32qi b, v32qi c) +{ + return ((v32qi)~a) >= 0 ? b : c; +} + +v8si +f18 (v4di a, v8si b, v8si c) +{ + return ((v8si)~a) >= 0 ? b : c; +} diff --git a/gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C b/gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C new file mode 100644 index 00000000000..d3454c264cd --- /dev/null +++ b/gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C @@ -0,0 +1,120 @@ +/* { dg-do compile } */ +/* { dg-options "-msse4 -std=c++14 -mno-avx2 -O2 -mno-xop" } */ +/* { dg-final { scan-assembler-not "pxor" } } */ +/* { dg-final { scan-assembler-not "pcmpgt\[bdq]" } } */ +/* { dg-final { scan-assembler-times "pblendvb" 6 } } */ +/* { dg-final { scan-assembler-times "blendvps" 6 } } */ +/* { dg-final { scan-assembler-times "blendvpd" 6 } } */ + +typedef char v16qi __attribute__ ((vector_size (16))); +typedef short v8hi __attribute__ ((vector_size (16))); +typedef int v4si __attribute__ ((vector_size (16))); +typedef long long v2di __attribute__ ((vector_size (16))); + +v4si +f1 (v16qi a, v4si b, v4si c) +{ + return ((v4si)~a) < 0 ? b : c; +} + +v2di +f2 (v16qi a, v2di b, v2di c) +{ + return ((v2di)~a) < 0 ? b : c; +} + +v16qi +f3 (v8hi a, v16qi b, v16qi c) +{ + return ((v16qi)~a) < 0 ? b : c; +} + +v4si +f4 (v8hi a, v4si b, v4si c) +{ + return ((v4si)~a) < 0 ? b : c; +} + +v2di +f5 (v8hi a, v2di b, v2di c) +{ + return ((v2di)~a) < 0 ? b : c; +} + +v16qi +f6 (v4si a, v16qi b, v16qi c) +{ + return ((v16qi)~a) < 0 ? b : c; +} + +v2di +f7 (v4si a, v2di b, v2di c) +{ + return ((v2di)~a) < 0 ? b : c; +} + +v16qi +f8 (v2di a, v16qi b, v16qi c) +{ + return ((v16qi)~a) < 0 ? b : c; +} + +v4si +f9 (v2di a, v4si b, v4si c) +{ + return ((v4si)~a) < 0 ? b : c; +} + +v4si +f10 (v16qi a, v4si b, v4si c) +{ + return ((v4si)~a) >= 0 ? b : c; +} + +v2di +f11 (v16qi a, v2di b, v2di c) +{ + return ((v2di)~a) >= 0 ? b : c; +} + +v16qi +f12 (v8hi a, v16qi b, v16qi c) +{ + return ((v16qi)~a) >= 0 ? b : c; +} + +v4si +f13 (v8hi a, v4si b, v4si c) +{ + return ((v4si)~a) >= 0 ? b : c; +} + +v2di +f14 (v8hi a, v2di b, v2di c) +{ + return ((v2di)~a) >= 0 ? b : c; +} + +v16qi +f15 (v4si a, v16qi b, v16qi c) +{ + return ((v16qi)~a) >= 0 ? b : c; +} + +v2di +f16 (v4si a, v2di b, v2di c) +{ + return ((v2di)~a) >= 0 ? b : c; +} + +v16qi +f17 (v2di a, v16qi b, v16qi c) +{ + return ((v16qi)~a) >= 0 ? b : c; +} + +v4si +f18 (v2di a, v4si b, v4si c) +{ + return ((v4si)~a) >= 0 ? b : c; +} -- 2.18.1