Message ID | 20221124091557.514727-1-linkw@linux.ibm.com |
---|---|
Headers | show |
Series | rs6000: Rework rs6000_emit_vector_compare | expand |
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen on 2022/11/24 17:15, Kewen Lin wrote: > Hi, > > Following Segher's suggestion, this patch series is to rework > function rs6000_emit_vector_compare for vector float and int > in multiple steps, it's based on the previous attempts [1][2]. > As mentioned in [1], the need to rework this for float is to > make a centralized place for vector float comparison handlings > instead of supporting with swapping ops and reversing code etc. > dispersedly. It's also for a subsequent patch to handle > comparison operators with or without trapping math (PR105480). > With the handling on vector float reworked, we can further make > the handling on vector int simplified as shown. > > For Segher's concern about whether this rework causes any > assembly change, I constructed two testcases for vector float[3] > and int[4] respectively before, it showed the most are fine > excepting for the difference on LE and UNGT, it's demonstrated > as improvement since it uses GE instead of GT ior EQ. The > associated test case in patch 3/9 is a good example. > > Besides, w/ and w/o the whole patch series, I built the whole > SPEC2017 at options -O3 and -Ofast separately, checked the > differences on object assembly. The result showed that the > most are unchanged, except for: > > * at -O3, 521.wrf_r has 9 object files and 526.blender_r has > 9 object files with differences. > > * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has > one and 527.cam4_r has 4 object files with differences. > > By looking into these differences, all significant differences > are caused by the known improvement mentined above transforming > GT ior EQ to GE, which can also affect unrolling decision due > to insn count. Some other trivial differences are branch > target offset difference, nop difference for alignment, vsx > register number differences etc. > > I also evaluated the runtime performance for these changed > benchmarks, the result is neutral. > > These patches are bootstrapped and regress-tested > incrementally on powerpc64-linux-gnu P7 & P8, and > powerpc64le-linux-gnu P9 & P10. > > Is it ok for trunk? > > BR, > Kewen > ----- > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html > [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html > [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html > [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html > > Kewen Lin (9): > rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 > rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 > rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 > rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 > rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 > rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 > rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 > rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 > rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 > > gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- > gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ > 2 files changed, 74 insertions(+), 131 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen > > on 2022/11/24 17:15, Kewen Lin wrote: >> Hi, >> >> Following Segher's suggestion, this patch series is to rework >> function rs6000_emit_vector_compare for vector float and int >> in multiple steps, it's based on the previous attempts [1][2]. >> As mentioned in [1], the need to rework this for float is to >> make a centralized place for vector float comparison handlings >> instead of supporting with swapping ops and reversing code etc. >> dispersedly. It's also for a subsequent patch to handle >> comparison operators with or without trapping math (PR105480). >> With the handling on vector float reworked, we can further make >> the handling on vector int simplified as shown. >> >> For Segher's concern about whether this rework causes any >> assembly change, I constructed two testcases for vector float[3] >> and int[4] respectively before, it showed the most are fine >> excepting for the difference on LE and UNGT, it's demonstrated >> as improvement since it uses GE instead of GT ior EQ. The >> associated test case in patch 3/9 is a good example. >> >> Besides, w/ and w/o the whole patch series, I built the whole >> SPEC2017 at options -O3 and -Ofast separately, checked the >> differences on object assembly. The result showed that the >> most are unchanged, except for: >> >> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >> 9 object files with differences. >> >> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >> one and 527.cam4_r has 4 object files with differences. >> >> By looking into these differences, all significant differences >> are caused by the known improvement mentined above transforming >> GT ior EQ to GE, which can also affect unrolling decision due >> to insn count. Some other trivial differences are branch >> target offset difference, nop difference for alignment, vsx >> register number differences etc. >> >> I also evaluated the runtime performance for these changed >> benchmarks, the result is neutral. >> >> These patches are bootstrapped and regress-tested >> incrementally on powerpc64-linux-gnu P7 & P8, and >> powerpc64le-linux-gnu P9 & P10. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> ----- >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >> >> Kewen Lin (9): >> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >> >> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >> 2 files changed, 74 insertions(+), 131 deletions(-) >> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen > >> >> on 2022/11/24 17:15, Kewen Lin wrote: >>> Hi, >>> >>> Following Segher's suggestion, this patch series is to rework >>> function rs6000_emit_vector_compare for vector float and int >>> in multiple steps, it's based on the previous attempts [1][2]. >>> As mentioned in [1], the need to rework this for float is to >>> make a centralized place for vector float comparison handlings >>> instead of supporting with swapping ops and reversing code etc. >>> dispersedly. It's also for a subsequent patch to handle >>> comparison operators with or without trapping math (PR105480). >>> With the handling on vector float reworked, we can further make >>> the handling on vector int simplified as shown. >>> >>> For Segher's concern about whether this rework causes any >>> assembly change, I constructed two testcases for vector float[3] >>> and int[4] respectively before, it showed the most are fine >>> excepting for the difference on LE and UNGT, it's demonstrated >>> as improvement since it uses GE instead of GT ior EQ. The >>> associated test case in patch 3/9 is a good example. >>> >>> Besides, w/ and w/o the whole patch series, I built the whole >>> SPEC2017 at options -O3 and -Ofast separately, checked the >>> differences on object assembly. The result showed that the >>> most are unchanged, except for: >>> >>> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >>> 9 object files with differences. >>> >>> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >>> one and 527.cam4_r has 4 object files with differences. >>> >>> By looking into these differences, all significant differences >>> are caused by the known improvement mentined above transforming >>> GT ior EQ to GE, which can also affect unrolling decision due >>> to insn count. Some other trivial differences are branch >>> target offset difference, nop difference for alignment, vsx >>> register number differences etc. >>> >>> I also evaluated the runtime performance for these changed >>> benchmarks, the result is neutral. >>> >>> These patches are bootstrapped and regress-tested >>> incrementally on powerpc64-linux-gnu P7 & P8, and >>> powerpc64le-linux-gnu P9 & P10. >>> >>> Is it ok for trunk? >>> >>> BR, >>> Kewen >>> ----- >>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >>> >>> Kewen Lin (9): >>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >>> >>> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >>> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >>> 2 files changed, 74 insertions(+), 131 deletions(-) >>> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>>
I get the following warning which prevents gcc from bootstrapping due to -Werror: /home/meissner/fsf-src/work124-sfsplat/gcc/config/rs6000/rs6000-p10sfopt.cc: In function ‘void {anonymous}::process_chain_from_load(gimple*)’: /home/meissner/fsf-src/work124-sfsplat/gcc/config/rs6000/rs6000-p10sfopt.cc:505:30: warning: zero-length gcc_dump_printf format string [-Wformat-zero-length] 505 | dump_printf (MSG_NOTE, ""); | ^~ I just commented out the dump_printf call.
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen >>> on 2022/11/24 17:15, Kewen Lin wrote: >>>> Hi, >>>> >>>> Following Segher's suggestion, this patch series is to rework >>>> function rs6000_emit_vector_compare for vector float and int >>>> in multiple steps, it's based on the previous attempts [1][2]. >>>> As mentioned in [1], the need to rework this for float is to >>>> make a centralized place for vector float comparison handlings >>>> instead of supporting with swapping ops and reversing code etc. >>>> dispersedly. It's also for a subsequent patch to handle >>>> comparison operators with or without trapping math (PR105480). >>>> With the handling on vector float reworked, we can further make >>>> the handling on vector int simplified as shown. >>>> >>>> For Segher's concern about whether this rework causes any >>>> assembly change, I constructed two testcases for vector float[3] >>>> and int[4] respectively before, it showed the most are fine >>>> excepting for the difference on LE and UNGT, it's demonstrated >>>> as improvement since it uses GE instead of GT ior EQ. The >>>> associated test case in patch 3/9 is a good example. >>>> >>>> Besides, w/ and w/o the whole patch series, I built the whole >>>> SPEC2017 at options -O3 and -Ofast separately, checked the >>>> differences on object assembly. The result showed that the >>>> most are unchanged, except for: >>>> >>>> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >>>> 9 object files with differences. >>>> >>>> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >>>> one and 527.cam4_r has 4 object files with differences. >>>> >>>> By looking into these differences, all significant differences >>>> are caused by the known improvement mentined above transforming >>>> GT ior EQ to GE, which can also affect unrolling decision due >>>> to insn count. Some other trivial differences are branch >>>> target offset difference, nop difference for alignment, vsx >>>> register number differences etc. >>>> >>>> I also evaluated the runtime performance for these changed >>>> benchmarks, the result is neutral. >>>> >>>> These patches are bootstrapped and regress-tested >>>> incrementally on powerpc64-linux-gnu P7 & P8, and >>>> powerpc64le-linux-gnu P9 & P10. >>>> >>>> Is it ok for trunk? >>>> >>>> BR, >>>> Kewen >>>> ----- >>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >>>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >>>> >>>> Kewen Lin (9): >>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >>>> >>>> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >>>> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >>>> 2 files changed, 74 insertions(+), 131 deletions(-) >>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>>>
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen >>>> on 2022/11/24 17:15, Kewen Lin wrote: >>>>> Hi, >>>>> >>>>> Following Segher's suggestion, this patch series is to rework >>>>> function rs6000_emit_vector_compare for vector float and int >>>>> in multiple steps, it's based on the previous attempts [1][2]. >>>>> As mentioned in [1], the need to rework this for float is to >>>>> make a centralized place for vector float comparison handlings >>>>> instead of supporting with swapping ops and reversing code etc. >>>>> dispersedly. It's also for a subsequent patch to handle >>>>> comparison operators with or without trapping math (PR105480). >>>>> With the handling on vector float reworked, we can further make >>>>> the handling on vector int simplified as shown. >>>>> >>>>> For Segher's concern about whether this rework causes any >>>>> assembly change, I constructed two testcases for vector float[3] >>>>> and int[4] respectively before, it showed the most are fine >>>>> excepting for the difference on LE and UNGT, it's demonstrated >>>>> as improvement since it uses GE instead of GT ior EQ. The >>>>> associated test case in patch 3/9 is a good example. >>>>> >>>>> Besides, w/ and w/o the whole patch series, I built the whole >>>>> SPEC2017 at options -O3 and -Ofast separately, checked the >>>>> differences on object assembly. The result showed that the >>>>> most are unchanged, except for: >>>>> >>>>> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >>>>> 9 object files with differences. >>>>> >>>>> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >>>>> one and 527.cam4_r has 4 object files with differences. >>>>> >>>>> By looking into these differences, all significant differences >>>>> are caused by the known improvement mentined above transforming >>>>> GT ior EQ to GE, which can also affect unrolling decision due >>>>> to insn count. Some other trivial differences are branch >>>>> target offset difference, nop difference for alignment, vsx >>>>> register number differences etc. >>>>> >>>>> I also evaluated the runtime performance for these changed >>>>> benchmarks, the result is neutral. >>>>> >>>>> These patches are bootstrapped and regress-tested >>>>> incrementally on powerpc64-linux-gnu P7 & P8, and >>>>> powerpc64le-linux-gnu P9 & P10. >>>>> >>>>> Is it ok for trunk? >>>>> >>>>> BR, >>>>> Kewen >>>>> ----- >>>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >>>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >>>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >>>>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >>>>> >>>>> Kewen Lin (9): >>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >>>>> >>>>> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >>>>> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >>>>> 2 files changed, 74 insertions(+), 131 deletions(-) >>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>>>>
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen >>>>> on 2022/11/24 17:15, Kewen Lin wrote: >>>>>> Hi, >>>>>> >>>>>> Following Segher's suggestion, this patch series is to rework >>>>>> function rs6000_emit_vector_compare for vector float and int >>>>>> in multiple steps, it's based on the previous attempts [1][2]. >>>>>> As mentioned in [1], the need to rework this for float is to >>>>>> make a centralized place for vector float comparison handlings >>>>>> instead of supporting with swapping ops and reversing code etc. >>>>>> dispersedly. It's also for a subsequent patch to handle >>>>>> comparison operators with or without trapping math (PR105480). >>>>>> With the handling on vector float reworked, we can further make >>>>>> the handling on vector int simplified as shown. >>>>>> >>>>>> For Segher's concern about whether this rework causes any >>>>>> assembly change, I constructed two testcases for vector float[3] >>>>>> and int[4] respectively before, it showed the most are fine >>>>>> excepting for the difference on LE and UNGT, it's demonstrated >>>>>> as improvement since it uses GE instead of GT ior EQ. The >>>>>> associated test case in patch 3/9 is a good example. >>>>>> >>>>>> Besides, w/ and w/o the whole patch series, I built the whole >>>>>> SPEC2017 at options -O3 and -Ofast separately, checked the >>>>>> differences on object assembly. The result showed that the >>>>>> most are unchanged, except for: >>>>>> >>>>>> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >>>>>> 9 object files with differences. >>>>>> >>>>>> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >>>>>> one and 527.cam4_r has 4 object files with differences. >>>>>> >>>>>> By looking into these differences, all significant differences >>>>>> are caused by the known improvement mentined above transforming >>>>>> GT ior EQ to GE, which can also affect unrolling decision due >>>>>> to insn count. Some other trivial differences are branch >>>>>> target offset difference, nop difference for alignment, vsx >>>>>> register number differences etc. >>>>>> >>>>>> I also evaluated the runtime performance for these changed >>>>>> benchmarks, the result is neutral. >>>>>> >>>>>> These patches are bootstrapped and regress-tested >>>>>> incrementally on powerpc64-linux-gnu P7 & P8, and >>>>>> powerpc64le-linux-gnu P9 & P10. >>>>>> >>>>>> Is it ok for trunk? >>>>>> >>>>>> BR, >>>>>> Kewen >>>>>> ----- >>>>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >>>>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >>>>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >>>>>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >>>>>> >>>>>> Kewen Lin (9): >>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >>>>>> >>>>>> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >>>>>> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >>>>>> 2 files changed, 74 insertions(+), 131 deletions(-) >>>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>>>>> >
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen > >>>>>> on 2022/11/24 17:15, Kewen Lin wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Following Segher's suggestion, this patch series is to rework >>>>>>> function rs6000_emit_vector_compare for vector float and int >>>>>>> in multiple steps, it's based on the previous attempts [1][2]. >>>>>>> As mentioned in [1], the need to rework this for float is to >>>>>>> make a centralized place for vector float comparison handlings >>>>>>> instead of supporting with swapping ops and reversing code etc. >>>>>>> dispersedly. It's also for a subsequent patch to handle >>>>>>> comparison operators with or without trapping math (PR105480). >>>>>>> With the handling on vector float reworked, we can further make >>>>>>> the handling on vector int simplified as shown. >>>>>>> >>>>>>> For Segher's concern about whether this rework causes any >>>>>>> assembly change, I constructed two testcases for vector float[3] >>>>>>> and int[4] respectively before, it showed the most are fine >>>>>>> excepting for the difference on LE and UNGT, it's demonstrated >>>>>>> as improvement since it uses GE instead of GT ior EQ. The >>>>>>> associated test case in patch 3/9 is a good example. >>>>>>> >>>>>>> Besides, w/ and w/o the whole patch series, I built the whole >>>>>>> SPEC2017 at options -O3 and -Ofast separately, checked the >>>>>>> differences on object assembly. The result showed that the >>>>>>> most are unchanged, except for: >>>>>>> >>>>>>> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >>>>>>> 9 object files with differences. >>>>>>> >>>>>>> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >>>>>>> one and 527.cam4_r has 4 object files with differences. >>>>>>> >>>>>>> By looking into these differences, all significant differences >>>>>>> are caused by the known improvement mentined above transforming >>>>>>> GT ior EQ to GE, which can also affect unrolling decision due >>>>>>> to insn count. Some other trivial differences are branch >>>>>>> target offset difference, nop difference for alignment, vsx >>>>>>> register number differences etc. >>>>>>> >>>>>>> I also evaluated the runtime performance for these changed >>>>>>> benchmarks, the result is neutral. >>>>>>> >>>>>>> These patches are bootstrapped and regress-tested >>>>>>> incrementally on powerpc64-linux-gnu P7 & P8, and >>>>>>> powerpc64le-linux-gnu P9 & P10. >>>>>>> >>>>>>> Is it ok for trunk? >>>>>>> >>>>>>> BR, >>>>>>> Kewen >>>>>>> ----- >>>>>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >>>>>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >>>>>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >>>>>>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >>>>>>> >>>>>>> Kewen Lin (9): >>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >>>>>>> >>>>>>> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >>>>>>> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >>>>>>> 2 files changed, 74 insertions(+), 131 deletions(-) >>>>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>>>>>> >>
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen >>>>>>> on 2022/11/24 17:15, Kewen Lin wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Following Segher's suggestion, this patch series is to rework >>>>>>>> function rs6000_emit_vector_compare for vector float and int >>>>>>>> in multiple steps, it's based on the previous attempts [1][2]. >>>>>>>> As mentioned in [1], the need to rework this for float is to >>>>>>>> make a centralized place for vector float comparison handlings >>>>>>>> instead of supporting with swapping ops and reversing code etc. >>>>>>>> dispersedly. It's also for a subsequent patch to handle >>>>>>>> comparison operators with or without trapping math (PR105480). >>>>>>>> With the handling on vector float reworked, we can further make >>>>>>>> the handling on vector int simplified as shown. >>>>>>>> >>>>>>>> For Segher's concern about whether this rework causes any >>>>>>>> assembly change, I constructed two testcases for vector float[3] >>>>>>>> and int[4] respectively before, it showed the most are fine >>>>>>>> excepting for the difference on LE and UNGT, it's demonstrated >>>>>>>> as improvement since it uses GE instead of GT ior EQ. The >>>>>>>> associated test case in patch 3/9 is a good example. >>>>>>>> >>>>>>>> Besides, w/ and w/o the whole patch series, I built the whole >>>>>>>> SPEC2017 at options -O3 and -Ofast separately, checked the >>>>>>>> differences on object assembly. The result showed that the >>>>>>>> most are unchanged, except for: >>>>>>>> >>>>>>>> * at -O3, 521.wrf_r has 9 object files and 526.blender_r has >>>>>>>> 9 object files with differences. >>>>>>>> >>>>>>>> * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has >>>>>>>> one and 527.cam4_r has 4 object files with differences. >>>>>>>> >>>>>>>> By looking into these differences, all significant differences >>>>>>>> are caused by the known improvement mentined above transforming >>>>>>>> GT ior EQ to GE, which can also affect unrolling decision due >>>>>>>> to insn count. Some other trivial differences are branch >>>>>>>> target offset difference, nop difference for alignment, vsx >>>>>>>> register number differences etc. >>>>>>>> >>>>>>>> I also evaluated the runtime performance for these changed >>>>>>>> benchmarks, the result is neutral. >>>>>>>> >>>>>>>> These patches are bootstrapped and regress-tested >>>>>>>> incrementally on powerpc64-linux-gnu P7 & P8, and >>>>>>>> powerpc64le-linux-gnu P9 & P10. >>>>>>>> >>>>>>>> Is it ok for trunk? >>>>>>>> >>>>>>>> BR, >>>>>>>> Kewen >>>>>>>> ----- >>>>>>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html >>>>>>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html >>>>>>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html >>>>>>>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html >>>>>>>> >>>>>>>> Kewen Lin (9): >>>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1 >>>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2 >>>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3 >>>>>>>> rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4 >>>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1 >>>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2 >>>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3 >>>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4 >>>>>>>> rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5 >>>>>>>> >>>>>>>> gcc/config/rs6000/rs6000.cc | 180 ++++++-------------- >>>>>>>> gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 25 +++ >>>>>>>> 2 files changed, 74 insertions(+), 131 deletions(-) >>>>>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c >>>>>>>> >>