[v3,0/5] LoongArch: SIMD fixes and optimizations

Message ID	20231120004728.205167-1-xry111@xry111.site
Headers	show Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3C56B3858C56 sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id ADE8466B39; Sun, 19 Nov 2023 19:47:38 -0500 (EST) From: Xi Ruoyao <xry111@xry111.site> To: gcc-patches@gcc.gnu.org Cc: chenglulu <chenglulu@loongson.cn>, i@xen0n.name, xuchenghua@loongson.cn, Xi Ruoyao <xry111@xry111.site> Subject: [PATCH v3 0/5] LoongArch: SIMD fixes and optimizations Date: Mon, 20 Nov 2023 08:47:23 +0800 Message-ID: <20231120004728.205167-1-xry111@xry111.site> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	LoongArch: SIMD fixes and optimizations \| expand [v3,0/5] LoongArch: SIMD fixes and optimizations [v3,1/5] LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578] [v3,2/5] LoongArch: Use standard pattern name and RTX code for LSX/LASX muh instructions [v3,3/5] LoongArch: Use standard pattern name and RTX code for LSX/LASX rotate shift [v3,4/5] LoongArch: Remove lrint_allow_inexact [v3,5/5] LoongArch: Use LSX for scalar FP rounding with explicit rounding mode

Message ID

20231120004728.205167-1-xry111@xry111.site

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3C56B3858C56
From: Xi Ruoyao <xry111@xry111.site>
To: gcc-patches@gcc.gnu.org
Cc: chenglulu <chenglulu@loongson.cn>, i@xen0n.name, xuchenghua@loongson.cn,
 Xi Ruoyao <xry111@xry111.site>
Subject: [PATCH v3 0/5] LoongArch: SIMD fixes and optimizations
Date: Mon, 20 Nov 2023 08:47:23 +0800
Message-ID: <20231120004728.205167-1-xry111@xry111.site>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org

Series

LoongArch: SIMD fixes and optimizations | expand

Message

Xi Ruoyao Nov. 20, 2023, 12:47 a.m. UTC

The [1/5] patch is the PR112578 fix at
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637097.html.
It has been changed to remove the nearbyint pattern (because nearbyint
should not raise FE_INEXACT even if -ffp-int-builtin-inexact).
As other patches depending on the simd.md file introduced by this, sending
it as the first of this series.

As many LASX instructions are only differentiated from the corresponding
LSX instruction with operand length, create simd.md file to contain the
RTX templates sharable by LSX and LASX.  This makes the code cleaner and
easier to maintain.

The [2/5] and [3/5] patches make vector product highpart and rotate
shift operations for GNU vectors and auto vectorization.

The [4/5] patch is a simple code cleanup, with no function change.

The [5/5] patch uses LSX for FP scalar rounding operations if LSX is
available and -ffp-int-builtin-exact.  We do this because the base FP
ISA does not have such instructions.  Using LSX is overkill, but still
much faster than calling libc functions.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

Xi Ruoyao (5):
  LoongArch: Fix usage of LSX and LASX frint/ftint instructions
    [PR112578]
  LoongArch: Use standard pattern name and RTX code for LSX/LASX muh
    instructions
  LoongArch: Use standard pattern name and RTX code for LSX/LASX rotate
    shift
  LoongArch: Remove lrint_allow_inexact
  LoongArch: Use LSX for scalar FP rounding with explicit rounding mode

 gcc/config/loongarch/lasx.md                  | 283 -----------------
 gcc/config/loongarch/loongarch-builtins.cc    |  52 ++--
 gcc/config/loongarch/loongarch.md             |  12 +-
 gcc/config/loongarch/lsx.md                   | 293 ------------------
 gcc/config/loongarch/simd.md                  | 268 ++++++++++++++++
 .../loongarch/vect-frint-no-inexact.c         |  48 +++
 .../loongarch/vect-frint-scalar-no-inexact.c  |  23 ++
 .../gcc.target/loongarch/vect-frint-scalar.c  |  43 +++
 .../gcc.target/loongarch/vect-frint.c         |  85 +++++
 .../loongarch/vect-ftint-no-inexact.c         |  44 +++
 .../gcc.target/loongarch/vect-ftint.c         |  83 +++++
 gcc/testsuite/gcc.target/loongarch/vect-muh.c |  36 +++
 .../gcc.target/loongarch/vect-rotr.c          |  36 +++
 13 files changed, 701 insertions(+), 605 deletions(-)
 create mode 100644 gcc/config/loongarch/simd.md
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-no-inexact.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar-no-inexact.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint-no-inexact.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-muh.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-rotr.c

Comments

Xi Ruoyao Nov. 29, 2023, 7:12 a.m. UTC | #1

On Mon, 2023-11-20 at 08:47 +0800, Xi Ruoyao wrote:
> The [1/5] patch is the PR112578 fix at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637097.html.
> It has been changed to remove the nearbyint pattern (because nearbyint
> should not raise FE_INEXACT even if -ffp-int-builtin-inexact).
> As other patches depending on the simd.md file introduced by this, sending
> it as the first of this series.
> 
> As many LASX instructions are only differentiated from the corresponding
> LSX instruction with operand length, create simd.md file to contain the
> RTX templates sharable by LSX and LASX.  This makes the code cleaner and
> easier to maintain.
> 
> The [2/5] and [3/5] patches make vector product highpart and rotate
> shift operations for GNU vectors and auto vectorization.
> 
> The [4/5] patch is a simple code cleanup, with no function change.
> 
> The [5/5] patch uses LSX for FP scalar rounding operations if LSX is
> available and -ffp-int-builtin-exact.  We do this because the base FP
> ISA does not have such instructions.  Using LSX is overkill, but still
> much faster than calling libc functions.
> 
> Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

Pushed r14-5950 .. r14-5954 with minor change: a FSF copyright
disclaimer is added into simd.md in the 1st patch, and an used
match_scratch is removed from <simd_frint_pattern><mode>2 in the 5th
patch.

> Xi Ruoyao (5):
>   LoongArch: Fix usage of LSX and LASX frint/ftint instructions
>     [PR112578]
>   LoongArch: Use standard pattern name and RTX code for LSX/LASX muh
>     instructions
>   LoongArch: Use standard pattern name and RTX code for LSX/LASX rotate
>     shift
>   LoongArch: Remove lrint_allow_inexact
>   LoongArch: Use LSX for scalar FP rounding with explicit rounding mode
> 
>  gcc/config/loongarch/lasx.md                  | 283 -----------------
>  gcc/config/loongarch/loongarch-builtins.cc    |  52 ++--
>  gcc/config/loongarch/loongarch.md             |  12 +-
>  gcc/config/loongarch/lsx.md                   | 293 ------------------
>  gcc/config/loongarch/simd.md                  | 268 ++++++++++++++++
>  .../loongarch/vect-frint-no-inexact.c         |  48 +++
>  .../loongarch/vect-frint-scalar-no-inexact.c  |  23 ++
>  .../gcc.target/loongarch/vect-frint-scalar.c  |  43 +++
>  .../gcc.target/loongarch/vect-frint.c         |  85 +++++
>  .../loongarch/vect-ftint-no-inexact.c         |  44 +++
>  .../gcc.target/loongarch/vect-ftint.c         |  83 +++++
>  gcc/testsuite/gcc.target/loongarch/vect-muh.c |  36 +++
>  .../gcc.target/loongarch/vect-rotr.c          |  36 +++
>  13 files changed, 701 insertions(+), 605 deletions(-)
>  create mode 100644 gcc/config/loongarch/simd.md
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-no-inexact.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar-no-inexact.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint-no-inexact.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-muh.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-rotr.c

Lulu Cheng Nov. 29, 2023, 7:45 a.m. UTC | #2

在 2023/11/29 下午3:12, Xi Ruoyao 写道:
> On Mon, 2023-11-20 at 08:47 +0800, Xi Ruoyao wrote:
>> The [1/5] patch is the PR112578 fix at
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637097.html.
>> It has been changed to remove the nearbyint pattern (because nearbyint
>> should not raise FE_INEXACT even if -ffp-int-builtin-inexact).
>> As other patches depending on the simd.md file introduced by this, sending
>> it as the first of this series.
>>
>> As many LASX instructions are only differentiated from the corresponding
>> LSX instruction with operand length, create simd.md file to contain the
>> RTX templates sharable by LSX and LASX.  This makes the code cleaner and
>> easier to maintain.
>>
>> The [2/5] and [3/5] patches make vector product highpart and rotate
>> shift operations for GNU vectors and auto vectorization.
>>
>> The [4/5] patch is a simple code cleanup, with no function change.
>>
>> The [5/5] patch uses LSX for FP scalar rounding operations if LSX is
>> available and -ffp-int-builtin-exact.  We do this because the base FP
>> ISA does not have such instructions.  Using LSX is overkill, but still
>> much faster than calling libc functions.
>>
>> Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?
> Pushed r14-5950 .. r14-5954 with minor change: a FSF copyright
> disclaimer is added into simd.md in the 1st patch, and an used
> match_scratch is removed from <simd_frint_pattern><mode>2 in the 5th
> patch.
>
Thank you very much!:-)
>> Xi Ruoyao (5):
>>    LoongArch: Fix usage of LSX and LASX frint/ftint instructions
>>      [PR112578]
>>    LoongArch: Use standard pattern name and RTX code for LSX/LASX muh
>>      instructions
>>    LoongArch: Use standard pattern name and RTX code for LSX/LASX rotate
>>      shift
>>    LoongArch: Remove lrint_allow_inexact
>>    LoongArch: Use LSX for scalar FP rounding with explicit rounding mode
>>
>>   gcc/config/loongarch/lasx.md                  | 283 -----------------
>>   gcc/config/loongarch/loongarch-builtins.cc    |  52 ++--
>>   gcc/config/loongarch/loongarch.md             |  12 +-
>>   gcc/config/loongarch/lsx.md                   | 293 ------------------
>>   gcc/config/loongarch/simd.md                  | 268 ++++++++++++++++
>>   .../loongarch/vect-frint-no-inexact.c         |  48 +++
>>   .../loongarch/vect-frint-scalar-no-inexact.c  |  23 ++
>>   .../gcc.target/loongarch/vect-frint-scalar.c  |  43 +++
>>   .../gcc.target/loongarch/vect-frint.c         |  85 +++++
>>   .../loongarch/vect-ftint-no-inexact.c         |  44 +++
>>   .../gcc.target/loongarch/vect-ftint.c         |  83 +++++
>>   gcc/testsuite/gcc.target/loongarch/vect-muh.c |  36 +++
>>   .../gcc.target/loongarch/vect-rotr.c          |  36 +++
>>   13 files changed, 701 insertions(+), 605 deletions(-)
>>   create mode 100644 gcc/config/loongarch/simd.md
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-no-inexact.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar-no-inexact.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint-no-inexact.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-muh.c
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-rotr.c