From patchwork Tue Dec 5 07:01:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 1871851 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Sks2l1fPWz1ySd for ; Tue, 5 Dec 2023 18:03:59 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 46EB53856DCD for ; Tue, 5 Dec 2023 07:03:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id C3495384F98C for ; Tue, 5 Dec 2023 07:02:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C3495384F98C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C3495384F98C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701759743; cv=none; b=OhLqnOmgKx7+1+OJlDlI5d9Z2pQ7fwzpW7hYL+UnRoJK3ly4IgZNtsJc+ObkErD5WgN3DPnf1pns6Rr3tT85wp/Vx1AFKkcVIG8z1U+vrgyMQ6Ello7Pplq56IyZdJz2mmB9oIeWh121j1107ni58KnvZtXvorV9hFyu8f1lAW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701759743; c=relaxed/simple; bh=Dtpsiezg5iuf+SdDPYmNKVmI1H1i+U7JMHC5bFYvssU=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=XUsAtSlUl4SqY2WgXeCdPxkKKeWWxNuwxtW593391FkZyS5DVXzPch8Lo1ID7Pb4ozfrOcOpXW/PNEqhru0sZ0Jhy2KUMpsgFlS9JwQrFfTmiddetUhjsKqc5iyGxDEaqW/699BrAIkWNvdF/2yFcUxJ6JTqbsvQc5erFCmonPk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rAPRZ-0006NF-Mg for gcc-patches@gcc.gnu.org; Tue, 05 Dec 2023 02:02:13 -0500 Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8CxNvHkym5lxPg+AA--.59901S3; Tue, 05 Dec 2023 15:01:57 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Cxvdzeym5ljDlVAA--.57842S4; Tue, 05 Dec 2023 15:01:50 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH v2 0/5] Add support for approximate instructions and optimize divf/sqrtf/rsqrt operations. Date: Tue, 5 Dec 2023 15:01:42 +0800 Message-Id: <20231205070147.53352-1-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8Cxvdzeym5ljDlVAA--.57842S4 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoWxurW8Kw15XFyrKw4fKrWxGrX_yoWrZFy3p3 y7CrnrtF48GFZ3Wr1kJa43XF4DXF97K3ya93WSy340krWIqr9Fv3WktrnxXFy3Ja45Jryx Xwn5uw15W3WYv3XCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07UWHqcUUUUU= Received-SPF: pass client-ip=114.242.206.163; envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org LoongArch V1.1 adds support for approximate instructions, which are utilized along with additional Newton-Raphson steps implement single precision floating-point division, square root and reciprocal square root operations for better throughput. The patches are modifications made based on the patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639243.html. Jiahao Xu (5): LoongArch: Add support for LoongArch V1.1 approximate instructions. LoongArch: Use standard pattern name for xvfrsqrt/vfrsqrt instructions. LoongArch: Redefine pattern for xvfrecip/vfrecip instructions. LoongArch: New options -mrecip and -mrecip= with ffast-math. LoongArch: Vectorized loop unrolling is disable for divf/sqrtf/rsqrtf when -mrecip is enabled. gcc/config/loongarch/genopts/isa-evolution.in | 1 + gcc/config/loongarch/genopts/loongarch.opt.in | 11 + gcc/config/loongarch/larchintrin.h | 38 +++ gcc/config/loongarch/lasx.md | 89 ++++++- gcc/config/loongarch/lasxintrin.h | 34 +++ gcc/config/loongarch/loongarch-builtins.cc | 66 +++++ gcc/config/loongarch/loongarch-cpucfg-map.h | 1 + gcc/config/loongarch/loongarch-protos.h | 2 + gcc/config/loongarch/loongarch-str.h | 1 + gcc/config/loongarch/loongarch.cc | 252 +++++++++++++++++- gcc/config/loongarch/loongarch.h | 18 ++ gcc/config/loongarch/loongarch.md | 104 ++++++-- gcc/config/loongarch/loongarch.opt | 15 ++ gcc/config/loongarch/lsx.md | 89 ++++++- gcc/config/loongarch/lsxintrin.h | 34 +++ gcc/config/loongarch/predicates.md | 8 + gcc/doc/extend.texi | 18 ++ gcc/doc/invoke.texi | 54 ++++ gcc/testsuite/gcc.target/loongarch/divf.c | 10 + .../loongarch/larch-frecipe-builtin.c | 28 ++ .../gcc.target/loongarch/recip-divf.c | 9 + .../gcc.target/loongarch/recip-sqrtf.c | 23 ++ gcc/testsuite/gcc.target/loongarch/sqrtf.c | 24 ++ .../loongarch/vector/lasx/lasx-divf.c | 13 + .../vector/lasx/lasx-frecipe-builtin.c | 30 +++ .../loongarch/vector/lasx/lasx-recip-divf.c | 12 + .../loongarch/vector/lasx/lasx-recip-sqrtf.c | 28 ++ .../loongarch/vector/lasx/lasx-recip.c | 24 ++ .../loongarch/vector/lasx/lasx-rsqrt.c | 26 ++ .../loongarch/vector/lasx/lasx-sqrtf.c | 29 ++ .../loongarch/vector/lsx/lsx-divf.c | 13 + .../vector/lsx/lsx-frecipe-builtin.c | 30 +++ .../loongarch/vector/lsx/lsx-recip-divf.c | 12 + .../loongarch/vector/lsx/lsx-recip-sqrtf.c | 28 ++ .../loongarch/vector/lsx/lsx-recip.c | 24 ++ .../loongarch/vector/lsx/lsx-rsqrt.c | 26 ++ .../loongarch/vector/lsx/lsx-sqrtf.c | 29 ++ 37 files changed, 1212 insertions(+), 41 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/larch-frecipe-builtin.c create mode 100644 gcc/testsuite/gcc.target/loongarch/recip-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/recip-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-frecipe-builtin.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-rsqrt.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-frecipe-builtin.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-rsqrt.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-sqrtf.c