From patchwork Tue Aug 4 12:20:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1340850 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=XwhWQRnX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BLYjz72C8z9sRR for ; Tue, 4 Aug 2020 22:20:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E633B3857C56; Tue, 4 Aug 2020 12:20:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 95C163857C49 for ; Tue, 4 Aug 2020 12:20:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 95C163857C49 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=roger@nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Z9g96qvqpPcpDWmIWs/7H6NQ2OSIW0iMLvxzk0Wj0ic=; b=XwhWQRnX2KG/Q0DzqpcdLnyvZq +lu4qzrSMclgWjQUM1iXJMVzngnMK7B1Wth1V3DbNQ/mr+btfvvmBNPpRfl9RY84AFO2vJbZibP2A nUAKWYPLCEBsQG+vL1coGMZF0RBLOtfKQwAp4LS23D6SFhbNWGget38h8eXzU9CoxCnSwZlJNR3L4 EV6FEFxzT7zG4oGvqPoNAJxuieeGqAeOVF17GXCJsYkCe+TjmWyBpux1TbiDhnymOfiKvMfeBR0eQ utk7+AQcWt1Kw+vXFNcPu+9BWR/vKYQykiMmRT3ncwXKCdY1N53Bs7XphUw0SWCy64yJ9Sa/GcuMC tesjeu/Q==; Received: from host86-137-89-56.range86-137.btcentralplus.com ([86.137.89.56]:55331 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1k2vvF-0004Zq-KW; Tue, 04 Aug 2020 08:20:02 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH] nvptx: Add support for PTX highpart multiplications (e.g. mul.hi.s32) Date: Tue, 4 Aug 2020 13:20:00 +0100 Message-ID: <001301d66a59$93af4860$bb0dd920$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdZqWYH7YZhWdZbcTcWOTOhVLjrnwQ== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This patch adds support for signed and unsigned, HImode, SImode and DImode highpart multiplications to the nvptx backend. Without the middle-end patch that I've just posted, the middle-end is able to (easily) make use of the narrow four of the six instructions, but with that patch, all six of these instructions are generated in the provided test cases. This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with a "make" and "make -k check" with no new failures with the above patch, and just the two failures to find mul.hi.?64 against current mainline. I'd considered submitting this patch either without support for the 64bit variants, or without tests for them, but it seemed more reasonable to make both enhancements at the same time. Ok for mainline (once the previous patch has been approved/pushed)? 2020-08-04 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (smulhi3_highpart, smulsi3_highpart, smuldi4_highpart, umulhi3_highpart, umulsi3_highpart, umuldi3_highpart): New instructions. gcc/testsuite/ChangeLog * gcc.target/nvptx/mul-hi.c: New test. * gcc.target/nvptx/umul-hi.c: New test. Thanks in advance, Roger --- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/testsuite/gcc.target/nvptx/mul-hi.c b/gcc/testsuite/gcc.target/nvptx/mul-hi.c new file mode 100644 index 0000000..2cc35af --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/mul-hi.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wno-long-long" } */ + +typedef int __attribute ((mode(TI))) ti_t; + +short smulhi3_highpart(short x, short y) +{ + return ((int)x * (int)y) >> 16; +} + +int smulsi3_highpart(int x, int y) +{ + return ((long)x * (long)y) >> 32; +} + +long smuldi3_highpart(long x, long y) +{ + return ((ti_t)x * (ti_t)y) >> 64; +} + +/* { dg-final { scan-assembler-times "mul.hi.s16" 1 } } */ +/* { dg-final { scan-assembler-times "mul.hi.s32" 1 } } */ +/* { dg-final { scan-assembler-times "mul.hi.s64" 1 } } */ diff --git a/gcc/testsuite/gcc.target/nvptx/umul-hi.c b/gcc/testsuite/gcc.target/nvptx/umul-hi.c new file mode 100644 index 0000000..148d1ce --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/umul-hi.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wno-long-long" } */ + +typedef unsigned int __attribute ((mode(TI))) uti_t; + +unsigned short umulhi3_highpart(unsigned short x, unsigned short y) +{ + return ((unsigned int)x * (unsigned int)y) >> 16; +} + +unsigned int umulsi3_highpart(unsigned int x, unsigned int y) +{ + return ((unsigned long)x * (unsigned long)y) >> 32; +} + +unsigned long umuldi3_highpart(unsigned long x, unsigned long y) +{ + return ((uti_t)x * (uti_t)y) >> 64; +} + +/* { dg-final { scan-assembler-times "mul.hi.u16" 1 } } */ +/* { dg-final { scan-assembler-times "mul.hi.u32" 1 } } */ +/* { dg-final { scan-assembler-times "mul.hi.u64" 1 } } */ diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md index c23edcf..0459549 100644 --- a/gcc/config/nvptx/nvptx.md +++ b/gcc/config/nvptx/nvptx.md @@ -568,6 +568,78 @@ "" "%.\\tmul.wide.u32\\t%0, %1, %2;") +(define_insn "smulhi3_highpart" + [(set (match_operand:HI 0 "nvptx_register_operand" "=R") + (truncate:HI + (lshiftrt:SI + (mult:SI (sign_extend:SI + (match_operand:HI 1 "nvptx_register_operand" "R")) + (sign_extend:SI + (match_operand:HI 2 "nvptx_register_operand" "R"))) + (const_int 16))))] + "" + "%.\\tmul.hi.s16\\t%0, %1, %2;") + +(define_insn "smulsi3_highpart" + [(set (match_operand:SI 0 "nvptx_register_operand" "=R") + (truncate:SI + (lshiftrt:DI + (mult:DI (sign_extend:DI + (match_operand:SI 1 "nvptx_register_operand" "R")) + (sign_extend:DI + (match_operand:SI 2 "nvptx_register_operand" "R"))) + (const_int 32))))] + "" + "%.\\tmul.hi.s32\\t%0, %1, %2;") + +(define_insn "smuldi3_highpart" + [(set (match_operand:DI 0 "nvptx_register_operand" "=R") + (truncate:DI + (lshiftrt:TI + (mult:TI (sign_extend:TI + (match_operand:DI 1 "nvptx_register_operand" "R")) + (sign_extend:TI + (match_operand:DI 2 "nvptx_register_operand" "R"))) + (const_int 64))))] + "" + "%.\\tmul.hi.s64\\t%0, %1, %2;") + +(define_insn "umulhi3_highpart" + [(set (match_operand:HI 0 "nvptx_register_operand" "=R") + (truncate:HI + (lshiftrt:SI + (mult:SI (zero_extend:SI + (match_operand:HI 1 "nvptx_register_operand" "R")) + (zero_extend:SI + (match_operand:HI 2 "nvptx_register_operand" "R"))) + (const_int 16))))] + "" + "%.\\tmul.hi.u16\\t%0, %1, %2;") + +(define_insn "umulsi3_highpart" + [(set (match_operand:SI 0 "nvptx_register_operand" "=R") + (truncate:SI + (lshiftrt:DI + (mult:DI (zero_extend:DI + (match_operand:SI 1 "nvptx_register_operand" "R")) + (zero_extend:DI + (match_operand:SI 2 "nvptx_register_operand" "R"))) + (const_int 32))))] + "" + "%.\\tmul.hi.u32\\t%0, %1, %2;") + +(define_insn "umuldi3_highpart" + [(set (match_operand:DI 0 "nvptx_register_operand" "=R") + (truncate:DI + (lshiftrt:TI + (mult:TI (zero_extend:TI + (match_operand:DI 1 "nvptx_register_operand" "R")) + (zero_extend:TI + (match_operand:DI 2 "nvptx_register_operand" "R"))) + (const_int 64))))] + "" + "%.\\tmul.hi.u64\\t%0, %1, %2;") + ;; Shifts (define_insn "ashl3"