From patchwork Thu Feb 27 17:18:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 1245962 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-520265-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha1 header.s=default header.b=iXb7qY/V; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48SzsJ42brz9sP7 for ; Fri, 28 Feb 2020 04:18:26 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; q=dns; s= default; b=CZOs6bpeV2Zf6uqVXlOf7oX+QRaW1AF60Sk5+5VIAYS7qpGYJFMgw ksT08X7i6bHWl9IZVsdy+OASJIjOEKtuFVWXM08oHR93mxBUgbFdOFfH+LDXIrRv bxcuBAwcCJGN/Dd79CZYdTqDWmSS0pdwzxjYTNbSn2vNAT39kd2JaM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; s= default; bh=DmIFmcgwob6LM74OUlTnUwG/+y8=; b=iXb7qY/VbrmjUKKXfoAj ZTbUmv/ShFRS1ly+LJ1vEsNyGkXFGFtRfJji90vMVGpkzGhpqhNqgIKAyBxa9FPQ ZaoRYesczdPXu79S8rHPNElXhpllzorZoVgVLhvzPPoEFqeKjitdtz4OAaAmibhN Kv2N3fpUGiQoyVfVzpTewRQ= Received: (qmail 84659 invoked by alias); 27 Feb 2020 17:18:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 84651 invoked by uid 89); 27 Feb 2020 17:18:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: esa3.mentor.iphmx.com Received: from esa3.mentor.iphmx.com (HELO esa3.mentor.iphmx.com) (68.232.137.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 27 Feb 2020 17:18:17 +0000 IronPort-SDR: EQL3gzsCdUtY8kv13WIEzOoS4LfZ8/VbXzyuScFc+IER/BOT9VXehAHcjAPfmOz9lg/NSaQpRe YDlqang/XseJHJwGElLjGnnqXoCYlPqUqyO3BiQrhi5XcLcUR8QGA6SKMItah1UutOAKYTWW7W N6zMC3FzQejwG5mwn+vsBXXFy0sXW/bejt8ITpgE3EWut3ZFGhBk4Da5+JhYMQcpM/lmxyz/yo c6C7BO2gcItVuD0tBZpxV1h2YjS/dCexgQVTV6fmGX4lco0FtjwqOTFXmJIQUuceDchDulWxs8 6ng= Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 27 Feb 2020 09:18:15 -0800 IronPort-SDR: Pgb13ZcNJW6LKJ6qqw96GgwhN1cNks4E5qAyFP6tWeXODXmaPIURoId4q4CeOPsO/i/4E6z5YB vL5kQhrHaWCOaiWnyZS3iAHK9wUiFyMPIFCQ6/jShVq/77tq/AcC87Ls8FEntFCxHKf9Sx9ssG uqRD00BsQXXvC7hAWTQp2vyyXYt9XmFC7QJl9BmVR6L8oeuCjExe7EV98rrUhmcSzK43VQu/qH gePU3GyFgbScmNNoZPfkoRAAJ1A245Nja0d+xPnlAeEzSoINI36AGCicLD+ZuQws5ieYsgYeco Z6Q= From: Andrew Stubbs Subject: [committed] amdgcn: sub-dword vector min/max/shift/bit operators To: "gcc-patches@gcc.gnu.org" Message-ID: Date: Thu, 27 Feb 2020 17:18:10 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 This patch adds V64QI and V64HI implementations of smin, umin, smax, umax, ashift, ashiftrt, lshiftrt, and, ior, xor, not, and popcount. None of these operators have a specific machine instruction, so they need to use V64SI instructions. For scalar code expr.c can DTRT automatically, but not so for vector operations. The min/max and shift operators emit explicit extends and truncates around the actual operator. I don't believe those are needed for the bit operators but it can be easily implemented if needed. There can be more optimal implementations in future, but right now I'm interested in correctness. For example, some of the instructions can have the extend and/or truncate combined into one "DPP" instruction, so I intend to add pattern for the combine pass to use. Similarly, there are load instructions with built-in extends, and I can change the representation of the stores to allow combining truncates. Andrew amdgcn: sub-dword vector min/max/shift/bit operators 2020-02-27 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (VEC_SUBDWORD_MODE): New mode iterator. (2): Change modes to VEC_ALL1REG_INT_MODE. (3): Likewise. (3): New. (v3): New. (3): New. (3): Rename to ... (v64si3): ... this, and change modes to V64SI. * config/gcn/gcn.md (mnemonic): Use '%B' for not. diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index a0cc9a2d8fc..40e864a8de7 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -16,6 +16,10 @@ ;; {{{ Vector iterators +; Vector modes for sub-dword modes +(define_mode_iterator VEC_SUBDWORD_MODE + [V64QI V64HI]) + ; Vector modes for one vector register (define_mode_iterator VEC_1REG_MODE [V64SI V64HF V64SF]) @@ -1881,20 +1885,20 @@ (define_code_iterator minmaxop [smin smax umin umax]) (define_insn "2" - [(set (match_operand:VEC_1REG_INT_MODE 0 "gcn_valu_dst_operand" "= v") - (bitunop:VEC_1REG_INT_MODE - (match_operand:VEC_1REG_INT_MODE 1 "gcn_valu_src0_operand" "vSvB")))] + [(set (match_operand:VEC_ALL1REG_INT_MODE 0 "gcn_valu_dst_operand" "= v") + (bitunop:VEC_ALL1REG_INT_MODE + (match_operand:VEC_ALL1REG_INT_MODE 1 "gcn_valu_src0_operand" "vSvB")))] "" "v_0\t%0, %1" [(set_attr "type" "vop1") (set_attr "length" "8")]) (define_insn "3" - [(set (match_operand:VEC_1REG_INT_MODE 0 "gcn_valu_dst_operand" "= v,RD") - (bitop:VEC_1REG_INT_MODE - (match_operand:VEC_1REG_INT_MODE 1 "gcn_valu_src0_operand" + [(set (match_operand:VEC_ALL1REG_INT_MODE 0 "gcn_valu_dst_operand" "= v,RD") + (bitop:VEC_ALL1REG_INT_MODE + (match_operand:VEC_ALL1REG_INT_MODE 1 "gcn_valu_src0_operand" "% v, 0") - (match_operand:VEC_1REG_INT_MODE 2 "gcn_valu_src1com_operand" + (match_operand:VEC_ALL1REG_INT_MODE 2 "gcn_valu_src1com_operand" "vSvB, v")))] "" "@ @@ -1967,6 +1971,27 @@ [(set_attr "type" "vmult,ds") (set_attr "length" "16,8")]) +(define_expand "3" + [(set (match_operand:VEC_SUBDWORD_MODE 0 "register_operand" "= v") + (shiftop:VEC_SUBDWORD_MODE + (match_operand:VEC_SUBDWORD_MODE 1 "gcn_alu_operand" " v") + (vec_duplicate:VEC_SUBDWORD_MODE + (match_operand:SI 2 "gcn_alu_operand" "SvB"))))] + "" + { + enum {ashift, lshiftrt, ashiftrt}; + bool unsignedp = ( == lshiftrt); + rtx insi1 = gen_reg_rtx (V64SImode); + rtx insi2 = gen_reg_rtx (SImode); + rtx outsi = gen_reg_rtx (V64SImode); + + convert_move (insi1, operands[1], unsignedp); + convert_move (insi2, operands[2], unsignedp); + emit_insn (gen_v64si3 (outsi, insi1, insi2)); + convert_move (operands[0], outsi, unsignedp); + DONE; + }) + (define_insn "v64si3" [(set (match_operand:V64SI 0 "register_operand" "= v") (shiftop:V64SI @@ -1978,6 +2003,26 @@ [(set_attr "type" "vop2") (set_attr "length" "8")]) +(define_expand "v3" + [(set (match_operand:VEC_SUBDWORD_MODE 0 "register_operand" "=v") + (shiftop:VEC_SUBDWORD_MODE + (match_operand:VEC_SUBDWORD_MODE 1 "gcn_alu_operand" " v") + (match_operand:VEC_SUBDWORD_MODE 2 "gcn_alu_operand" "vB")))] + "" + { + enum {ashift, lshiftrt, ashiftrt}; + bool unsignedp = ( == ashift || == ashiftrt); + rtx insi1 = gen_reg_rtx (V64SImode); + rtx insi2 = gen_reg_rtx (V64SImode); + rtx outsi = gen_reg_rtx (V64SImode); + + convert_move (insi1, operands[1], unsignedp); + convert_move (insi2, operands[2], unsignedp); + emit_insn (gen_vv64si3 (outsi, insi1, insi2)); + convert_move (operands[0], outsi, unsignedp); + DONE; + }) + (define_insn "vv64si3" [(set (match_operand:V64SI 0 "register_operand" "=v") (shiftop:V64SI @@ -1988,13 +2033,31 @@ [(set_attr "type" "vop2") (set_attr "length" "8")]) -(define_insn "3" - [(set (match_operand:VEC_1REG_INT_MODE 0 "gcn_valu_dst_operand" "= v,RD") - (minmaxop:VEC_1REG_INT_MODE - (match_operand:VEC_1REG_INT_MODE 1 "gcn_valu_src0_operand" - "% v, 0") - (match_operand:VEC_1REG_INT_MODE 2 "gcn_valu_src1com_operand" - "vSvB, v")))] +(define_expand "3" + [(set (match_operand:VEC_SUBDWORD_MODE 0 "gcn_valu_dst_operand") + (minmaxop:VEC_SUBDWORD_MODE + (match_operand:VEC_SUBDWORD_MODE 1 "gcn_valu_src0_operand") + (match_operand:VEC_SUBDWORD_MODE 2 "gcn_valu_src1com_operand")))] + "" + { + enum {smin, umin, smax, umax}; + bool unsignedp = ( == umax || == umin); + rtx insi1 = gen_reg_rtx (V64SImode); + rtx insi2 = gen_reg_rtx (V64SImode); + rtx outsi = gen_reg_rtx (V64SImode); + + convert_move (insi1, operands[1], unsignedp); + convert_move (insi2, operands[2], unsignedp); + emit_insn (gen_v64si3 (outsi, insi1, insi2)); + convert_move (operands[0], outsi, unsignedp); + DONE; + }) + +(define_insn "v64si3" + [(set (match_operand:V64SI 0 "gcn_valu_dst_operand" "= v,RD") + (minmaxop:V64SI + (match_operand:V64SI 1 "gcn_valu_src0_operand" "% v, 0") + (match_operand:V64SI 2 "gcn_valu_src1com_operand" "vSvB, v")))] "" "@ v_0\t%0, %2, %1 diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md index d8b49dfd640..1bd3330f90b 100644 --- a/gcc/config/gcn/gcn.md +++ b/gcc/config/gcn/gcn.md @@ -319,7 +319,7 @@ (smax "max%i") (umin "min%u") (umax "max%u") - (not "not%b") + (not "not%B") (popcount "bcnt_u32%b")]) (define_code_attr bare_mnemonic