From patchwork Fri Mar 19 06:29:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 1455600 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=PR2mMx5f; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4F1vBz1b19z9sW1 for ; Fri, 19 Mar 2021 17:30:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F1986386F00D; Fri, 19 Mar 2021 06:29:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F1986386F00D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1616135396; bh=gx2bLw1Up6ABjZJesUS9cVmLlUBl2Rr56ESIDBmLSZ8=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=PR2mMx5f2mEbL6zbtFp2UmieP6GBVs9qomQWbNx7OVgC+QBWFRyfH1furXkxIqfrw 7hR2Hb3gXNGKstnrthjm/fmRcGFeVwyY1ZRpmIr5NloimIMtMhykLfNrpHZ3K+zbl+ CdMEjsq3R99lNlHIRGC6xhFsaZrf5w1fx/a+FlrM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id C6FFB385041C for ; Fri, 19 Mar 2021 06:29:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C6FFB385041C Received: from linux-libre.fsfla.org ([209.51.188.54]:57320 helo=free.home) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lN8do-0004lV-PC; Fri, 19 Mar 2021 02:29:51 -0400 Received: from livre (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTPS id 12J6Td7u364188 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Mar 2021 03:29:39 -0300 To: gcc-patches@gcc.gnu.org Subject: fix ssse3_pshufbv8qi3 post-reload const pool load Organization: Free thinker, not speaking for the GNU Project Date: Fri, 19 Mar 2021 03:29:39 -0300 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Alexandre Oliva via Gcc-patches From: Alexandre Oliva Reply-To: Alexandre Oliva Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" The split in ssse3_pshufbv8qi3 forces a const vector into the constant pool, and loads from it. That runs after reload, so if the load requires any reloading, we're out of luck. Indeed, if the load address is not legitimate, e.g. -mcmodel=large, the insn is no longer recognized. This patch turns the constant into an input operand, introduces an expander to generate the constant unconditionally, and arranges for this input operand to be retained as an unused immediate in the alternatives that don't undergo splitting, and for it to be loaded into the scratch register for those that do. It is now the register allocator that arranges to load the const vector into a register, so it deals with whatever legitimizing steps needed for the target configuration. Regstrapped on x86_64-linux-gnu. Ok to install? for gcc/ChangeLog * config/i386/predicates.md (register_or_const_vec_operand): New. * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for the now *-prefixed insn_and_split, turn the splitter const vec into an input for the insn, making it an ignored immediate for non-split cases, and loaded into the scratch register otherwise. --- gcc/config/i386/predicates.md | 6 ++++++ gcc/config/i386/sse.md | 26 +++++++++++++++++++------- 2 files changed, 25 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index b6dd5e9d3b243..f1da005c95cf3 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand" (ior (match_operand 0 "nonimmediate_operand") (match_code "const_vector"))) +;; Return true when OP is either register operand, or any +;; CONST_VECTOR. +(define_predicate "register_or_const_vector_operand" + (ior (match_operand 0 "register_operand") + (match_code "const_vector"))) + ;; Return true when OP is nonimmediate or standard SSE constant. (define_predicate "nonimmediate_or_sse_const_operand" (ior (match_operand 0 "nonimmediate_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 43e4d57ec6a3d..b693864e62d1b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17159,10 +17159,26 @@ (define_insn "_pshufb3" (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) -(define_insn_and_split "ssse3_pshufbv8qi3" +(define_expand "ssse3_pshufbv8qi3" + [(parallel + [(set (match_operand:V8QI 0 "register_operand" "=") + (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "") + (match_operand:V8QI 2 "register_mmxmem_operand" "") + (const_vector:V4SI [(match_dup 3) (match_dup 3) + (match_dup 3) (match_dup 3)])] + UNSPEC_PSHUFB)) + (clobber (match_scratch:V4SI 4 "="))])] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" +{ + operands[3] = gen_int_mode (0xf7f7f7f7, SImode); +}) + +(define_insn_and_split "*ssse3_pshufbv8qi3" [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") - (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") + (match_operand:V4SI 4 "register_or_const_vector_operand" + "i,3,3")] UNSPEC_PSHUFB)) (clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" @@ -17172,8 +17188,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3" #" "TARGET_SSSE3 && reload_completed && SSE_REGNO_P (REGNO (operands[0]))" - [(set (match_dup 3) (match_dup 5)) - (set (match_dup 3) + [(set (match_dup 3) (and:V4SI (match_dup 3) (match_dup 2))) (set (match_dup 0) (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] @@ -17188,9 +17203,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3" GET_MODE (operands[2])); operands[4] = lowpart_subreg (V16QImode, operands[3], GET_MODE (operands[3])); - rtx vec_const = ix86_build_const_vector (V4SImode, true, - gen_int_mode (0xf7f7f7f7, SImode)); - operands[5] = force_const_mem (V4SImode, vec_const); } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "prefix_extra" "1")