From patchwork Thu Jan 31 11:09:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1034111 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-495009-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="fUy0hfIB"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="oU+0xDA4"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43qyFN5qzqz9sBn for ; Thu, 31 Jan 2019 22:10:15 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; q=dns; s=default; b=bx1j0aPNEsQQ9+y+My7BtXVDrguOGuRTI8FtBE9Cxv1 hv5AMclRwls01bRmfRcRbJlGWtUGsuaj1yWWWVh+IYdoVMIbdp2JNtrZnuAv219G vWYnv184nw00nsu2xLaYq/tIBMe1WTplXxXT5YIqh78HqtzlQu7eVv8qI4+r0cl4 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; s=default; bh=T/yS/PBm1KGMAINiL3gX2bQ1lHM=; b=fUy0hfIBWiXvc3HuU J2hF8fAPFj3grosiduq4ZeA3LLhRd8xmr+Y/gwL5FRB3v6aT6uLmgcK4oXj4ua5R yko1Qi+dJB0e11s8fdcHDEH9bCndFET5djG2aUCsi+5svFkP3m1JmzREmZcWUiQW h8+KjXZDia4XXYLsX+XogY3d0U= Received: (qmail 62334 invoked by alias); 31 Jan 2019 11:10:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 62314 invoked by uid 89); 31 Jan 2019 11:10:06 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=H*Ad:D*ca, nn, xv, hello! X-HELO: mail-it1-f182.google.com Received: from mail-it1-f182.google.com (HELO mail-it1-f182.google.com) (209.85.166.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 31 Jan 2019 11:09:59 +0000 Received: by mail-it1-f182.google.com with SMTP id z20so3445623itc.3 for ; Thu, 31 Jan 2019 03:09:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=rfaSjaoC2DrRFoBFYdjd37OAOPKHtz2asIM+q/bWAqs=; b=oU+0xDA41pTiVuRlmxrHEDpbbM++wCkUGUZv+JbWMAffVdCUiAFeJog+s4EhtjyXGZ F/KPBabvI9T+e+7YhDNaD2kKJSwnKroE59hn3QZiKjwOkQPlavw+QKHYHcAYSkWFLrnA YVsb68Me0CdHBqFf6vFnN4zAtdAqW453TYVZEg0DQ428QMVm0eCGWZr4xyOpZeerbPbZ PBcCAuUMFcwH1I2RPddRkD26vuc8fqmmfjA4+n0XTZZd4HPmPxLh65ErlYg8cgbNIgam nH5d6wzAIOXJK0vTd8JXLHuYdIwsWHTNv1LnG24w3nSRo2Ln3aXaORXBzWL8KQwNC171 NAeg== MIME-Version: 1.0 From: Uros Bizjak Date: Thu, 31 Jan 2019 12:09:45 +0100 Message-ID: Subject: [PATCH, i386]: Fix PR 89071, AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm, xmm instructions To: "gcc-patches@gcc.gnu.org" Cc: "H. J. Lu" , Peter Cordes Hello! Attached patch (partially) avoids emitting XOR dependency breaking insn by removing SSE reg dependency in the AVX instructions themselves. 2019-01-31 Uroš Bizjak PR target/89071 * config/i386/i386.md (*extendsfdf2): Split out reg->reg alternative to avoid partial SSE register stall for TARGET_AVX. (truncdfsf2): Ditto. (sse4_1_round2): Ditto. Bootstrapped on x86_64-linux-gnu {,-m32}, regression test in progress. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index d085e88bc61d..744f155fca6f 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -4370,9 +4370,9 @@ }) (define_insn "*extendsfdf2" - [(set (match_operand:DF 0 "nonimm_ssenomem_operand" "=f,m,v") + [(set (match_operand:DF 0 "nonimm_ssenomem_operand" "=f,m,v,v") (float_extend:DF - (match_operand:SF 1 "nonimmediate_operand" "fm,f,vm")))] + (match_operand:SF 1 "nonimmediate_operand" "fm,f,v,m")))] "TARGET_80387 || (TARGET_SSE2 && TARGET_SSE_MATH)" { switch (which_alternative) @@ -4382,15 +4382,17 @@ return output_387_reg_move (insn, operands); case 2: + return "%vcvtss2sd\t{%d1, %0|%0, %d1}"; + case 3: return "%vcvtss2sd\t{%1, %d0|%d0, %1}"; default: gcc_unreachable (); } } - [(set_attr "type" "fmov,fmov,ssecvt") - (set_attr "prefix" "orig,orig,maybe_vex") - (set_attr "mode" "SF,XF,DF") + [(set_attr "type" "fmov,fmov,ssecvt,ssecvt") + (set_attr "prefix" "orig,orig,maybe_vex,maybe_vex") + (set_attr "mode" "SF,XF,DF,DF") (set (attr "enabled") (if_then_else (match_test ("TARGET_SSE2 && TARGET_SSE_MATH")) @@ -4481,7 +4483,7 @@ "TARGET_SSE_PARTIAL_REG_DEPENDENCY && epilogue_completed && optimize_function_for_speed_p (cfun) && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1])) + || (!TARGET_AVX && REGNO (operands[0]) != REGNO (operands[1]))) && (!EXT_REX_SSE_REG_P (operands[0]) || TARGET_AVX512VL)" [(set (match_dup 0) @@ -4534,9 +4536,9 @@ ;; Conversion from DFmode to SFmode. (define_insn "truncdfsf2" - [(set (match_operand:SF 0 "nonimm_ssenomem_operand" "=m,f,v") + [(set (match_operand:SF 0 "nonimm_ssenomem_operand" "=m,f,v,v") (float_truncate:SF - (match_operand:DF 1 "register_ssemem_operand" "f,f,vm")))] + (match_operand:DF 1 "register_ssemem_operand" "f,f,v,m")))] "TARGET_80387 || (TARGET_SSE2 && TARGET_SSE_MATH)" { switch (which_alternative) @@ -4546,13 +4548,15 @@ return output_387_reg_move (insn, operands); case 2: + return "%vcvtsd2ss\t{%d1, %0|%0, %d1}"; + case 3: return "%vcvtsd2ss\t{%1, %d0|%d0, %1}"; default: gcc_unreachable (); } } - [(set_attr "type" "fmov,fmov,ssecvt") + [(set_attr "type" "fmov,fmov,ssecvt,ssecvt") (set_attr "mode" "SF") (set (attr "enabled") (if_then_else @@ -4639,7 +4643,7 @@ "TARGET_SSE_PARTIAL_REG_DEPENDENCY && epilogue_completed && optimize_function_for_speed_p (cfun) && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1])) + || (!TARGET_AVX && REGNO (operands[0]) != REGNO (operands[1]))) && (!EXT_REX_SSE_REG_P (operands[0]) || TARGET_AVX512VL)" [(set (match_dup 0) @@ -16171,19 +16175,20 @@ (define_insn "sse4_1_round2" - [(set (match_operand:MODEF 0 "register_operand" "=x,v") - (unspec:MODEF [(match_operand:MODEF 1 "nonimmediate_operand" "xm,vm") - (match_operand:SI 2 "const_0_to_15_operand" "n,n")] + [(set (match_operand:MODEF 0 "register_operand" "=x,x,v") + (unspec:MODEF [(match_operand:MODEF 1 "nonimmediate_operand" "x,m,vm") + (match_operand:SI 2 "const_0_to_15_operand" "n,n,n")] UNSPEC_ROUND))] "TARGET_SSE4_1" "@ + %vround\t{%2, %d1, %0|%0, %d1, %2} %vround\t{%2, %1, %d0|%d0, %1, %2} vrndscale\t{%2, %1, %d0|%d0, %1, %2}" [(set_attr "type" "ssecvt") - (set_attr "prefix_extra" "1,*") - (set_attr "length_immediate" "*,1") - (set_attr "prefix" "maybe_vex,evex") - (set_attr "isa" "noavx512f,avx512f") + (set_attr "prefix_extra" "1,1,*") + (set_attr "length_immediate" "*,*,1") + (set_attr "prefix" "maybe_vex,maybe_vex,evex") + (set_attr "isa" "noavx512f,noavx512f,avx512f") (set_attr "mode" "")]) (define_insn "rintxf2"