From patchwork Tue Jun 7 07:41:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1639825 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=nW7Po4Vk; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LHMkj32Lrz9sFw for ; Tue, 7 Jun 2022 17:42:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D2B03382CCAE for ; Tue, 7 Jun 2022 07:41:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D2B03382CCAE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654587718; bh=E9WxW8lDFJ75Xmb0Z1S/u4G9nq5lKnLjR/0wY+bIWGg=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=nW7Po4Vkd4GQb6D7FzuMhWmyQQuyTgFkNuk5mMoG/XITr02+DxubLOm5VXmZ6IbEJ nmp+EJQQnCqd+xFQkxlpKPU7ac2SnrMe3sR1CVWoL1cdVnq+Rrew9Z8nYR+iRDNwNv 7blVJpMiO1Rw0aKdnrFmuV4LlLeC2oMN37kq1kNA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id EB784382CC94 for ; Tue, 7 Jun 2022 07:41:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EB784382CC94 X-IronPort-AV: E=McAfee;i="6400,9594,10370"; a="340348546" X-IronPort-AV: E=Sophos;i="5.91,283,1647327600"; d="scan'208";a="340348546" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jun 2022 00:41:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,283,1647327600"; d="scan'208";a="532500614" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga003.jf.intel.com with ESMTP; 07 Jun 2022 00:41:34 -0700 Received: from shliclel051.sh.intel.com (shliclel051.sh.intel.com [10.239.236.51]) by scymds01.sc.intel.com with ESMTP id 2577fX6m002907; Tue, 7 Jun 2022 00:41:34 -0700 To: gcc-patches@gcc.gnu.org Subject: [PATCH] Disparages SSE_REGS alternatives sligntly with ?v instead of *v in *mov{si, di}_internal. Date: Tue, 7 Jun 2022 15:41:33 +0800 Message-Id: <20220607074133.3296-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: "Liu, Hongtao" Reply-To: liuhongt Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" So alternative v won't be igored in record_reg_classess. Similar for *r alternatives in some vector patterns. It helps testcase in the PR, also RA now makes better decisions for gcc.target/i386/extract-insert-combining.c movd %esi, %xmm0 movd %edi, %xmm1 - movl %esi, -12(%rsp) paddd %xmm0, %xmm1 pinsrd $0, %esi, %xmm0 paddd %xmm1, %xmm0 The patch has no big impact on SPEC2017 for both O2 and Ofast march=native run. And I noticed there's some changes in SPEC2017 Before: mov mem, %eax vmovd %eax, %xmm0 .. mov %eax, 64(%rsp) After: vmovd mem, %xmm0 .. vmovd %xmm0, 64(%rsp) Which should be exactly what we want? Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} Ok for trunk? gcc/ChangeLog: * config/i386/i386.md (*movsi_internal): Change alternative from *v to ?v. (*movdi_internal): Ditto. * config/i386/sse.md (vec_set_0): Change alternative *r to ?r. (*vec_extractv4sf_mem): Ditto. (*vec_extracthf): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr105513-1.c: New test. * gcc.target/i386/extract-insert-combining.c: Add new scan-assembler-not for spill. --- gcc/config/i386/i386.md | 8 ++++---- gcc/config/i386/sse.md | 8 ++++---- .../gcc.target/i386/extract-insert-combining.c | 1 + gcc/testsuite/gcc.target/i386/pr105513-1.c | 16 ++++++++++++++++ 4 files changed, 25 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr105513-1.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 48a98e1b68b..5b538413942 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2251,9 +2251,9 @@ (define_split (define_insn "*movdi_internal" [(set (match_operand:DI 0 "nonimmediate_operand" - "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,m,?r ,?*Yd,?r,?*v,?*y,?*x,*k,*k ,*r,*m,*k") + "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r,?*y,?v,?v,?v,m ,m,?r ,?*Yd,?r,?v,?*y,?*x,*k,*k ,*r,*m,*k") (match_operand:DI 1 "general_operand" - "riFo,riF,Z,rem,i,re,C ,*y,Bk ,*y,*y,r ,C ,*v,Bk,*v,v,*Yd,r ,*v,r ,*x ,*y ,*r,*kBk,*k,*k,CBC"))] + "riFo,riF,Z,rem,i,re,C ,*y,Bk ,*y,*y,r ,C ,?v,Bk,?v,v,*Yd,r ,?v,r ,*x ,*y ,*r,*kBk,*k,*k,CBC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && ix86_hardreg_mov_ok (operands[0], operands[1])" { @@ -2472,9 +2472,9 @@ (define_peephole2 (define_insn "*movsi_internal" [(set (match_operand:SI 0 "nonimmediate_operand" - "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k") + "=r,m ,*y,*y,?*y,?m,?r,?*y,?v,?v,?v,m ,?r,?v,*k,*k ,*rm,*k") (match_operand:SI 1 "general_operand" - "g ,re,C ,*y,Bk ,*y,*y,r ,C ,*v,Bk,*v,*v,r ,*r,*kBk,*k ,CBC"))] + "g ,re,C ,*y,Bk ,*y,*y,r ,C ,?v,Bk,?v,?v,r ,*r,*kBk,*k ,CBC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && ix86_hardreg_mov_ok (operands[0], operands[1])" { diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 62688f8e29d..d41ce2e1a9b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10590,11 +10590,11 @@ (define_insn "*vec_concatv4sf_0" ;; see comment above inline_secondary_memory_needed function in i386.cc (define_insn "vec_set_0" [(set (match_operand:VI4F_128 0 "nonimmediate_operand" - "=Yr,*x,v,v,v,x,x,v,Yr ,*x ,x ,m ,m ,m") + "=Yr,*x,v,v,v,x,x,v,Yr ,?x ,x ,m ,m ,m") (vec_merge:VI4F_128 (vec_duplicate:VI4F_128 (match_operand: 2 "general_operand" - " Yr,*x,v,m,r ,m,x,v,*rm,*rm,*rm,!x,!*re,!*fF")) + " Yr,*x,v,m,r ,m,x,v,?rm,?rm,?rm,!x,?re,!*fF")) (match_operand:VI4F_128 1 "nonimm_or_0_operand" " C , C,C,C,C ,C,0,v,0 ,0 ,x ,0 ,0 ,0") (const_int 1)))] @@ -11056,7 +11056,7 @@ (define_insn_and_split "*sse4_1_extractps" (set_attr "mode" "V4SF,V4SF,V4SF,*,*")]) (define_insn_and_split "*vec_extractv4sf_mem" - [(set (match_operand:SF 0 "register_operand" "=v,*r,f") + [(set (match_operand:SF 0 "register_operand" "=v,?r,f") (vec_select:SF (match_operand:V4SF 1 "memory_operand" "o,o,o") (parallel [(match_operand 2 "const_0_to_3_operand")])))] @@ -11933,7 +11933,7 @@ (define_insn_and_split "*vec_extract_0" "operands[1] = gen_lowpart (HFmode, operands[1]);") (define_insn "*vec_extracthf" - [(set (match_operand:HF 0 "register_sse4nonimm_operand" "=*r,m,x,v") + [(set (match_operand:HF 0 "register_sse4nonimm_operand" "=?r,m,x,v") (vec_select:HF (match_operand:V8HF 1 "register_operand" "v,v,0,v") (parallel diff --git a/gcc/testsuite/gcc.target/i386/extract-insert-combining.c b/gcc/testsuite/gcc.target/i386/extract-insert-combining.c index 32d951e6832..5a53d4cbf06 100644 --- a/gcc/testsuite/gcc.target/i386/extract-insert-combining.c +++ b/gcc/testsuite/gcc.target/i386/extract-insert-combining.c @@ -4,6 +4,7 @@ /* { dg-final { scan-assembler-times "(?:vpaddd|paddd)\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]" 2 } } */ /* { dg-final { scan-assembler-times "(?:vpinsrd|pinsrd)\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]" 1 } } */ /* { dg-final { scan-assembler-not "vmovss" } } */ +/* { dg-final { scan-assembler-not {(?n)mov.*(%rsp)} { target { ! ia32 } } } } */ #include diff --git a/gcc/testsuite/gcc.target/i386/pr105513-1.c b/gcc/testsuite/gcc.target/i386/pr105513-1.c new file mode 100644 index 00000000000..530f5292252 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr105513-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -msse2 -mtune=skylake -mfpmath=sse" } */ +/* { dg-final { scan-assembler-not "\\(%rsp\\)" } } */ + +static int as_int(float x) +{ + return (union{float x; int i;}){x}.i; +} + +float f(double y, float x) +{ + int i = as_int(x); + if (__builtin_expect(i > 99, 0)) return 0; + if (i*2u < 77) if (i==2) return 0; + return y*x; +}