From patchwork Thu Aug 29 07:33:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1978280 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=fu92/bpF; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WvY1X3s41z1yZ9 for ; Thu, 29 Aug 2024 17:33:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0A282385B50D for ; Thu, 29 Aug 2024 07:33:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by sourceware.org (Postfix) with ESMTPS id 1DE27385B50D for ; Thu, 29 Aug 2024 07:33:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1DE27385B50D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1DE27385B50D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724916807; cv=none; b=tHX/JC/RWl61yB1pmv3OCuxX8nLRfvbMVVUK0QiwuXkIyf4bxp7e9MqUJ89kE5MgU1XugM4g1tD8acLrbQzBvRUfvlmOuNgMr8wX+HXaMN6WWoUU3DMgL48KcYDSTdad1FWUC/w6fF51Tois7in3ka2ZBcrVaF54iCVVkeJcsy4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724916807; c=relaxed/simple; bh=xAXS04tOdJwrYHUgDQGyTjXiz5J4CRcRQcX6zd8uOWI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=A0e4vyKnkRpjQBvJ1MKt3NuJ9NQpu2utkplmMhKJ2dSSvZHJVijSVcyi2SlDO3XIOpdA987gBjG/nrqB33aDwGhqcfWsUufApSbrp2U6d/uu1C2QIgKiJszKnm3i/Z5ij0+CrXecDDtRJ3Fb4xGbjvIiwYatVdVE/afftC96VdU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724916806; x=1756452806; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=xAXS04tOdJwrYHUgDQGyTjXiz5J4CRcRQcX6zd8uOWI=; b=fu92/bpFf5d7R1kehrx7qSaXf+cLdfkeMKv8tbr/xv/VZoWRhutS//DA fYnoqCwJxDbU8REa1FJligxtZmKjog4hQDP4OCJrXYFrjJDeyvBhQEkBn xVtE4Ut1Pvl5Jawh1kDetB2QVlEBb/NNFTCczdx2O+Yyy0z7E7fB3DyJQ 56YDJDRDhGcr2/Tmpg2Zh0C6rMsBEbrPu9LwIGI4sk0yYGWUDtvzyar29 8hMW8ChIp3nFyO4Ku1m7vnpfjjyJHhWPQKuM2G5vHOtGgXwLxKbLFBx08 9N/kYLnloLCoXcSSmIk+FUbMqhkuZvnYYVsXXbi4J4YYHFv6ITA+TND7F Q==; X-CSE-ConnectionGUID: 8MpFR1pOSTqV9sCClTzkrA== X-CSE-MsgGUID: w4Rlv+tITe+xKCmQTbXdKw== X-IronPort-AV: E=McAfee;i="6700,10204,11178"; a="34867784" X-IronPort-AV: E=Sophos;i="6.10,185,1719903600"; d="scan'208";a="34867784" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2024 00:33:23 -0700 X-CSE-ConnectionGUID: XhnKx18xRgCvFVB/94pERA== X-CSE-MsgGUID: q5QOXIzJTHas9QfU4PqFuA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,185,1719903600"; d="scan'208";a="63682082" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa010.fm.intel.com with ESMTP; 29 Aug 2024 00:33:21 -0700 From: liuhongt To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com Subject: [PATCH] [x86] Check avx upper register for parallel. Date: Thu, 29 Aug 2024 15:33:20 +0800 Message-Id: <20240829073320.2188675-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org For function arguments/return, when it's BLK mode, it's put in a parallel with an expr_list, and the expr_list contains the real mode and registers. Current ix86_check_avx_upper_register only checked for SSE_REG_P, and failed to handle that. The patch extend the handle to each subrtx. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/116512 * config/i386/i386.cc (ix86_avx_u128_mode_entry): Iterate each subrtx for potential rtx parallel to check avx upper register. (ix86_avx_u128_mode_exit): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr116512.c: New test. --- gcc/config/i386/i386.cc | 28 ++++++++++++++++++++---- gcc/testsuite/gcc.target/i386/pr116512.c | 26 ++++++++++++++++++++++ 2 files changed, 50 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr116512.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 224a78cc832..94d1a14056e 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -15148,8 +15148,18 @@ ix86_avx_u128_mode_entry (void) { rtx incoming = DECL_INCOMING_RTL (arg); - if (incoming && ix86_check_avx_upper_register (incoming)) - return AVX_U128_DIRTY; + if (incoming) + { + /* construct_container may return a parallel with expr_list + which contains the real reg and mode */ + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, incoming, ALL) + { + rtx x = *iter; + if (ix86_check_avx_upper_register (x)) + return AVX_U128_DIRTY; + } + } } return AVX_U128_CLEAN; @@ -15184,8 +15194,18 @@ ix86_avx_u128_mode_exit (void) /* Exit mode is set to AVX_U128_DIRTY if there are 256bit or 512 bit modes used in the function return register. */ - if (reg && ix86_check_avx_upper_register (reg)) - return AVX_U128_DIRTY; + if (reg) + { + /* construct_container may return a parallel with expr_list + which contains the real reg and mode */ + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, reg, ALL) + { + rtx x = *iter; + if (ix86_check_avx_upper_register (x)) + return AVX_U128_DIRTY; + } + } /* Exit mode is set to AVX_U128_DIRTY if there are 256bit or 512bit modes used in function arguments, otherwise return AVX_U128_CLEAN. diff --git a/gcc/testsuite/gcc.target/i386/pr116512.c b/gcc/testsuite/gcc.target/i386/pr116512.c new file mode 100644 index 00000000000..c2bc6c91b64 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr116512.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64-v4 -O2" } */ +/* { dg-final { scan-assembler-not "vzeroupper" { target { ! ia32 } } } } */ + +#include + +struct B { + union { + __m512 f; + __m512i s; + }; +}; + +struct B foo(int n) { + struct B res; + res.s = _mm512_set1_epi32(n); + + return res; +} + +__m512i bar(int n) { + struct B res; + res.s = _mm512_set1_epi32(n); + + return res.s; +}