From patchwork Fri Jul 26 06:25:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 1965192 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=b9Uu0Jyg; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVd6k3ll0z1ybY for ; Fri, 26 Jul 2024 16:25:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9901E3858414 for ; Fri, 26 Jul 2024 06:25:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by sourceware.org (Postfix) with ESMTPS id 5C8EF3858D20 for ; Fri, 26 Jul 2024 06:25:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C8EF3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5C8EF3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721975129; cv=none; b=rxnhmzbY4FskATGSal6WS2J3T8ZZonNM/dSii8m5PWLR68JGNaFGGYgU5xbfP7inhU6dKZMqu5JWsp44yK/T9Y56sF8DOO+K4G4NYDAg1nDp1DyW42D59NsKyqADEV3lsnlkc+OCsU3NcsOLWjUsF1b79U9J797Mr9bVvI6dPws= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721975129; c=relaxed/simple; bh=EWY/+ZE8h+mn7WSdj0VoGBZcOp2GtJE/Y8IIunhOBcI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=STjagXoXNTdn8mhbsySSnEye05Uh1FJBbv370zlrZnZUwfnLhom6LL3TKi0+ZXMonmZ/HpZzMPVqRsLx6d7jtrVCfh20CILw15K4l41gE1m4FQdz1G8OHVJRnyMBIGD0fabKYQjhtjBbjhkLi5BhrSCFLi53rUE6ijqhAlpDE/g= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721975126; x=1753511126; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=EWY/+ZE8h+mn7WSdj0VoGBZcOp2GtJE/Y8IIunhOBcI=; b=b9Uu0Jyg4EkCLU7TD/XQ68qd/2TZhvZPnPCbn+uo3del9wT7ofp7MMTL ImHNHOUwRfvJk7phn+8OGxhv9cydS/oWh4s25oQUqi6LDH0EvNeyjP2Yz jFD8ozzLcFY2hg3utoFcJq/2SqZUmTBTmbGZI+mnmjh7cQy107NgVCoUS RaR5jObkljlkAkTAFlzlID/qR/zMOjoU20kMO8daXBJSKkEfQuwnKoP8e 6+lluSs9+up8nOOvrBjUOiGHzhsLzNXpprLGeJKmu+ejnKO6d6jHObicc cTQeXKlLeiLSdXv1NiNhOrWN1Y9WyVxn8/Wr+jcZOqcTIHCODQQkY0qCf A==; X-CSE-ConnectionGUID: Z/l35EfrRKmA7unS9nDW2g== X-CSE-MsgGUID: YGgwB2cmQPqjNjgV+/u57g== X-IronPort-AV: E=McAfee;i="6700,10204,11144"; a="19628651" X-IronPort-AV: E=Sophos;i="6.09,238,1716274800"; d="scan'208";a="19628651" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2024 23:25:25 -0700 X-CSE-ConnectionGUID: xvtQeMovTZCognr70f7wew== X-CSE-MsgGUID: Xp+hn4LQR+O/DQOTJITaRg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,238,1716274800"; d="scan'208";a="90652253" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa001.jf.intel.com with ESMTP; 25 Jul 2024 23:25:23 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8B5F3100567B; Fri, 26 Jul 2024 14:25:22 +0800 (CST) From: Haochen Jiang To: gcc-patches@gcc.gnu.org Cc: hongtao.liu@intel.com, ubizjak@gmail.com Subject: [PATCH] i386: Fix AVX512 intrin macro typo Date: Fri, 26 Jul 2024 14:25:22 +0800 Message-Id: <20240726062522.3853519-1-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi all, There are several typo in AVX512 intrins macro define. They will eventually result in errors with -O0. This patch will fix that. Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk and backport to GCC14, GCC 13 and GCC 12? Thx, Haochen gcc/ChangeLog: * config/i386/avx512dqintrin.h (_mm_mask_fpclass_ss_mask): Correct operand order. (_mm_mask_fpclass_sd_mask): Ditto. (_mm_reduce_round_sd): Use -1 as mask since it is non-mask. (_mm_reduce_round_ss): Ditto. * config/i386/avx512vlbwintrin.h (_mm256_mask_alignr_epi8): Correct operand usage. (_mm_mask_alignr_epi8): Ditto. * config/i386/avx512vlintrin.h (_mm_mask_alignr_epi64): Ditto. --- gcc/config/i386/avx512dqintrin.h | 16 +++++++++------- gcc/config/i386/avx512vlbwintrin.h | 4 ++-- gcc/config/i386/avx512vlintrin.h | 2 +- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h index 3beed7e649a..d9890c6da1d 100644 --- a/gcc/config/i386/avx512dqintrin.h +++ b/gcc/config/i386/avx512dqintrin.h @@ -572,11 +572,11 @@ _mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm) ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X), \ (int) (C), (__mmask8) (-1))) \ -#define _mm_mask_fpclass_ss_mask(X, C, U) \ +#define _mm_mask_fpclass_ss_mask(U, X, C) \ ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X), \ (int) (C), (__mmask8) (U))) -#define _mm_mask_fpclass_sd_mask(X, C, U) \ +#define _mm_mask_fpclass_sd_mask(U, X, C) \ ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X), \ (int) (C), (__mmask8) (U))) #define _mm_reduce_sd(A, B, C) \ @@ -594,8 +594,9 @@ _mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm) (__mmask8)(U))) #define _mm_reduce_round_sd(A, B, C, R) \ - ((__m128d) __builtin_ia32_reducesd_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__mmask8)(U), (int)(R))) + ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_avx512_setzero_pd (), \ + (__mmask8)(-1), (int)(R))) #define _mm_mask_reduce_round_sd(W, U, A, B, C, R) \ ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \ @@ -622,8 +623,9 @@ _mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm) (__mmask8)(U))) #define _mm_reduce_round_ss(A, B, C, R) \ - ((__m128) __builtin_ia32_reducess_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__mmask8)(U), (int)(R))) + ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_avx512_setzero_ps (), \ + (__mmask8)(-1), (int)(R))) #define _mm_mask_reduce_round_ss(W, U, A, B, C, R) \ ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A), \ @@ -631,7 +633,7 @@ _mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm) (__mmask8)(U), (int)(R))) #define _mm_maskz_reduce_round_ss(U, A, B, C, R) \ - ((__m128) __builtin_ia32_reducesd_mask_round ((__v4sf)(__m128)(A), \ + ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A), \ (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_avx512_setzero_ps (), \ (__mmask8)(U), (int)(R))) diff --git a/gcc/config/i386/avx512vlbwintrin.h b/gcc/config/i386/avx512vlbwintrin.h index 56740054aa1..98b9099e343 100644 --- a/gcc/config/i386/avx512vlbwintrin.h +++ b/gcc/config/i386/avx512vlbwintrin.h @@ -2089,7 +2089,7 @@ _mm_maskz_slli_epi16 (__mmask8 __U, __m128i __A, unsigned int __B) #define _mm256_mask_alignr_epi8(W, U, X, Y, N) \ ((__m256i) __builtin_ia32_palignr256_mask ((__v4di)(__m256i)(X), \ (__v4di)(__m256i)(Y), (int)((N) * 8), \ - (__v4di)(__m256i)(X), (__mmask32)(U))) + (__v4di)(__m256i)(W), (__mmask32)(U))) #define _mm256_mask_srli_epi16(W, U, A, B) \ ((__m256i) __builtin_ia32_psrlwi256_mask ((__v16hi)(__m256i)(A), \ @@ -2172,7 +2172,7 @@ _mm_maskz_slli_epi16 (__mmask8 __U, __m128i __A, unsigned int __B) #define _mm_mask_alignr_epi8(W, U, X, Y, N) \ ((__m128i) __builtin_ia32_palignr128_mask ((__v2di)(__m128i)(X), \ (__v2di)(__m128i)(Y), (int)((N) * 8), \ - (__v2di)(__m128i)(X), (__mmask16)(U))) + (__v2di)(__m128i)(W), (__mmask16)(U))) #define _mm_maskz_alignr_epi8(U, X, Y, N) \ ((__m128i) __builtin_ia32_palignr128_mask ((__v2di)(__m128i)(X), \ diff --git a/gcc/config/i386/avx512vlintrin.h b/gcc/config/i386/avx512vlintrin.h index 409a5d166b3..ca3b578f113 100644 --- a/gcc/config/i386/avx512vlintrin.h +++ b/gcc/config/i386/avx512vlintrin.h @@ -13404,7 +13404,7 @@ _mm256_permutex_pd (__m256d __X, const int __M) #define _mm_mask_alignr_epi64(W, U, X, Y, C) \ ((__m128i)__builtin_ia32_alignq128_mask ((__v2di)(__m128i)(X), \ - (__v2di)(__m128i)(Y), (int)(C), (__v2di)(__m128i)(X), (__mmask8)-1)) + (__v2di)(__m128i)(Y), (int)(C), (__v2di)(__m128i)(W), (__mmask8)(U))) #define _mm_maskz_alignr_epi64(U, X, Y, C) \ ((__m128i)__builtin_ia32_alignq128_mask ((__v2di)(__m128i)(X), \