From patchwork Mon Sep 2 08:41:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Levy Hsu X-Patchwork-Id: 1979574 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=levyhsu.com header.i=@levyhsu.com header.a=rsa-sha256 header.s=default header.b=e2isSepN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wy2MF2Rh7z1ygC for ; Mon, 2 Sep 2024 18:42:46 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C1D79385F018 for ; Mon, 2 Sep 2024 08:42:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from out28-56.mail.aliyun.com (out28-56.mail.aliyun.com [115.124.28.56]) by sourceware.org (Postfix) with ESMTPS id 1B8C8385DDD3 for ; Mon, 2 Sep 2024 08:42:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1B8C8385DDD3 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=levyhsu.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=levyhsu.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1B8C8385DDD3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=115.124.28.56 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725266545; cv=none; b=e2Bepiz1HRmcnXxpZId8es0JZlfgRD8p2hyoFNNsohCIZszBjRTl3z+/lkVqTExav1Xnfwo9cE9iieL3Yj7bMJzzY/CpSPaFlmeqJKCdMY0pTQVFHONhv9zkMEAYW1eX6ZAZpgQPO/zUmb+lH07g7OZ2XX1/9RzYh5f7KiFCTYo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725266545; c=relaxed/simple; bh=9nDWlXGNFwkAkrkT/ndh1cpmFXfUQQEyYJ2lcbl/yy0=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Rz0heccdZcFPPU+lNCigawOiJdHAihHqsznRaala9YLheY8p2OSRz397O+pn87dRErCoLEcIex7T13K/4OC4XxyZzoZHxAM4axrH5ScGtK5ZnP65RZe17WOf1txo0iOhdLSuVnvyJup7UZWigj/6pCRmoiyWeBK1vG+3d3odD4A= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=levyhsu.com; s=default; t=1725266540; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=2NfjfTweJXq9tftdIQ7WTVlbdp+9y7vZfDR897a59MA=; b=e2isSepNhBOu47YxgxhvcYx5fiakdZtuKLkQB83I/w4JJLhWC5Ne6k+Uqtt5s5+Ns4Hba2CzF5K7MM6jwnTMl9TnyyaqfvRVgGuhMngGq05p0p6vupCKAOMyphS5Ozmv0WsirZPWAX7RlBDbh6YWNJ+X4+sTgq4nV/2P0pKSUCkMwyvvZUliYWCcbN1jTCs7KGa3cVXc8psDKK2IWLyH1AnjIERpzS4hOAcDZb8dvisTtiw7UbURN74DGWAE2wbZYq+ZFpQQOfsBxspxHGH7/V67Aij57H0NfLKUYXFncFSiyCf+yQPLQBywfTvVv6EVaii/bqv/ZcnFMD8bxDUtaQ== Received: from ip-10-0-136-122.us-west-2.compute.internal(mailfrom:admin@levyhsu.com fp:SMTPD_---.Z8WcbyU_1725266534) by smtp.aliyun-inc.com; Mon, 02 Sep 2024 16:42:18 +0800 From: Levy Hsu To: gcc-patches@gcc.gnu.org Cc: admin@levyhsu.com, liwei.xu@intel.com, crazylht@gmail.com, ubizjak@gmail.com Subject: [PATCH] i386: Support partial vectorized V2BF/V4BF smaxmin Date: Mon, 2 Sep 2024 08:41:22 +0000 Message-ID: <20240902084202.1862005-1-admin@levyhsu.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_PERMERROR, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? This patch supports sminmax for partial vectorized V2BF/V4BF. gcc/ChangeLog: * config/i386/mmx.md (3): New define_expand for V2BF/V4BFsmaxmin gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-partial-bf-vector-smaxmin-1.c: New test. --- gcc/config/i386/mmx.md | 19 ++++++++++ .../avx10_2-partial-bf-vector-smaxmin-1.c | 36 +++++++++++++++++++ 2 files changed, 55 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx10_2-partial-bf-vector-smaxmin-1.c diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 9116ddb5321..3f12a1349ab 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2098,6 +2098,25 @@ DONE; }) +(define_expand "3" + [(set (match_operand:VBF_32_64 0 "register_operand") + (smaxmin:VBF_32_64 + (match_operand:VBF_32_64 1 "nonimmediate_operand") + (match_operand:VBF_32_64 2 "nonimmediate_operand")))] + "TARGET_AVX10_2_256" +{ + rtx op0 = gen_reg_rtx (V8BFmode); + rtx op1 = lowpart_subreg (V8BFmode, + force_reg (mode, operands[1]), mode); + rtx op2 = lowpart_subreg (V8BFmode, + force_reg (mode, operands[2]), mode); + + emit_insn (gen_v8bf3 (op0, op1, op2)); + + emit_move_insn (operands[0], lowpart_subreg (mode, op0, V8BFmode)); + DONE; +}) + (define_expand "sqrt2" [(set (match_operand:VHF_32_64 0 "register_operand") (sqrt:VHF_32_64 diff --git a/gcc/testsuite/gcc.target/i386/avx10_2-partial-bf-vector-smaxmin-1.c b/gcc/testsuite/gcc.target/i386/avx10_2-partial-bf-vector-smaxmin-1.c new file mode 100644 index 00000000000..0a7cc58e29d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_2-partial-bf-vector-smaxmin-1.c @@ -0,0 +1,36 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mavx10.2 -Ofast" } */ +/* /* { dg-final { scan-assembler-times "vmaxpbf16" 2 } } */ +/* /* { dg-final { scan-assembler-times "vminpbf16" 2 } } */ + +void +maxpbf16_64 (__bf16* restrict dest, __bf16* restrict src1, __bf16* restrict src2) +{ + int i; + for (i = 0; i < 4; i++) + dest[i] = src1[i] > src2[i] ? src1[i] : src2[i]; +} + +void +maxpbf16_32 (__bf16* restrict dest, __bf16* restrict src1, __bf16* restrict src2) +{ + int i; + for (i = 0; i < 2; i++) + dest[i] = src1[i] > src2[i] ? src1[i] : src2[i]; +} + +void +minpbf16_64 (__bf16* restrict dest, __bf16* restrict src1, __bf16* restrict src2) +{ + int i; + for (i = 0; i < 4; i++) + dest[i] = src1[i] < src2[i] ? src1[i] : src2[i]; +} + +void +minpbf16_32 (__bf16* restrict dest, __bf16* restrict src1, __bf16* restrict src2) +{ + int i; + for (i = 0; i < 2; i++) + dest[i] = src1[i] < src2[i] ? src1[i] : src2[i]; +}