From patchwork Tue Sep 10 06:35:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Levy Hsu X-Patchwork-Id: 1982966 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=levyhsu.com header.i=@levyhsu.com header.a=rsa-sha256 header.s=default header.b=NVSbd43f; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X2v9c2XGrz1y1S for ; Tue, 10 Sep 2024 16:36:18 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5E1FD385B503 for ; Tue, 10 Sep 2024 06:36:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from out28-63.mail.aliyun.com (out28-63.mail.aliyun.com [115.124.28.63]) by sourceware.org (Postfix) with ESMTPS id CB8113858D39 for ; Tue, 10 Sep 2024 06:35:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CB8113858D39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=levyhsu.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=levyhsu.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CB8113858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=115.124.28.63 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725950160; cv=none; b=f4z7JlDZTWLd174lVUXMbKlc2ilmcR6+rBbK7RFY7G0iM5eF8HbpxttFd5rQmSAAiDbCs7Foa9znjUT8ad6Q2NjDrymgnZ/gCczURkK9FATfuWb6kv722r0eOkeLkiGqBFf8zOvAIcbKF+2AWrOqjZSf7Y2DUBEkJtu2nLEesU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725950160; c=relaxed/simple; bh=JWZBa51XrMMXm0Wx/FVR/z3jpPXmpkAlsp8GzqeKAMQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=XljtePu0wxZOHE68Ch1+xG2QLEaA5h/1fTluAAN75LftUjoTOumbW3I5HxF0HeMObKoXzCQt93czsN6MbX0gmaNmoutbwqo+ehSp4o8ZyqADzPURsakCT8+CwF4c8YCi8n+3dVbS79vEcnRrGdH1CpD0aozQ9SGHH4wWP70QZFg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=levyhsu.com; s=default; t=1725950154; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=JGk7Agt8S+/oMALDYjEuoTidqiTcZGQ/fvP8IJZy9do=; b=NVSbd43fN/t5YDYNkS7+G2ZmX+LML67t/wQlRhRR+JrtByhNBs+HnodI7pz4p/HJvPXVVmiIBsv/+0lKVDhdBfTrHoMMc8DCx2LT23JE+Ms8xNbe/0b6MZGnt+5OCcQ1AIsNxOcc9FJ/NozomyEf0e10rfEcXwDVvjDb+w9WXrZ5DksuZ93dHU6zaMBcVnkGNgh7tTSZJWYJfUC76RAQozHMZJuMDJMfe6hI3F4FE5evLrigO8SOMT+3cnb6YK65J7n2saV0RCiE+Fg4VcCjaitYtl0znS/f1hZor26v1u+G7Pu4u3Q18d+hWnC1cpeoetjJQHsOOmSI3fzqqxBtZg== Received: from ip-10-0-154-97.us-west-2.compute.internal(mailfrom:admin@levyhsu.com fp:SMTPD_---.ZFF0tvo_1725950151) by smtp.aliyun-inc.com; Tue, 10 Sep 2024 14:35:53 +0800 From: Levy Hsu To: gcc-patches@gcc.gnu.org Cc: admin@levyhsu.com, liwei.xu@intel.com, crazylht@gmail.com, ubizjak@gmail.com Subject: [PATCH] i386: Enable V2BF/V4BF vec_cmp with AVX10.2 vcmppbf16 Date: Tue, 10 Sep 2024 06:35:28 +0000 Message-ID: <20240910063548.18245-1-admin@levyhsu.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, TXREP, T_SPF_PERMERROR, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/ChangeLog: * config/i386/i386.cc (ix86_get_mask_mode): Enable BFmode for targetm.vectorize.get_mask_mode with AVX10.2. * config/i386/mmx.md (vec_cmpqi): Implement vec_cmpv2bfqi and vec_cmpv4bfqi. gcc/testsuite/ChangeLog: * gcc.target/i386/part-vect-vec_cmpbf.c: New test. --- gcc/config/i386/i386.cc | 3 ++- gcc/config/i386/mmx.md | 17 ++++++++++++ .../gcc.target/i386/part-vect-vec_cmpbf.c | 26 +++++++++++++++++++ 3 files changed, 45 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/part-vect-vec_cmpbf.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 45320124b91..82267552474 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -24682,7 +24682,8 @@ ix86_get_mask_mode (machine_mode data_mode) /* AVX512FP16 only supports vector comparison to kmask for _Float16. */ || (TARGET_AVX512VL && TARGET_AVX512FP16 - && GET_MODE_INNER (data_mode) == E_HFmode)) + && GET_MODE_INNER (data_mode) == E_HFmode) + || TARGET_AVX10_2_256 && GET_MODE_INNER (data_mode) == E_BFmode) { if (elem_size == 4 || elem_size == 8 diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 4bc191b874b..95d9356694a 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2290,6 +2290,23 @@ DONE; }) +;;This instruction does not generate floating point exceptions +(define_expand "vec_cmpqi" + [(set (match_operand:QI 0 "register_operand") + (match_operator:QI 1 "" + [(match_operand:VBF_32_64 2 "register_operand") + (match_operand:VBF_32_64 3 "nonimmediate_operand")]))] + "TARGET_AVX10_2_256" +{ + rtx op2 = lowpart_subreg (V8BFmode, + force_reg (mode, operands[2]), mode); + rtx op3 = lowpart_subreg (V8BFmode, + force_reg (mode, operands[3]), mode); + + emit_insn (gen_vec_cmpv8bfqi (operands[0], operands[1], op2, op3)); + DONE; +}) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Parallel half-precision floating point rounding operations. diff --git a/gcc/testsuite/gcc.target/i386/part-vect-vec_cmpbf.c b/gcc/testsuite/gcc.target/i386/part-vect-vec_cmpbf.c new file mode 100644 index 00000000000..0bb720b6432 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/part-vect-vec_cmpbf.c @@ -0,0 +1,26 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx10.2" } */ +/* { dg-final { scan-assembler-times "vcmppbf16" 10 } } */ + +typedef __bf16 __attribute__((__vector_size__ (4))) v2bf; +typedef __bf16 __attribute__((__vector_size__ (8))) v4bf; + + +#define VCMPMN(type, op, name) \ +type \ +__attribute__ ((noinline, noclone)) \ +vec_cmp_##type##type##name (type a, type b) \ +{ \ + return a op b; \ +} + +VCMPMN (v4bf, <, lt) +VCMPMN (v2bf, <, lt) +VCMPMN (v4bf, <=, le) +VCMPMN (v2bf, <=, le) +VCMPMN (v4bf, >, gt) +VCMPMN (v2bf, >, gt) +VCMPMN (v4bf, >=, ge) +VCMPMN (v2bf, >=, ge) +VCMPMN (v4bf, ==, eq) +VCMPMN (v2bf, ==, eq)