From patchwork Wed May 15 08:20:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1935379 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=e3nKPC9N; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfR6X3wpVz1ymw for ; Wed, 15 May 2024 18:22:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9016E384AB6A for ; Wed, 15 May 2024 08:22:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by sourceware.org (Postfix) with ESMTPS id 14C703858410 for ; Wed, 15 May 2024 08:20:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 14C703858410 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 14C703858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761268; cv=none; b=jYdlMdBA12X3ivdzq/NgNXdBJZ7mQPLR10T5yImLPTkEhd4FZDdC5+VXUaF/ujnuOSsXb20CeTIcnlLkW7N0ijP8n8Ct5LHsUomuYtEFET7xfG+fPhk+ZzqbtIenMDBtwUC5wMkIpcMLRG+24j2OnHp+G5GemrdPJmAdeZAYYz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761268; c=relaxed/simple; bh=nW2c/5yNQI08nMlr3DZjgKOcrz0R06AhqrGoyfOgFzY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=S5UQF4PNbE0DVFr9BId/+XMrjzRdYZcswquyHS1VWNKDCzThRelHyDpTO9S7MjY2t7OPOQKIQHpoRHTP9cB8uAz1/2NduPklis0Ry6WMagCKL+fyBbidhFChEGwgoRETxNnaD4kPJd/s8QV6hyl3BKDaO3XjQo9VYceLdFUwosE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715761259; x=1747297259; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nW2c/5yNQI08nMlr3DZjgKOcrz0R06AhqrGoyfOgFzY=; b=e3nKPC9NAGo9jaqcH6Gg/1jwTpzDYugVdEwRKjc/pswW6y1xG1IfKjIs ky9zOimJ0+wRRYicnOUHdAvAivsMzVgPnwZVCssNKqWCljeVYBUyVjSoB 8q2uChObx6xndPjN5y8mx27SoJetuTUmhcnjvBFGzaZUxfWAiQmZie2HO Zu5T0XHPEVQj0CS57q7aUpilqulqDrP7qmLpuMNzAdB5xCBlKtcYuHkJx gewam5kwdhwbBbNfUi12PRpsiBLBFLKYC2b7ntnmtMSWqFJgk2ng2QgWo 1xC6c9r74rCXXANH97ZO0y4ZjlsSq76BVwoJcCbphst9LHtGkjAW9QHe5 A==; X-CSE-ConnectionGUID: NOgHh7OcTeGrs2zJaWn9Pg== X-CSE-MsgGUID: ACvwluYHSkmtmqY5Wa3GSQ== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="11648192" X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="11648192" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 01:20:56 -0700 X-CSE-ConnectionGUID: IaeWU9sYRwiWl4NgfTPwEQ== X-CSE-MsgGUID: VTBX7LyDRjaIvrrJH35rJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="35448184" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa005.fm.intel.com with ESMTP; 15 May 2024 01:20:55 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 894321008F6E; Wed, 15 May 2024 16:20:54 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates Date: Wed, 15 May 2024 16:20:53 +0800 Message-Id: <20240515082054.3934069-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240515082054.3934069-1-hongyu.wang@intel.com> References: <20240515082054.3934069-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org For general ccmp scenario, the tree sequence is like _1 = (a < b) _2 = (c < d) _3 = _1 & _2 current ccmp expanding will try to swap compare order for _1 and _2, compare the cost/cost2 between compare _1 and _2 first, then return the sequence with lower cost. For x86 ccmp, we don't support FP compare as ccmp operand, but we support fp comi + int ccmp sequence. With current cost comparison model, the fp comi + int ccmp can never be generated since it doesn't check whether expand_ccmp_next returns available result and the rtl cost for the empty ccmp sequence is always smaller. Check the expand_ccmp_next result ret and ret2, returns the valid one before cost comparison. gcc/ChangeLog: * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of expand_ccmp_next, returns the valid one first before comparing cost. --- gcc/ccmp.cc | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc index 7cb525addf4..4b424220068 100644 --- a/gcc/ccmp.cc +++ b/gcc/ccmp.cc @@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, rtx_insn **gen_seq) cost2 = seq_cost (prep_seq_2, speed_p); cost2 += seq_cost (gen_seq_2, speed_p); } - if (cost2 < cost1) + + /* For x86 target the ccmp does not support fp operands, but + have fcomi insn that can produce eflags and then do int + ccmp. So if one of the op is fp compare, ret1 or ret2 can + fail, and the cost of the corresponding empty seq will + always be smaller, then the NULL sequence will be returned. + Add check for ret and ret2, returns the available one if + the other is NULL. */ + if ((!ret && ret2) + || (!(ret && !ret2) + && cost2 < cost1)) { *prep_seq = prep_seq_2; *gen_seq = gen_seq_2;