From patchwork Wed May 15 08:20:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1935380 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=RcKPFxgJ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfR8T4y4Jz1ymw for ; Wed, 15 May 2024 18:24:09 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E350A384AB54 for ; Wed, 15 May 2024 08:24:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by sourceware.org (Postfix) with ESMTPS id E34CF384AB5A for ; Wed, 15 May 2024 08:21:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E34CF384AB5A Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E34CF384AB5A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761271; cv=none; b=iCSvFOZ/JMDojZWWud4tonnlNCs2RvR5WGVEnHHgv7dJujT17VQiWd0wTiCXWkhSr6uQEQZ1LNwaHD596U8DbGnH8zAuR14WCYTBRWmlzEaH1OizhSeJpYkbfDFc3GwE9vLj9spwifIOUf9tkPJKUmRkqosbNbnSwLyAUvqMgdo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761271; c=relaxed/simple; bh=YNetVIb8LdY8Q5VyJ5Qa9r70bHXF+lUdYIA3wYCo/T8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CBnHpx2Yv5w9iCegsglOgZ5rP6RDi7C8UIWujQmbRCoeNWkwu7IhuNRhN+qibKEFW8cBj9MHR1a/6E1QDUlKzcexlQ9C2EdRb+g3tFDRZAaWoW3RwbmvoGyRHcMgE9kXuoNGdoaxK5R2Kkm3dHEiBWl5Q1ptH47Pqf6sAsvqu7w= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715761265; x=1747297265; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YNetVIb8LdY8Q5VyJ5Qa9r70bHXF+lUdYIA3wYCo/T8=; b=RcKPFxgJHOZn/cmQALlzIhd1XBRA0ON9Oi6AzvPJFEG7ZgSpk7X3Kf5d 58XVVDPkmLlYdqTEYcKotV0H46tJt0TRjfzhe5N4UJ8DL2uBEFpg/o0RY Qr3RyKBMfJa1UYf5A6+diYR30RN8eKK1gnJrBr9ojK7UrmTTN7axuTPFx g8zjMTaIPCwkBQgGfKEVzmR5MEIuS0w+/PP+K7Y6IzkE7b4p10KkGJZcw 3JRJkOmZa3gwpmUA5PQ4PLcCFsTPD5Fd8kwAMnPf/27LUBn07wzx+M5VV VESA8/Uq/uqIYVrvuhirCPPtBMbPvOcEMqllCfXAwnFg5eatMTiTvUGSq g==; X-CSE-ConnectionGUID: W1u8fazCRVe4i+jON/RGXg== X-CSE-MsgGUID: 3nS4x//TT4Kj8Rv53C8ABQ== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="11648201" X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="11648201" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 01:20:58 -0700 X-CSE-ConnectionGUID: L51bLY2JQgWj+VCCUgd3rQ== X-CSE-MsgGUID: 1hug5u2ARVqzF8WvtdkAWQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="35448193" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa005.fm.intel.com with ESMTP; 15 May 2024 01:20:55 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 86A0510081FF; Wed, 15 May 2024 16:20:54 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 1/3] [APX CCMP] Support APX CCMP Date: Wed, 15 May 2024 16:20:52 +0800 Message-Id: <20240515082054.3934069-2-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240515082054.3934069-1-hongyu.wang@intel.com> References: <20240515082054.3934069-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org APX CCMP feature implements conditional compare which executes compare when EFLAGS matches certain condition. CCMP introduces default flags value (dfv), when conditional compare does not execute, it will directly set the flags according to dfv. The instruction goes like ccmpeq {dfv=sf,of,cf,zf} %rax, %r16 For this instruction, it will test EFLAGS regs if it matches conditional code EQ, if yes, compare %rax and %r16 like legacy cmp. If no, the EFLAGS will be updated according to dfv, which means SF,OF,CF,ZF are set. PF will be set according to CF in dfv, and AF will always be cleared. The dfv part can be a combination of sf,of,cf,zf, like {dfv=cf,zf} which sets CF and ZF only and clear others, or {dfv=} which clears all EFLAGS. To enable CCMP, we implemented the target hook TARGET_GEN_CCMP_FIRST and TARGET_GEN_CCMP_NEXT to reuse the current ccmp infrastructure. Also we extended the cstorem4 optab to support storing different CCmode to fit current ccmp infrasturcture. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_gen_ccmp_first): New function that test if the first compare can be generated. (ix86_gen_ccmp_next): New function to emit a simgle compare and ccmp sequence. * config/i386/i386-opts.h (enum apx_features): Add apx_ccmp. * config/i386/i386-protos.h (ix86_gen_ccmp_first): New proto declare. (ix86_gen_ccmp_next): Likewise. (ix86_get_flags_cc): Likewise. * config/i386/i386.cc (ix86_flags_cc): New enum. (ix86_ccmp_dfv_mapping): New string array to map conditional code to dfv. (ix86_print_operand): Handle special dfv flag for CCMP. (ix86_get_flags_cc): New function to return x86 CC enum. (TARGET_GEN_CCMP_FIRST): Define. (TARGET_GEN_CCMP_NEXT): Likewise. * config/i386/i386.h (TARGET_APX_CCMP): Define. * config/i386/i386.md (@ccmp): New define_insn to support ccmp. (UNSPEC_APX_DFV): New unspec for ccmp dfv. (ALL_CC): New mode iterator. (cstorecc4): Change to ... (cstore4) ... this, use ALL_CC to loop through all available CCmodes. * config/i386/i386.opt (apx_ccmp): Add enum value for ccmp. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ccmp-1.c: New compile test. * gcc.target/i386/apx-ccmp-2.c: New runtime test. --- gcc/config/i386/i386-expand.cc | 121 +++++++++++++++++++++ gcc/config/i386/i386-opts.h | 6 +- gcc/config/i386/i386-protos.h | 5 + gcc/config/i386/i386.cc | 50 +++++++++ gcc/config/i386/i386.h | 1 + gcc/config/i386/i386.md | 35 +++++- gcc/config/i386/i386.opt | 3 + gcc/testsuite/gcc.target/i386/apx-ccmp-1.c | 63 +++++++++++ gcc/testsuite/gcc.target/i386/apx-ccmp-2.c | 57 ++++++++++ 9 files changed, 337 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ccmp-1.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-ccmp-2.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 1ab22fe7973..f00525e449f 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -25554,4 +25554,125 @@ ix86_expand_fast_convert_bf_to_sf (rtx val) return ret; } +rtx +ix86_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq, + rtx_code code, tree treeop0, tree treeop1) +{ + if (!TARGET_APX_CCMP) + return NULL_RTX; + + rtx op0, op1, res; + machine_mode op_mode; + + start_sequence (); + expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, EXPAND_NORMAL); + + op_mode = GET_MODE (op0); + if (op_mode == VOIDmode) + op_mode = GET_MODE (op1); + + if (!(op_mode == DImode || op_mode == SImode || op_mode == HImode + || op_mode == QImode)) + { + end_sequence (); + return NULL_RTX; + } + + /* Canonicalize the operands according to mode. */ + if (!nonimmediate_operand (op0, op_mode)) + op0 = force_reg (op_mode, op0); + if (!x86_64_general_operand (op1, op_mode)) + op1 = force_reg (op_mode, op1); + + *prep_seq = get_insns (); + end_sequence (); + + start_sequence (); + + res = ix86_expand_compare (code, op0, op1); + + if (!res) + { + end_sequence (); + return NULL_RTX; + } + *gen_seq = get_insns (); + end_sequence (); + + return res; +} + +rtx +ix86_gen_ccmp_next (rtx_insn **prep_seq, rtx_insn **gen_seq, rtx prev, + rtx_code cmp_code, tree treeop0, tree treeop1, + rtx_code bit_code) +{ + if (!TARGET_APX_CCMP) + return NULL_RTX; + + rtx op0, op1, target; + machine_mode op_mode, cmp_mode, cc_mode = CCmode; + int unsignedp = TYPE_UNSIGNED (TREE_TYPE (treeop0)); + insn_code icode; + rtx_code prev_code; + struct expand_operand ops[5]; + int dfv; + + push_to_sequence (*prep_seq); + expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, EXPAND_NORMAL); + + cmp_mode = op_mode = GET_MODE (op0); + + if (!(op_mode == DImode || op_mode == SImode || op_mode == HImode + || op_mode == QImode)) + { + end_sequence (); + return NULL_RTX; + } + + icode = code_for_ccmp (op_mode); + + op0 = prepare_operand (icode, op0, 2, op_mode, cmp_mode, unsignedp); + op1 = prepare_operand (icode, op1, 3, op_mode, cmp_mode, unsignedp); + if (!op0 || !op1) + { + end_sequence (); + return NULL_RTX; + } + + *prep_seq = get_insns (); + end_sequence (); + + target = gen_rtx_REG (cc_mode, FLAGS_REG); + dfv = ix86_get_flags_cc ((rtx_code) cmp_code); + + prev_code = GET_CODE (prev); + + if (bit_code != AND) + prev_code = reverse_condition (prev_code); + else + dfv = (int)(dfv ^ 1); + + prev = gen_rtx_fmt_ee (prev_code, VOIDmode, XEXP (prev, 0), + const0_rtx); + + create_fixed_operand (&ops[0], target); + create_fixed_operand (&ops[1], prev); + create_fixed_operand (&ops[2], op0); + create_fixed_operand (&ops[3], op1); + create_fixed_operand (&ops[4], GEN_INT (dfv)); + + push_to_sequence (*gen_seq); + if (!maybe_expand_insn (icode, 5, ops)) + { + end_sequence (); + return NULL_RTX; + } + + *gen_seq = get_insns (); + end_sequence (); + + return gen_rtx_fmt_ee ((rtx_code) cmp_code, VOIDmode, target, const0_rtx); +} + #include "gt-i386-expand.h" diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h index 60176ce609f..5fcc4927978 100644 --- a/gcc/config/i386/i386-opts.h +++ b/gcc/config/i386/i386-opts.h @@ -140,8 +140,10 @@ enum apx_features { apx_push2pop2 = 1 << 1, apx_ndd = 1 << 2, apx_ppx = 1 << 3, - apx_nf = 1<< 4, - apx_all = apx_egpr | apx_push2pop2 | apx_ndd | apx_ppx | apx_nf, + apx_nf = 1 << 4, + apx_ccmp = 1 << 5, + apx_all = apx_egpr | apx_push2pop2 | apx_ndd + | apx_ppx | apx_nf | apx_ccmp, }; #endif diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index dbc861fb1ea..26e29df7312 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -242,6 +242,11 @@ extern void ix86_expand_atomic_fetch_op_loop (rtx, rtx, rtx, enum rtx_code, extern void ix86_expand_cmpxchg_loop (rtx *, rtx, rtx, rtx, rtx, rtx, bool, rtx_code_label *); extern rtx ix86_expand_fast_convert_bf_to_sf (rtx); +extern rtx ix86_gen_ccmp_first (rtx_insn **, rtx_insn **, enum rtx_code, + tree, tree); +extern rtx ix86_gen_ccmp_next (rtx_insn **, rtx_insn **, rtx, + enum rtx_code, tree, tree, enum rtx_code); +extern int ix86_get_flags_cc (enum rtx_code); extern rtx ix86_memtag_untagged_pointer (rtx, rtx); extern bool ix86_memtag_can_tag_addresses (void); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index b4838b7939e..2363cab1eae 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -433,6 +433,22 @@ static bool i386_asm_output_addr_const_extra (FILE *, rtx); static bool ix86_can_inline_p (tree, tree); static unsigned int ix86_minimum_incoming_stack_boundary (bool); +typedef enum ix86_flags_cc +{ + X86_CCO = 0, X86_CCNO, X86_CCB, X86_CCNB, + X86_CCE, X86_CCNE, X86_CCBE, X86_CCNBE, + X86_CCS, X86_CCNS, X86_CCP, X86_CCNP, + X86_CCL, X86_CCNL, X86_CCLE, X86_CCNLE +} ix86_cc; + +static const char *ix86_ccmp_dfv_mapping[] = +{ + "{dfv=of}", "{dfv=}", "{dfv=cf}", "{dfv=}", + "{dfv=zf}", "{dfv=}", "{dfv=cf, zf}", "{dfv=}", + "{dfv=sf}", "{dfv=}", "{dfv=cf}", "{dfv=}", + "{dfv=sf}", "{dfv=sf, of}", "{dfv=sf, of, zf}", "{dfv=sf, of}" +}; + /* Whether -mtune= or -march= were specified */ int ix86_tune_defaulted; @@ -13690,6 +13706,7 @@ print_reg (rtx x, int code, FILE *file) M -- print addr32 prefix for TARGET_X32 with VSIB address. ! -- print NOTRACK prefix for jxx/call/ret instructions if required. N -- print maskz if it's constant 0 operand. + G -- print embedded flag for ccmp/ctest. */ void @@ -14083,6 +14100,14 @@ ix86_print_operand (FILE *file, rtx x, int code) file); return; + case 'G': + { + int dfv = INTVAL (x); + const char *dfv_suffix = ix86_ccmp_dfv_mapping[dfv]; + fputs (dfv_suffix, file); + } + return; + case 'H': if (!offsettable_memref_p (x)) { @@ -16466,6 +16491,24 @@ ix86_convert_const_vector_to_integer (rtx op, machine_mode mode) return val.to_shwi (); } +int ix86_get_flags_cc (rtx_code code) +{ + switch (code) + { + case NE: return X86_CCNE; + case EQ: return X86_CCE; + case GE: return X86_CCNL; + case GT: return X86_CCNLE; + case LE: return X86_CCLE; + case LT: return X86_CCL; + case GEU: return X86_CCNB; + case GTU: return X86_CCNBE; + case LEU: return X86_CCBE; + case LTU: return X86_CCB; + default: return -1; + } +} + /* Return TRUE or FALSE depending on whether the first SET in INSN has source and destination with matching CC modes, and that the CC mode is at least as constrained as REQ_MODE. */ @@ -26765,6 +26808,13 @@ ix86_libgcc_floating_mode_supported_p #undef TARGET_MEMTAG_TAG_SIZE #define TARGET_MEMTAG_TAG_SIZE ix86_memtag_tag_size +#undef TARGET_GEN_CCMP_FIRST +#define TARGET_GEN_CCMP_FIRST ix86_gen_ccmp_first + +#undef TARGET_GEN_CCMP_NEXT +#define TARGET_GEN_CCMP_NEXT ix86_gen_ccmp_next + + static bool ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED) { diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index f20ae4726da..5631bc4695a 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -56,6 +56,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_APX_NDD (ix86_apx_features & apx_ndd) #define TARGET_APX_PPX (ix86_apx_features & apx_ppx) #define TARGET_APX_NF (ix86_apx_features & apx_nf) +#define TARGET_APX_CCMP (ix86_apx_features & apx_ccmp) #include "config/vxworks-dummy.h" diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index ddde83e57f5..49978d1f383 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -217,6 +217,10 @@ (define_c_enum "unspec" [ ;; For APX PPX support UNSPEC_APX_PPX + + ;; For APX CCMP support + ;; DFV = default flag value + UNSPEC_APX_DFV ]) (define_c_enum "unspecv" [ @@ -1504,6 +1508,25 @@ (define_expand "cstore4" DONE; }) +(define_insn "@ccmp" + [(set (match_operand:CC 0 "flags_reg_operand") + (if_then_else:CC + (match_operator 1 "comparison_operator" + [(reg:CC FLAGS_REG) (const_int 0)]) + (compare:CC + (minus:SWI (match_operand:SWI 2 "nonimmediate_operand" "m,") + (match_operand:SWI 3 "" ",")) + (const_int 0)) + (unspec:SI + [(match_operand:SI 4 "const_0_to_15_operand")] + UNSPEC_APX_DFV)))] + "TARGET_APX_CCMP" + "ccmp%C1{}\t%G4 {%3, %2|%2, %3}" + [(set_attr "type" "icmp") + (set_attr "mode" "") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex")]) + (define_expand "@cmp_1" [(set (reg:CC FLAGS_REG) (compare:CC (match_operand:SWI48 0 "nonimmediate_operand") @@ -1850,10 +1873,18 @@ (define_expand "cbranchcc4" DONE; }) -(define_expand "cstorecc4" +;; For conditonal compare, the middle-end hook will convert +;; CCmode to sub-CCmode using SELECT_CC_MODE macro and try +;; to find cstore in optab. Add ALL_CC to support +;; the cstore after ccmp sequence. + +(define_mode_iterator ALL_CC + [CCGC CCGOC CCNO CCGZ CCA CCC CCO CCP CCS CCZ CC]) + +(define_expand "cstore4" [(set (match_operand:QI 0 "register_operand") (match_operator 1 "comparison_operator" - [(match_operand 2 "flags_reg_operand") + [(match_operand:ALL_CC 2 "flags_reg_operand") (match_operand 3 "const0_operand")]))] "" { diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 66021d59d4e..7e6fe91d1d6 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1359,6 +1359,9 @@ Enum(apx_features) String(ppx) Value(apx_ppx) Set(5) EnumValue Enum(apx_features) String(nf) Value(apx_nf) Set(6) +EnumValue +Enum(apx_features) String(ccmp) Value(apx_ccmp) Set(7) + EnumValue Enum(apx_features) String(all) Value(apx_all) Set(1) diff --git a/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c b/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c new file mode 100644 index 00000000000..5a2dad89f1f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c @@ -0,0 +1,63 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mapx-features=ccmp" } */ + +int +f1 (int a) +{ + return a < 17 || a == 32; +} + +int +f2 (int a) +{ + return a > 33 || a == 18; +} + +int +f3 (int a, int b) +{ + return a != 19 && b > 34; +} + +int +f4 (int a, int b) +{ + return a < 35 && b == 20; +} + +int +f5 (short a) +{ + return a == 0 || a == 5; +} + +int +f6 (long long a) +{ + return a == 6 || a == 0; +} + +int +f7 (char a, char b) +{ + return a > 0 && b <= 7; +} + +int +f8 (int a, int b) +{ + return a == 9 && b > 0; +} + +int +f9 (int a, int b) +{ + a += b; + return a == 3 || a == 0; +} + +/* { dg-final { scan-assembler-times "ccmpg" 2 } } */ +/* { dg-final { scan-assembler-times "ccmple" 2 } } */ +/* { dg-final { scan-assembler-times "ccmpne" 4 } } */ +/* { dg-final { scan-assembler-times "ccmpe" 1 } } */ + diff --git a/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c b/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c new file mode 100644 index 00000000000..30a1c216c1b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c @@ -0,0 +1,57 @@ +/* { dg-do run { target { ! ia32 } } } */ +/* { dg-require-effective-target apxf } */ +/* { dg-options "-O3 -mno-apxf" } */ + +__attribute__((noinline, noclone, target("apxf"))) +int foo_apx(int a, int b, int c, int d) +{ + int sum = a; + + if (a != c) + { + c += d; + a += b; + sum += a + c; + if (b != d && sum < c || sum > d) + { + b += d; + sum += b; + } + } + + return sum; +} + +__attribute__((noinline, noclone, target("no-apxf"))) +int foo_noapx(int a, int b, int c, int d) +{ + int sum = a; + + if (a != c) + { + c += d; + a += b; + sum += a + c; + if (b != d && sum < c || sum > d) + { + b += d; + sum += b; + } + } + + return sum; +} + +int main (void) +{ + if (!__builtin_cpu_supports ("apxf")) + return 0; + + int val1 = foo_noapx (23, 17, 32, 44); + int val2 = foo_apx (23, 17, 32, 44); + + if (val1 != val2) + __builtin_abort (); + + return 0; +} From patchwork Wed May 15 08:20:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1935379 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=e3nKPC9N; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfR6X3wpVz1ymw for ; Wed, 15 May 2024 18:22:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9016E384AB6A for ; Wed, 15 May 2024 08:22:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by sourceware.org (Postfix) with ESMTPS id 14C703858410 for ; Wed, 15 May 2024 08:20:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 14C703858410 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 14C703858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761268; cv=none; b=jYdlMdBA12X3ivdzq/NgNXdBJZ7mQPLR10T5yImLPTkEhd4FZDdC5+VXUaF/ujnuOSsXb20CeTIcnlLkW7N0ijP8n8Ct5LHsUomuYtEFET7xfG+fPhk+ZzqbtIenMDBtwUC5wMkIpcMLRG+24j2OnHp+G5GemrdPJmAdeZAYYz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761268; c=relaxed/simple; bh=nW2c/5yNQI08nMlr3DZjgKOcrz0R06AhqrGoyfOgFzY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=S5UQF4PNbE0DVFr9BId/+XMrjzRdYZcswquyHS1VWNKDCzThRelHyDpTO9S7MjY2t7OPOQKIQHpoRHTP9cB8uAz1/2NduPklis0Ry6WMagCKL+fyBbidhFChEGwgoRETxNnaD4kPJd/s8QV6hyl3BKDaO3XjQo9VYceLdFUwosE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715761259; x=1747297259; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nW2c/5yNQI08nMlr3DZjgKOcrz0R06AhqrGoyfOgFzY=; b=e3nKPC9NAGo9jaqcH6Gg/1jwTpzDYugVdEwRKjc/pswW6y1xG1IfKjIs ky9zOimJ0+wRRYicnOUHdAvAivsMzVgPnwZVCssNKqWCljeVYBUyVjSoB 8q2uChObx6xndPjN5y8mx27SoJetuTUmhcnjvBFGzaZUxfWAiQmZie2HO Zu5T0XHPEVQj0CS57q7aUpilqulqDrP7qmLpuMNzAdB5xCBlKtcYuHkJx gewam5kwdhwbBbNfUi12PRpsiBLBFLKYC2b7ntnmtMSWqFJgk2ng2QgWo 1xC6c9r74rCXXANH97ZO0y4ZjlsSq76BVwoJcCbphst9LHtGkjAW9QHe5 A==; X-CSE-ConnectionGUID: NOgHh7OcTeGrs2zJaWn9Pg== X-CSE-MsgGUID: ACvwluYHSkmtmqY5Wa3GSQ== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="11648192" X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="11648192" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 01:20:56 -0700 X-CSE-ConnectionGUID: IaeWU9sYRwiWl4NgfTPwEQ== X-CSE-MsgGUID: VTBX7LyDRjaIvrrJH35rJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="35448184" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa005.fm.intel.com with ESMTP; 15 May 2024 01:20:55 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 894321008F6E; Wed, 15 May 2024 16:20:54 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates Date: Wed, 15 May 2024 16:20:53 +0800 Message-Id: <20240515082054.3934069-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240515082054.3934069-1-hongyu.wang@intel.com> References: <20240515082054.3934069-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org For general ccmp scenario, the tree sequence is like _1 = (a < b) _2 = (c < d) _3 = _1 & _2 current ccmp expanding will try to swap compare order for _1 and _2, compare the cost/cost2 between compare _1 and _2 first, then return the sequence with lower cost. For x86 ccmp, we don't support FP compare as ccmp operand, but we support fp comi + int ccmp sequence. With current cost comparison model, the fp comi + int ccmp can never be generated since it doesn't check whether expand_ccmp_next returns available result and the rtl cost for the empty ccmp sequence is always smaller. Check the expand_ccmp_next result ret and ret2, returns the valid one before cost comparison. gcc/ChangeLog: * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of expand_ccmp_next, returns the valid one first before comparing cost. --- gcc/ccmp.cc | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc index 7cb525addf4..4b424220068 100644 --- a/gcc/ccmp.cc +++ b/gcc/ccmp.cc @@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, rtx_insn **gen_seq) cost2 = seq_cost (prep_seq_2, speed_p); cost2 += seq_cost (gen_seq_2, speed_p); } - if (cost2 < cost1) + + /* For x86 target the ccmp does not support fp operands, but + have fcomi insn that can produce eflags and then do int + ccmp. So if one of the op is fp compare, ret1 or ret2 can + fail, and the cost of the corresponding empty seq will + always be smaller, then the NULL sequence will be returned. + Add check for ret and ret2, returns the available one if + the other is NULL. */ + if ((!ret && ret2) + || (!(ret && !ret2) + && cost2 < cost1)) { *prep_seq = prep_seq_2; *gen_seq = gen_seq_2; From patchwork Wed May 15 08:20:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1935378 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=iXEaAt7W; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfR633rGRz1ymw for ; Wed, 15 May 2024 18:22:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D16123849ACF for ; Wed, 15 May 2024 08:22:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by sourceware.org (Postfix) with ESMTPS id 3D704385F014 for ; Wed, 15 May 2024 08:21:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3D704385F014 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3D704385F014 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761269; cv=none; b=kCeuQIsDStoPF/qvcMp0hAHw/U4ngc98TeWYPPaIUktxts7D6SZFnSmWyDau21jj0fvTcLmaCSd4aCPLJX6IpbfTtxnuJyEI+kO6Hsm4l20kZ0Nqqyy7mlJFO3HewXNdhn2YTbFotuuQYg4M12ESggvbPbd5fS8xSaRQM+kfeX0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715761269; c=relaxed/simple; bh=g46GmmDe+aRkvkBHlw9nrH5rMd0VsApxQ/mJ4LQnRSc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=D3yUZgdjrqzLvKhs647s2hi/WVlRT50xYMjx0pwIDtlYR6su/ak994nVRAf+U0Ncy4Krzl6/+5K+S4KMa+DYeas0Yrf/S3wtRQDXaffIioKIatw4HNxpkQBjZ7eO5yQ+ssjKmQTjiRsLKFXLj4251Aep4lfKcMa+dPlbCdvAPHc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715761264; x=1747297264; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g46GmmDe+aRkvkBHlw9nrH5rMd0VsApxQ/mJ4LQnRSc=; b=iXEaAt7WHVHmOxCjex447SjoH+3XLPDwa765RJgA+VCI/Jdg1KoVUE5G SKDruDN1qyQJoIUiLPyRPDg8WkYqLVot5qpYTlBqNYJffyjGmy8JvRDMZ HJDmlisCOolc9byJGTTCk34qCXqpFUg/oSHDU38rxoedDfA76qnBiEe0H HLsWI2iOHWesZtregM4cuuj41J7i5+vzEyBYx3iQu7+XwCVrPvG2k8+PT Uo+pnccaLh+wD+rQMMQWMzzXh/6P/pbZJhBE32seN7HM4PxJgdUXErln5 AUldkM8QXDq180DaW1ceQJT5QzP4QiansRuNk8wfW2DENDatiiR8te7/8 Q==; X-CSE-ConnectionGUID: /+IqCRZFSvagFknZAM0bZA== X-CSE-MsgGUID: +itEl7+/RMWoPFOFtpb0+Q== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="11648199" X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="11648199" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 01:20:58 -0700 X-CSE-ConnectionGUID: +H4b+Pt1QPWF5JwzY6+ygQ== X-CSE-MsgGUID: qmZ0Fo8YQFKrrMwvEKsB5Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="35448190" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa005.fm.intel.com with ESMTP; 15 May 2024 01:20:55 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8AE451008F96; Wed, 15 May 2024 16:20:54 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 3/3] [APX CCMP] Support ccmp for float compare Date: Wed, 15 May 2024 16:20:54 +0800 Message-Id: <20240515082054.3934069-4-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240515082054.3934069-1-hongyu.wang@intel.com> References: <20240515082054.3934069-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The ccmp insn itself doesn't support fp compare, but x86 has fp comi insn that changes EFLAG which can be the scc input to ccmp. Allow scalar fp compare in ix86_gen_ccmp_first except ORDERED/UNORDERD compare which can not be identified in ccmp. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_gen_ccmp_first): Add fp compare and check the allowed fp compare type. (ix86_gen_ccmp_next): Adjust compare_code input to ccmp for fp compare. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ccmp-1.c: Add test for fp compare. * gcc.target/i386/apx-ccmp-2.c: Likewise. --- gcc/config/i386/i386-expand.cc | 53 ++++++++++++++++++++-- gcc/testsuite/gcc.target/i386/apx-ccmp-1.c | 45 +++++++++++++++++- gcc/testsuite/gcc.target/i386/apx-ccmp-2.c | 47 +++++++++++++++++++ 3 files changed, 138 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index f00525e449f..7507034dc91 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -25571,18 +25571,58 @@ ix86_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq, if (op_mode == VOIDmode) op_mode = GET_MODE (op1); + /* We only supports following scalar comparisons that use just 1 + instruction: DI/SI/QI/HI/DF/SF/HF. + Unordered/Ordered compare cannot be corretly indentified by + ccmp so they are not supported. */ if (!(op_mode == DImode || op_mode == SImode || op_mode == HImode - || op_mode == QImode)) + || op_mode == QImode || op_mode == DFmode || op_mode == SFmode + || op_mode == HFmode) + || code == ORDERED + || code == UNORDERED) { end_sequence (); return NULL_RTX; } /* Canonicalize the operands according to mode. */ - if (!nonimmediate_operand (op0, op_mode)) - op0 = force_reg (op_mode, op0); - if (!x86_64_general_operand (op1, op_mode)) - op1 = force_reg (op_mode, op1); + if (SCALAR_INT_MODE_P (op_mode)) + { + if (!nonimmediate_operand (op0, op_mode)) + op0 = force_reg (op_mode, op0); + if (!x86_64_general_operand (op1, op_mode)) + op1 = force_reg (op_mode, op1); + } + else + { + /* op0/op1 can be canonicallized from expand_fp_compare, so + just adjust the code to make it generate supported fp + condition. */ + if (ix86_fp_compare_code_to_integer (code) == UNKNOWN) + { + /* First try to split condition if we don't need to honor + NaNs, as the ORDERED/UNORDERED check always fall + through. */ + if (!HONOR_NANS (op_mode)) + { + rtx_code first_code; + split_comparison (code, op_mode, &first_code, &code); + } + /* Otherwise try to swap the operand order and check if + the comparison is supported. */ + else + { + code = swap_condition (code); + std::swap (op0, op1); + } + + if (ix86_fp_compare_code_to_integer (code) == UNKNOWN) + { + end_sequence (); + return NULL_RTX; + } + } + } *prep_seq = get_insns (); end_sequence (); @@ -25647,6 +25687,9 @@ ix86_gen_ccmp_next (rtx_insn **prep_seq, rtx_insn **gen_seq, rtx prev, dfv = ix86_get_flags_cc ((rtx_code) cmp_code); prev_code = GET_CODE (prev); + /* Fixup FP compare code here. */ + if (GET_MODE (XEXP (prev, 0)) == CCFPmode) + prev_code = ix86_fp_compare_code_to_integer (prev_code); if (bit_code != AND) prev_code = reverse_condition (prev_code); diff --git a/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c b/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c index 5a2dad89f1f..e4e112f07e0 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c +++ b/gcc/testsuite/gcc.target/i386/apx-ccmp-1.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { ! ia32 } } } */ -/* { dg-options "-O2 -mapx-features=ccmp" } */ +/* { dg-options "-O2 -ffast-math -mapx-features=ccmp" } */ int f1 (int a) @@ -56,8 +56,49 @@ f9 (int a, int b) return a == 3 || a == 0; } +int +f10 (float a, int b, float c) +{ + return a > c || b < 19; +} + +int +f11 (float a, int b) +{ + return a == 0.0 && b > 21; +} + +int +f12 (double a, int b) +{ + return a < 3.0 && b != 23; +} + +int +f13 (double a, double b, int c, int d) +{ + a += b; + c += d; + return a != b || c == d; +} + +int +f14 (double a, int b) +{ + return b != 0 && a < 1.5; +} + +int +f15 (double a, double b, int c, int d) +{ + return c != d || a <= b; +} + /* { dg-final { scan-assembler-times "ccmpg" 2 } } */ /* { dg-final { scan-assembler-times "ccmple" 2 } } */ /* { dg-final { scan-assembler-times "ccmpne" 4 } } */ -/* { dg-final { scan-assembler-times "ccmpe" 1 } } */ +/* { dg-final { scan-assembler-times "ccmpe" 3 } } */ +/* { dg-final { scan-assembler-times "ccmpbe" 1 } } */ +/* { dg-final { scan-assembler-times "ccmpa" 1 } } */ +/* { dg-final { scan-assembler-times "ccmpbl" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c b/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c index 30a1c216c1b..0123a686d2c 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c +++ b/gcc/testsuite/gcc.target/i386/apx-ccmp-2.c @@ -42,6 +42,47 @@ int foo_noapx(int a, int b, int c, int d) return sum; } +__attribute__((noinline, noclone, + optimize(("finite-math-only")), target("apxf"))) +double foo_fp_apx(int a, double b, int c, double d) +{ + int sum = a; + double sumd = b; + + if (a != c) + { + sum += a; + if (a < c || sumd != d || sum > c) + { + c += a; + sum += a + c; + } + } + + return sum + sumd; +} + +__attribute__((noinline, noclone, + optimize(("finite-math-only")), target("no-apxf"))) +double foo_fp_noapx(int a, double b, int c, double d) +{ + int sum = a; + double sumd = b; + + if (a != c) + { + sum += a; + if (a < c || sumd != d || sum > c) + { + c += a; + sum += a + c; + } + } + + return sum + sumd; +} + + int main (void) { if (!__builtin_cpu_supports ("apxf")) @@ -53,5 +94,11 @@ int main (void) if (val1 != val2) __builtin_abort (); + double val3 = foo_fp_noapx (24, 7.5, 32, 2.0); + double val4 = foo_fp_apx (24, 7.5, 32, 2.0); + + if (val3 != val4) + __builtin_abort (); + return 0; }