From patchwork Thu Apr 16 12:25:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1271683 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 492z2j4rvyz9sP7 for ; Thu, 16 Apr 2020 22:25:33 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C86A3384A035; Thu, 16 Apr 2020 12:25:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CC44B384A03D for ; Thu, 16 Apr 2020 12:25:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org CC44B384A03D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=andre.simoesdiasvieira@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8009DC14 for ; Thu, 16 Apr 2020 05:25:27 -0700 (PDT) Received: from [10.57.25.163] (unknown [10.57.25.163]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 045E83F73D for ; Thu, 16 Apr 2020 05:25:26 -0700 (PDT) From: "Andre Vieira (lists)" Subject: [PATCH 3/19] aarch64: Improve cas generation To: "gcc-patches@gcc.gnu.org" Message-ID: <18c55fbb-6f57-952b-8226-3edceebdf5ae@arm.com> Date: Thu, 16 Apr 2020 13:25:25 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 Content-Language: en-US X-Spam-Status: No, score=-28.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_DMARC_STATUS, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Do not zero-extend the input to the cas for subword operations; instead, use the appropriate zero-extending compare insns. Correct the predicates and constraints for immediate expected operand. 2020-04-16  Andre Vieira     Backport from mainline.     2018-10-31  Richard Henderson     * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.     (aarch64_split_compare_and_swap): Use it.     (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;     test oldval against the proper predicate.     * config/aarch64/atomics.md (atomic_compare_and_swap):     Use nonmemory_operand for expected.     (cas_short_expected_pred): New.     (aarch64_compare_and_swap): Use it; use "rn" not "rI" to match.     (aarch64_compare_and_swap): Use "rn" not "rI" for expected.     * config/aarch64/predicates.md (aarch64_plushi_immediate): New.     (aarch64_plushi_operand): New. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c83a9f7ae78d4ed3da6636fce4d1f57c27048756..b6a6e314153ecf4a7ae1b83cfb64e6192197edc5 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1524,6 +1524,33 @@ aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y) return cc_reg; } +/* Similarly, but maybe zero-extend Y if Y_MODE < SImode. */ + +static rtx +aarch64_gen_compare_reg_maybe_ze (RTX_CODE code, rtx x, rtx y, + machine_mode y_mode) +{ + if (y_mode == E_QImode || y_mode == E_HImode) + { + if (CONST_INT_P (y)) + y = GEN_INT (INTVAL (y) & GET_MODE_MASK (y_mode)); + else + { + rtx t, cc_reg; + machine_mode cc_mode; + + t = gen_rtx_ZERO_EXTEND (SImode, y); + t = gen_rtx_COMPARE (CC_SWPmode, t, x); + cc_mode = CC_SWPmode; + cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM); + emit_set_insn (cc_reg, t); + return cc_reg; + } + } + + return aarch64_gen_compare_reg (code, x, y); +} + /* Build the SYMBOL_REF for __tls_get_addr. */ static GTY(()) rtx tls_get_addr_libfunc; @@ -14167,20 +14194,11 @@ aarch64_emit_unlikely_jump (rtx insn) void aarch64_expand_compare_and_swap (rtx operands[]) { - rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x; - machine_mode mode, cmp_mode; - typedef rtx (*gen_split_cas_fn) (rtx, rtx, rtx, rtx, rtx, rtx, rtx); + rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x, cc_reg; + machine_mode mode, r_mode; typedef rtx (*gen_atomic_cas_fn) (rtx, rtx, rtx, rtx); int idx; - gen_split_cas_fn split_gen; gen_atomic_cas_fn atomic_gen; - const gen_split_cas_fn split_cas[] = - { - gen_aarch64_compare_and_swapqi, - gen_aarch64_compare_and_swaphi, - gen_aarch64_compare_and_swapsi, - gen_aarch64_compare_and_swapdi - }; const gen_atomic_cas_fn atomic_cas[] = { gen_aarch64_compare_and_swapqi_lse, @@ -14198,36 +14216,19 @@ aarch64_expand_compare_and_swap (rtx operands[]) mod_s = operands[6]; mod_f = operands[7]; mode = GET_MODE (mem); - cmp_mode = mode; /* Normally the succ memory model must be stronger than fail, but in the unlikely event of fail being ACQUIRE and succ being RELEASE we need to promote succ to ACQ_REL so that we don't lose the acquire semantics. */ - if (is_mm_acquire (memmodel_from_int (INTVAL (mod_f))) && is_mm_release (memmodel_from_int (INTVAL (mod_s)))) mod_s = GEN_INT (MEMMODEL_ACQ_REL); - switch (mode) + r_mode = mode; + if (mode == QImode || mode == HImode) { - case E_QImode: - case E_HImode: - /* For short modes, we're going to perform the comparison in SImode, - so do the zero-extension now. */ - cmp_mode = SImode; - rval = gen_reg_rtx (SImode); - oldval = convert_modes (SImode, mode, oldval, true); - /* Fall through. */ - - case E_SImode: - case E_DImode: - /* Force the value into a register if needed. */ - if (!aarch64_plus_operand (oldval, mode)) - oldval = force_reg (cmp_mode, oldval); - break; - - default: - gcc_unreachable (); + r_mode = SImode; + rval = gen_reg_rtx (r_mode); } switch (mode) @@ -14245,27 +14246,49 @@ aarch64_expand_compare_and_swap (rtx operands[]) /* The CAS insn requires oldval and rval overlap, but we need to have a copy of oldval saved across the operation to tell if the operation is successful. */ - if (mode == QImode || mode == HImode) - rval = copy_to_mode_reg (SImode, gen_lowpart (SImode, oldval)); - else if (reg_overlap_mentioned_p (rval, oldval)) - rval = copy_to_mode_reg (mode, oldval); + if (reg_overlap_mentioned_p (rval, oldval)) + rval = copy_to_mode_reg (r_mode, oldval); else - emit_move_insn (rval, oldval); + emit_move_insn (rval, gen_lowpart (r_mode, oldval)); + emit_insn (atomic_gen (rval, mem, newval, mod_s)); - aarch64_gen_compare_reg (EQ, rval, oldval); + + cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode); } else { - split_gen = split_cas[idx]; - emit_insn (split_gen (rval, mem, oldval, newval, is_weak, mod_s, mod_f)); + /* The oldval predicate varies by mode. Test it and force to reg. */ + insn_code code; + switch (mode) + { + case E_QImode: + code = CODE_FOR_aarch64_compare_and_swapqi; + break; + case E_HImode: + code = CODE_FOR_aarch64_compare_and_swaphi; + break; + case E_SImode: + code = CODE_FOR_aarch64_compare_and_swapsi; + break; + case E_DImode: + code = CODE_FOR_aarch64_compare_and_swapdi; + break; + default: + gcc_unreachable (); + } + if (!insn_data[code].operand[2].predicate (oldval, mode)) + oldval = force_reg (mode, oldval); + + emit_insn (GEN_FCN (code) (rval, mem, oldval, newval, + is_weak, mod_s, mod_f)); + cc_reg = gen_rtx_REG (CCmode, CC_REGNUM); } - if (mode == QImode || mode == HImode) + if (r_mode != mode) rval = gen_lowpart (mode, rval); emit_move_insn (operands[1], rval); - x = gen_rtx_REG (CCmode, CC_REGNUM); - x = gen_rtx_EQ (SImode, x, const0_rtx); + x = gen_rtx_EQ (SImode, cc_reg, const0_rtx); emit_insn (gen_rtx_SET (bval, x)); } @@ -14374,10 +14397,10 @@ aarch64_split_compare_and_swap (rtx operands[]) } else { - cond = aarch64_gen_compare_reg (NE, rval, oldval); + cond = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode); x = gen_rtx_NE (VOIDmode, cond, const0_rtx); x = gen_rtx_IF_THEN_ELSE (VOIDmode, x, - gen_rtx_LABEL_REF (Pmode, label2), pc_rtx); + gen_rtx_LABEL_REF (Pmode, label2), pc_rtx); aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x)); } diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md index be970d105ffbe218afb044bef900494454dd8d37..b0e84b8addd809598b3e358a265b86582ce96462 100644 --- a/gcc/config/aarch64/atomics.md +++ b/gcc/config/aarch64/atomics.md @@ -24,8 +24,8 @@ [(match_operand:SI 0 "register_operand" "") ;; bool out (match_operand:ALLI 1 "register_operand" "") ;; val out (match_operand:ALLI 2 "aarch64_sync_memory_operand" "") ;; memory - (match_operand:ALLI 3 "general_operand" "") ;; expected - (match_operand:ALLI 4 "aarch64_reg_or_zero" "") ;; desired + (match_operand:ALLI 3 "nonmemory_operand" "") ;; expected + (match_operand:ALLI 4 "aarch64_reg_or_zero" "") ;; desired (match_operand:SI 5 "const_int_operand") ;; is_weak (match_operand:SI 6 "const_int_operand") ;; mod_s (match_operand:SI 7 "const_int_operand")] ;; mod_f @@ -36,19 +36,22 @@ } ) +(define_mode_attr cas_short_expected_pred + [(QI "aarch64_reg_or_imm") (HI "aarch64_plushi_operand")]) + (define_insn_and_split "aarch64_compare_and_swap" [(set (reg:CC CC_REGNUM) ;; bool out (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW)) - (set (match_operand:SI 0 "register_operand" "=&r") ;; val out + (set (match_operand:SI 0 "register_operand" "=&r") ;; val out (zero_extend:SI (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))) ;; memory (set (match_dup 1) (unspec_volatile:SHORT - [(match_operand:SI 2 "aarch64_plus_operand" "rI") ;; expected + [(match_operand:SHORT 2 "" "rn") ;; expected (match_operand:SHORT 3 "aarch64_reg_or_zero" "rZ") ;; desired - (match_operand:SI 4 "const_int_operand") ;; is_weak - (match_operand:SI 5 "const_int_operand") ;; mod_s - (match_operand:SI 6 "const_int_operand")] ;; mod_f + (match_operand:SI 4 "const_int_operand") ;; is_weak + (match_operand:SI 5 "const_int_operand") ;; mod_s + (match_operand:SI 6 "const_int_operand")] ;; mod_f UNSPECV_ATOMIC_CMPSW)) (clobber (match_scratch:SI 7 "=&r"))] "" @@ -68,7 +71,7 @@ (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory (set (match_dup 1) (unspec_volatile:GPI - [(match_operand:GPI 2 "aarch64_plus_operand" "rI") ;; expect + [(match_operand:GPI 2 "aarch64_plus_operand" "rn") ;; expect (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ") ;; desired (match_operand:SI 4 "const_int_operand") ;; is_weak (match_operand:SI 5 "const_int_operand") ;; mod_s diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 5d41d4350402b2a9e5941f160c6ab6f933bfff90..7b0565a00b14f17b856c4d2c89335300dfa53e4e 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -110,6 +110,18 @@ (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_plus_immediate"))) +(define_predicate "aarch64_plushi_immediate" + (match_code "const_int") +{ + HOST_WIDE_INT val = INTVAL (op); + /* The HImode value must be zero-extendable to an SImode plus_operand. */ + return ((val & 0xfff) == val || sext_hwi (val & 0xf000, 16) == val); +}) + +(define_predicate "aarch64_plushi_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "aarch64_plushi_immediate"))) + (define_predicate "aarch64_pluslong_immediate" (and (match_code "const_int") (match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)")))