From patchwork Thu Sep 12 13:32:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 1984689 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gjlay.de header.i=@gjlay.de header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=iirgxb+F; dkim=pass header.d=gjlay.de header.i=@gjlay.de header.a=ed25519-sha256 header.s=strato-dkim-0003 header.b=LyanlqZS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X4JKL3VMvz1y1C for ; Thu, 12 Sep 2024 23:32:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AE8FC385828E for ; Thu, 12 Sep 2024 13:32:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [85.215.255.25]) by sourceware.org (Postfix) with ESMTPS id 18C803858D28 for ; Thu, 12 Sep 2024 13:32:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 18C803858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 18C803858D28 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=85.215.255.25 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726147950; cv=pass; b=bxDw7muzGikHWlz3Tg5qP+FQynDMJloWPws5VYGyiV5AcuM7qCTuE3PJNttR8/c/BAU4OB3JVZ4xtc+cGPf+o/zyw2EQwspBM+BU/CZ5rqqoiymN5oD3kdOOYfwIXnwmtIKel1kO+tVdg/1uZMMHAZufmQQrCfMascEP1lw90ZM= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726147950; c=relaxed/simple; bh=6W5RBh4D21SYhXgR3kSSNsZW/NFevMbD27mQ8YPJ6aM=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version:From: To:Subject; b=SB6yokmJ/rBP2yIRsKI6S00SvKhETniBt98EAynN8UHPK4OJhPZHf1JEBDywE9lexRnHJL5AAicDcfJ5CiqpDRShRrxqBn2sKc6AR+ptNWwzPF0hlKOVTDNL3YR770nDzGXq/jSiwxCjxYe3oY+0wpEGBtLzCcxvLHVaOALj47c= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; t=1726147945; cv=none; d=strato.com; s=strato-dkim-0002; b=cinVOwE1ARg8bhPk4FJS621PEH2SDZBbxehkcaCVuKFaUAYe+/E+Bd7JQxjpIDSCWY 5l32eFe1YstjDH3GzvsonEG5x4z/twbCMKN+NYETHlhf+4pIehFj/MNjfuXQ45qdlmV3 VyVIy2Zi95GNYeuQk32P4jULWLgESICyP7QagijQEWMTpIgAxQG6+nJX3ExBSEfgofE4 vdEjOI77HemXek/x1Gg0NEKtzGyTAgQQzsqhZb/SjGXcSfq6eYb2xmZ1pg9aiK9PPpvM 4z+TFsRr9PjM8u8rI/Uk/T2uADK6pcZYzCvtCI36A7sOss67e+MfFuxv4w/QNffTUtrP NAWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1726147945; s=strato-dkim-0002; d=strato.com; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=kb2lSXcB6UXKi7roS2hIYtE0u4SwA6fDLb0y3Am5WLo=; b=nua0I5g8Y0NVxgVFnCrtQ08EMWFlXAA9+SSviPDxS9n2nY+Mwm2+9qpcTGMvwtP7bL qEVPh7twlj++fFnwg28ZZ1frIuX4SHD44XsfauRLVBOo2TTGPxM/ol3LSW/mwCU04pu+ g775jbhGwvYZx4ntsiUh3XgfLa/IM8p8YHPRO3pZo6v/v26wYxD70WiJeuNy5LRy6x35 qN3fX3Zqj782DvVnH04GT05vgBqweDjJJvBw+QYs4gqPbtpUbTEwy+xpzUODhTpMRR1s r50bp6BoTPsg0b21rbS5sNHXcUAL7lHdyNOQPlf0c+KIVff53drrbc80peacrooYVK9f L+lw== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1726147945; s=strato-dkim-0002; d=gjlay.de; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=kb2lSXcB6UXKi7roS2hIYtE0u4SwA6fDLb0y3Am5WLo=; b=iirgxb+FQqpurCZi21TEg8dHty6VT9+kc5+o2pLj0WpijfVUrORYo2wh7n6oGJA4Wc 3uacCd44npt0ONspc++tJZjWOwVDLp/6+xo7uGB/wyZB8eK6q9OcstSdMBrrOxztaZ3I jac/4Sw48RvD5fw2UFHWfSdzUOv/VHbPqN6LJYk8k/fLCdgMdDmwiW3WPwAOpeXUB5Ut 7RxLEHiL74kmIgQltXqx6uy7dP0yhr8znrIgQAE03G8oWb+mdmI3ojV4BQ/DPje1/vhi vqNAQ3c0V+ibpNKBKY1c6iFqknbt6hAzEb+Il9GKQpE5ahpPPyn6Iyggw2ahyct/XEUd 6Rbg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1726147945; s=strato-dkim-0003; d=gjlay.de; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=kb2lSXcB6UXKi7roS2hIYtE0u4SwA6fDLb0y3Am5WLo=; b=LyanlqZSpP2L6fUVGF5LFVDgtHgrwOIGDo8uVNdVJOsWa/FpJeE+tiCZsNEVaYs/28 SflpuR66UywpI+u0vBBg== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKXKoq0dKoR0vetzhr/2IDlGFRklUq" Received: from [192.168.2.102] by smtp.strato.de (RZmta 51.2.3 DYNA|AUTH) with ESMTPSA id xccbe708CDWOGNg (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Thu, 12 Sep 2024 15:32:24 +0200 (CEST) Message-ID: Date: Thu, 12 Sep 2024 15:32:24 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Georg-Johann Lay Content-Language: en-US To: "gcc-patches@gcc.gnu.org" , Denis Chertykov Subject: [patch,avr] Rework avr_out_compare X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch reworks avr_out_compare: Use new convenient helper functions that may be useful in other output functions, too. Generalized some special cases that only work for EQ and NE comparisons. For example, with the patch ;; R24:SI == -1 (unused after) adiw r26,1 sbci r25,hi8(-1) sbci r24,lo8(-1) ;; R18:SI == -1 cpi r18,-1 cpc r19,r18 cpc r20,r18 cpc r21,r18 Without the patch, we had: ;; R24:SI == -1 (unused after) cpi r24,-1 sbci r25,-1 sbci r26,-1 sbci r27,-1 ;; R18:SI == -1 cpi r18,-1 ldi r24,-1 cpc r19,r24 cpc r20,r24 cpc r21,r24 Ok for trunk? This patch requires "Tweak 32-bit comparisons". https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662738.html Johann --- AVR: Rework avr_out_compare. 16-bit comparisons like R25:24 == -1 are currently performed like cpi R24, -1 cpc R25, R24 Similar is possible for wider modes. ADIW can be used like SBIW when the compare code is EQ or NE because such comparisons are just about (propagating) the Z flag. The patch adds helper functions like avr_byte() that may be useful in other functions than avr_out_compare(). gcc/ * config/avr/avr.cc (avr_chunk, avr_byte, avr_word) (avr_int8, avr_uint8, avr_int16): New helper functions. (avr_out_compare): Overhaul. AVR: Rework avr_out_compare. 16-bit comparisons like R25:24 == -1 are currently performed like cpi R24, -1 cpc R25, R24 Similar is possible for wider modes. ADIW can be used like SBIW when the compare code is EQ or NE because such comparisons are just about (propagating) the Z flag. The patch adds helper functions like avr_byte() that may be useful in other functions than avr_out_compare(). gcc/ * config/avr/avr.cc (avr_chunk, avr_byte, avr_word) (avr_int8, avr_uint8, avr_int16): New helper functions. (avr_out_compare): Overhaul. diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc index 99657911171..1cfbfe6ec3b 100644 --- a/gcc/config/avr/avr.cc +++ b/gcc/config/avr/avr.cc @@ -322,6 +322,68 @@ avr_to_int_mode (rtx x) } +/* Return chunk of mode MODE of X as an rtx. N specifies the subreg + byte at which the chunk starts. N must be an integral multiple + of the mode size. */ + +static rtx +avr_chunk (machine_mode mode, rtx x, int n) +{ + gcc_assert (n % GET_MODE_SIZE (mode) == 0); + machine_mode xmode = GET_MODE (x) == VOIDmode ? DImode : GET_MODE (x); + return simplify_gen_subreg (mode, x, xmode, n); +} + + +/* Return the N-th byte of X as an rtx. */ + +static rtx +avr_byte (rtx x, int n) +{ + return avr_chunk (QImode, x, n); +} + + +/* Return the sub-word of X starting at byte number N. */ + +static rtx +avr_word (rtx x, int n) +{ + return avr_chunk (HImode, x, n); +} + + +/* Return the N-th byte of compile-time constant X as an int8_t. */ + +static int8_t +avr_int8 (rtx x, int n) +{ + gcc_assert (CONST_INT_P (x) || CONST_FIXED_P (x) || CONST_DOUBLE_P (x)); + + return (int8_t) trunc_int_for_mode (INTVAL (avr_byte (x, n)), QImode); +} + +/* Return the N-th byte of compile-time constant X as an uint8_t. */ + +static uint8_t +avr_uint8 (rtx x, int n) +{ + return (uint8_t) avr_int8 (x, n); +} + + +/* Return the sub-word of compile-time constant X that starts + at byte N as an int16_t. */ + +static int16_t +avr_int16 (rtx x, int n) +{ + gcc_assert (CONST_INT_P (x) || CONST_FIXED_P (x) || CONST_DOUBLE_P (x)); + + return (int16_t) trunc_int_for_mode (INTVAL (avr_word (x, n)), HImode); +} + + /* Return true if hard register REG supports the ADIW and SBIW instructions. */ bool @@ -5574,9 +5636,6 @@ avr_out_compare (rtx_insn *insn, rtx *xop, int *plen) xval = avr_to_int_mode (xop[1]); } - /* MODE of the comparison. */ - machine_mode mode = GET_MODE (xreg); - gcc_assert (REG_P (xreg)); gcc_assert ((CONST_INT_P (xval) && n_bytes <= 4) || (const_double_operand (xval, VOIDmode) && n_bytes == 8)); @@ -5584,13 +5643,15 @@ avr_out_compare (rtx_insn *insn, rtx *xop, int *plen) if (plen) *plen = 0; + const bool eqne_p = compare_eq_p (insn); + /* Comparisons == +/-1 and != +/-1 can be done similar to camparing against 0 by ORing the bytes. This is one instruction shorter. Notice that 64-bit comparisons are always against reg:ALL8 18 (ACC_A) and therefore don't use this. */ - if (!test_hard_reg_class (LD_REGS, xreg) - && compare_eq_p (insn) + if (eqne_p + && ! test_hard_reg_class (LD_REGS, xreg) && reg_unused_after (insn, xreg)) { if (xval == const1_rtx) @@ -5619,39 +5680,11 @@ avr_out_compare (rtx_insn *insn, rtx *xop, int *plen) } } - /* Comparisons == -1 and != -1 of a d-register that's used after the - comparison. (If it's unused after we use CPI / SBCI or ADIW sequence - from below.) Instead of CPI Rlo,-1 / LDI Rx,-1 / CPC Rhi,Rx we can - use CPI Rlo,-1 / CPC Rhi,Rlo which is 1 instruction shorter: - If CPI is true then Rlo contains -1 and we can use Rlo instead of Rx - when CPC'ing the high part. If CPI is false then CPC cannot render - the result to true. This also works for the more generic case where - the constant is of the form 0xabab. */ - - if (n_bytes == 2 - && xval != const0_rtx - && test_hard_reg_class (LD_REGS, xreg) - && compare_eq_p (insn) - && !reg_unused_after (insn, xreg)) - { - rtx xlo8 = simplify_gen_subreg (QImode, xval, mode, 0); - rtx xhi8 = simplify_gen_subreg (QImode, xval, mode, 1); - - if (INTVAL (xlo8) == INTVAL (xhi8)) - { - xop[0] = xreg; - xop[1] = xlo8; - - return avr_asm_len ("cpi %A0,%1" CR_TAB - "cpc %B0,%A0", xop, plen, 2); - } - } - /* Comparisons == and != may change the order in which the sub-bytes are being compared. Start with the high 16 bits so we can use SBIW. */ if (n_bytes == 4 - && compare_eq_p (insn) + && eqne_p && AVR_HAVE_ADIW && REGNO (xreg) >= REG_22) { @@ -5660,56 +5693,57 @@ avr_out_compare (rtx_insn *insn, rtx *xop, int *plen) "cpc %B0,__zero_reg__" CR_TAB "cpc %A0,__zero_reg__", xop, plen, 3); - rtx xhi16 = simplify_gen_subreg (HImode, xval, mode, 2); - if (IN_RANGE (UINTVAL (xhi16) & GET_MODE_MASK (HImode), 0, 63) - && reg_unused_after (insn, xreg)) + int16_t hi16 = avr_int16 (xval, 2); + if (reg_unused_after (insn, xreg) + && (IN_RANGE (hi16, 0, 63) + || (eqne_p + && IN_RANGE (hi16, -63, -1)))) { - xop[1] = xhi16; - avr_asm_len ("sbiw %C0,%1", xop, plen, 1); - xop[1] = xval; + rtx op[] = { xop[0], avr_word (xval, 2) }; + avr_asm_len (hi16 < 0 ? "adiw %C0,%n1" : "sbiw %C0,%1", + op, plen, 1); return avr_asm_len ("sbci %B0,hi8(%1)" CR_TAB "sbci %A0,lo8(%1)", xop, plen, 2); } } + bool changed[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; + for (int i = 0; i < n_bytes; i++) { /* We compare byte-wise. */ - rtx reg8 = simplify_gen_subreg (QImode, xreg, mode, i); - rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i); + xop[0] = avr_byte (xreg, i); + xop[1] = avr_byte (xval, i); /* 8-bit value to compare with this byte. */ - unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode); - - /* Registers R16..R31 can operate with immediate. */ - bool ld_reg_p = test_hard_reg_class (LD_REGS, reg8); - - xop[0] = reg8; - xop[1] = gen_int_mode (val8, QImode); + unsigned int val8 = avr_uint8 (xval, i); /* Word registers >= R24 can use SBIW/ADIW with 0..63. */ if (i == 0 - && avr_adiw_reg_p (reg8)) + && n_bytes >= 2 + && avr_adiw_reg_p (xop[0])) { - int val16 = trunc_int_for_mode (INTVAL (xval), HImode); + int val16 = avr_int16 (xval, 0); if (IN_RANGE (val16, 0, 63) && (val8 == 0 || reg_unused_after (insn, xreg))) { avr_asm_len ("sbiw %0,%1", xop, plen, 1); - + changed[0] = changed[1] = val8 != 0; i++; continue; } - if (n_bytes == 2 - && IN_RANGE (val16, -63, -1) - && compare_eq_p (insn) + if (IN_RANGE (val16, -63, -1) + && eqne_p && reg_unused_after (insn, xreg)) { - return avr_asm_len ("adiw %0,%n1", xop, plen, 1); + avr_asm_len ("adiw %0,%n1", xop, plen, 1); + changed[0] = changed[1] = true; + i++; + continue; } } @@ -5728,7 +5762,7 @@ avr_out_compare (rtx_insn *insn, rtx *xop, int *plen) instruction; the only difference is that comparisons don't write the result back to the target register. */ - if (ld_reg_p) + if (test_hard_reg_class (LD_REGS, xop[0])) { if (i == 0) { @@ -5738,10 +5772,37 @@ avr_out_compare (rtx_insn *insn, rtx *xop, int *plen) else if (reg_unused_after (insn, xreg)) { avr_asm_len ("sbci %0,%1", xop, plen, 1); + changed[i] = true; continue; } } + /* When byte comparisons for an EQ or NE comparison look like + compare (x[i], C) + compare (x[j], C) + then we can instead use + compare (x[i], C) + compare (x[j], x[i]) + which is shorter, and the outcome of the comparison is the same. */ + + if (eqne_p) + { + bool done = false; + + for (int j = 0; j < i && ! done; ++j) + if (val8 == avr_uint8 (xval, j) + // Make sure that we didn't clobber x[j] above. + && ! changed[j]) + { + rtx op[] = { xop[0], avr_byte (xreg, j) }; + avr_asm_len ("cpc %0,%1", op, plen, 1); + done = true; + } + + if (done) + continue; + } + /* Must load the value into the scratch register. */ gcc_assert (REG_P (xop[2]));