From patchwork Sat Jul 6 10:15:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 1957509 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gjlay.de header.i=@gjlay.de header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=RBoLzfMc; dkim=pass header.d=gjlay.de header.i=@gjlay.de header.a=ed25519-sha256 header.s=strato-dkim-0003 header.b=6d4jT12I; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WGR930XwKz1xr3 for ; Sat, 6 Jul 2024 20:15:34 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1C8643830B75 for ; Sat, 6 Jul 2024 10:15:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [85.215.255.24]) by sourceware.org (Postfix) with ESMTPS id A53443858C98 for ; Sat, 6 Jul 2024 10:15:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A53443858C98 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A53443858C98 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=85.215.255.24 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1720260915; cv=pass; b=Za1luyBiHxQFfLB/veqEByshCP7Bao5eWnNiDEFN92JfSdIqyEsN+A/usJc7Q21+wnOQfVlHVk6VUG+03GJmv+0q+I897ITfv3csSiyHBi2lkCjPhqG6PIcqdmLmtDBOd6pBttxEW/scVUwlKpeYmnSu5qp2IrTcu3H42rZtXDI= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1720260915; c=relaxed/simple; bh=8ncB9fRNu7yGJTpdf3/D+UdXohPNiFXidY7EUBaAcuo=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version:From: To:Subject; b=kaaOBwxWWSGI5yBzNNAOvyBihS/nxJU3J/3RkGu+0Fg4+C4R8cXopY8C//SjP+EgrORDSbgWPHuM0/R5OqliAoFEws8+4Td4Vix3VumbZmTWi7mZvAIM1qamQIfYoOpgtFxgwSXcNoc3b7w3Wmao6BdUgBFeWWG8bnVZ+x5mIhs= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; t=1720260912; cv=none; d=strato.com; s=strato-dkim-0002; b=f+dM3fjK+TCywT/H/qyfSkMdvJuikP+RuENSeM87qyOPz9nuZZ1TUkz8kcbaWz2ltZ e27Ykf48y57zuwVZSznf+JBwvCgWor+rehbJ5Y2DHCe0Zl4SZbBUrtHkBrHs7JNL6ULc PYHhkQCg7Uirik3rhi7dyeBkm2mJFEspjz9Bk82kJVk1r0+BbEqhCz/0XLJ4aJecdi0j XGM9D+xTEJUKPSAJla9TdASmsONv3kJgM8o5ueXoNg5u/A/qj30ZHaGC8ByDcKLKnYu9 MJHjzvkkVGJMJFnagyqzqzCLQ+iTaFh3h8mLbqTNiQS1WXX2W+2xCb0YbwMet163+gYc v2ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1720260912; s=strato-dkim-0002; d=strato.com; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=gxEI2boJ1P5icqxgUy8TlJHCghAnaJGvI5cGYC2TeiA=; b=JO3YGoR+ZjA+BNONqFOjBthkkUyqTFDiBcMRubOO530Z6eL7FatK+jwN8sYwg+aQbG gYQQmb3Myfp3LXLPvelU0zQm8i0q2NiacojMY5E5RyojxHuNlSvqBKzx79ukGcp+Doyv mejg+CZ2robL8zpwkQcDBdnCT5jg0tAx7uLCdSytXFnK66PqZGNBDx4Sbljkdy5z7Um4 UxNiiKPFBcrIr/aouhDjjsjTImQABz2x5kZqjcwU6BZ+pm90dC6yh2wGJcZTR8kp/XWD hgFSXN0Rats519qfUkXVyPc6JyiskypGTA9B9bic3xIGOySw6SoIYVsZoLKm/k5tzADw BWKA== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1720260912; s=strato-dkim-0002; d=gjlay.de; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=gxEI2boJ1P5icqxgUy8TlJHCghAnaJGvI5cGYC2TeiA=; b=RBoLzfMc/RE2OzSs6pc/HMX0fVlEsFM+BDgYl4XoImuGw5tln8BpWDRw6Ev+ZA48oa AP0mY1+d+kam5lGsGMZMg8waONjnRqigPR15UtyT6iuQS01mbOt2fNRcoZqBg9AcszQY jlvoRTGReuOtfqQPH0KXln+vg0P0F+YuLwFbNimJ6OajYGPz9MBJfgQJcRsQhm57rDm2 +mrewXMou4PuRsI3X+HTps3PeontoDaDENR/ZdfPTwY7p5pxXgz+5t2Q3cNG4I1KBnHE xFPbqp+OSIxfQ0iGQGBoEJjY7PBVIA3c29v4ULuP6jdkukwlaGH4FbMxsh1d6OcxK0EZ X8eQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1720260912; s=strato-dkim-0003; d=gjlay.de; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=gxEI2boJ1P5icqxgUy8TlJHCghAnaJGvI5cGYC2TeiA=; b=6d4jT12IeXunn9FcNcqmAJL5F1Z8MwyY1fcj4AW9cc5mHvJe56VzPJkRx49IojW0DJ lcTLWJwkgUTSS8sQMjBw== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKT7Qq0xotTetVnKkbjtK7q2y9LkX3jYYP" Received: from [192.168.2.102] by smtp.strato.de (RZmta 50.5.0 DYNA|AUTH) with ESMTPSA id x05778066AFBdwh (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate) for ; Sat, 6 Jul 2024 12:15:11 +0200 (CEST) Message-ID: Date: Sat, 6 Jul 2024 12:15:10 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Georg-Johann Lay Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Subject: [patch,avr,applied] Create more opportunities for the -mfuse-add optimization pass X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Up to now, a post-reload split for fake PLUS addresses was only run on AVR_TINY. However, also non-AVR_TINY cores have some address registers that don't support PLUS addressing, which is the X register, and the Z register with [E]LPM. This patch splits also these patterns. The fuse-add pass can already handle all the generated RTXes. Johann --- AVR: Create more opportunities for -mfuse-add optimization. avr_split_tiny_move() was only run for AVR_TINY because it has no PLUS addressing modes. Same applies to the X register on ordinary cores, and also to the Z register when used with [E]LPM. For example, without this patch long long addLL (long long *a, long long *b) { return *a + *b; } compiles with "-mmcu=atmgea128 -Os -dp" to: ... movw r26,r24 ; 80 [c=4 l=1] *movhi/0 movw r30,r22 ; 81 [c=4 l=1] *movhi/0 ld r18,X ; 82 [c=4 l=1] movqi_insn/3 adiw r26,1 ; 83 [c=4 l=3] movqi_insn/3 ld r19,X sbiw r26,1 adiw r26,2 ; 84 [c=4 l=3] movqi_insn/3 ld r20,X sbiw r26,2 adiw r26,3 ; 85 [c=4 l=3] movqi_insn/3 ld r21,X sbiw r26,3 adiw r26,4 ; 86 [c=4 l=3] movqi_insn/3 ld r22,X sbiw r26,4 adiw r26,5 ; 87 [c=4 l=3] movqi_insn/3 ld r23,X sbiw r26,5 adiw r26,6 ; 88 [c=4 l=3] movqi_insn/3 ld r24,X sbiw r26,6 adiw r26,7 ; 89 [c=4 l=2] movqi_insn/3 ld r25,X ld r10,Z ; 90 [c=4 l=1] movqi_insn/3 ... whereas with this patch it becomes: ... movw r26,r24 ; 80 [c=4 l=1] *movhi/0 movw r30,r22 ; 81 [c=4 l=1] *movhi/0 ld r18,X+ ; 140 [c=4 l=1] movqi_insn/3 ld r19,X+ ; 142 [c=4 l=1] movqi_insn/3 ld r20,X+ ; 144 [c=4 l=1] movqi_insn/3 ld r21,X+ ; 146 [c=4 l=1] movqi_insn/3 ld r22,X+ ; 148 [c=4 l=1] movqi_insn/3 ld r23,X+ ; 150 [c=4 l=1] movqi_insn/3 ld r24,X+ ; 152 [c=4 l=1] movqi_insn/3 ld r25,X ; 109 [c=4 l=1] movqi_insn/3 ld r10,Z ; 111 [c=4 l=1] movqi_insn/3 ... gcc/ * config/avr/avr.md: Also split with avr_split_tiny_move() for non-AVR_TINY. * config/avr/avr.cc (avr_split_tiny_move): Don't change memory references with base regs that can do PLUS addressing. (avr_out_lpm_no_lpmx) [POST_INC]: Don't output final ADIW when the address register is unused after. gcc/testsuite/ * gcc.target/avr/torture/fuse-add.c: New test. diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc index f048bf5fd41..d299fceb782 100644 --- a/gcc/config/avr/avr.cc +++ b/gcc/config/avr/avr.cc @@ -4471,28 +4471,21 @@ avr_out_lpm_no_lpmx (rtx_insn *insn, rtx *xop, int *plen) gcc_assert (REG_Z == REGNO (XEXP (addr, 0)) && n_bytes <= 4); - if (regno_dest == LPM_REGNO) - avr_asm_len ("%4lpm" CR_TAB - "adiw %2,1", xop, plen, 2); - else - avr_asm_len ("%4lpm" CR_TAB - "mov %A0,%3" CR_TAB - "adiw %2,1", xop, plen, 3); + for (int i = 0; i < n_bytes; ++i) + { + rtx reg = simplify_gen_subreg (QImode, dest, GET_MODE (dest), i); - if (n_bytes >= 2) - avr_asm_len ("%4lpm" CR_TAB - "mov %B0,%3" CR_TAB - "adiw %2,1", xop, plen, 3); + if (i > 0) + avr_asm_len ("adiw %2,1", xop, plen, 1); - if (n_bytes >= 3) - avr_asm_len ("%4lpm" CR_TAB - "mov %C0,%3" CR_TAB - "adiw %2,1", xop, plen, 3); + avr_asm_len ("%4lpm", xop, plen, 1); - if (n_bytes >= 4) - avr_asm_len ("%4lpm" CR_TAB - "mov %D0,%3" CR_TAB - "adiw %2,1", xop, plen, 3); + if (REGNO (reg) != LPM_REGNO) + avr_asm_len ("mov %0,r0", ®, plen, 1); + } + + if (! _reg_unused_after (insn, xop[2], false)) + avr_asm_len ("adiw %2,1", xop, plen, 1); break; /* POST_INC */ @@ -6685,6 +6678,14 @@ avr_split_tiny_move (rtx_insn * /*insn*/, rtx *xop) if (REGNO (base) > REG_Z) return false; + if (! AVR_TINY + // Only keep base registers that can't do PLUS addressing. + && ((REGNO (base) != REG_X + && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (mem))) + || avr_load_libgcc_p (mem) + || avr_mem_memx_p (mem))) + return false; + bool volatile_p = MEM_VOLATILE_P (mem); bool mem_volatile_p = false; if (frame_pointer_needed diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md index dabf4c0fc5a..2783b8c986f 100644 --- a/gcc/config/avr/avr.md +++ b/gcc/config/avr/avr.md @@ -1035,8 +1035,7 @@ (define_split [(parallel [(set (match_operand:MOVMODE 0 "nonimmediate_operand") (match_operand:MOVMODE 1 "general_operand")) (clobber (reg:CC REG_CC))])] - "AVR_TINY - && reload_completed + "reload_completed && avr_fuse_add > 0 // Only split this for .split2 when we are before // pass .avr-fuse-add (which runs after proep). diff --git a/gcc/testsuite/gcc.target/avr/torture/fuse-add.c b/gcc/testsuite/gcc.target/avr/torture/fuse-add.c new file mode 100644 index 00000000000..b78b1aa9fc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/fuse-add.c @@ -0,0 +1,59 @@ +/* { dg-do run } */ +/* { dg-additional-options "-std=gnu99" } */ + +typedef __UINT64_TYPE__ uint64_t; + +extern const uint64_t aa __asm ("real_aa"); +extern const uint64_t bb __asm ("real_bb"); + +__attribute__((used)) const uint64_t real_aa = 0x1122334455667788; +__attribute__((used)) const uint64_t real_bb = 0x0908070605040302; + +__attribute__((noinline,noclone)) +uint64_t add1 (const uint64_t *aa, const uint64_t *bb) +{ + return *aa + *bb; +} + +#ifdef __FLASH +extern const __flash uint64_t fa __asm ("real_fa"); +extern const __flash uint64_t fb __asm ("real_fb"); + +__attribute__((used)) const __flash uint64_t real_fa = 0x1122334455667788; +__attribute__((used)) const __flash uint64_t real_fb = 0x0908070605040302; + +__attribute__((noinline,noclone)) +uint64_t add2 (const __flash uint64_t *aa, const uint64_t *bb) +{ + return *aa + *bb; +} + +uint64_t add3 (const uint64_t *aa, const __flash uint64_t *bb) +{ + return *aa + *bb; +} + +uint64_t add4 (const __flash uint64_t *aa, const __flash uint64_t *bb) +{ + return *aa + *bb; +} +#endif /* have __flash */ + +int main (void) +{ + if (add1 (&aa, &bb) != real_aa + real_bb) + __builtin_exit (__LINE__); + +#ifdef __FLASH + if (add2 (&fa, &bb) != real_fa + real_bb) + __builtin_exit (__LINE__); + + if (add3 (&aa, &fb) != real_aa + real_fb) + __builtin_exit (__LINE__); + + if (add4 (&fa, &fb) != real_fa + real_fb) + __builtin_exit (__LINE__); +#endif + + return 0; +}