From patchwork Sun Sep 8 10:21:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 1982200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gjlay.de header.i=@gjlay.de header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=AEVb7bLV; dkim=pass header.d=gjlay.de header.i=@gjlay.de header.a=ed25519-sha256 header.s=strato-dkim-0003 header.b=LT4QuLv0; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X1mGk3XX8z1y1S for ; Sun, 8 Sep 2024 20:21:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D9525385DDF6 for ; Sun, 8 Sep 2024 10:21:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [81.169.146.162]) by sourceware.org (Postfix) with ESMTPS id 8CB533858C66 for ; Sun, 8 Sep 2024 10:21:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8CB533858C66 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8CB533858C66 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=81.169.146.162 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1725790889; cv=pass; b=EhxP4v04Wzs+P39bJvb4IZoCP/vtUjgT80GjMgqhe4dK6FhS76EMfUpsPlj4xAVQCW2PY2KYvUL3oVLcDm7R+iqigHph3I11ZENQvSE7zpB+EDpr0DBvnYh8u2dnrNG5x83rvVKxfixYraB4q8YObaNycWypOE0BegwXXcCcAuM= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1725790889; c=relaxed/simple; bh=7wryTYZzfcGpImHCgJT0G2acezMgGMGZd9nm+Su9oXE=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version:From: Subject:To; b=I4sowPIrqCzEZvXgh55eY3Ercpbw4m+SHnCG7ZhT0rEwtZ5G/ZOTNWdW72rTVoZk+WUe16U1/PFVvkGah0D7RDKNxt4ZL0bkI2FTXNjae9We5/OgV8rylBHO33SavbxOJIyhAw67P6dyJDswGkilRtJn6CxvRby397aiqYtqyCE= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; t=1725790883; cv=none; d=strato.com; s=strato-dkim-0002; b=TIiAdhcaKJYbelT88VARIYNkzTYynNXOMckLUA5UerR7/3C3xFRfKxbyG0NiTPWZV8 yZ0cAS2CY6J13PuFzfJ00BP5cQkXctU0fDWuKrfz+uZmJSmO6TnQnK+VVD6vkbgYp4Q5 VDtZeU0IKUs4AWym/4vrDmLh7Vya6IGIfFX5t/Fs0Zg2zsE7nLetJ/lZn8DF2H4fBB18 vdYnkiVdMkesITjiWWxV4B9ryyZEp8QeFCrxnPBXBee3PRbRavRsNXbgBlcuD+KoHfOX WqN5OXruMNCBoqz1MrXiAUdwdxOy4+jdDNQrtXiGyPnx3khq3Sm03fsGvKjFCDv1WfQZ 76dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1725790883; s=strato-dkim-0002; d=strato.com; h=To:Subject:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=uH9S6cAzW6K3iezDAe4Z34Y9X15eM5YoedK3e3jyg5Q=; b=inEhO+iaj74GoolSd6y+LCaaTHmrQ0TXzZKRt7ClWaon0JgJf0WWDqwN0bB5qvSIye /k0w0wyiqudzF+m4AYMO5jYMUGYp3k6fCP3kvbH1nMpY3D7lH9ycUWJ94siVliR2IeUz S1cv7oNAllFsVtieQ8Ga/js/1UCjLbj3MIwYbcJipkl6Ns36aY95ZJKa/Fa1ThlOs6Rt VdGNdLeBxmuJ4V0iKCOWS1B6qZRTlv4lYdikAPiseMjs6xK5cryOALE7et38ZxKOJTHY YNgDBqZ9mFMtlycppFQsTc5P01qgtzNcNJayD+4rJ097QR9EU6o3wN9UAIo/Ig4aBMKL a4uw== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1725790883; s=strato-dkim-0002; d=gjlay.de; h=To:Subject:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=uH9S6cAzW6K3iezDAe4Z34Y9X15eM5YoedK3e3jyg5Q=; b=AEVb7bLV88ZtoHBrJhYgfHq9y/nkDGCeqnUxwS8dLQuJN67GupIngzxvwUIyUgiq1R e+UjTudLnfcsfDqrhQYWvXnChVybe7aMCznxZ+d9aVem5HOHqip4XoEdjN2MnqWmALcy 4hmP2P5r7sDnbHW+e8tTjoFz+DL6Kq+hrhydbPUOLIESgWd5SNG2vycspe2D7D9HJaP6 VPg/7dUQW9eCe3wzyYTEv1hrTS3/pTitLvJLhdjov7rbyagT9F/pAbIiqvhs6UwXY4VM ypiXfgWHhy+j7CQUq3WNAnJDStdeNuUmaKH0MiADDg74JmyMAUIZP57Pa2DMa4buXg1U OTGg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1725790883; s=strato-dkim-0003; d=gjlay.de; h=To:Subject:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=uH9S6cAzW6K3iezDAe4Z34Y9X15eM5YoedK3e3jyg5Q=; b=LT4QuLv0PsWoBxyQ3+PEreS4P2t3Q+hwUUoONepL0scw++v3ZfSKOpm3G+kPuK4bfE Ct0DU/MKeA+Vo8bTWpAw== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKXKoq0dKoR0vetzhr/2IDlGFRklUq" Received: from [192.168.2.102] by smtp.strato.de (RZmta 51.2.3 DYNA|AUTH) with ESMTPSA id xccbe7088ALNw3m (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Sun, 8 Sep 2024 12:21:23 +0200 (CEST) Message-ID: Date: Sun, 8 Sep 2024 12:21:22 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Georg-Johann Lay Content-Language: en-US Subject: [patch,reload,v2] PR116326 Introduce RELOAD_ELIMINABLE_REGS To: "gcc-patches@gcc.gnu.org" , Richard Sandiford , Jeff Law , Denis Chertykov X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The reason for PR116326 is that LRA and reload require different ELIMINABLE_REGS for a multi-register frame pointer. As ELIMINABLE_REGS is used to initialize static const objects, it is not possible to make ELIMINABLE_REGS dependent on -mlra. It was also concluded that it is not desirable to adjust reload so that it behaves like LRA, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116326#c8 This patch adds a new macro RELOAD_ELIMINABLE_REGS that takes precedence over ELIMINABLE_REGS in reload1.cc. The patch was proposed by H.J. Lu and is only required for trunk. Johann --- reload1.cc: rtl-optimization/116326 - Use RELOAD_ELIMINABLE_REGS. The new macro is required because reload and LRA are using different representations for a multi-register frame pointer. As ELIMINABLE_REGS is used to initialize static const objects, it can't depend on -mlra. PR rtl-optimization/116326 gcc/ * reload1.cc (reg_eliminate_1): Initialize from RELOAD_ELIMINABLE_REGS if defined. * config/avr/avr.h (RELOAD_ELIMINABLE_REGS): Copy from ELIMINABLE_REGS. (ELIMINABLE_REGS): Don't mention sub-regnos of the frame pointer. gcc/testsuite/ * gcc.target/avr/torture/lra-pr116324.c: New test. * gcc.target/avr/torture/lra-pr116325.c: New test. reload1.cc: rtl-optimization/116326 - Use RELOAD_ELIMINABLE_REGS. The new macro is required because reload and LRA are using different representations for a multi-register frame pointer. As ELIMINABLE_REGS is used to initialize static const objects, it can't depend on -mlra. PR rtl-optimization/116326 gcc/ * reload1.cc (reg_eliminate_1): Initialize from RELOAD_ELIMINABLE_REGS if defined. * config/avr/avr.h (RELOAD_ELIMINABLE_REGS): Copy from ELIMINABLE_REGS. (ELIMINABLE_REGS): Don't mention sub-regnos of the frame pointer. gcc/testsuite/ * gcc.target/avr/torture/lra-pr116324.c: New test. * gcc.target/avr/torture/lra-pr116325.c: New test. diff --git a/gcc/config/avr/avr.h b/gcc/config/avr/avr.h index 1cf4180e534..3fa2ee76c43 100644 --- a/gcc/config/avr/avr.h +++ b/gcc/config/avr/avr.h @@ -308,12 +308,19 @@ enum reg_class { #define STATIC_CHAIN_REGNUM ((AVR_TINY) ? 18 :2) -#define ELIMINABLE_REGS { \ +#define RELOAD_ELIMINABLE_REGS { \ { ARG_POINTER_REGNUM, STACK_POINTER_REGNUM }, \ { ARG_POINTER_REGNUM, FRAME_POINTER_REGNUM }, \ { FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM }, \ { FRAME_POINTER_REGNUM + 1, STACK_POINTER_REGNUM + 1 } } +#define ELIMINABLE_REGS \ + { \ + { ARG_POINTER_REGNUM, STACK_POINTER_REGNUM }, \ + { ARG_POINTER_REGNUM, FRAME_POINTER_REGNUM }, \ + { FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM } \ + } + #define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET) \ OFFSET = avr_initial_elimination_offset (FROM, TO) diff --git a/gcc/reload1.cc b/gcc/reload1.cc index 2e059b09970..b0ae64e10b2 100644 --- a/gcc/reload1.cc +++ b/gcc/reload1.cc @@ -283,7 +283,13 @@ static const struct elim_table_1 const int to; } reg_eliminate_1[] = + // Reload and LRA don't agree on how a multi-register frame pointer + // is represented for elimination. See avr.h for a use case. +#ifdef RELOAD_ELIMINABLE_REGS + RELOAD_ELIMINABLE_REGS; +#else ELIMINABLE_REGS; +#endif #define NUM_ELIMINABLE_REGS ARRAY_SIZE (reg_eliminate_1) diff --git a/gcc/testsuite/gcc.target/avr/torture/lra-pr116324.c b/gcc/testsuite/gcc.target/avr/torture/lra-pr116324.c new file mode 100644 index 00000000000..b54eb402d8b --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/lra-pr116324.c @@ -0,0 +1,86 @@ +/* { dg-options { -std=gnu99 } } */ + +void f7_clr (void *cc) +{ + __asm ("%~call __f7_clr_asm" :: "z" (cc) : "memory"); +} + +void* f7_copy (void *cc, const void *aa) +{ + extern void __f7_copy_asm (void); + __asm ("%~call __f7_copy_asm" :: "z" (cc), "x" (aa) : "memory"); + return cc; +} + +typedef _Bool bool; +typedef unsigned int uint16_t; +typedef unsigned char uint8_t; +typedef int int16_t; + +typedef struct f7_t +{ + union + { + struct + { + uint8_t sign :1; + uint8_t reserved1 :1; + uint8_t is_nan :1; + uint8_t reserved2 :4; + uint8_t is_inf :1; + }; + uint8_t flags; + }; + + uint8_t mant[7]; + int16_t expo; +} f7_t; + + +static inline __attribute__((__always_inline__)) +void __f7_clr (f7_t *cc) +{ + extern void __f7_clr_asm (void); + __asm ("%~call %x[f]" + : + : [f] "i" (__f7_clr_asm), "z" (cc) + : "memory"); +} + +static inline __attribute__((__always_inline__)) +bool __f7_signbit (const f7_t *aa) +{ + return aa->flags & (1 << 0); +} + +static inline __attribute__((__always_inline__)) +int16_t sub_ssat16 (int16_t a, int16_t b) +{ + _Sat _Fract sa = __builtin_avr_rbits (a); + _Sat _Fract sb = __builtin_avr_rbits (b); + return __builtin_avr_bitsr (sa - sb); +} + +extern void __f7_Iadd (f7_t*, const f7_t*); +extern void __f7_addsub (f7_t*, const f7_t*, const f7_t*, bool neg_b); +extern uint8_t __f7_mulx (f7_t*, const f7_t*, const f7_t*, bool); +extern f7_t* __f7_normalize_asm (f7_t*); + +void __f7_madd_msub (f7_t *cc, const f7_t *aa, const f7_t *bb, const f7_t *dd, + bool neg_d) +{ + f7_t xx7, *xx = &xx7; + uint8_t x_lsb = __f7_mulx (xx, aa, bb, 1 ); + uint8_t x_sign = __f7_signbit (xx); + int16_t x_expo = xx->expo; + __f7_addsub (xx, xx, dd, neg_d); + + + __f7_clr (cc); + cc->expo = sub_ssat16 (x_expo, (8 * 7)); + cc->mant[7 - 1] = x_lsb; + cc = __f7_normalize_asm (cc); + cc->flags = x_sign; + __f7_Iadd (cc, xx); +} + diff --git a/gcc/testsuite/gcc.target/avr/torture/lra-pr116325.c b/gcc/testsuite/gcc.target/avr/torture/lra-pr116325.c new file mode 100644 index 00000000000..747e9a0f219 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/lra-pr116325.c @@ -0,0 +1,117 @@ +typedef __SIZE_TYPE__ size_t; +typedef __UINT8_TYPE__ uint8_t; + +void swapfunc (char *a, char *b, int n) +{ + do + { + char t = *a; + *a++ = *b; + *b++ = t; + } while (--n > 0); +} + + +typedef int cmp_t (const void*, const void*); + +#define min(a, b) ((a) < (b) ? (a) : (b)) + +#define swap(a, b) \ + swapfunc (a, b, es) + +#define vecswap(a, b, n) \ + if ((n) > 0) swapfunc (a, b, n) + +static char* +med3 (char *a, char *b, char *c, cmp_t *cmp) +{ + return cmp (a, b) < 0 + ? (cmp (b, c) < 0 ? b : (cmp (a, c) < 0 ? c : a )) + : (cmp (b, c) > 0 ? b : (cmp (a, c) < 0 ? a : c )); +} + +void +qsort (void *a, size_t n, size_t es, cmp_t *cmp) +{ + char *pa, *pb, *pc, *pd, *pl, *pm, *pn; + int d, r, swap_cnt; + +loop: + swap_cnt = 0; + if (n < 7) + { + for (pm = (char*) a + es; pm < (char*) a + n * es; pm += es) + for (pl = pm; pl > (char*) a && cmp (pl - es, pl) > 0; pl -= es) + swap (pl, pl - es); + return; + } + pm = (char*) a + (n / 2) * es; + if (n > 7) + { + pl = a; + pn = (char*) a + (n - 1) * es; + if (n > 40) + { + d = (n / 8) * es; + pl = med3 (pl, pl + d, pl + 2 * d, cmp); + pm = med3 (pm - d, pm, pm + d, cmp); + pn = med3 (pn - 2 * d, pn - d, pn, cmp); + } + pm = med3 (pl, pm, pn, cmp); + } + swap (a, pm); + pa = pb = (char*) a + es; + + pc = pd = (char*) a + (n - 1) * es; + for (;;) + { + while (pb <= pc && (r = cmp (pb, a)) <= 0) + { + if (r == 0) + { + swap_cnt = 1; + swap (pa, pb); + pa += es; + } + pb += es; + } + while (pb <= pc && (r = cmp (pc, a)) >= 0) + { + if (r == 0) + { + swap_cnt = 1; + swap (pc, pd); + pd -= es; + } + pc -= es; + } + if (pb > pc) + break; + swap (pb, pc); + swap_cnt = 1; + pb += es; + pc -= es; + } + if (swap_cnt == 0) + { + for (pm = (char*) a + es; pm < (char*) a + n * es; pm += es) + for (pl = pm; pl > (char*) a && cmp (pl - es, pl) > 0; pl -= es) + swap (pl, pl - es); + return; + } + + pn = (char*) a + n * es; + r = min (pa - (char*) a, pb - pa); + vecswap (a, pb - r, r); + r = min (pd - pc, (int) (pn - pd - es)); + vecswap (pb, pn - r, r); + if ((r = pb - pa) > (int) es) + qsort(a, r / es, es, cmp); + if ((r = pd - pc) > (int) es) + { + /* Iterate rather than recurse to save stack space */ + a = pn - r; + n = r / es; + goto loop; + } +}