From patchwork Tue Sep 10 14:21:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Schulze Frielinghaus X-Patchwork-Id: 1983278 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=iVOIXBMf; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X35YW67xpz1y1C for ; Wed, 11 Sep 2024 00:24:14 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B5FA7385DC20 for ; Tue, 10 Sep 2024 14:24:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B5FA7385DC20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1725978252; bh=xtkbIP+WwECsbiWeM8s6EggV4jcFyV+dYeFQF6nQ6d8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From; b=iVOIXBMfndzQfR18edSuy8F6DfzLmruVqx0JeOO2AxkOnRsvfG9eYQr/ggSWStPLT jk8bFbpglA9Ty3XqhskdCPJf5hJOID3lyhXfY2INonEzo33hhTPgG6etThxuRN3Wr6 gS+fbgMVIwwBQ+MMuUga5MFwz4lSJWVLwPkSR5k4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 29AC5385C6C3; Tue, 10 Sep 2024 14:22:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 29AC5385C6C3 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gcc.gnu.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 29AC5385C6C3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725978125; cv=none; b=gcLU+zBh/HUi0HGW+kjxHv0AtrfdUo6tRi87ViPqi821sNEWKd/F3rc2MAvPXgABNOUzVN+TYdX1tW54WmHXJh4eV5pdf0XI9azsgBZGSwDN7tB7fMo8asmoDnkZmW4nRCrenB/Kq3oaNuSXt93VPcrC20YqWA9AydoVWjfE50o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725978125; c=relaxed/simple; bh=kEpKLkRddP/qrcl8s/0UG/p2WQk6vbxFvb4RmPdtSWI=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=o5w9feXXb4l/etXXkiXxfpK0zSU2kM5D3urf7Nyw47rQWsCjBWqbNLd0LTlmsFn4gX+lHdCKCedniteY/bO++xL8IAJKudmQy4VZEsK0x6MRowkxAncmtbZynvfAVilrosZAgcJiPbY4UMz1ZvEy+FRQNOW9+5IKCaAzemfKv9k= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 48AB9WWX021704; Tue, 10 Sep 2024 14:22:00 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 41geba82bk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 10 Sep 2024 14:21:59 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 48ADWcAx013471; Tue, 10 Sep 2024 14:21:59 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 41h3cm3m96-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 10 Sep 2024 14:21:59 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 48AELvRn46989796 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 10 Sep 2024 14:21:57 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5B4C420040; Tue, 10 Sep 2024 14:21:57 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3333C2004B; Tue, 10 Sep 2024 14:21:57 +0000 (GMT) Received: from a8345010.lnxne.boe (unknown [9.152.108.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTPS; Tue, 10 Sep 2024 14:21:57 +0000 (GMT) From: Stefan Schulze Frielinghaus To: gcc-patches@gcc.gnu.org Cc: Stefan Schulze Frielinghaus Subject: [RFC 4/4] Rewrite register asm into hard register constraints Date: Tue, 10 Sep 2024 16:21:01 +0200 Message-ID: <20240910142121.3285492-5-stefansf@gcc.gnu.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240910142121.3285492-1-stefansf@gcc.gnu.org> References: <20240910142121.3285492-1-stefansf@gcc.gnu.org> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: sZyJKVFuyOpDHyjpvTkhyIsPLPLA06DB X-Proofpoint-GUID: sZyJKVFuyOpDHyjpvTkhyIsPLPLA06DB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-10_04,2024-09-09_02,2024-09-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 priorityscore=1501 bulkscore=0 spamscore=0 phishscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=483 impostorscore=0 clxscore=1034 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2408220000 definitions=main-2409100105 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NEUTRAL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Currently a register asm already materializes during expand. This means, a hard register is allocated for the very first access of a register asm as e.g. in an assignment. As a consequence this might lead to suboptimal register allocation if the assignment and the using asm statement are spread far apart. Even more problematic are function calls in between register asm assignments and its using asm statement since hard registers may be clobbered by a call. The former may be solved by pulling register asm assignments and asm statements close by. However, the latter is not easily solved since sometimes function calls are implicit. For example int foo (int *x) { register int y asm ("0") = 42; register int z asm ("1") = *x; asm ("bar\t%0,%1" : "+r" (z) : "r" (y)); return z; } If compiled with address sanitizer, then a function call is introduced for the memory load which in turn may interfer with the initialization of register asm y. Likewise, for some targets and configurations even an operation like an addition may lead to an implicit library call. In contrast hard register constraints materialize during register allocation and therefore do not suffer from this, i.e., asm operands are kept in pseudos until RA. This patch adds the feature of rewriting local register asm into code which exploits hard register constraints. For example register int global asm ("r3"); int foo (int x0) { register int x asm ("r4") = x0; register int y asm ("r5"); asm ("bar\t%0,%1,%2" : "=r" (x) : "0" (x), "r" (global)); x += 42; asm ("baz\t%0,%1" : "=r" (y) : "r" (x)); return y; } is rewritten during gimplification into register int global asm ("r3"); int foo (int x0) { int x = x0; int y; asm ("bar\t%0,%1,%2" : "={r4}" (x) : "0" (x), "r" (global)); x += 42; asm ("baz\t%0,%1" : "={r5}" (y) : "{r4}" (x)); return y; } The resulting code solely relies on hard register constraints modulo global register asm. Since I consider this as an experimental feature it is hidden behind new flag -fdemote-register-asm (I'm open for other naming suggestions). --- gcc/common.opt | 4 + gcc/gimplify.cc | 78 +++++++++++++++++++ .../gcc.dg/asm-hard-reg-demotion-1.c | 19 +++++ .../gcc.dg/asm-hard-reg-demotion-2.c | 19 +++++ gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h | 52 +++++++++++++ 5 files changed, 172 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion-1.c create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion-2.c create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h diff --git a/gcc/common.opt b/gcc/common.opt index ea39f87ae71..859a735a0b7 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -3422,6 +3422,10 @@ fverbose-asm Common Var(flag_verbose_asm) Add extra commentary to assembler output. +fdemote-register-asm +Common Var(flag_demote_register_asm) Init(0) +Demote local register asm and use hard register constraints instead + fvisibility= Common Joined RejectNegative Enum(symbol_visibility) Var(default_visibility) Init(VISIBILITY_DEFAULT) -fvisibility=[default|internal|hidden|protected] Set the default symbol visibility. diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 08e0b5d047b..c9bd1769c28 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -7049,6 +7049,73 @@ num_alternatives (const_tree link) return num + 1; } +static hash_set demote_register_asm; + +static void +gimplify_demote_register_asm (tree link) +{ + if (!flag_demote_register_asm) + return; + tree op = TREE_VALUE (link); + if (!VAR_P (op) || !DECL_HARD_REGISTER (op) || is_global_var (op)) + return; + tree id = DECL_ASSEMBLER_NAME (op); + const char *regname = IDENTIFIER_POINTER (id); + ++regname; + int regno = decode_reg_name (regname); + if (regno < 0) + /* This indicates an error and we error out later on. */ + return; + const char *constraint = TREE_STRING_POINTER (TREE_VALUE (TREE_PURPOSE (link))); + auto_vec constraint_new; + for (const char *p = constraint; *p; ) + { + bool pushed = false; + switch (*p) + { + case '+': case '=': case '%': case '?': case '!': case '*': case '&': + case '#': case '$': case '^': case '{': case 'E': case 'F': case 'G': + case 'H': case 's': case 'i': case 'n': case 'I': case 'J': case 'K': + case 'L': case 'M': case 'N': case 'O': case 'P': case ',': case '0': + case '1': case '2': case '3': case '4': case '5': case '6': case '7': + case '8': case '9': case '[': case '<': case '>': case 'g': case 'X': + break; + + default: + if (!ISALPHA (*p)) + break; + enum constraint_num cn = lookup_constraint (p); + enum reg_class rclass = reg_class_for_constraint (cn); + if (rclass != NO_REGS || insn_extra_address_constraint (cn)) + { + gcc_assert (reg_class_subset_p (REGNO_REG_CLASS (regno), rclass)); + constraint_new.safe_push ('{'); + size_t len = strlen (regname); + for (size_t i = 0; i < len; ++i) + constraint_new.safe_push (regname[i]); + constraint_new.safe_push ('}'); + pushed = true; + } + break; + } + + for (size_t len = CONSTRAINT_LEN (*p, p); len; len--, p++) + { + if (!pushed) + constraint_new.safe_push (*p); + if (*p == '\0') + break; + } + } + unsigned int len = constraint_new.length (); + char *new_constraint = new char[len + 1]; + memcpy (new_constraint, &constraint_new[0], len); + new_constraint[len] = '\0'; + tree str = build_string (len + 1, new_constraint); + TREE_VALUE (TREE_PURPOSE (link)) = str; + demote_register_asm.add (op); +} + /* Gimplify the operands of an ASM_EXPR. Input operands should be a gimple value; output operands should be a gimple lvalue. */ @@ -7100,6 +7167,8 @@ gimplify_asm_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) bool ok; size_t constraint_len; + gimplify_demote_register_asm (link); + link_next = TREE_CHAIN (link); oconstraints[i] @@ -7285,6 +7354,8 @@ gimplify_asm_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) int input_num = 0; for (link = ASM_INPUTS (expr); link; ++input_num, ++i, link = link_next) { + gimplify_demote_register_asm (link); + link_next = TREE_CHAIN (link); constraint = TREE_STRING_POINTER (TREE_VALUE (TREE_PURPOSE (link))); reg_info.operand = TREE_VALUE (link); @@ -19525,6 +19596,13 @@ gimplify_body (tree fndecl, bool do_parms) } } + for (auto op : demote_register_asm) + { + DECL_REGISTER (op) = 0; + DECL_HARD_REGISTER (op) = 0; + } + demote_register_asm.empty (); + if ((flag_openacc || flag_openmp || flag_openmp_simd) && gimplify_omp_ctxp) { diff --git a/gcc/testsuite/gcc.dg/asm-hard-reg-demotion-1.c b/gcc/testsuite/gcc.dg/asm-hard-reg-demotion-1.c new file mode 100644 index 00000000000..541a66a8d05 --- /dev/null +++ b/gcc/testsuite/gcc.dg/asm-hard-reg-demotion-1.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64*-*-* powerpc64*-*-* riscv64-*-* s390*-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fdemote-register-asm" } */ + +#include "asm-hard-reg-demotion.h" + +int +main (void) +{ + if (bar (0) != 0 + || bar (1) != 1 + || bar (2) != 2 + || bar (32) != 32 + || baz (0) != 0 + || baz (1) != 1 + || baz (2) != 2 + || baz (32) != 32) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/asm-hard-reg-demotion-2.c b/gcc/testsuite/gcc.dg/asm-hard-reg-demotion-2.c new file mode 100644 index 00000000000..3d216d440af --- /dev/null +++ b/gcc/testsuite/gcc.dg/asm-hard-reg-demotion-2.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64*-*-* powerpc64*-*-* riscv64-*-* s390*-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fno-demote-register-asm" } */ + +#include "asm-hard-reg-demotion.h" + +int +main (void) +{ + if (bar (0) != 42 + || bar (1) != 42 + || bar (2) != 42 + || bar (32) != 42 + || baz (0) != 0 + || baz (1) != 1 + || baz (2) != 2 + || baz (32) != 32) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h b/gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h new file mode 100644 index 00000000000..6d72f622ce9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h @@ -0,0 +1,52 @@ +/* Pass parameter x in the first general argument register to the assembler + instruction. + + In function bar we fail to do so because after the function call to foo, + variable argreg1 does not contain the value of x but rather 42 which got + passed to foo. Thus, the function always returns 42. In contrast in + function baz, variable x is saved over the function call and materializes in + the asm statement and therefore is returned. */ + +#if defined (__aarch64__) +# define REG register int argreg1 __asm__ ("x0") = x; +# define MOVE1 __asm__ ("mov\t%0,%1" : "=r" (out) : "r" (argreg1)); +# define MOVE2 __asm__ ("mov\t%0,%1" : "=r" (out) : "{x0}" (x)); +#elif defined (__powerpc__) || defined (__POWERPC__) +# define REG register int argreg1 __asm__ ("r3") = x; +# define MOVE1 __asm__ ("mr\t%0,%1" : "=r" (out) : "r" (argreg1)); +# define MOVE2 __asm__ ("mr\t%0,%1" : "=r" (out) : "{r3}" (x)); +#elif defined (__riscv) +# define REG register int argreg1 __asm__ ("a0") = x; +# define MOVE1 __asm__ ("mv\t%0,%1" : "=r" (out) : "r" (argreg1)); +# define MOVE2 __asm__ ("mv\t%0,%1" : "=r" (out) : "{a0}" (x)); +#elif defined (__s390__) +# define REG register int argreg1 __asm__ ("r2") = x; +# define MOVE1 __asm__ ("lr\t%0,%1" : "=r" (out) : "r" (argreg1)); +# define MOVE2 __asm__ ("lr\t%0,%1" : "=r" (out) : "{r2}" (x)); +#elif defined (__x86_64__) +# define REG register int argreg1 __asm__ ("edi") = x; +# define MOVE1 __asm__ ("mov\t%1,%0" : "=r" (out) : "r" (argreg1)); +# define MOVE2 __asm__ ("mov\t%1,%0" : "=r" (out) : "{edi}" (x)); +#endif + +__attribute__ ((noipa)) +int foo (int unused) { } + +int +bar (int x) +{ + int out; + REG + foo (42); + MOVE1 + return out; +} + +int +baz (int x) +{ + int out; + foo (42); + MOVE2 + return out; +}