From patchwork Thu Nov 7 10:24:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 2007939 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=KBAjEVAK; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkdWG2R3wz1xyW for ; Thu, 7 Nov 2024 21:25:26 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8ADB53858405 for ; Thu, 7 Nov 2024 10:25:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTP id 243213858D20 for ; Thu, 7 Nov 2024 10:24:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 243213858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 243213858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730975097; cv=none; b=f9me5BHYhX0apMvgw3eNd9M6NVM77eSXscQAHYA+P7VnTyQxwXhEu1zGVLV6FfF+epdALXLH6dnH4WaGd0mfqoFU9xbFPkuqYnqB4+YEu5CMJMk0DQZ/+/fRP9ptUrWE5Jmvu2vk9jXa67HZDai1oosOTRf7hgmtwJu0MPOOLZI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730975097; c=relaxed/simple; bh=oDW/ItvF6K824oEX/DyIQc+WWoql82ujDTVGCC7B1ho=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=Q87GJUencYYzl6jNDpt8u1aDpDq0bZhGDET2sfMUh0kPDE04AhUJSNX3zURj6V3kOOEf9Lc2Ogc28LoGRPZQdxx6lS80YQXLdSBDk+oeQ5JzSC/2DAv1/yDZqaHf5AZtgbf5JBUNvLMWFSWnVMdMxsnzcUSrGmCAq3B8LqF0Eos= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1730975087; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ku74NQRbkXEXltQfJjSt952hpFZDanIipBny43q0z+c=; b=KBAjEVAKHKeiruOyHc4O42fo9d2e11C8QEE9Ow1bZq61MtMSptlWqPH39Jl7vaGesoWZCO ixxWDsUxyH66mHI0kleF6IT3iHB3OP9gsTabiJMEcB+KuCXDCVvzEIBfIKWSp5IuZYW15G MFwvc6K0VsgrY2dq/uRkAfH3xnP40CU= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-653-PswvKlYnNp2QaFtyDUJVgA-1; Thu, 07 Nov 2024 05:24:45 -0500 X-MC-Unique: PswvKlYnNp2QaFtyDUJVgA-1 X-Mimecast-MFC-AGG-ID: PswvKlYnNp2QaFtyDUJVgA Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B8E641944F01; Thu, 7 Nov 2024 10:24:43 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.224.16]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EEC4B1956054; Thu, 7 Nov 2024 10:24:42 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 4A7AOdW13752674 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 7 Nov 2024 11:24:39 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 4A7AOd5G3752673; Thu, 7 Nov 2024 11:24:39 +0100 Date: Thu, 7 Nov 2024 11:24:39 +0100 From: Jakub Jelinek To: Uros Bizjak , Richard Biener , Jan Hubicka Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] inline-asm, i386, v2: Add "redzone" clobber support Message-ID: References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: nVoKH_s0WHmLbceDgUUKqBzud0WRd6yel-1vQ2tIXDY_1730975084 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org On Thu, Nov 07, 2024 at 09:12:34AM +0100, Uros Bizjak wrote: > On Thu, Nov 7, 2024 at 9:00 AM Jakub Jelinek wrote: > > > > On Thu, Nov 07, 2024 at 08:47:34AM +0100, Uros Bizjak wrote: > > > Maybe we should always recognize "redzone", even for targets without > > > it. This is the way we recognize "cc" even for targets without CC reg > > > (e.g. alpha). This would simplify the definition and processing - if > > > the hook returns NULL_RTX (the default), then it (obviously) won't be > > > added to the clobber list. > > > > Dunno, am open to that, but thought it would be just weird if one says > > "redzone" on targets which don't have such a concept. > > Let's look at the situation with x86_32 and x86_64. The "redzone" for > the former is just an afterthought, so we can safely say that it > doesn't support it. So, the code that targets both targets (e.g. linux > kernel) would (in a pedantic way) have to redefine many shared asm > defines, one to have clobber and one without it. We don't want that, > we want one definition and "let's compiler sort it out". > > For targets without clobber concept, well - don't add it to the > clobber list if it is always ineffective. One *can* add "cc" to all > alpha asms, but well.. ;) Ok, here is a variant of the patch which just ignores "redzone" clobber if it doesn't make sense. 2024-11-07 Jakub Jelinek gcc/ * target.def (redzone_clobber): New target hook. * varasm.cc (decode_reg_name_and_count): Return -5 for "redzone". * cfgexpand.cc (expand_asm_stmt): Handle redzone clobber. * config/i386/i386.h (struct machine_function): Add asm_redzone_clobber_seen member. * config/i386/i386.cc (ix86_compute_frame_layout): Don't use red zone if cfun->machine->asm_redzone_clobber_seen. (ix86_redzone_clobber): New function. (TARGET_REDZONE_CLOBBER): Redefine. * doc/extend.texi (Clobbers and Scratch Registers): Document the "redzone" clobber. * doc/tm.texi.in: Add @hook TARGET_REDZONE_CLOBBER. * doc/tm.texi: Regenerate. gcc/testsuite/ * gcc.dg/asm-redzone-1.c: New test. * gcc.target/i386/asm-redzone-1.c: New test. Jakub --- gcc/target.def.jj 2024-11-06 18:53:10.836843793 +0100 +++ gcc/target.def 2024-11-07 10:57:58.697898800 +0100 @@ -3376,6 +3376,16 @@ to be used.", bool, (machine_mode mode), NULL) +DEFHOOK +(redzone_clobber, + "Define this to return some RTL for the @code{redzone} @code{asm} clobber\n\ +if target has a red zone and wants to support the @code{redzone} clobber\n\ +or return NULL if the clobber should be ignored.\n\ +\n\ +The default is to ignore the @code{redzone} clobber.", + rtx, (), + NULL) + /* Support for named address spaces. */ #undef HOOK_PREFIX #define HOOK_PREFIX "TARGET_ADDR_SPACE_" --- gcc/varasm.cc.jj 2024-11-06 18:53:10.838843765 +0100 +++ gcc/varasm.cc 2024-11-07 10:55:46.858763724 +0100 @@ -965,9 +965,11 @@ set_user_assembler_name (tree decl, cons /* Decode an `asm' spec for a declaration as a register name. Return the register number, or -1 if nothing specified, - or -2 if the ASMSPEC is not `cc' or `memory' and is not recognized, + or -2 if the ASMSPEC is not `cc' or `memory' or `redzone' and is not + recognized, or -3 if ASMSPEC is `cc' and is not recognized, - or -4 if ASMSPEC is `memory' and is not recognized. + or -4 if ASMSPEC is `memory' and is not recognized, + or -5 if ASMSPEC is `redzone' and is not recognized. Accept an exact spelling or a decimal number. Prefixes such as % are optional. */ @@ -1034,6 +1036,9 @@ decode_reg_name_and_count (const char *a } #endif /* ADDITIONAL_REGISTER_NAMES */ + if (!strcmp (asmspec, "redzone")) + return -5; + if (!strcmp (asmspec, "memory")) return -4; --- gcc/cfgexpand.cc.jj 2024-11-06 18:53:10.803844259 +0100 +++ gcc/cfgexpand.cc 2024-11-07 11:00:16.212953571 +0100 @@ -3205,6 +3205,12 @@ expand_asm_stmt (gasm *stmt) rtx x = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)); clobber_rvec.safe_push (x); } + else if (j == -5) + { + if (targetm.redzone_clobber) + if (rtx x = targetm.redzone_clobber ()) + clobber_rvec.safe_push (x); + } else { /* Otherwise we should have -1 == empty string --- gcc/config/i386/i386.h.jj 2024-11-06 18:53:10.807844203 +0100 +++ gcc/config/i386/i386.h 2024-11-07 10:55:46.904763076 +0100 @@ -2881,6 +2881,9 @@ struct GTY(()) machine_function { /* True if red zone is used. */ BOOL_BITFIELD red_zone_used : 1; + /* True if inline asm with redzone clobber has been seen. */ + BOOL_BITFIELD asm_redzone_clobber_seen : 1; + /* The largest alignment, in bytes, of stack slot actually used. */ unsigned int max_used_stack_alignment; --- gcc/config/i386/i386.cc.jj 2024-11-06 18:53:10.807844203 +0100 +++ gcc/config/i386/i386.cc 2024-11-07 10:55:46.947762468 +0100 @@ -7171,6 +7171,7 @@ ix86_compute_frame_layout (void) if (ix86_using_red_zone () && crtl->sp_is_unchanging && crtl->is_leaf + && !cfun->machine->asm_redzone_clobber_seen && !ix86_pc_thunk_call_expanded && !ix86_current_function_calls_tls_descriptor) { @@ -26268,6 +26269,22 @@ ix86_mode_can_transfer_bits (machine_mod return true; } +/* Implement TARGET_REDZONE_CLOBBER. */ +static rtx +ix86_redzone_clobber () +{ + cfun->machine->asm_redzone_clobber_seen = true; + if (ix86_using_red_zone ()) + { + rtx base = plus_constant (Pmode, stack_pointer_rtx, + GEN_INT (-RED_ZONE_SIZE)); + rtx mem = gen_rtx_MEM (BLKmode, base); + set_mem_size (mem, RED_ZONE_SIZE); + return mem; + } + return NULL_RTX; +} + /* Target-specific selftests. */ #if CHECKING_P @@ -27121,6 +27138,9 @@ ix86_libgcc_floating_mode_supported_p #undef TARGET_MODE_CAN_TRANSFER_BITS #define TARGET_MODE_CAN_TRANSFER_BITS ix86_mode_can_transfer_bits +#undef TARGET_REDZONE_CLOBBER +#define TARGET_REDZONE_CLOBBER ix86_redzone_clobber + static bool ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED) { --- gcc/doc/extend.texi.jj 2024-11-06 18:53:10.826843934 +0100 +++ gcc/doc/extend.texi 2024-11-07 10:57:07.550622297 +0100 @@ -11827,7 +11827,7 @@ asm volatile ("movc3 %0, %1, %2" : "r0", "r1", "r2", "r3", "r4", "r5", "memory"); @end example -Also, there are two special clobber arguments: +Also, there are three special clobber arguments: @table @code @item "cc" @@ -11855,6 +11855,18 @@ Note that this clobber does not prevent speculative reads past the @code{asm} statement. To prevent that, you need processor-specific fence instructions. +@item "redzone" +The @code{"redzone"} clobber tells the compiler that the assembly code +may write to the stack red zone, area below the stack pointer which on +some architectures in some calling conventions is guaranteed not to be +changed by signal handlers, interrupts or exceptions and so the compiler +can store there temporaries in leaf functions. On targets which have +no concept of the stack red zone, the clobber is ignored. +It should be used e.g.@: in case the assembly code uses call instructions +or pushes something to the stack without taking the red zone into account +by subtracting red zone size from the stack pointer first and restoring +it afterwards. + @end table Flushing registers to memory has performance implications and may be --- gcc/doc/tm.texi.in.jj 2024-11-06 18:53:10.833843835 +0100 +++ gcc/doc/tm.texi.in 2024-11-07 10:55:46.998761746 +0100 @@ -3464,6 +3464,8 @@ stack. @hook TARGET_MODE_CAN_TRANSFER_BITS +@hook TARGET_REDZONE_CLOBBER + @hook TARGET_TRANSLATE_MODE_ATTRIBUTE @hook TARGET_SCALAR_MODE_SUPPORTED_P --- gcc/doc/tm.texi.jj 2024-11-06 18:53:10.831843863 +0100 +++ gcc/doc/tm.texi 2024-11-07 10:55:47.008761605 +0100 @@ -4563,6 +4563,14 @@ The default is to assume modes with the to be used. @end deftypefn +@deftypefn {Target Hook} rtx TARGET_REDZONE_CLOBBER () +Define this to return some RTL for the @code{redzone} @code{asm} clobber +if target has a red zone and wants to support the @code{redzone} clobber +or return NULL if the clobber should be ignored. + +The default is to ignore the @code{redzone} clobber. +@end deftypefn + @deftypefn {Target Hook} machine_mode TARGET_TRANSLATE_MODE_ATTRIBUTE (machine_mode @var{mode}) Define this hook if during mode attribute processing, the port should translate machine_mode @var{mode} to another mode. For example, rs6000's --- gcc/testsuite/gcc.dg/asm-redzone-1.c.jj 2024-11-07 11:07:52.873493855 +0100 +++ gcc/testsuite/gcc.dg/asm-redzone-1.c 2024-11-07 11:07:45.697595356 +0100 @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "" } */ + +void +foo (void) +{ + asm ("" : : : "cc", "memory", "redzone"); +} --- gcc/testsuite/gcc.target/i386/asm-redzone-1.c.jj 2024-11-07 10:55:47.018761463 +0100 +++ gcc/testsuite/gcc.target/i386/asm-redzone-1.c 2024-11-07 10:55:47.018761463 +0100 @@ -0,0 +1,38 @@ +/* { dg-do run { target lp64 } } */ +/* { dg-options "-O2" } */ + +__attribute__((noipa)) int +foo (void) +{ + int a = 1; + int b = 2; + int c = 3; + int d = 4; + int e = 5; + int f = 6; + int g = 7; + int h = 8; + int i = 9; + int j = 10; + int k = 11; + int l = 12; + int m = 13; + int n = 14; + asm volatile ("" : "+g" (a), "+g" (b), "+g" (c), "+g" (d), "+g" (e)); + asm volatile ("" : "+g" (f), "+g" (g), "+g" (h), "+g" (i), "+g" (j)); + asm volatile ("" : "+g" (k), "+g" (l), "+g" (m), "+g" (n)); + asm volatile ("{pushq %%rax; pushq %%rax; popq %%rax; popq %%rax" + "|push rax;push rax;pop rax;pop rax}" + : : : "ax", "si", "di", "r10", "r11", "redzone"); + asm volatile ("" : "+g" (a), "+g" (b), "+g" (c), "+g" (d), "+g" (e)); + asm volatile ("" : "+g" (f), "+g" (g), "+g" (h), "+g" (i), "+g" (j)); + asm volatile ("" : "+g" (k), "+g" (l), "+g" (m), "+g" (n)); + return a + b + c + d + e + f + g + h + i + j + k + l + m + n; +} + +int +main () +{ + if (foo () != 105) + __builtin_abort (); +}