Message ID | Zyt0I7BIL+dPhMm/@tucnak |
---|---|
State | New |
Headers | show |
Series | inline-asm, i386: Add "redzone" clobber support | expand |
> Hi! > > The following patch adds a "redzone" clobber (recognized just > on targets which choose to recognize it, right now just x86), > with which one can mark the rare case where inline asm pushes > something on the stack or uses call instruction without taking > red zone into account (i.e. addq $-128, %rsp; and addq $128, %rsp > around that). > > 2024-11-06 Jakub Jelinek <jakub@redhat.com> > > gcc/ > * target.def (redzone_clobber): New target hook. > * varasm.cc (decode_reg_name_and_count): Return -5 for > "redzone". > * cfgexpand.cc (expand_asm_stmt): Handle redzone clobber. > * config/i386/i386.h (struct machine_function): Add > asm_redzone_clobber_seen member. > * config/i386/i386.cc (ix86_compute_frame_layout): Don't > use red zone if cfun->machine->asm_redzone_clobber_seen. > (ix86_redzone_clobber): New function. > (TARGET_REDZONE_CLOBBER): Redefine. > * doc/extend.texi (Clobbers and Scratch Registers): Document > the "redzone" clobber. > * doc/tm.texi.in: Add @hook TARGET_REDZONE_CLOBBER. > * doc/tm.texi: Regenerate. > gcc/testsuite/ > * gcc.target/i386/asm-redzone-1.c: New test. LGTM, though if asm needs temporary memory it can ask for it explicitly. I guess this can be practical to hide actual call in the asm statement (which will break unwinding I guess). Honza
On Wed, Nov 06, 2024 at 06:04:43PM +0100, Jan Hubicka wrote: > LGTM, though if asm needs temporary memory it can ask for it explicitly. Sure, but the "redzone" is not about needing extra temporary memory, but about asking the compiler not to put any of its own temporaries in the red zone. The temporary memory could be a hack to avoid moving inline asm with call into the prologue or epilogue. > I guess this can be practical to hide actual call in the asm statement > (which will break unwinding I guess). When using .cfi_* directives, inline asm can be written such that it has accurate unwind info. When not using .cfi_* directives, the inline asm could e.g. jump to a subsection, do call there with hand written .eh_frame section for it and then jump back. I guess Linux kernel doesn't care, for whatever reason they hate DWARF. Jakub
On Wed, Nov 6, 2024 at 2:50 PM Jakub Jelinek <jakub@redhat.com> wrote: > > Hi! > > The following patch adds a "redzone" clobber (recognized just > on targets which choose to recognize it, right now just x86), > with which one can mark the rare case where inline asm pushes > something on the stack or uses call instruction without taking > red zone into account (i.e. addq $-128, %rsp; and addq $128, %rsp > around that). Maybe we should always recognize "redzone", even for targets without it. This is the way we recognize "cc" even for targets without CC reg (e.g. alpha). This would simplify the definition and processing - if the hook returns NULL_RTX (the default), then it (obviously) won't be added to the clobber list. If someone adds "redzone" clobber to the target without redzone it won't hurt if it is simply ignored. > 2024-11-06 Jakub Jelinek <jakub@redhat.com> > > gcc/ > * target.def (redzone_clobber): New target hook. > * varasm.cc (decode_reg_name_and_count): Return -5 for > "redzone". > * cfgexpand.cc (expand_asm_stmt): Handle redzone clobber. > * config/i386/i386.h (struct machine_function): Add > asm_redzone_clobber_seen member. > * config/i386/i386.cc (ix86_compute_frame_layout): Don't > use red zone if cfun->machine->asm_redzone_clobber_seen. > (ix86_redzone_clobber): New function. > (TARGET_REDZONE_CLOBBER): Redefine. > * doc/extend.texi (Clobbers and Scratch Registers): Document > the "redzone" clobber. > * doc/tm.texi.in: Add @hook TARGET_REDZONE_CLOBBER. > * doc/tm.texi: Regenerate. > gcc/testsuite/ > * gcc.target/i386/asm-redzone-1.c: New test. > > --- gcc/target.def.jj 2024-10-25 10:00:29.525767041 +0200 > +++ gcc/target.def 2024-11-06 13:32:39.376320955 +0100 > @@ -3376,6 +3376,18 @@ to be used.", > bool, (machine_mode mode), > NULL) > > +DEFHOOK > +(redzone_clobber, > + "Define this to return some RTL for the @code{redzone} @code{asm} clobber\n\ > +if target has a red zone and wants to support the @code{redzone} clobber\n\ > +or return NULL if the clobber shouldn't be recognized or return\n\ > +@code{pc_rtx} if the clobber should be recognized but should not be added\n\ > +to the list of clobbers.\n\ > +\n\ > +The default is not to support the @code{redzone} clobber.", > + rtx, (), > + NULL) > + > /* Support for named address spaces. */ > #undef HOOK_PREFIX > #define HOOK_PREFIX "TARGET_ADDR_SPACE_" > --- gcc/varasm.cc.jj 2024-11-06 10:23:21.215728007 +0100 > +++ gcc/varasm.cc 2024-11-06 12:21:17.891908352 +0100 > @@ -967,9 +967,11 @@ set_user_assembler_name (tree decl, cons > > /* Decode an `asm' spec for a declaration as a register name. > Return the register number, or -1 if nothing specified, > - or -2 if the ASMSPEC is not `cc' or `memory' and is not recognized, > + or -2 if the ASMSPEC is not `cc' or `memory' or `redzone' and is not > + recognized, > or -3 if ASMSPEC is `cc' and is not recognized, > or -4 if ASMSPEC is `memory' and is not recognized. Comma at the end of this part of the sentence. > + or -5 if ASMSPEC is `redzone' and is not recognized. > Accept an exact spelling or a decimal number. > Prefixes such as % are optional. */ > > @@ -1036,6 +1038,9 @@ decode_reg_name_and_count (const char *a > } > #endif /* ADDITIONAL_REGISTER_NAMES */ > > + if (!strcmp (asmspec, "redzone")) > + return -5; > + > if (!strcmp (asmspec, "memory")) > return -4; > > --- gcc/cfgexpand.cc.jj 2024-11-05 08:55:41.610187084 +0100 > +++ gcc/cfgexpand.cc 2024-11-06 14:26:07.684918717 +0100 > @@ -3193,6 +3193,19 @@ expand_asm_stmt (gasm *stmt) > j = decode_reg_name_and_count (regname, &nregs); > if (j < 0) > { > + if (j == -5) > + { > + rtx x = NULL_RTX; > + if (targetm.redzone_clobber) > + x = targetm.redzone_clobber (); > + if (x == pc_rtx) > + /* Recognize it but don't add anything to > + clobber_rvec. */; > + else if (x) > + clobber_rvec.safe_push (x); > + else > + j = -2; > + } > if (j == -2) > { > /* ??? Diagnose during gimplification? */ > @@ -3208,8 +3221,9 @@ expand_asm_stmt (gasm *stmt) > else > { > /* Otherwise we should have -1 == empty string > - or -3 == cc, which is not a register. */ > - gcc_assert (j == -1 || j == -3); > + or -3 == cc, which is not a register or > + -5 == redzone, ditto. */ > + gcc_assert (j == -1 || j == -3 || j == -5); > } > } > else > --- gcc/config/i386/i386.h.jj 2024-11-05 08:55:41.663186325 +0100 > +++ gcc/config/i386/i386.h 2024-11-06 14:11:42.893154474 +0100 > @@ -2881,6 +2881,9 @@ struct GTY(()) machine_function { > /* True if red zone is used. */ > BOOL_BITFIELD red_zone_used : 1; > > + /* True if inline asm with redzone clobber has been seen. */ > + BOOL_BITFIELD asm_redzone_clobber_seen : 1; > + > /* The largest alignment, in bytes, of stack slot actually used. */ > unsigned int max_used_stack_alignment; > > --- gcc/config/i386/i386.cc.jj 2024-11-05 08:55:41.661186354 +0100 > +++ gcc/config/i386/i386.cc 2024-11-06 14:12:52.560168428 +0100 > @@ -7172,7 +7172,8 @@ ix86_compute_frame_layout (void) > && crtl->sp_is_unchanging > && crtl->is_leaf > && !ix86_pc_thunk_call_expanded > - && !ix86_current_function_calls_tls_descriptor) > + && !ix86_current_function_calls_tls_descriptor > + && !cfun->machine->asm_redzone_clobber_seen) Please put the new check just after crtl-> checks. Thanks, Uros.
On Thu, Nov 07, 2024 at 08:47:34AM +0100, Uros Bizjak wrote: > Maybe we should always recognize "redzone", even for targets without > it. This is the way we recognize "cc" even for targets without CC reg > (e.g. alpha). This would simplify the definition and processing - if > the hook returns NULL_RTX (the default), then it (obviously) won't be > added to the clobber list. Dunno, am open to that, but thought it would be just weird if one says "redzone" on targets which don't have such a concept. > > --- gcc/varasm.cc.jj 2024-11-06 10:23:21.215728007 +0100 > > +++ gcc/varasm.cc 2024-11-06 12:21:17.891908352 +0100 > > @@ -967,9 +967,11 @@ set_user_assembler_name (tree decl, cons > > > > /* Decode an `asm' spec for a declaration as a register name. > > Return the register number, or -1 if nothing specified, > > - or -2 if the ASMSPEC is not `cc' or `memory' and is not recognized, > > + or -2 if the ASMSPEC is not `cc' or `memory' or `redzone' and is not > > + recognized, > > or -3 if ASMSPEC is `cc' and is not recognized, > > or -4 if ASMSPEC is `memory' and is not recognized. > > Comma at the end of this part of the sentence. Fixed in my copy. > > --- gcc/config/i386/i386.cc.jj 2024-11-05 08:55:41.661186354 +0100 > > +++ gcc/config/i386/i386.cc 2024-11-06 14:12:52.560168428 +0100 > > @@ -7172,7 +7172,8 @@ ix86_compute_frame_layout (void) > > && crtl->sp_is_unchanging > > && crtl->is_leaf > > && !ix86_pc_thunk_call_expanded > > - && !ix86_current_function_calls_tls_descriptor) > > + && !ix86_current_function_calls_tls_descriptor > > + && !cfun->machine->asm_redzone_clobber_seen) > > Please put the new check just after crtl-> checks. Changed in my copy too. The patch passed bootstrap/regtest on x86_64-linux and i686-linux btw. Jakub
On Thu, Nov 7, 2024 at 9:00 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Thu, Nov 07, 2024 at 08:47:34AM +0100, Uros Bizjak wrote: > > Maybe we should always recognize "redzone", even for targets without > > it. This is the way we recognize "cc" even for targets without CC reg > > (e.g. alpha). This would simplify the definition and processing - if > > the hook returns NULL_RTX (the default), then it (obviously) won't be > > added to the clobber list. > > Dunno, am open to that, but thought it would be just weird if one says > "redzone" on targets which don't have such a concept. Let's look at the situation with x86_32 and x86_64. The "redzone" for the former is just an afterthought, so we can safely say that it doesn't support it. So, the code that targets both targets (e.g. linux kernel) would (in a pedantic way) have to redefine many shared asm defines, one to have clobber and one without it. We don't want that, we want one definition and "let's compiler sort it out". For targets without clobber concept, well - don't add it to the clobber list if it is always ineffective. One *can* add "cc" to all alpha asms, but well.. ;) Uros.
--- gcc/target.def.jj 2024-10-25 10:00:29.525767041 +0200 +++ gcc/target.def 2024-11-06 13:32:39.376320955 +0100 @@ -3376,6 +3376,18 @@ to be used.", bool, (machine_mode mode), NULL) +DEFHOOK +(redzone_clobber, + "Define this to return some RTL for the @code{redzone} @code{asm} clobber\n\ +if target has a red zone and wants to support the @code{redzone} clobber\n\ +or return NULL if the clobber shouldn't be recognized or return\n\ +@code{pc_rtx} if the clobber should be recognized but should not be added\n\ +to the list of clobbers.\n\ +\n\ +The default is not to support the @code{redzone} clobber.", + rtx, (), + NULL) + /* Support for named address spaces. */ #undef HOOK_PREFIX #define HOOK_PREFIX "TARGET_ADDR_SPACE_" --- gcc/varasm.cc.jj 2024-11-06 10:23:21.215728007 +0100 +++ gcc/varasm.cc 2024-11-06 12:21:17.891908352 +0100 @@ -967,9 +967,11 @@ set_user_assembler_name (tree decl, cons /* Decode an `asm' spec for a declaration as a register name. Return the register number, or -1 if nothing specified, - or -2 if the ASMSPEC is not `cc' or `memory' and is not recognized, + or -2 if the ASMSPEC is not `cc' or `memory' or `redzone' and is not + recognized, or -3 if ASMSPEC is `cc' and is not recognized, or -4 if ASMSPEC is `memory' and is not recognized. + or -5 if ASMSPEC is `redzone' and is not recognized. Accept an exact spelling or a decimal number. Prefixes such as % are optional. */ @@ -1036,6 +1038,9 @@ decode_reg_name_and_count (const char *a } #endif /* ADDITIONAL_REGISTER_NAMES */ + if (!strcmp (asmspec, "redzone")) + return -5; + if (!strcmp (asmspec, "memory")) return -4; --- gcc/cfgexpand.cc.jj 2024-11-05 08:55:41.610187084 +0100 +++ gcc/cfgexpand.cc 2024-11-06 14:26:07.684918717 +0100 @@ -3193,6 +3193,19 @@ expand_asm_stmt (gasm *stmt) j = decode_reg_name_and_count (regname, &nregs); if (j < 0) { + if (j == -5) + { + rtx x = NULL_RTX; + if (targetm.redzone_clobber) + x = targetm.redzone_clobber (); + if (x == pc_rtx) + /* Recognize it but don't add anything to + clobber_rvec. */; + else if (x) + clobber_rvec.safe_push (x); + else + j = -2; + } if (j == -2) { /* ??? Diagnose during gimplification? */ @@ -3208,8 +3221,9 @@ expand_asm_stmt (gasm *stmt) else { /* Otherwise we should have -1 == empty string - or -3 == cc, which is not a register. */ - gcc_assert (j == -1 || j == -3); + or -3 == cc, which is not a register or + -5 == redzone, ditto. */ + gcc_assert (j == -1 || j == -3 || j == -5); } } else --- gcc/config/i386/i386.h.jj 2024-11-05 08:55:41.663186325 +0100 +++ gcc/config/i386/i386.h 2024-11-06 14:11:42.893154474 +0100 @@ -2881,6 +2881,9 @@ struct GTY(()) machine_function { /* True if red zone is used. */ BOOL_BITFIELD red_zone_used : 1; + /* True if inline asm with redzone clobber has been seen. */ + BOOL_BITFIELD asm_redzone_clobber_seen : 1; + /* The largest alignment, in bytes, of stack slot actually used. */ unsigned int max_used_stack_alignment; --- gcc/config/i386/i386.cc.jj 2024-11-05 08:55:41.661186354 +0100 +++ gcc/config/i386/i386.cc 2024-11-06 14:12:52.560168428 +0100 @@ -7172,7 +7172,8 @@ ix86_compute_frame_layout (void) && crtl->sp_is_unchanging && crtl->is_leaf && !ix86_pc_thunk_call_expanded - && !ix86_current_function_calls_tls_descriptor) + && !ix86_current_function_calls_tls_descriptor + && !cfun->machine->asm_redzone_clobber_seen) { frame->red_zone_size = to_allocate; if (frame->save_regs_using_mov) @@ -26268,6 +26269,25 @@ ix86_mode_can_transfer_bits (machine_mod return true; } +/* Implement TARGET_REDZONE_CLOBBER. */ +static rtx +ix86_redzone_clobber () +{ + cfun->machine->asm_redzone_clobber_seen = true; + if (ix86_using_red_zone ()) + { + rtx base = plus_constant (Pmode, stack_pointer_rtx, + GEN_INT (-RED_ZONE_SIZE)); + rtx mem = gen_rtx_MEM (BLKmode, base); + set_mem_size (mem, RED_ZONE_SIZE); + return mem; + } + /* Otherwise accept it. We certainly don't want an error + on "redzone" clobber in e.g. -mno-red-zone compiled function, + and even 32-bit code can use -mred-zone. */ + return pc_rtx; +} + /* Target-specific selftests. */ #if CHECKING_P @@ -27121,6 +27141,9 @@ ix86_libgcc_floating_mode_supported_p #undef TARGET_MODE_CAN_TRANSFER_BITS #define TARGET_MODE_CAN_TRANSFER_BITS ix86_mode_can_transfer_bits +#undef TARGET_REDZONE_CLOBBER +#define TARGET_REDZONE_CLOBBER ix86_redzone_clobber + static bool ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED) { --- gcc/doc/extend.texi.jj 2024-11-06 11:09:46.433407349 +0100 +++ gcc/doc/extend.texi 2024-11-06 14:37:01.426682958 +0100 @@ -11823,7 +11823,7 @@ asm volatile ("movc3 %0, %1, %2" : "r0", "r1", "r2", "r3", "r4", "r5", "memory"); @end example -Also, there are two special clobber arguments: +Also, there are three special clobber arguments: @table @code @item "cc" @@ -11851,6 +11851,18 @@ Note that this clobber does not prevent speculative reads past the @code{asm} statement. To prevent that, you need processor-specific fence instructions. +@item "redzone" +The @code{"redzone"} clobber tells the compiler that the assembly code +may write to the stack red zone, area below the stack pointer which on +some architectures in some calling conventions is guaranteed not to be +changed by signal handlers, interrupts or exceptions and so the compiler +can store there temporaries in leaf functions. The @code{"redzone"} +clobber may be used just on targets which do have a concept of a red zone +and should be used e.g.@: in case the assembly code uses call instructions +or pushes something to the stack without taking the red zone into account +by subtracting red zone size from the stack pointer first and restoring +it afterwards. + @end table Flushing registers to memory has performance implications and may be --- gcc/doc/tm.texi.in.jj 2024-09-24 11:31:48.701622348 +0200 +++ gcc/doc/tm.texi.in 2024-11-06 14:02:19.110134054 +0100 @@ -3464,6 +3464,8 @@ stack. @hook TARGET_MODE_CAN_TRANSFER_BITS +@hook TARGET_REDZONE_CLOBBER + @hook TARGET_TRANSLATE_MODE_ATTRIBUTE @hook TARGET_SCALAR_MODE_SUPPORTED_P --- gcc/doc/tm.texi.jj 2024-09-24 11:31:48.699622375 +0200 +++ gcc/doc/tm.texi 2024-11-06 14:07:09.391025522 +0100 @@ -4563,6 +4563,16 @@ The default is to assume modes with the to be used. @end deftypefn +@deftypefn {Target Hook} rtx TARGET_REDZONE_CLOBBER () +Define this to return some RTL for the @code{redzone} @code{asm} clobber +if target has a red zone and wants to support the @code{redzone} clobber +or return NULL if the clobber shouldn't be recognized or return +@code{pc_rtx} if the clobber should be recognized but should not be added +to the list of clobbers. + +The default is not to support the @code{redzone} clobber. +@end deftypefn + @deftypefn {Target Hook} machine_mode TARGET_TRANSLATE_MODE_ATTRIBUTE (machine_mode @var{mode}) Define this hook if during mode attribute processing, the port should translate machine_mode @var{mode} to another mode. For example, rs6000's --- gcc/testsuite/gcc.target/i386/asm-redzone-1.c.jj 2024-11-06 14:41:39.970752016 +0100 +++ gcc/testsuite/gcc.target/i386/asm-redzone-1.c 2024-11-06 14:42:18.625206505 +0100 @@ -0,0 +1,38 @@ +/* { dg-do run { target lp64 } } */ +/* { dg-options "-O2" } */ + +__attribute__((noipa)) int +foo (void) +{ + int a = 1; + int b = 2; + int c = 3; + int d = 4; + int e = 5; + int f = 6; + int g = 7; + int h = 8; + int i = 9; + int j = 10; + int k = 11; + int l = 12; + int m = 13; + int n = 14; + asm volatile ("" : "+g" (a), "+g" (b), "+g" (c), "+g" (d), "+g" (e)); + asm volatile ("" : "+g" (f), "+g" (g), "+g" (h), "+g" (i), "+g" (j)); + asm volatile ("" : "+g" (k), "+g" (l), "+g" (m), "+g" (n)); + asm volatile ("{pushq %%rax; pushq %%rax; popq %%rax; popq %%rax" + "|push rax;push rax;pop rax;pop rax}" + : : : "ax", "si", "di", "r10", "r11", "redzone"); + asm volatile ("" : "+g" (a), "+g" (b), "+g" (c), "+g" (d), "+g" (e)); + asm volatile ("" : "+g" (f), "+g" (g), "+g" (h), "+g" (i), "+g" (j)); + asm volatile ("" : "+g" (k), "+g" (l), "+g" (m), "+g" (n)); + return a + b + c + d + e + f + g + h + i + j + k + l + m + n; +} + +int +main () +{ + if (foo () != 105) + __builtin_abort (); +}