diff mbox series

[03/11] aarch64: Use br instead of ret for eh_return

Message ID 913cd5eb33e01ad279915b4a1f0ce4bd7afd5ad7.1692699125.git.szabolcs.nagy@arm.com
State New
Headers show
Series aarch64 GCS preliminary patches | expand

Commit Message

Szabolcs Nagy Aug. 22, 2023, 10:38 a.m. UTC
The expected way to handle eh_return is to pass the stack adjustment
offset and landing pad address via

  EH_RETURN_STACKADJ_RTX
  EH_RETURN_HANDLER_RTX

to the epilogue that is shared between normal return paths and the
eh_return paths.  EH_RETURN_HANDLER_RTX is the stack slot of the
return address that is overwritten with the landing pad in the
eh_return case and EH_RETURN_STACKADJ_RTX is a register added to sp
right before return and it is set to 0 in the normal return case.

The issue with this design is that eh_return and normal return may
require different return sequence but there is no way to distinguish
the two cases in the epilogue (the stack adjustment may be 0 in the
eh_return case too).

The reason eh_return and normal return requires different return
sequence is that control flow integrity hardening may need to treat
eh_return as a forward-edge transfer (it is not returning to the
previous stack frame) and normal return as a backward-edge one.
In case of AArch64 forward-edge is protected by BTI and requires br
instruction and backward-edge is protected by PAUTH or GCS and
requires ret (or authenticated ret) instruction.

This patch resolves the issue by using the EH_RETURN_STACKADJ_RTX
register only as a flag that is set to 1 in the eh_return paths
(it is 0 in normal return paths) and introduces

  AARCH64_EH_RETURN_STACKADJ_RTX
  AARCH64_EH_RETURN_HANDLER_RTX

to pass the actual stack adjustment and landing pad address to the
epilogue in the eh_return case. Then the epilogue can use the right
return sequence based on the EH_RETURN_STACKADJ_RTX flag.

The handler could be passed the old way via clobbering the return
address, but since now the eh_return case can be distinguished, the
handler can be in a different register than x30 and no stack frame
is needed for eh_return.

The new code generation for functions with eh_return is not amazing,
since x5 and x6 is assumed to be used by the epilogue even in the
normal return path, not just for eh_return.  But only the unwinder
is expected to use eh_return so this is fine.

This patch fixes a return to anywhere gadget in the unwinder with
existing standard branch protection as well as makes EH return
compatible with the Guarded Control Stack (GCS) extension.

gcc/ChangeLog:

	* config/aarch64/aarch64-protos.h (aarch64_eh_return_handler_rtx):
	Remove.
	(aarch64_eh_return): New.
	* config/aarch64/aarch64.cc (aarch64_return_address_signing_enabled):
	Sign return address even in functions with eh_return.
	(aarch64_epilogue_uses): Mark two registers as used.
	(aarch64_expand_epilogue): Conditionally return with br or ret.
	(aarch64_eh_return_handler_rtx): Remove.
	(aarch64_eh_return): New.
	* config/aarch64/aarch64.h (EH_RETURN_HANDLER_RTX): Remove.
	(AARCH64_EH_RETURN_STACKADJ_REGNUM): Define.
	(AARCH64_EH_RETURN_STACKADJ_RTX): Define.
	(AARCH64_EH_RETURN_HANDLER_REGNUM): Define.
	(AARCH64_EH_RETURN_HANDLER_RTX): Define.
	* config/aarch64/aarch64.md (eh_return): New.
---
 gcc/config/aarch64/aarch64-protos.h |   2 +-
 gcc/config/aarch64/aarch64.cc       | 106 +++++++++++++++-------------
 gcc/config/aarch64/aarch64.h        |  11 ++-
 gcc/config/aarch64/aarch64.md       |   8 +++
 4 files changed, 73 insertions(+), 54 deletions(-)

Comments

Richard Sandiford Aug. 23, 2023, 9:28 a.m. UTC | #1
Szabolcs Nagy <szabolcs.nagy@arm.com> writes:
> The expected way to handle eh_return is to pass the stack adjustment
> offset and landing pad address via
>
>   EH_RETURN_STACKADJ_RTX
>   EH_RETURN_HANDLER_RTX
>
> to the epilogue that is shared between normal return paths and the
> eh_return paths.  EH_RETURN_HANDLER_RTX is the stack slot of the
> return address that is overwritten with the landing pad in the
> eh_return case and EH_RETURN_STACKADJ_RTX is a register added to sp
> right before return and it is set to 0 in the normal return case.
>
> The issue with this design is that eh_return and normal return may
> require different return sequence but there is no way to distinguish
> the two cases in the epilogue (the stack adjustment may be 0 in the
> eh_return case too).
>
> The reason eh_return and normal return requires different return
> sequence is that control flow integrity hardening may need to treat
> eh_return as a forward-edge transfer (it is not returning to the
> previous stack frame) and normal return as a backward-edge one.
> In case of AArch64 forward-edge is protected by BTI and requires br
> instruction and backward-edge is protected by PAUTH or GCS and
> requires ret (or authenticated ret) instruction.
>
> This patch resolves the issue by using the EH_RETURN_STACKADJ_RTX
> register only as a flag that is set to 1 in the eh_return paths
> (it is 0 in normal return paths) and introduces
>
>   AARCH64_EH_RETURN_STACKADJ_RTX
>   AARCH64_EH_RETURN_HANDLER_RTX
>
> to pass the actual stack adjustment and landing pad address to the
> epilogue in the eh_return case. Then the epilogue can use the right
> return sequence based on the EH_RETURN_STACKADJ_RTX flag.
>
> The handler could be passed the old way via clobbering the return
> address, but since now the eh_return case can be distinguished, the
> handler can be in a different register than x30 and no stack frame
> is needed for eh_return.

I don't think there's any specific target-independent requirement for
EH_RETURN_HANDLER_RTX to be a stack slot.  df-scan.cc has code to handle
registers.

So couldn't we just use EH_RETURN_HANDLER_RTX for this, rather than
making it AARCH64_EH_RETURN_HANDLER_RTX?

> The new code generation for functions with eh_return is not amazing,
> since x5 and x6 is assumed to be used by the epilogue even in the
> normal return path, not just for eh_return.  But only the unwinder
> is expected to use eh_return so this is fine.

I guess the problem here is that x5 and x6 are upwards-exposed on
the non-eh_return paths, and so are treated as live for most of the
function.  Is that right?

The patch seems to be using the existing interfaces to implement
a slightly different model.  E.g. if feels like a hack (but a neat hack)
that EH_RETURN_STACKADJ_RTX is now a flag rather than an adjustment,
with AARCH64_EH_RETURN_STACKADJ_RTX then being the "real" stack
adjustment.  And the reason for the upwards exposure of the new
registers on normal return paths is that the existing model has
no hook into the normal return path.

Rather than hiding this in target code, perhaps we should add a
target-independent concept of an "eh_return taken" flag, say
EH_RETURN_TAKEN_RTX.

We could define it so that, on targets that define EH_RETURN_TAKEN_RTX,
a register EH_RETURN_STACKADJ_RTX and a register EH_RETURN_HANDLER_RTX
are only meaningful when the flag is true.  E.g. we could have:

#ifdef EH_RETURN_HANDLER_RTX
  for (rtx tmp : { EH_RETURN_STACKADJ_RTX, EH_RETURN_HANDLER_RTX })
    if (tmp && REG_P (tmp))
      emit_clobber (tmp);
#endif

in the "normal return" part of expand_eh_return.  (If some other target
wants a flag with different semantics, it'd be up to them to add it.)

That should avoid most of the bad code-quality effects, since the
specialness of x4-x6 will be confined to the code immediately before
the pre-epilogue exit edges.

Thanks,
Richard

> This patch fixes a return to anywhere gadget in the unwinder with
> existing standard branch protection as well as makes EH return
> compatible with the Guarded Control Stack (GCS) extension.
>
> gcc/ChangeLog:
>
> 	* config/aarch64/aarch64-protos.h (aarch64_eh_return_handler_rtx):
> 	Remove.
> 	(aarch64_eh_return): New.
> 	* config/aarch64/aarch64.cc (aarch64_return_address_signing_enabled):
> 	Sign return address even in functions with eh_return.
> 	(aarch64_epilogue_uses): Mark two registers as used.
> 	(aarch64_expand_epilogue): Conditionally return with br or ret.
> 	(aarch64_eh_return_handler_rtx): Remove.
> 	(aarch64_eh_return): New.
> 	* config/aarch64/aarch64.h (EH_RETURN_HANDLER_RTX): Remove.
> 	(AARCH64_EH_RETURN_STACKADJ_REGNUM): Define.
> 	(AARCH64_EH_RETURN_STACKADJ_RTX): Define.
> 	(AARCH64_EH_RETURN_HANDLER_REGNUM): Define.
> 	(AARCH64_EH_RETURN_HANDLER_RTX): Define.
> 	* config/aarch64/aarch64.md (eh_return): New.
> ---
>  gcc/config/aarch64/aarch64-protos.h |   2 +-
>  gcc/config/aarch64/aarch64.cc       | 106 +++++++++++++++-------------
>  gcc/config/aarch64/aarch64.h        |  11 ++-
>  gcc/config/aarch64/aarch64.md       |   8 +++
>  4 files changed, 73 insertions(+), 54 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
> index 70303d6fd95..5d1834162a4 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -855,7 +855,7 @@ machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned,
>  						       machine_mode);
>  int aarch64_uxt_size (int, HOST_WIDE_INT);
>  int aarch64_vec_fpconst_pow_of_2 (rtx);
> -rtx aarch64_eh_return_handler_rtx (void);
> +void aarch64_eh_return (rtx);
>  rtx aarch64_mask_from_zextract_ops (rtx, rtx);
>  const char *aarch64_output_move_struct (rtx *operands);
>  rtx aarch64_return_addr_rtx (void);
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index eba5d4a7e04..36cd172d182 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -8972,17 +8972,6 @@ aarch64_return_address_signing_enabled (void)
>    /* This function should only be called after frame laid out.   */
>    gcc_assert (cfun->machine->frame.laid_out);
>  
> -  /* Turn return address signing off in any function that uses
> -     __builtin_eh_return.  The address passed to __builtin_eh_return
> -     is not signed so either it has to be signed (with original sp)
> -     or the code path that uses it has to avoid authenticating it.
> -     Currently eh return introduces a return to anywhere gadget, no
> -     matter what we do here since it uses ret with user provided
> -     address. An ideal fix for that is to use indirect branch which
> -     can be protected with BTI j (to some extent).  */
> -  if (crtl->calls_eh_return)
> -    return false;
> -
>    /* If signing scope is AARCH_FUNCTION_NON_LEAF, we only sign a leaf function
>       if its LR is pushed onto stack.  */
>    return (aarch_ra_sign_scope == AARCH_FUNCTION_ALL
> @@ -9932,9 +9921,8 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2,
>     Note that in the case of sibcalls, the values "used by the epilogue" are
>     considered live at the start of the called function.
>  
> -   For SIMD functions we need to return 1 for FP registers that are saved and
> -   restored by a function but are not zero in call_used_regs.  If we do not do 
> -   this optimizations may remove the restore of the register.  */
> +   For EH return we need to keep two registers alive for stack adjustment
> +   and return address.  */
>  
>  int
>  aarch64_epilogue_uses (int regno)
> @@ -9944,6 +9932,13 @@ aarch64_epilogue_uses (int regno)
>        if (regno == LR_REGNUM)
>  	return 1;
>      }
> +
> +  if (!epilogue_completed && crtl->calls_eh_return)
> +    {
> +      if (regno == AARCH64_EH_RETURN_STACKADJ_REGNUM
> +	  || regno == AARCH64_EH_RETURN_HANDLER_REGNUM)
> +	return 1;
> +    }
>    return 0;
>  }
>  
> @@ -10342,6 +10337,30 @@ aarch64_expand_epilogue (bool for_sibcall)
>        RTX_FRAME_RELATED_P (insn) = 1;
>      }
>  
> +  /* Stack adjustment for exception handler.  */
> +  if (crtl->calls_eh_return && !for_sibcall)
> +    {
> +      /* If the EH_RETURN_STACKADJ_RTX flag is set then we need
> +	 to unwind the stack and jump to the handler, otherwise
> +	 skip this eh_return logic and continue with normal
> +	 return after the label.  We have already reset the CFA
> +	 to be SP; letting the CFA move during this adjustment
> +	 is just as correct as retaining the CFA from the body
> +	 of the function.  Therefore, do nothing special.  */
> +      rtx label = gen_label_rtx ();
> +      rtx x = gen_rtx_EQ (VOIDmode, EH_RETURN_STACKADJ_RTX, const0_rtx);
> +      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
> +				gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
> +      rtx jump = emit_jump_insn (gen_rtx_SET (pc_rtx, x));
> +      JUMP_LABEL (jump) = label;
> +      LABEL_NUSES (label)++;
> +      emit_insn (gen_add2_insn (stack_pointer_rtx,
> +				AARCH64_EH_RETURN_STACKADJ_RTX));
> +      emit_jump_insn (gen_indirect_jump (AARCH64_EH_RETURN_HANDLER_RTX));
> +      emit_barrier ();
> +      emit_label (label);
> +    }
> +
>    /* We prefer to emit the combined return/authenticate instruction RETAA,
>       however there are three cases in which we must instead emit an explicit
>       authentication instruction.
> @@ -10371,56 +10390,41 @@ aarch64_expand_epilogue (bool for_sibcall)
>        RTX_FRAME_RELATED_P (insn) = 1;
>      }
>  
> -  /* Stack adjustment for exception handler.  */
> -  if (crtl->calls_eh_return && !for_sibcall)
> -    {
> -      /* We need to unwind the stack by the offset computed by
> -	 EH_RETURN_STACKADJ_RTX.  We have already reset the CFA
> -	 to be SP; letting the CFA move during this adjustment
> -	 is just as correct as retaining the CFA from the body
> -	 of the function.  Therefore, do nothing special.  */
> -      emit_insn (gen_add2_insn (stack_pointer_rtx, EH_RETURN_STACKADJ_RTX));
> -    }
> -
>    emit_use (gen_rtx_REG (DImode, LR_REGNUM));
>    if (!for_sibcall)
>      emit_jump_insn (ret_rtx);
>  }
>  
> -/* Implement EH_RETURN_HANDLER_RTX.  EH returns need to either return
> -   normally or return to a previous frame after unwinding.
> +/* Implement the eh_return instruction pattern.  Functions with EH returns
> +   either return normally or return to a previous frame after unwinding.
>  
> -   An EH return uses a single shared return sequence.  The epilogue is
> +   The two cases use a single shared return sequence.  The epilogue is
>     exactly like a normal epilogue except that it has an extra input
>     register (EH_RETURN_STACKADJ_RTX) which contains the stack adjustment
>     that must be applied after the frame has been destroyed.  An extra label
>     is inserted before the epilogue which initializes this register to zero,
>     and this is the entry point for a normal return.
>  
> -   An actual EH return updates the return address, initializes the stack
> -   adjustment and jumps directly into the epilogue (bypassing the zeroing
> -   of the adjustment).  Since the return address is typically saved on the
> -   stack when a function makes a call, the saved LR must be updated outside
> -   the epilogue.
> -
> -   This poses problems as the store is generated well before the epilogue,
> -   so the offset of LR is not known yet.  Also optimizations will remove the
> -   store as it appears dead, even after the epilogue is generated (as the
> -   base or offset for loading LR is different in many cases).
> -
> -   To avoid these problems this implementation forces the frame pointer
> -   in eh_return functions so that the location of LR is fixed and known early.
> -   It also marks the store volatile, so no optimization is permitted to
> -   remove the store.  */
> -rtx
> -aarch64_eh_return_handler_rtx (void)
> -{
> -  rtx tmp = gen_frame_mem (Pmode,
> -    plus_constant (Pmode, hard_frame_pointer_rtx, UNITS_PER_WORD));
> +   An actual EH return initializes the stack adjustment then invokes this
> +   target hook (which supposed to overwrite the return address) and then
> +   jumps directly into the epilogue (bypassing the zeroing of the adjustment).
> +
> +   We depart from the intended EH return logic by using two additional
> +   registers to pass the handler and stack adjustment to the epilogue
>  
> -  /* Mark the store volatile, so no optimization is permitted to remove it.  */
> -  MEM_VOLATILE_P (tmp) = true;
> -  return tmp;
> +     AARCH64_EH_RETURN_HANDLER_RTX
> +     AARCH64_EH_RETURN_STACKADJ_RTX
> +
> +   and set EH_RETURN_STACKADJ_RTX to 1 in the EH return path so it is a
> +   flag that the epilogue can use to distinguish normal and EH returns.
> +   This allows different return instructions in the two cases.  The return
> +   address is not modified for EH returns.  */
> +void
> +aarch64_eh_return (rtx handler)
> +{
> +  emit_move_insn (AARCH64_EH_RETURN_HANDLER_RTX, handler);
> +  emit_move_insn (AARCH64_EH_RETURN_STACKADJ_RTX, EH_RETURN_STACKADJ_RTX);
> +  emit_move_insn (EH_RETURN_STACKADJ_RTX, const1_rtx);
>  }
>  
>  /* Output code to add DELTA to the first argument, and then jump
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index c783cb96c48..fa68ef0057a 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -583,9 +583,16 @@ enum class aarch64_feature : unsigned char {
>  /* Output assembly strings after .cfi_startproc is emitted.  */
>  #define ASM_POST_CFI_STARTPROC  aarch64_post_cfi_startproc
>  
> -/* For EH returns X4 contains the stack adjustment.  */
> +/* For EH returns X4 is a flag that is set in the EH return
> +   code paths and then X5 and X6 contain the stack adjustment
> +   and return address respectively.  */
>  #define EH_RETURN_STACKADJ_RTX	gen_rtx_REG (Pmode, R4_REGNUM)
> -#define EH_RETURN_HANDLER_RTX  aarch64_eh_return_handler_rtx ()
> +#define AARCH64_EH_RETURN_STACKADJ_REGNUM	R5_REGNUM
> +#define AARCH64_EH_RETURN_STACKADJ_RTX	\
> +  gen_rtx_REG (Pmode, AARCH64_EH_RETURN_STACKADJ_REGNUM)
> +#define AARCH64_EH_RETURN_HANDLER_REGNUM	R6_REGNUM
> +#define AARCH64_EH_RETURN_HANDLER_RTX	\
> +  gen_rtx_REG (Pmode, AARCH64_EH_RETURN_HANDLER_REGNUM)
>  
>  #undef TARGET_COMPUTE_FRAME_LAYOUT
>  #define TARGET_COMPUTE_FRAME_LAYOUT aarch64_layout_frame
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 01cf989641f..0a3474776f0 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -877,6 +877,14 @@ (define_expand "sibcall_epilogue"
>    "
>  )
>  
> +(define_expand "eh_return"
> +  [(use (match_operand 0 "general_operand"))]
> +  ""
> +{
> +  aarch64_eh_return (operands[0]);
> +  DONE;
> +})
> +
>  (define_insn "*do_return"
>    [(return)]
>    ""
Richard Sandiford Aug. 24, 2023, 9:43 a.m. UTC | #2
Richard Sandiford <richard.sandiford@arm.com> writes:
> Rather than hiding this in target code, perhaps we should add a
> target-independent concept of an "eh_return taken" flag, say
> EH_RETURN_TAKEN_RTX.
>
> We could define it so that, on targets that define EH_RETURN_TAKEN_RTX,
> a register EH_RETURN_STACKADJ_RTX and a register EH_RETURN_HANDLER_RTX
> are only meaningful when the flag is true.  E.g. we could have:
>
> #ifdef EH_RETURN_HANDLER_RTX

Gah, I meant #ifdef EH_RETURN_TAKEN_RTX here

>   for (rtx tmp : { EH_RETURN_STACKADJ_RTX, EH_RETURN_HANDLER_RTX })
>     if (tmp && REG_P (tmp))
>       emit_clobber (tmp);
> #endif
>
> in the "normal return" part of expand_eh_return.  (If some other target
> wants a flag with different semantics, it'd be up to them to add it.)
>
> That should avoid most of the bad code-quality effects, since the
> specialness of x4-x6 will be confined to the code immediately before
> the pre-epilogue exit edges.
>
> Thanks,
> Richard
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 70303d6fd95..5d1834162a4 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -855,7 +855,7 @@  machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned,
 						       machine_mode);
 int aarch64_uxt_size (int, HOST_WIDE_INT);
 int aarch64_vec_fpconst_pow_of_2 (rtx);
-rtx aarch64_eh_return_handler_rtx (void);
+void aarch64_eh_return (rtx);
 rtx aarch64_mask_from_zextract_ops (rtx, rtx);
 const char *aarch64_output_move_struct (rtx *operands);
 rtx aarch64_return_addr_rtx (void);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index eba5d4a7e04..36cd172d182 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -8972,17 +8972,6 @@  aarch64_return_address_signing_enabled (void)
   /* This function should only be called after frame laid out.   */
   gcc_assert (cfun->machine->frame.laid_out);
 
-  /* Turn return address signing off in any function that uses
-     __builtin_eh_return.  The address passed to __builtin_eh_return
-     is not signed so either it has to be signed (with original sp)
-     or the code path that uses it has to avoid authenticating it.
-     Currently eh return introduces a return to anywhere gadget, no
-     matter what we do here since it uses ret with user provided
-     address. An ideal fix for that is to use indirect branch which
-     can be protected with BTI j (to some extent).  */
-  if (crtl->calls_eh_return)
-    return false;
-
   /* If signing scope is AARCH_FUNCTION_NON_LEAF, we only sign a leaf function
      if its LR is pushed onto stack.  */
   return (aarch_ra_sign_scope == AARCH_FUNCTION_ALL
@@ -9932,9 +9921,8 @@  aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2,
    Note that in the case of sibcalls, the values "used by the epilogue" are
    considered live at the start of the called function.
 
-   For SIMD functions we need to return 1 for FP registers that are saved and
-   restored by a function but are not zero in call_used_regs.  If we do not do 
-   this optimizations may remove the restore of the register.  */
+   For EH return we need to keep two registers alive for stack adjustment
+   and return address.  */
 
 int
 aarch64_epilogue_uses (int regno)
@@ -9944,6 +9932,13 @@  aarch64_epilogue_uses (int regno)
       if (regno == LR_REGNUM)
 	return 1;
     }
+
+  if (!epilogue_completed && crtl->calls_eh_return)
+    {
+      if (regno == AARCH64_EH_RETURN_STACKADJ_REGNUM
+	  || regno == AARCH64_EH_RETURN_HANDLER_REGNUM)
+	return 1;
+    }
   return 0;
 }
 
@@ -10342,6 +10337,30 @@  aarch64_expand_epilogue (bool for_sibcall)
       RTX_FRAME_RELATED_P (insn) = 1;
     }
 
+  /* Stack adjustment for exception handler.  */
+  if (crtl->calls_eh_return && !for_sibcall)
+    {
+      /* If the EH_RETURN_STACKADJ_RTX flag is set then we need
+	 to unwind the stack and jump to the handler, otherwise
+	 skip this eh_return logic and continue with normal
+	 return after the label.  We have already reset the CFA
+	 to be SP; letting the CFA move during this adjustment
+	 is just as correct as retaining the CFA from the body
+	 of the function.  Therefore, do nothing special.  */
+      rtx label = gen_label_rtx ();
+      rtx x = gen_rtx_EQ (VOIDmode, EH_RETURN_STACKADJ_RTX, const0_rtx);
+      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+				gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
+      rtx jump = emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+      JUMP_LABEL (jump) = label;
+      LABEL_NUSES (label)++;
+      emit_insn (gen_add2_insn (stack_pointer_rtx,
+				AARCH64_EH_RETURN_STACKADJ_RTX));
+      emit_jump_insn (gen_indirect_jump (AARCH64_EH_RETURN_HANDLER_RTX));
+      emit_barrier ();
+      emit_label (label);
+    }
+
   /* We prefer to emit the combined return/authenticate instruction RETAA,
      however there are three cases in which we must instead emit an explicit
      authentication instruction.
@@ -10371,56 +10390,41 @@  aarch64_expand_epilogue (bool for_sibcall)
       RTX_FRAME_RELATED_P (insn) = 1;
     }
 
-  /* Stack adjustment for exception handler.  */
-  if (crtl->calls_eh_return && !for_sibcall)
-    {
-      /* We need to unwind the stack by the offset computed by
-	 EH_RETURN_STACKADJ_RTX.  We have already reset the CFA
-	 to be SP; letting the CFA move during this adjustment
-	 is just as correct as retaining the CFA from the body
-	 of the function.  Therefore, do nothing special.  */
-      emit_insn (gen_add2_insn (stack_pointer_rtx, EH_RETURN_STACKADJ_RTX));
-    }
-
   emit_use (gen_rtx_REG (DImode, LR_REGNUM));
   if (!for_sibcall)
     emit_jump_insn (ret_rtx);
 }
 
-/* Implement EH_RETURN_HANDLER_RTX.  EH returns need to either return
-   normally or return to a previous frame after unwinding.
+/* Implement the eh_return instruction pattern.  Functions with EH returns
+   either return normally or return to a previous frame after unwinding.
 
-   An EH return uses a single shared return sequence.  The epilogue is
+   The two cases use a single shared return sequence.  The epilogue is
    exactly like a normal epilogue except that it has an extra input
    register (EH_RETURN_STACKADJ_RTX) which contains the stack adjustment
    that must be applied after the frame has been destroyed.  An extra label
    is inserted before the epilogue which initializes this register to zero,
    and this is the entry point for a normal return.
 
-   An actual EH return updates the return address, initializes the stack
-   adjustment and jumps directly into the epilogue (bypassing the zeroing
-   of the adjustment).  Since the return address is typically saved on the
-   stack when a function makes a call, the saved LR must be updated outside
-   the epilogue.
-
-   This poses problems as the store is generated well before the epilogue,
-   so the offset of LR is not known yet.  Also optimizations will remove the
-   store as it appears dead, even after the epilogue is generated (as the
-   base or offset for loading LR is different in many cases).
-
-   To avoid these problems this implementation forces the frame pointer
-   in eh_return functions so that the location of LR is fixed and known early.
-   It also marks the store volatile, so no optimization is permitted to
-   remove the store.  */
-rtx
-aarch64_eh_return_handler_rtx (void)
-{
-  rtx tmp = gen_frame_mem (Pmode,
-    plus_constant (Pmode, hard_frame_pointer_rtx, UNITS_PER_WORD));
+   An actual EH return initializes the stack adjustment then invokes this
+   target hook (which supposed to overwrite the return address) and then
+   jumps directly into the epilogue (bypassing the zeroing of the adjustment).
+
+   We depart from the intended EH return logic by using two additional
+   registers to pass the handler and stack adjustment to the epilogue
 
-  /* Mark the store volatile, so no optimization is permitted to remove it.  */
-  MEM_VOLATILE_P (tmp) = true;
-  return tmp;
+     AARCH64_EH_RETURN_HANDLER_RTX
+     AARCH64_EH_RETURN_STACKADJ_RTX
+
+   and set EH_RETURN_STACKADJ_RTX to 1 in the EH return path so it is a
+   flag that the epilogue can use to distinguish normal and EH returns.
+   This allows different return instructions in the two cases.  The return
+   address is not modified for EH returns.  */
+void
+aarch64_eh_return (rtx handler)
+{
+  emit_move_insn (AARCH64_EH_RETURN_HANDLER_RTX, handler);
+  emit_move_insn (AARCH64_EH_RETURN_STACKADJ_RTX, EH_RETURN_STACKADJ_RTX);
+  emit_move_insn (EH_RETURN_STACKADJ_RTX, const1_rtx);
 }
 
 /* Output code to add DELTA to the first argument, and then jump
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index c783cb96c48..fa68ef0057a 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -583,9 +583,16 @@  enum class aarch64_feature : unsigned char {
 /* Output assembly strings after .cfi_startproc is emitted.  */
 #define ASM_POST_CFI_STARTPROC  aarch64_post_cfi_startproc
 
-/* For EH returns X4 contains the stack adjustment.  */
+/* For EH returns X4 is a flag that is set in the EH return
+   code paths and then X5 and X6 contain the stack adjustment
+   and return address respectively.  */
 #define EH_RETURN_STACKADJ_RTX	gen_rtx_REG (Pmode, R4_REGNUM)
-#define EH_RETURN_HANDLER_RTX  aarch64_eh_return_handler_rtx ()
+#define AARCH64_EH_RETURN_STACKADJ_REGNUM	R5_REGNUM
+#define AARCH64_EH_RETURN_STACKADJ_RTX	\
+  gen_rtx_REG (Pmode, AARCH64_EH_RETURN_STACKADJ_REGNUM)
+#define AARCH64_EH_RETURN_HANDLER_REGNUM	R6_REGNUM
+#define AARCH64_EH_RETURN_HANDLER_RTX	\
+  gen_rtx_REG (Pmode, AARCH64_EH_RETURN_HANDLER_REGNUM)
 
 #undef TARGET_COMPUTE_FRAME_LAYOUT
 #define TARGET_COMPUTE_FRAME_LAYOUT aarch64_layout_frame
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 01cf989641f..0a3474776f0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -877,6 +877,14 @@  (define_expand "sibcall_epilogue"
   "
 )
 
+(define_expand "eh_return"
+  [(use (match_operand 0 "general_operand"))]
+  ""
+{
+  aarch64_eh_return (operands[0]);
+  DONE;
+})
+
 (define_insn "*do_return"
   [(return)]
   ""