diff mbox series

[v1,1/3] aarch64: store signing key and signing method in DWARF _Unwind_FrameState

Message ID 20240719145404.683815-2-matthieu.longo@arm.com
State New
Headers show
Series store signing key and signing method in DWARF _Unwind_FrameState | expand

Commit Message

Matthieu Longo July 19, 2024, 2:54 p.m. UTC
This patch is only a refactoring of the existing implementation
of PAuth and returned-address signing. The existing behavior is
preserved.

_Unwind_FrameState already contains several CIE and FDE information
(see the attributes below the comment "The information we care
about from the CIE/FDE" in libgcc/unwind-dw2.h).
The patch aims at moving the information from DWARF CIE (signing
key stored in the augmentation string) and FDE (the used signing
method) into _Unwind_FrameState along the already-stored CIE and
FDE information.
Note: those information have to be saved in frame_state_reg_info
instead of _Unwind_FrameState as they need to be savable by
DW_CFA_remember_state and restorable by DW_CFA_restore_state, that
both rely on the attribute "prev".

Those new information in _Unwind_FrameState simplifies the look-up
of the signing key when the return address is demangled. It also
allows future signing methods to be easily added.

_Unwind_FrameState is not a part of the public API of libunwind,
so the change is backward compatible.

A new architecture-specific handler MD_ARCH_EXTENSION_FRAME_INIT
allows to reset values (if needed) in the frame state and unwind
context before changing the frame state to the caller context.

A new architecture-specific handler MD_ARCH_EXTENSION_CIE_AUG_HANDLER
isolates the architecture-specific augmentation strings in AArch64
backend, and allows others architectures to reuse augmentation
strings that would have clashed with AArch64 DWARF extensions.

aarch64_demangle_return_addr, DW_CFA_AARCH64_negate_ra_state and
DW_CFA_val_expression cases in libgcc/unwind-dw2-execute_cfa.h
were documented to clarify where the value of the RA state register
is stored (FS and CONTEXT respectively).

libgcc/ChangeLog:

  * config/aarch64/aarch64-unwind.h
  (AARCH64_DWARF_RA_STATE_MASK): The mask for RA state register.
  (aarch64_RA_signing_method_t): The diversifiers used to sign a
  function's return address.
  (aarch64_pointer_auth_key): The key used to sign a function's
  return address.
  (aarch64_cie_signed_with_b_key): Deleted as the signing key is
  available now in _Unwind_FrameState.
  (MD_ARCH_EXTENSION_CIE_AUG_HANDLER): New CIE augmentation string
  handler for architecture extensions.
  (MD_ARCH_EXTENSION_FRAME_INIT): New architecture-extension
  initialization routine for DWARF frame state and context before
  execution of DWARF instructions.
  (aarch64_context_RA_state_get): Read RA state register from CONTEXT.
  (aarch64_RA_state_get): Read RA state register from FS.
  (aarch64_RA_state_set): Write RA state register into FS.
  (aarch64_RA_state_toggle): Toggle RA state register in FS.
  (aarch64_cie_aug_handler): Handler AArch64 augmentation strings.
  (aarch64_arch_extension_frame_init): Initialize defaults for the
  signing key (PAUTH_KEY_A), and RA state register (RA_no_signing).
  (aarch64_demangle_return_addr): Rely on the frame registers and
  the signing_key attribute in _Unwind_FrameState.
  * unwind-dw2-execute_cfa.h:
  Use the right alias DW_CFA_AARCH64_negate_ra_state for __aarch64__
  instead of DW_CFA_GNU_window_save.
  (DW_CFA_AARCH64_negate_ra_state): Save the signing method in RA
  state register. Toggle RA state register without resetting 'how'
  to REG_UNSAVED.
  * unwind-dw2.c:
  (extract_cie_info): Save the signing key in the current
  _Unwind_FrameState while parsing the augmentation data.
  (uw_frame_state_for): Reset some attributes related to architecture
  extensions in _Unwind_FrameState.
  (uw_update_context): Move authentication code to AArch64 unwinding.
  * unwind-dw2.h (enum register_rule): Give a name to the existing
  enum for the register rules, and replace 'unsigned char' by 'enum
  register_rule' to facilitate debugging in GDB.
  (_Unwind_FrameState): Add a new architecture-extension attribute
  to store the signing key.
---
 libgcc/config/aarch64/aarch64-unwind.h | 154 ++++++++++++++++++++-----
 libgcc/unwind-dw2-execute_cfa.h        |  34 ++++--
 libgcc/unwind-dw2.c                    |  19 ++-
 libgcc/unwind-dw2.h                    |  17 ++-
 4 files changed, 175 insertions(+), 49 deletions(-)

Comments

Matthieu Longo July 29, 2024, 9:56 a.m. UTC | #1
On 2024-07-19 15:54, Matthieu Longo wrote:
> This patch is only a refactoring of the existing implementation
> of PAuth and returned-address signing. The existing behavior is
> preserved.
> 
> _Unwind_FrameState already contains several CIE and FDE information
> (see the attributes below the comment "The information we care
> about from the CIE/FDE" in libgcc/unwind-dw2.h).
> The patch aims at moving the information from DWARF CIE (signing
> key stored in the augmentation string) and FDE (the used signing
> method) into _Unwind_FrameState along the already-stored CIE and
> FDE information.
> Note: those information have to be saved in frame_state_reg_info
> instead of _Unwind_FrameState as they need to be savable by
> DW_CFA_remember_state and restorable by DW_CFA_restore_state, that
> both rely on the attribute "prev".
> 
> Those new information in _Unwind_FrameState simplifies the look-up
> of the signing key when the return address is demangled. It also
> allows future signing methods to be easily added.
> 
> _Unwind_FrameState is not a part of the public API of libunwind,
> so the change is backward compatible.
> 
> A new architecture-specific handler MD_ARCH_EXTENSION_FRAME_INIT
> allows to reset values (if needed) in the frame state and unwind
> context before changing the frame state to the caller context.
> 
> A new architecture-specific handler MD_ARCH_EXTENSION_CIE_AUG_HANDLER
> isolates the architecture-specific augmentation strings in AArch64
> backend, and allows others architectures to reuse augmentation
> strings that would have clashed with AArch64 DWARF extensions.
> 
> aarch64_demangle_return_addr, DW_CFA_AARCH64_negate_ra_state and
> DW_CFA_val_expression cases in libgcc/unwind-dw2-execute_cfa.h
> were documented to clarify where the value of the RA state register
> is stored (FS and CONTEXT respectively).
> 
> libgcc/ChangeLog:
> 
>    * config/aarch64/aarch64-unwind.h
>    (AARCH64_DWARF_RA_STATE_MASK): The mask for RA state register.
>    (aarch64_RA_signing_method_t): The diversifiers used to sign a
>    function's return address.
>    (aarch64_pointer_auth_key): The key used to sign a function's
>    return address.
>    (aarch64_cie_signed_with_b_key): Deleted as the signing key is
>    available now in _Unwind_FrameState.
>    (MD_ARCH_EXTENSION_CIE_AUG_HANDLER): New CIE augmentation string
>    handler for architecture extensions.
>    (MD_ARCH_EXTENSION_FRAME_INIT): New architecture-extension
>    initialization routine for DWARF frame state and context before
>    execution of DWARF instructions.
>    (aarch64_context_RA_state_get): Read RA state register from CONTEXT.
>    (aarch64_RA_state_get): Read RA state register from FS.
>    (aarch64_RA_state_set): Write RA state register into FS.
>    (aarch64_RA_state_toggle): Toggle RA state register in FS.
>    (aarch64_cie_aug_handler): Handler AArch64 augmentation strings.
>    (aarch64_arch_extension_frame_init): Initialize defaults for the
>    signing key (PAUTH_KEY_A), and RA state register (RA_no_signing).
>    (aarch64_demangle_return_addr): Rely on the frame registers and
>    the signing_key attribute in _Unwind_FrameState.
>    * unwind-dw2-execute_cfa.h:
>    Use the right alias DW_CFA_AARCH64_negate_ra_state for __aarch64__
>    instead of DW_CFA_GNU_window_save.
>    (DW_CFA_AARCH64_negate_ra_state): Save the signing method in RA
>    state register. Toggle RA state register without resetting 'how'
>    to REG_UNSAVED.
>    * unwind-dw2.c:
>    (extract_cie_info): Save the signing key in the current
>    _Unwind_FrameState while parsing the augmentation data.
>    (uw_frame_state_for): Reset some attributes related to architecture
>    extensions in _Unwind_FrameState.
>    (uw_update_context): Move authentication code to AArch64 unwinding.
>    * unwind-dw2.h (enum register_rule): Give a name to the existing
>    enum for the register rules, and replace 'unsigned char' by 'enum
>    register_rule' to facilitate debugging in GDB.
>    (_Unwind_FrameState): Add a new architecture-extension attribute
>    to store the signing key.
> ---
>   libgcc/config/aarch64/aarch64-unwind.h | 154 ++++++++++++++++++++-----
>   libgcc/unwind-dw2-execute_cfa.h        |  34 ++++--
>   libgcc/unwind-dw2.c                    |  19 ++-
>   libgcc/unwind-dw2.h                    |  17 ++-
>   4 files changed, 175 insertions(+), 49 deletions(-)
> 
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h b/libgcc/config/aarch64/aarch64-unwind.h
> index daf96624b5e..cc225a7e207 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -25,55 +25,155 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>   #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
>   #define AARCH64_UNWIND_H
>   
> -#define DWARF_REGNUM_AARCH64_RA_STATE 34
> +#include "ansidecl.h"
> +#include <stdbool.h>
> +
> +#define AARCH64_DWARF_REGNUM_RA_STATE 34
> +#define AARCH64_DWARF_RA_STATE_MASK   0x1
> +
> +/* The diversifiers used to sign a function's return address. */
> +typedef enum
> +{
> +  AARCH64_RA_no_signing = 0x0,
> +  AARCH64_RA_signing_SP = 0x1,
> +} __attribute__((packed)) aarch64_RA_signing_method_t;
> +
> +/* The key used to sign a function's return address. */
> +typedef enum {
> +  AARCH64_PAUTH_KEY_A,
> +  AARCH64_PAUTH_KEY_B,
> +} __attribute__((packed)) aarch64_pointer_auth_key;
> +
> +#define MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug) \
> +  aarch64_cie_aug_handler(fs, aug)
> +
> +#define MD_ARCH_EXTENSION_FRAME_INIT(context, fs) \
> +  aarch64_arch_extension_frame_init(context, fs)
>   
>   #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
>     aarch64_demangle_return_addr (context, fs, addr)
>   
> -static inline int
> -aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
> +static inline aarch64_RA_signing_method_t
> +aarch64_context_RA_state_get(struct _Unwind_Context *context)
> +{
> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
> +  return _Unwind_GetGR (context, index) & AARCH64_DWARF_RA_STATE_MASK;
> +}
> +
> +static inline aarch64_RA_signing_method_t
> +aarch64_fs_RA_state_get(_Unwind_FrameState const* fs)
>   {
> -  const struct dwarf_fde *fde = _Unwind_Find_FDE (context->bases.func,
> -						  &context->bases);
> -  if (fde != NULL)
> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
> +  return fs->regs.reg[index].loc.offset & AARCH64_DWARF_RA_STATE_MASK;
> +}
> +
> +static inline void
> +aarch64_fs_RA_state_set(_Unwind_FrameState *fs,
> +			aarch64_RA_signing_method_t signing_method)
> +{
> +  fs->regs.reg[AARCH64_DWARF_REGNUM_RA_STATE].loc.offset = signing_method;
> +}
> +
> +static inline void
> +aarch64_fs_RA_state_toggle(_Unwind_FrameState *fs)
> +{
> +  aarch64_RA_signing_method_t signing_method = aarch64_fs_RA_state_get(fs);
> +  gcc_assert (signing_method == AARCH64_RA_no_signing ||
> +	      signing_method == AARCH64_RA_signing_SP);
> +  aarch64_fs_RA_state_set(fs, (signing_method == AARCH64_RA_no_signing)
> +			  ? AARCH64_RA_signing_SP
> +			  : AARCH64_RA_no_signing);
> +}
> +
> +/* CIE handler for custom augmentation string.  */
> +static inline bool
> +aarch64_cie_aug_handler(_Unwind_FrameState *fs, unsigned char aug)
> +{
> +  // AArch64 B-key pointer authentication.
> +  if (aug == 'B')
>       {
> -      const struct dwarf_cie *cie = get_cie (fde);
> -      if (cie != NULL)
> -	{
> -	  const unsigned char *aug_str = cie->augmentation;
> -	  return __builtin_strchr ((const char *) aug_str,
> -				   'B') == NULL ? 0 : 1;
> -	}
> +      fs->regs.signing_key = AARCH64_PAUTH_KEY_B;
> +      return true;
>       }
> -  return 0;
> +  return false;
>   }
>   
> -/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
> -   unwind frame info FS.  If ADDR_WORD is signed, we do address authentication
> -   on it using CFA of current frame.  */
> +/* At the entrance of a new frame, some cached information from the CIE/FDE,
> + * and registers values related to architectural extensions require a default
> + * initialization.
> + * If any of those values related to architecture extensions had to be saved
> + * for the next frame, it should be done via the architecture extensions handler
> + * MD_FROB_UPDATE_CONTEXT in uw_update_context_1 (libgcc/unwind-dw2.c).  */
> +static inline void
> +aarch64_arch_extension_frame_init (struct _Unwind_Context *context ATTRIBUTE_UNUSED,
> +				   _Unwind_FrameState *fs)
> +{
> +  // Reset previously cached CIE/FDE information.
> +  fs->regs.signing_key = AARCH64_PAUTH_KEY_A;
> +
> +  // Reset some registers.
> +  // Note: the associated fs->how table is automatically reset to REG_UNSAVED in
> +  // uw_frame_state_for (libgcc/unwind-dw2.c). REG_UNSAVED means that whatever
> +  // was in the previous context is the current register value. In other words,
> +  // the next context inherits by default the previous value for a register.
> +  // To keep this logic coherent with some architecture extensions, we need to
> +  // reinitialize fs->how for some registers to REG_ARCHEXT.
> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +  fs->regs.how[reg] = REG_ARCHEXT;
> +  aarch64_fs_RA_state_set(fs, AARCH64_RA_no_signing);
> +}
>   
> +/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
> +   unwind frame info FS. If ADDR_WORD is signed, we do address authentication
> +   on it using CFA of current frame.
> +   Note: this function is called after uw_update_context_1 in uw_update_context.
> +   uw_update_context_1 is the function in which val expression are computed. Thus
> +   if DW_CFA_val_expression is used, the value of the RA state register is stored
> +   in CONTEXT, not FS. (more details about when DW_CFA_val_expression is used in
> +   the next comment below)  */
>   static inline void *
>   aarch64_demangle_return_addr (struct _Unwind_Context *context,
>   			      _Unwind_FrameState *fs,
>   			      _Unwind_Word addr_word)
>   {
>     void *addr = (void *)addr_word;
> -  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -
> -  if (fs->regs.how[reg] == REG_UNSAVED)
> -    return addr;
> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +
> +  // In libgcc, REG_ARCHEXT means that the RA state register was set by an AArch64
> +  // DWARF instruction and contains a valid value.
> +  // Return-address signing state is normally toggled by DW_CFA_AARCH64_negate_ra_state
> +  // (also knwon by its alias as DW_CFA_GNU_window_save).
> +  // However, RA state register can be set directly via DW_CFA_val_expression
> +  // too. GCC does not generate such CFI but some other compilers reportedly do.
> +  // (see PR104689 for more details).
> +  // Any other value than REG_ARCHEXT should be interpreted as if the RA state register
> +  // is set by another DWARF instruction, and the value is fetchable via _Unwind_GetGR.
> +  aarch64_RA_signing_method_t signing_method = AARCH64_RA_no_signing;
> +  if (fs->regs.how[reg] == REG_ARCHEXT)
> +    {
> +      signing_method = aarch64_fs_RA_state_get(fs);
> +    }
> +  else if (fs->regs.how[reg] != REG_UNSAVED)
> +    {
> +      signing_method = aarch64_context_RA_state_get(context);
> +
> +      // If the value was set in context, CONTEXT->by_value was set to 1.
> +      // uw_install_context_1 contains an assert on by_value, to check that all
> +      // registers values have been resolved before installing the context.
> +      // This will make the unwinding crash if we don't reset CONTEXT->by_value,
> +      // and CONTEXT->reg[reg].
> +      context->reg[reg] = _Unwind_Get_Unwind_Context_Reg_Val(0x0);
> +      context->by_value[reg] = 0;
> +    }
>   
> -  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
> -     REG_UNSAVED/REG_UNSAVED_ARCHEXT means RA signing is disabled/enabled),
> -     or set by a DW_CFA_expression.  */
> -  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
> -      || (_Unwind_GetGR (context, reg) & 0x1) != 0)
> +  if (signing_method == AARCH64_RA_signing_SP)
>       {
>         _Unwind_Word salt = (_Unwind_Word) context->cfa;
> -      if (aarch64_cie_signed_with_b_key (context) != 0)
> +      if (fs->regs.signing_key == AARCH64_PAUTH_KEY_B)
>   	return __builtin_aarch64_autib1716 (addr, salt);
>         return __builtin_aarch64_autia1716 (addr, salt);
>       }
> +  // else {} // signing_method == AARCH64_RA_no_signing, nothing to do.
>   
>     return addr;
>   }
> diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
> index a6b249f9ad6..2ea78c4ef8d 100644
> --- a/libgcc/unwind-dw2-execute_cfa.h
> +++ b/libgcc/unwind-dw2-execute_cfa.h
> @@ -271,23 +271,33 @@
>   	      fs->regs.how[reg] = REG_SAVED_VAL_EXP;
>   	      fs->regs.reg[reg].loc.exp = insn_ptr;
>   	    }
> +	  // Don't execute the expression, but jump over it by adding
> +	  // DW_FORM_block's size to insn_ptr.
>   	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
>   	  insn_ptr += utmp;
>   	  break;
>   
> -	case DW_CFA_GNU_window_save:
>   #if defined (__aarch64__) && !defined (__ILP32__)
> -	  /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
> -	     return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
> -	     mean RA signing is disabled/enabled.  */
> -	  reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -	  gcc_assert (fs->regs.how[reg] == REG_UNSAVED
> -		      || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
> -	  if (fs->regs.how[reg] == REG_UNSAVED)
> -	    fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
> -	  else
> -	    fs->regs.how[reg] = REG_UNSAVED;
> +	case DW_CFA_AARCH64_negate_ra_state:
> +	  // This CFA is multiplexed with Sparc.
> +	  // On AArch64 it's used to toggle the status of return address signing
> +	  // with SP as a diversifier.
> +	  // - REG_ARCHEXT means that the RA state register in FS contains a
> +	  // valid value, and that no other DWARF directive has changed the value
> +	  // of this register.
> +	  // - any other value is not compatible with negating the RA state.
> +	  reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +
> +	  // /!\ Mixing DW_CFA_val_expression with DW_CFA_AARCH64_negate_ra_state
> +	  // will result in an undefined behavior (likely an unwinding failure),
> +	  // as the chronology of the DWARF directives will be broken.
> +	  gcc_assert (fs->regs.how[reg] == REG_ARCHEXT);
> +
> +	  // Toggle current state
> +	  aarch64_fs_RA_state_toggle(fs);
> +	  break;
>   #else
> +	case DW_CFA_GNU_window_save:
>   	  /* ??? Hardcoded for SPARC register window configuration.  */
>   	  if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)
>   	    for (reg = 16; reg < 32; ++reg)
> @@ -295,8 +305,8 @@
>   		fs->regs.how[reg] = REG_SAVED_OFFSET;
>   		fs->regs.reg[reg].loc.offset = (reg - 16) * sizeof (void *);
>   	      }
> -#endif
>   	  break;
> +#endif
>   
>   	case DW_CFA_GNU_args_size:
>   	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
> diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
> index 0849e89cd34..fb60853825d 100644
> --- a/libgcc/unwind-dw2.c
> +++ b/libgcc/unwind-dw2.c
> @@ -501,11 +501,11 @@ extract_cie_info (const struct dwarf_cie *cie, struct _Unwind_Context *context,
>   	  fs->signal_frame = 1;
>   	  aug += 1;
>   	}
> -      /* aarch64 B-key pointer authentication.  */
> -      else if (aug[0] == 'B')
> -	{
> -	  aug += 1;
> -      }
> +
> +#if defined(MD_ARCH_EXTENSION_CIE_AUG_HANDLER)
> +      else if (MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug[0]))
> +	aug += 1;
> +#endif
>   
>         /* Otherwise we have an unknown augmentation string.
>   	 Bail unless we saw a 'z' prefix.  */
> @@ -996,6 +996,9 @@ uw_frame_state_for (struct _Unwind_Context *context, _Unwind_FrameState *fs)
>   
>     memset (&fs->regs.how[0], 0,
>   	  sizeof (*fs) - offsetof (_Unwind_FrameState, regs.how[0]));
> +#if defined(MD_ARCH_EXTENSION_FRAME_INIT)
> +  MD_ARCH_EXTENSION_FRAME_INIT(context, fs);
> +#endif
>     context->args_size = 0;
>     context->lsda = 0;
>   
> @@ -1197,7 +1200,11 @@ uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState *fs)
>         {
>         case REG_UNSAVED:
>         case REG_UNDEFINED:
> -      case REG_UNSAVED_ARCHEXT:
> +      case REG_ARCHEXT: // If the value depends on an augmenter, then there is
> +			// no processing to do here, and the value computation
> +			// should be delayed until the architecture handler
> +			// computes the value correctly based on the augmenter
> +			// information.
>   	break;
>   
>         case REG_SAVED_OFFSET:
> diff --git a/libgcc/unwind-dw2.h b/libgcc/unwind-dw2.h
> index 0dd8611f697..b75f4c65f98 100644
> --- a/libgcc/unwind-dw2.h
> +++ b/libgcc/unwind-dw2.h
> @@ -22,16 +22,17 @@
>      see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>      <http://www.gnu.org/licenses/>.  */
>   
> -enum {
> +enum register_rule
> +{
>     REG_UNSAVED,
>     REG_SAVED_OFFSET,
>     REG_SAVED_REG,
>     REG_SAVED_EXP,
>     REG_SAVED_VAL_OFFSET,
>     REG_SAVED_VAL_EXP,
> -  REG_UNSAVED_ARCHEXT,		/* Target specific extension.  */
> +  REG_ARCHEXT,		/* Target specific extension.  */
>     REG_UNDEFINED
> -};
> +} __attribute__((packed));
>   
>   /* The result of interpreting the frame unwind info for a frame.
>      This is all symbolic at this point, as none of the values can
> @@ -49,7 +50,7 @@ typedef struct
>   	const unsigned char *exp;
>         } loc;
>       } reg[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
> -    unsigned char how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
> +    enum register_rule how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>   
>       enum {
>         CFA_UNSET,
> @@ -65,6 +66,14 @@ typedef struct
>       _Unwind_Sword cfa_offset;
>       _Unwind_Word cfa_reg;
>       const unsigned char *cfa_exp;
> +
> +    /* Architecture extensions information from CIE/FDE.
> +     * Note: those information have to be saved in struct frame_state_reg_info
> +     * instead of _Unwind_FrameState as DW_CFA_restore_state has to be able to
> +     * restore them.  */
> +#if defined(__aarch64__) && !defined (__ILP32__)
> +    unsigned char signing_key;
> +#endif
>     } regs;
>   
>     /* The PC described by the current frame state.  */

Ping.
Richard Sandiford Aug. 6, 2024, 10:06 a.m. UTC | #2
Sorry for the slow review.

Matthieu Longo <matthieu.longo@arm.com> writes:
> This patch is only a refactoring of the existing implementation
> of PAuth and returned-address signing. The existing behavior is
> preserved.
>
> _Unwind_FrameState already contains several CIE and FDE information
> (see the attributes below the comment "The information we care
> about from the CIE/FDE" in libgcc/unwind-dw2.h).
> The patch aims at moving the information from DWARF CIE (signing
> key stored in the augmentation string) and FDE (the used signing
> method) into _Unwind_FrameState along the already-stored CIE and
> FDE information.
> Note: those information have to be saved in frame_state_reg_info
> instead of _Unwind_FrameState as they need to be savable by
> DW_CFA_remember_state and restorable by DW_CFA_restore_state, that
> both rely on the attribute "prev".
>
> Those new information in _Unwind_FrameState simplifies the look-up
> of the signing key when the return address is demangled. It also
> allows future signing methods to be easily added.
>
> _Unwind_FrameState is not a part of the public API of libunwind,
> so the change is backward compatible.
>
> A new architecture-specific handler MD_ARCH_EXTENSION_FRAME_INIT
> allows to reset values (if needed) in the frame state and unwind
> context before changing the frame state to the caller context.
>
> A new architecture-specific handler MD_ARCH_EXTENSION_CIE_AUG_HANDLER
> isolates the architecture-specific augmentation strings in AArch64
> backend, and allows others architectures to reuse augmentation
> strings that would have clashed with AArch64 DWARF extensions.
>
> aarch64_demangle_return_addr, DW_CFA_AARCH64_negate_ra_state and
> DW_CFA_val_expression cases in libgcc/unwind-dw2-execute_cfa.h
> were documented to clarify where the value of the RA state register
> is stored (FS and CONTEXT respectively).

The abstraction generally looks good.  My main comment is that,
if we are putting the new behaviour behind target macros (which I
agree is a good idea), could we also have a target macro to abstract:

> +#if defined(__aarch64__) && !defined (__ILP32__)
> +    unsigned char signing_key;
> +#endif

?  E.g. something like:

#ifdef MD_CFI_STATE
    MD_CFI_STATE md;
#endif

with aarch64 defining:

#define MD_CFI_STATE struct { unsigned char signing_key; }

(Names and organisation are just suggestions.)

It might be good to try running the patch through

  contrig/check_GNU_style.py

since the GCC coding standards are quite picky about formatting and
stylistic issues.  The main ones I spotted were:

- In files that already use block comments, all comments should be
  block comments rather than // comments.  The comments should end
  with ".  " (full stop and two spaces).

- Block comments are formatted as:

    /* Line 1
       Line 2.  */

  rather than as:

    /* Line 1
     * Line 2.  */

- Function names are generally all lowrecase (e.g. "ra" rather than "RA").

- In function calls, there should be a space between the function and
  the opening "("

- For pointer types, there should be a space before a "*" (or string of
  "*"s), but no space afterwards.

- "const" qualifiers generally go before the type that they qualify,
  rather than afterwards.

- The line width should be 80 characters or fewer.  (The patch was pretty
  good about this, but there were a couple of long lines).

- In multi-line conditions, the ||s and &&s go at the beginning of lines,
  rather than at the end.  (Same for infix operators in general.)

More detailed comments below.

>
> libgcc/ChangeLog:
>
>   * config/aarch64/aarch64-unwind.h
>   (AARCH64_DWARF_RA_STATE_MASK): The mask for RA state register.
>   (aarch64_RA_signing_method_t): The diversifiers used to sign a
>   function's return address.
>   (aarch64_pointer_auth_key): The key used to sign a function's
>   return address.
>   (aarch64_cie_signed_with_b_key): Deleted as the signing key is
>   available now in _Unwind_FrameState.
>   (MD_ARCH_EXTENSION_CIE_AUG_HANDLER): New CIE augmentation string
>   handler for architecture extensions.
>   (MD_ARCH_EXTENSION_FRAME_INIT): New architecture-extension
>   initialization routine for DWARF frame state and context before
>   execution of DWARF instructions.
>   (aarch64_context_RA_state_get): Read RA state register from CONTEXT.
>   (aarch64_RA_state_get): Read RA state register from FS.
>   (aarch64_RA_state_set): Write RA state register into FS.
>   (aarch64_RA_state_toggle): Toggle RA state register in FS.
>   (aarch64_cie_aug_handler): Handler AArch64 augmentation strings.
>   (aarch64_arch_extension_frame_init): Initialize defaults for the
>   signing key (PAUTH_KEY_A), and RA state register (RA_no_signing).
>   (aarch64_demangle_return_addr): Rely on the frame registers and
>   the signing_key attribute in _Unwind_FrameState.
>   * unwind-dw2-execute_cfa.h:
>   Use the right alias DW_CFA_AARCH64_negate_ra_state for __aarch64__
>   instead of DW_CFA_GNU_window_save.
>   (DW_CFA_AARCH64_negate_ra_state): Save the signing method in RA
>   state register. Toggle RA state register without resetting 'how'
>   to REG_UNSAVED.
>   * unwind-dw2.c:
>   (extract_cie_info): Save the signing key in the current
>   _Unwind_FrameState while parsing the augmentation data.
>   (uw_frame_state_for): Reset some attributes related to architecture
>   extensions in _Unwind_FrameState.
>   (uw_update_context): Move authentication code to AArch64 unwinding.
>   * unwind-dw2.h (enum register_rule): Give a name to the existing
>   enum for the register rules, and replace 'unsigned char' by 'enum
>   register_rule' to facilitate debugging in GDB.
>   (_Unwind_FrameState): Add a new architecture-extension attribute
>   to store the signing key.
> ---
>  libgcc/config/aarch64/aarch64-unwind.h | 154 ++++++++++++++++++++-----
>  libgcc/unwind-dw2-execute_cfa.h        |  34 ++++--
>  libgcc/unwind-dw2.c                    |  19 ++-
>  libgcc/unwind-dw2.h                    |  17 ++-
>  4 files changed, 175 insertions(+), 49 deletions(-)
>
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h b/libgcc/config/aarch64/aarch64-unwind.h
> index daf96624b5e..cc225a7e207 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -25,55 +25,155 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
>  #define AARCH64_UNWIND_H
>  
> -#define DWARF_REGNUM_AARCH64_RA_STATE 34
> +#include "ansidecl.h"
> +#include <stdbool.h>
> +
> +#define AARCH64_DWARF_REGNUM_RA_STATE 34
> +#define AARCH64_DWARF_RA_STATE_MASK   0x1
> +
> +/* The diversifiers used to sign a function's return address. */
> +typedef enum
> +{
> +  AARCH64_RA_no_signing = 0x0,
> +  AARCH64_RA_signing_SP = 0x1,
> +} __attribute__((packed)) aarch64_RA_signing_method_t;
> +
> +/* The key used to sign a function's return address. */
> +typedef enum {
> +  AARCH64_PAUTH_KEY_A,
> +  AARCH64_PAUTH_KEY_B,
> +} __attribute__((packed)) aarch64_pointer_auth_key;
> +
> +#define MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug) \
> +  aarch64_cie_aug_handler(fs, aug)
> +
> +#define MD_ARCH_EXTENSION_FRAME_INIT(context, fs) \
> +  aarch64_arch_extension_frame_init(context, fs)
>  
>  #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
>    aarch64_demangle_return_addr (context, fs, addr)
>  
> -static inline int
> -aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
> +static inline aarch64_RA_signing_method_t
> +aarch64_context_RA_state_get(struct _Unwind_Context *context)
> +{
> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
> +  return _Unwind_GetGR (context, index) & AARCH64_DWARF_RA_STATE_MASK;
> +}
> +
> +static inline aarch64_RA_signing_method_t
> +aarch64_fs_RA_state_get(_Unwind_FrameState const* fs)
>  {
> -  const struct dwarf_fde *fde = _Unwind_Find_FDE (context->bases.func,
> -						  &context->bases);
> -  if (fde != NULL)
> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
> +  return fs->regs.reg[index].loc.offset & AARCH64_DWARF_RA_STATE_MASK;
> +}
> +
> +static inline void
> +aarch64_fs_RA_state_set(_Unwind_FrameState *fs,
> +			aarch64_RA_signing_method_t signing_method)
> +{
> +  fs->regs.reg[AARCH64_DWARF_REGNUM_RA_STATE].loc.offset = signing_method;
> +}
> +
> +static inline void
> +aarch64_fs_RA_state_toggle(_Unwind_FrameState *fs)
> +{
> +  aarch64_RA_signing_method_t signing_method = aarch64_fs_RA_state_get(fs);
> +  gcc_assert (signing_method == AARCH64_RA_no_signing ||
> +	      signing_method == AARCH64_RA_signing_SP);
> +  aarch64_fs_RA_state_set(fs, (signing_method == AARCH64_RA_no_signing)
> +			  ? AARCH64_RA_signing_SP
> +			  : AARCH64_RA_no_signing);
> +}
> +
> +/* CIE handler for custom augmentation string.  */
> +static inline bool
> +aarch64_cie_aug_handler(_Unwind_FrameState *fs, unsigned char aug)
> +{
> +  // AArch64 B-key pointer authentication.
> +  if (aug == 'B')
>      {
> -      const struct dwarf_cie *cie = get_cie (fde);
> -      if (cie != NULL)
> -	{
> -	  const unsigned char *aug_str = cie->augmentation;
> -	  return __builtin_strchr ((const char *) aug_str,
> -				   'B') == NULL ? 0 : 1;
> -	}
> +      fs->regs.signing_key = AARCH64_PAUTH_KEY_B;
> +      return true;
>      }
> -  return 0;
> +  return false;
>  }
>  
> -/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
> -   unwind frame info FS.  If ADDR_WORD is signed, we do address authentication
> -   on it using CFA of current frame.  */
> +/* At the entrance of a new frame, some cached information from the CIE/FDE,
> + * and registers values related to architectural extensions require a default
> + * initialization.
> + * If any of those values related to architecture extensions had to be saved
> + * for the next frame, it should be done via the architecture extensions handler
> + * MD_FROB_UPDATE_CONTEXT in uw_update_context_1 (libgcc/unwind-dw2.c).  */
> +static inline void
> +aarch64_arch_extension_frame_init (struct _Unwind_Context *context ATTRIBUTE_UNUSED,
> +				   _Unwind_FrameState *fs)
> +{
> +  // Reset previously cached CIE/FDE information.
> +  fs->regs.signing_key = AARCH64_PAUTH_KEY_A;

How about making the comment:

  /* By default, DW_CFA_AARCH64_negate_ra_state assumes key A is being used
     for signing.  This can be overridden by adding 'B' to the augmentation
     string.  */

(if you agree that's a reasonable summary).

> +
> +  // Reset some registers.
> +  // Note: the associated fs->how table is automatically reset to REG_UNSAVED in
> +  // uw_frame_state_for (libgcc/unwind-dw2.c). REG_UNSAVED means that whatever
> +  // was in the previous context is the current register value. In other words,
> +  // the next context inherits by default the previous value for a register.
> +  // To keep this logic coherent with some architecture extensions, we need to
> +  // reinitialize fs->how for some registers to REG_ARCHEXT.

How about being a bit more specific about the intent:

  /* All registers are initially in state REG_UNSAVED, which indicates that
     they inherit register values from the previous frame.  However, the
     return address starts every frame in the "unsigned" state.  It also
     starts every frame in a state that supports the original toggle-based
     DW_CFA_AARCH64_negate_ra_state method of controlling RA signing.  */

> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +  fs->regs.how[reg] = REG_ARCHEXT;

Very minor, but I think this would be easier to read without the temporary.

> +  aarch64_fs_RA_state_set(fs, AARCH64_RA_no_signing);
> +}
>  
> +/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
> +   unwind frame info FS. If ADDR_WORD is signed, we do address authentication
> +   on it using CFA of current frame.
> +   Note: this function is called after uw_update_context_1 in uw_update_context.
> +   uw_update_context_1 is the function in which val expression are computed. Thus

expressions

> +   if DW_CFA_val_expression is used, the value of the RA state register is stored
> +   in CONTEXT, not FS. (more details about when DW_CFA_val_expression is used in
> +   the next comment below)  */
>  static inline void *
>  aarch64_demangle_return_addr (struct _Unwind_Context *context,
>  			      _Unwind_FrameState *fs,
>  			      _Unwind_Word addr_word)
>  {
>    void *addr = (void *)addr_word;
> -  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -
> -  if (fs->regs.how[reg] == REG_UNSAVED)
> -    return addr;
> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +
> +  // In libgcc, REG_ARCHEXT means that the RA state register was set by an AArch64
> +  // DWARF instruction and contains a valid value.

It's also used to describe the initial state.

> +  // Return-address signing state is normally toggled by DW_CFA_AARCH64_negate_ra_state
> +  // (also knwon by its alias as DW_CFA_GNU_window_save).
> +  // However, RA state register can be set directly via DW_CFA_val_expression
> +  // too. GCC does not generate such CFI but some other compilers reportedly do.

nit: drop the trailing full stop, since the following line continues
the sentence.

> +  // (see PR104689 for more details).
> +  // Any other value than REG_ARCHEXT should be interpreted as if the RA state register
> +  // is set by another DWARF instruction, and the value is fetchable via _Unwind_GetGR.
> +  aarch64_RA_signing_method_t signing_method = AARCH64_RA_no_signing;
> +  if (fs->regs.how[reg] == REG_ARCHEXT)
> +    {
> +      signing_method = aarch64_fs_RA_state_get(fs);
> +    }
> +  else if (fs->regs.how[reg] != REG_UNSAVED)
> +    {
> +      signing_method = aarch64_context_RA_state_get(context);
> +
> +      // If the value was set in context, CONTEXT->by_value was set to 1.
> +      // uw_install_context_1 contains an assert on by_value, to check that all
> +      // registers values have been resolved before installing the context.
> +      // This will make the unwinding crash if we don't reset CONTEXT->by_value,
> +      // and CONTEXT->reg[reg].
> +      context->reg[reg] = _Unwind_Get_Unwind_Context_Reg_Val(0x0);
> +      context->by_value[reg] = 0;

Is this change to the context necessary for the refactor, or is it
instead needed for the follow-on patches?  If it's part of the refactor,
could you explain why it's needed now but wasn't before?

This is the only part that didn't look like a no-op change.

> +    }
>  
> -  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
> -     REG_UNSAVED/REG_UNSAVED_ARCHEXT means RA signing is disabled/enabled),
> -     or set by a DW_CFA_expression.  */
> -  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
> -      || (_Unwind_GetGR (context, reg) & 0x1) != 0)
> +  if (signing_method == AARCH64_RA_signing_SP)
>      {
>        _Unwind_Word salt = (_Unwind_Word) context->cfa;
> -      if (aarch64_cie_signed_with_b_key (context) != 0)
> +      if (fs->regs.signing_key == AARCH64_PAUTH_KEY_B)
>  	return __builtin_aarch64_autib1716 (addr, salt);
>        return __builtin_aarch64_autia1716 (addr, salt);
>      }
> +  // else {} // signing_method == AARCH64_RA_no_signing, nothing to do.

IMO it'd be clearer without this line.  Alternatively, we could make it:

  else if (signing_method == AARCH64_RA_no_signing)
    return addr;

  gcc_unreachable ();

>  }
> diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
> index a6b249f9ad6..2ea78c4ef8d 100644
> --- a/libgcc/unwind-dw2-execute_cfa.h
> +++ b/libgcc/unwind-dw2-execute_cfa.h
> @@ -271,23 +271,33 @@
>  	      fs->regs.how[reg] = REG_SAVED_VAL_EXP;
>  	      fs->regs.reg[reg].loc.exp = insn_ptr;
>  	    }
> +	  // Don't execute the expression, but jump over it by adding
> +	  // DW_FORM_block's size to insn_ptr.
>  	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
>  	  insn_ptr += utmp;
>  	  break;
>  
> -	case DW_CFA_GNU_window_save:
>  #if defined (__aarch64__) && !defined (__ILP32__)
> -	  /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
> -	     return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
> -	     mean RA signing is disabled/enabled.  */
> -	  reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -	  gcc_assert (fs->regs.how[reg] == REG_UNSAVED
> -		      || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
> -	  if (fs->regs.how[reg] == REG_UNSAVED)
> -	    fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
> -	  else
> -	    fs->regs.how[reg] = REG_UNSAVED;
> +	case DW_CFA_AARCH64_negate_ra_state:
> +	  // This CFA is multiplexed with Sparc.
> +	  // On AArch64 it's used to toggle the status of return address signing
> +	  // with SP as a diversifier.
> +	  // - REG_ARCHEXT means that the RA state register in FS contains a
> +	  // valid value, and that no other DWARF directive has changed the value
> +	  // of this register.
> +	  // - any other value is not compatible with negating the RA state.
> +	  reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +
> +	  // /!\ Mixing DW_CFA_val_expression with DW_CFA_AARCH64_negate_ra_state
> +	  // will result in an undefined behavior (likely an unwinding failure),
> +	  // as the chronology of the DWARF directives will be broken.
> +	  gcc_assert (fs->regs.how[reg] == REG_ARCHEXT);

Could we move the assert into aarch64_fs_RA_state_toggle, or do later
patches not want that?

> +
> +	  // Toggle current state
> +	  aarch64_fs_RA_state_toggle(fs);
> +	  break;
>  #else
> +	case DW_CFA_GNU_window_save:
>  	  /* ??? Hardcoded for SPARC register window configuration.  */
>  	  if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)
>  	    for (reg = 16; reg < 32; ++reg)
> @@ -295,8 +305,8 @@
>  		fs->regs.how[reg] = REG_SAVED_OFFSET;
>  		fs->regs.reg[reg].loc.offset = (reg - 16) * sizeof (void *);
>  	      }
> -#endif
>  	  break;
> +#endif
>  
>  	case DW_CFA_GNU_args_size:
>  	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
> diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
> index 0849e89cd34..fb60853825d 100644
> --- a/libgcc/unwind-dw2.c
> +++ b/libgcc/unwind-dw2.c
> @@ -501,11 +501,11 @@ extract_cie_info (const struct dwarf_cie *cie, struct _Unwind_Context *context,
>  	  fs->signal_frame = 1;
>  	  aug += 1;
>  	}
> -      /* aarch64 B-key pointer authentication.  */
> -      else if (aug[0] == 'B')
> -	{
> -	  aug += 1;
> -      }
> +
> +#if defined(MD_ARCH_EXTENSION_CIE_AUG_HANDLER)
> +      else if (MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug[0]))
> +	aug += 1;
> +#endif
>  
>        /* Otherwise we have an unknown augmentation string.
>  	 Bail unless we saw a 'z' prefix.  */
> @@ -996,6 +996,9 @@ uw_frame_state_for (struct _Unwind_Context *context, _Unwind_FrameState *fs)
>  
>    memset (&fs->regs.how[0], 0,
>  	  sizeof (*fs) - offsetof (_Unwind_FrameState, regs.how[0]));
> +#if defined(MD_ARCH_EXTENSION_FRAME_INIT)
> +  MD_ARCH_EXTENSION_FRAME_INIT(context, fs);
> +#endif
>    context->args_size = 0;
>    context->lsda = 0;
>  
> @@ -1197,7 +1200,11 @@ uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState *fs)
>        {
>        case REG_UNSAVED:
>        case REG_UNDEFINED:
> -      case REG_UNSAVED_ARCHEXT:
> +      case REG_ARCHEXT: // If the value depends on an augmenter, then there is
> +			// no processing to do here, and the value computation
> +			// should be delayed until the architecture handler
> +			// computes the value correctly based on the augmenter
> +			// information.

This would more usually be written as a comment above the case, rather than
to the right.

>  	break;
>  
>        case REG_SAVED_OFFSET:
> diff --git a/libgcc/unwind-dw2.h b/libgcc/unwind-dw2.h
> index 0dd8611f697..b75f4c65f98 100644
> --- a/libgcc/unwind-dw2.h
> +++ b/libgcc/unwind-dw2.h
> @@ -22,16 +22,17 @@
>     see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -enum {
> +enum register_rule
> +{
>    REG_UNSAVED,
>    REG_SAVED_OFFSET,
>    REG_SAVED_REG,
>    REG_SAVED_EXP,
>    REG_SAVED_VAL_OFFSET,
>    REG_SAVED_VAL_EXP,
> -  REG_UNSAVED_ARCHEXT,		/* Target specific extension.  */
> +  REG_ARCHEXT,		/* Target specific extension.  */
>    REG_UNDEFINED
> -};
> +} __attribute__((packed));
>  
>  /* The result of interpreting the frame unwind info for a frame.
>     This is all symbolic at this point, as none of the values can
> @@ -49,7 +50,7 @@ typedef struct
>  	const unsigned char *exp;
>        } loc;
>      } reg[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
> -    unsigned char how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
> +    enum register_rule how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>  
>      enum {
>        CFA_UNSET,
> @@ -65,6 +66,14 @@ typedef struct
>      _Unwind_Sword cfa_offset;
>      _Unwind_Word cfa_reg;
>      const unsigned char *cfa_exp;
> +
> +    /* Architecture extensions information from CIE/FDE.
> +     * Note: those information have to be saved in struct frame_state_reg_info

this information (singular)

Thanks for doing this.  It's definitely a cleaner structure than what
we had before.

Richard

> +     * instead of _Unwind_FrameState as DW_CFA_restore_state has to be able to
> +     * restore them.  */
> +#if defined(__aarch64__) && !defined (__ILP32__)
> +    unsigned char signing_key;
> +#endif
>    } regs;
>  
>    /* The PC described by the current frame state.  */
Matthieu Longo Sept. 17, 2024, 1:28 p.m. UTC | #3
On 2024-08-06 11:06, Richard Sandiford wrote:
> Sorry for the slow review.
> 
> Matthieu Longo <matthieu.longo@arm.com> writes:
>> This patch is only a refactoring of the existing implementation
>> of PAuth and returned-address signing. The existing behavior is
>> preserved.
>>
>> _Unwind_FrameState already contains several CIE and FDE information
>> (see the attributes below the comment "The information we care
>> about from the CIE/FDE" in libgcc/unwind-dw2.h).
>> The patch aims at moving the information from DWARF CIE (signing
>> key stored in the augmentation string) and FDE (the used signing
>> method) into _Unwind_FrameState along the already-stored CIE and
>> FDE information.
>> Note: those information have to be saved in frame_state_reg_info
>> instead of _Unwind_FrameState as they need to be savable by
>> DW_CFA_remember_state and restorable by DW_CFA_restore_state, that
>> both rely on the attribute "prev".
>>
>> Those new information in _Unwind_FrameState simplifies the look-up
>> of the signing key when the return address is demangled. It also
>> allows future signing methods to be easily added.
>>
>> _Unwind_FrameState is not a part of the public API of libunwind,
>> so the change is backward compatible.
>>
>> A new architecture-specific handler MD_ARCH_EXTENSION_FRAME_INIT
>> allows to reset values (if needed) in the frame state and unwind
>> context before changing the frame state to the caller context.
>>
>> A new architecture-specific handler MD_ARCH_EXTENSION_CIE_AUG_HANDLER
>> isolates the architecture-specific augmentation strings in AArch64
>> backend, and allows others architectures to reuse augmentation
>> strings that would have clashed with AArch64 DWARF extensions.
>>
>> aarch64_demangle_return_addr, DW_CFA_AARCH64_negate_ra_state and
>> DW_CFA_val_expression cases in libgcc/unwind-dw2-execute_cfa.h
>> were documented to clarify where the value of the RA state register
>> is stored (FS and CONTEXT respectively).
> 
> The abstraction generally looks good.  My main comment is that,
> if we are putting the new behaviour behind target macros (which I
> agree is a good idea), could we also have a target macro to abstract:
> 
>> +#if defined(__aarch64__) && !defined (__ILP32__)
>> +    unsigned char signing_key;
>> +#endif
> 
> ?  E.g. something like:
> 
> #ifdef MD_CFI_STATE
>      MD_CFI_STATE md;
> #endif
> 
> with aarch64 defining:
> 
> #define MD_CFI_STATE struct { unsigned char signing_key; }
> 
> (Names and organisation are just suggestions.)
> 
> It might be good to try running the patch through
> 
>    contrig/check_GNU_style.py
> 
> since the GCC coding standards are quite picky about formatting and
> stylistic issues.  The main ones I spotted were:
> 
> - In files that already use block comments, all comments should be
>    block comments rather than // comments.  The comments should end
>    with ".  " (full stop and two spaces).
> 
> - Block comments are formatted as:
> 
>      /* Line 1
>         Line 2.  */
> 
>    rather than as:
> 
>      /* Line 1
>       * Line 2.  */
> 
> - Function names are generally all lowrecase (e.g. "ra" rather than "RA").
> 
> - In function calls, there should be a space between the function and
>    the opening "("
> 
> - For pointer types, there should be a space before a "*" (or string of
>    "*"s), but no space afterwards.
> 
> - "const" qualifiers generally go before the type that they qualify,
>    rather than afterwards.
> 
> - The line width should be 80 characters or fewer.  (The patch was pretty
>    good about this, but there were a couple of long lines).
> 
> - In multi-line conditions, the ||s and &&s go at the beginning of lines,
>    rather than at the end.  (Same for infix operators in general.)
> 
> More detailed comments below.
> 
>>
>> libgcc/ChangeLog:
>>
>>    * config/aarch64/aarch64-unwind.h
>>    (AARCH64_DWARF_RA_STATE_MASK): The mask for RA state register.
>>    (aarch64_RA_signing_method_t): The diversifiers used to sign a
>>    function's return address.
>>    (aarch64_pointer_auth_key): The key used to sign a function's
>>    return address.
>>    (aarch64_cie_signed_with_b_key): Deleted as the signing key is
>>    available now in _Unwind_FrameState.
>>    (MD_ARCH_EXTENSION_CIE_AUG_HANDLER): New CIE augmentation string
>>    handler for architecture extensions.
>>    (MD_ARCH_EXTENSION_FRAME_INIT): New architecture-extension
>>    initialization routine for DWARF frame state and context before
>>    execution of DWARF instructions.
>>    (aarch64_context_RA_state_get): Read RA state register from CONTEXT.
>>    (aarch64_RA_state_get): Read RA state register from FS.
>>    (aarch64_RA_state_set): Write RA state register into FS.
>>    (aarch64_RA_state_toggle): Toggle RA state register in FS.
>>    (aarch64_cie_aug_handler): Handler AArch64 augmentation strings.
>>    (aarch64_arch_extension_frame_init): Initialize defaults for the
>>    signing key (PAUTH_KEY_A), and RA state register (RA_no_signing).
>>    (aarch64_demangle_return_addr): Rely on the frame registers and
>>    the signing_key attribute in _Unwind_FrameState.
>>    * unwind-dw2-execute_cfa.h:
>>    Use the right alias DW_CFA_AARCH64_negate_ra_state for __aarch64__
>>    instead of DW_CFA_GNU_window_save.
>>    (DW_CFA_AARCH64_negate_ra_state): Save the signing method in RA
>>    state register. Toggle RA state register without resetting 'how'
>>    to REG_UNSAVED.
>>    * unwind-dw2.c:
>>    (extract_cie_info): Save the signing key in the current
>>    _Unwind_FrameState while parsing the augmentation data.
>>    (uw_frame_state_for): Reset some attributes related to architecture
>>    extensions in _Unwind_FrameState.
>>    (uw_update_context): Move authentication code to AArch64 unwinding.
>>    * unwind-dw2.h (enum register_rule): Give a name to the existing
>>    enum for the register rules, and replace 'unsigned char' by 'enum
>>    register_rule' to facilitate debugging in GDB.
>>    (_Unwind_FrameState): Add a new architecture-extension attribute
>>    to store the signing key.
>> ---
>>   libgcc/config/aarch64/aarch64-unwind.h | 154 ++++++++++++++++++++-----
>>   libgcc/unwind-dw2-execute_cfa.h        |  34 ++++--
>>   libgcc/unwind-dw2.c                    |  19 ++-
>>   libgcc/unwind-dw2.h                    |  17 ++-
>>   4 files changed, 175 insertions(+), 49 deletions(-)
>>
>> diff --git a/libgcc/config/aarch64/aarch64-unwind.h b/libgcc/config/aarch64/aarch64-unwind.h
>> index daf96624b5e..cc225a7e207 100644
>> --- a/libgcc/config/aarch64/aarch64-unwind.h
>> +++ b/libgcc/config/aarch64/aarch64-unwind.h
>> @@ -25,55 +25,155 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>>   #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
>>   #define AARCH64_UNWIND_H
>>   
>> -#define DWARF_REGNUM_AARCH64_RA_STATE 34
>> +#include "ansidecl.h"
>> +#include <stdbool.h>
>> +
>> +#define AARCH64_DWARF_REGNUM_RA_STATE 34
>> +#define AARCH64_DWARF_RA_STATE_MASK   0x1
>> +
>> +/* The diversifiers used to sign a function's return address. */
>> +typedef enum
>> +{
>> +  AARCH64_RA_no_signing = 0x0,
>> +  AARCH64_RA_signing_SP = 0x1,
>> +} __attribute__((packed)) aarch64_RA_signing_method_t;
>> +
>> +/* The key used to sign a function's return address. */
>> +typedef enum {
>> +  AARCH64_PAUTH_KEY_A,
>> +  AARCH64_PAUTH_KEY_B,
>> +} __attribute__((packed)) aarch64_pointer_auth_key;
>> +
>> +#define MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug) \
>> +  aarch64_cie_aug_handler(fs, aug)
>> +
>> +#define MD_ARCH_EXTENSION_FRAME_INIT(context, fs) \
>> +  aarch64_arch_extension_frame_init(context, fs)
>>   
>>   #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
>>     aarch64_demangle_return_addr (context, fs, addr)
>>   
>> -static inline int
>> -aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
>> +static inline aarch64_RA_signing_method_t
>> +aarch64_context_RA_state_get(struct _Unwind_Context *context)
>> +{
>> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
>> +  return _Unwind_GetGR (context, index) & AARCH64_DWARF_RA_STATE_MASK;
>> +}
>> +
>> +static inline aarch64_RA_signing_method_t
>> +aarch64_fs_RA_state_get(_Unwind_FrameState const* fs)
>>   {
>> -  const struct dwarf_fde *fde = _Unwind_Find_FDE (context->bases.func,
>> -						  &context->bases);
>> -  if (fde != NULL)
>> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
>> +  return fs->regs.reg[index].loc.offset & AARCH64_DWARF_RA_STATE_MASK;
>> +}
>> +
>> +static inline void
>> +aarch64_fs_RA_state_set(_Unwind_FrameState *fs,
>> +			aarch64_RA_signing_method_t signing_method)
>> +{
>> +  fs->regs.reg[AARCH64_DWARF_REGNUM_RA_STATE].loc.offset = signing_method;
>> +}
>> +
>> +static inline void
>> +aarch64_fs_RA_state_toggle(_Unwind_FrameState *fs)
>> +{
>> +  aarch64_RA_signing_method_t signing_method = aarch64_fs_RA_state_get(fs);
>> +  gcc_assert (signing_method == AARCH64_RA_no_signing ||
>> +	      signing_method == AARCH64_RA_signing_SP);
>> +  aarch64_fs_RA_state_set(fs, (signing_method == AARCH64_RA_no_signing)
>> +			  ? AARCH64_RA_signing_SP
>> +			  : AARCH64_RA_no_signing);
>> +}
>> +
>> +/* CIE handler for custom augmentation string.  */
>> +static inline bool
>> +aarch64_cie_aug_handler(_Unwind_FrameState *fs, unsigned char aug)
>> +{
>> +  // AArch64 B-key pointer authentication.
>> +  if (aug == 'B')
>>       {
>> -      const struct dwarf_cie *cie = get_cie (fde);
>> -      if (cie != NULL)
>> -	{
>> -	  const unsigned char *aug_str = cie->augmentation;
>> -	  return __builtin_strchr ((const char *) aug_str,
>> -				   'B') == NULL ? 0 : 1;
>> -	}
>> +      fs->regs.signing_key = AARCH64_PAUTH_KEY_B;
>> +      return true;
>>       }
>> -  return 0;
>> +  return false;
>>   }
>>   
>> -/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
>> -   unwind frame info FS.  If ADDR_WORD is signed, we do address authentication
>> -   on it using CFA of current frame.  */
>> +/* At the entrance of a new frame, some cached information from the CIE/FDE,
>> + * and registers values related to architectural extensions require a default
>> + * initialization.
>> + * If any of those values related to architecture extensions had to be saved
>> + * for the next frame, it should be done via the architecture extensions handler
>> + * MD_FROB_UPDATE_CONTEXT in uw_update_context_1 (libgcc/unwind-dw2.c).  */
>> +static inline void
>> +aarch64_arch_extension_frame_init (struct _Unwind_Context *context ATTRIBUTE_UNUSED,
>> +				   _Unwind_FrameState *fs)
>> +{
>> +  // Reset previously cached CIE/FDE information.
>> +  fs->regs.signing_key = AARCH64_PAUTH_KEY_A;
> 
> How about making the comment:
> 
>    /* By default, DW_CFA_AARCH64_negate_ra_state assumes key A is being used
>       for signing.  This can be overridden by adding 'B' to the augmentation
>       string.  */
> 
> (if you agree that's a reasonable summary).
> 
>> +
>> +  // Reset some registers.
>> +  // Note: the associated fs->how table is automatically reset to REG_UNSAVED in
>> +  // uw_frame_state_for (libgcc/unwind-dw2.c). REG_UNSAVED means that whatever
>> +  // was in the previous context is the current register value. In other words,
>> +  // the next context inherits by default the previous value for a register.
>> +  // To keep this logic coherent with some architecture extensions, we need to
>> +  // reinitialize fs->how for some registers to REG_ARCHEXT.
> 
> How about being a bit more specific about the intent:
> 
>    /* All registers are initially in state REG_UNSAVED, which indicates that
>       they inherit register values from the previous frame.  However, the
>       return address starts every frame in the "unsigned" state.  It also
>       starts every frame in a state that supports the original toggle-based
>       DW_CFA_AARCH64_negate_ra_state method of controlling RA signing.  */
> 
>> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
>> +  fs->regs.how[reg] = REG_ARCHEXT;
> 
> Very minor, but I think this would be easier to read without the temporary.
> 
>> +  aarch64_fs_RA_state_set(fs, AARCH64_RA_no_signing);
>> +}
>>   
>> +/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
>> +   unwind frame info FS. If ADDR_WORD is signed, we do address authentication
>> +   on it using CFA of current frame.
>> +   Note: this function is called after uw_update_context_1 in uw_update_context.
>> +   uw_update_context_1 is the function in which val expression are computed. Thus
> 
> expressions
> 
>> +   if DW_CFA_val_expression is used, the value of the RA state register is stored
>> +   in CONTEXT, not FS. (more details about when DW_CFA_val_expression is used in
>> +   the next comment below)  */
>>   static inline void *
>>   aarch64_demangle_return_addr (struct _Unwind_Context *context,
>>   			      _Unwind_FrameState *fs,
>>   			      _Unwind_Word addr_word)
>>   {
>>     void *addr = (void *)addr_word;
>> -  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
>> -
>> -  if (fs->regs.how[reg] == REG_UNSAVED)
>> -    return addr;
>> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
>> +
>> +  // In libgcc, REG_ARCHEXT means that the RA state register was set by an AArch64
>> +  // DWARF instruction and contains a valid value.
> 
> It's also used to describe the initial state.
> 
>> +  // Return-address signing state is normally toggled by DW_CFA_AARCH64_negate_ra_state
>> +  // (also knwon by its alias as DW_CFA_GNU_window_save).
>> +  // However, RA state register can be set directly via DW_CFA_val_expression
>> +  // too. GCC does not generate such CFI but some other compilers reportedly do.
> 
> nit: drop the trailing full stop, since the following line continues
> the sentence.
> 
>> +  // (see PR104689 for more details).
>> +  // Any other value than REG_ARCHEXT should be interpreted as if the RA state register
>> +  // is set by another DWARF instruction, and the value is fetchable via _Unwind_GetGR.
>> +  aarch64_RA_signing_method_t signing_method = AARCH64_RA_no_signing;
>> +  if (fs->regs.how[reg] == REG_ARCHEXT)
>> +    {
>> +      signing_method = aarch64_fs_RA_state_get(fs);
>> +    }
>> +  else if (fs->regs.how[reg] != REG_UNSAVED)
>> +    {
>> +      signing_method = aarch64_context_RA_state_get(context);
>> +
>> +      // If the value was set in context, CONTEXT->by_value was set to 1.
>> +      // uw_install_context_1 contains an assert on by_value, to check that all
>> +      // registers values have been resolved before installing the context.
>> +      // This will make the unwinding crash if we don't reset CONTEXT->by_value,
>> +      // and CONTEXT->reg[reg].
>> +      context->reg[reg] = _Unwind_Get_Unwind_Context_Reg_Val(0x0);
>> +      context->by_value[reg] = 0;
> 
> Is this change to the context necessary for the refactor, or is it
> instead needed for the follow-on patches?  If it's part of the refactor,
> could you explain why it's needed now but wasn't before?
> 
> This is the only part that didn't look like a no-op change.
>

The Cranelift project decided to set the RA state register with 
DW_CFA_val_expression instead of DW_CFA_AARCH64_negate_ra_state.
Something like:
const RA_SIGN_STATE := 34
DW_CFA_val_expression, RA_SIGN_STATE, 1, DW_OP_lit1

When the value is set in CONTEXT, CONTEXT->by_value is also set to 1.
uw_install_context_1 contains an assert on by_value.
The restoration of the context has apparently never been tested as it 
triggers the assert. There is only one test [1] in the GCC testsuite 
that tests the backtracing.

[1]: gcc/testsuite/gcc.target/aarch64/pr104689.c

The idea of the fix was to reset the RA state register, as it is a local 
frame register, and does not need to be copied into the target frame.

I discussed offline this issue with Richard Sandiford, and here is his 
explanation of why there is this assertion in uw_install_context_1.

** begin quotation **

I think the reason for the assert is that the current frame passed to 
uw_install_context_1 is always the function that calls 
__builtin_eh_return, such as _Unwind_RaiseException. That function is 
supposed to create save slots on the stack for each register whose 
contents need to be tracked and unwound (generally all call-preserved 
registers). So I suppose it was a reasonable assumption that that frame 
would never have registers stored by value, since that would prevent the 
register values from being unwound properly.  And the current code 
wouldn't do something sensible for current->by_value[i] != 0.

Here we're using a register to track a property of a frame, rather than 
to track registers between frames. In that sense it's a new use case.

Personally I think it'd be reasonable for the loop in 
uw_install_context_1 to continue on current->by_value[i] (with a 
comment). That seems preferable to changing the register entry to avoid 
the assert. I don't think the loop in uw_install_context_1 should even 
be trying to copy the RA signing state, although I suppose it doesn't 
matter much either way.

** end quotation **

As recommended by Richard, I added a handler MD_FRAME_LOCAL_REGISTER_P 
(in the next revision) to explicitly skip the RA state register from the 
context copy.

>> +    }
>>   
>> -  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
>> -     REG_UNSAVED/REG_UNSAVED_ARCHEXT means RA signing is disabled/enabled),
>> -     or set by a DW_CFA_expression.  */
>> -  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
>> -      || (_Unwind_GetGR (context, reg) & 0x1) != 0)
>> +  if (signing_method == AARCH64_RA_signing_SP)
>>       {
>>         _Unwind_Word salt = (_Unwind_Word) context->cfa;
>> -      if (aarch64_cie_signed_with_b_key (context) != 0)
>> +      if (fs->regs.signing_key == AARCH64_PAUTH_KEY_B)
>>   	return __builtin_aarch64_autib1716 (addr, salt);
>>         return __builtin_aarch64_autia1716 (addr, salt);
>>       }
>> +  // else {} // signing_method == AARCH64_RA_no_signing, nothing to do.
> 
> IMO it'd be clearer without this line.  Alternatively, we could make it:
> 
>    else if (signing_method == AARCH64_RA_no_signing)
>      return addr;
> 
>    gcc_unreachable ();
> 
>>   }
>> diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
>> index a6b249f9ad6..2ea78c4ef8d 100644
>> --- a/libgcc/unwind-dw2-execute_cfa.h
>> +++ b/libgcc/unwind-dw2-execute_cfa.h
>> @@ -271,23 +271,33 @@
>>   	      fs->regs.how[reg] = REG_SAVED_VAL_EXP;
>>   	      fs->regs.reg[reg].loc.exp = insn_ptr;
>>   	    }
>> +	  // Don't execute the expression, but jump over it by adding
>> +	  // DW_FORM_block's size to insn_ptr.
>>   	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
>>   	  insn_ptr += utmp;
>>   	  break;
>>   
>> -	case DW_CFA_GNU_window_save:
>>   #if defined (__aarch64__) && !defined (__ILP32__)
>> -	  /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
>> -	     return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
>> -	     mean RA signing is disabled/enabled.  */
>> -	  reg = DWARF_REGNUM_AARCH64_RA_STATE;
>> -	  gcc_assert (fs->regs.how[reg] == REG_UNSAVED
>> -		      || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
>> -	  if (fs->regs.how[reg] == REG_UNSAVED)
>> -	    fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
>> -	  else
>> -	    fs->regs.how[reg] = REG_UNSAVED;
>> +	case DW_CFA_AARCH64_negate_ra_state:
>> +	  // This CFA is multiplexed with Sparc.
>> +	  // On AArch64 it's used to toggle the status of return address signing
>> +	  // with SP as a diversifier.
>> +	  // - REG_ARCHEXT means that the RA state register in FS contains a
>> +	  // valid value, and that no other DWARF directive has changed the value
>> +	  // of this register.
>> +	  // - any other value is not compatible with negating the RA state.
>> +	  reg = AARCH64_DWARF_REGNUM_RA_STATE;
>> +
>> +	  // /!\ Mixing DW_CFA_val_expression with DW_CFA_AARCH64_negate_ra_state
>> +	  // will result in an undefined behavior (likely an unwinding failure),
>> +	  // as the chronology of the DWARF directives will be broken.
>> +	  gcc_assert (fs->regs.how[reg] == REG_ARCHEXT);
> 
> Could we move the assert into aarch64_fs_RA_state_toggle, or do later
> patches not want that?
> 
>> +
>> +	  // Toggle current state
>> +	  aarch64_fs_RA_state_toggle(fs);
>> +	  break;
>>   #else
>> +	case DW_CFA_GNU_window_save:
>>   	  /* ??? Hardcoded for SPARC register window configuration.  */
>>   	  if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)
>>   	    for (reg = 16; reg < 32; ++reg)
>> @@ -295,8 +305,8 @@
>>   		fs->regs.how[reg] = REG_SAVED_OFFSET;
>>   		fs->regs.reg[reg].loc.offset = (reg - 16) * sizeof (void *);
>>   	      }
>> -#endif
>>   	  break;
>> +#endif
>>   
>>   	case DW_CFA_GNU_args_size:
>>   	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
>> diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
>> index 0849e89cd34..fb60853825d 100644
>> --- a/libgcc/unwind-dw2.c
>> +++ b/libgcc/unwind-dw2.c
>> @@ -501,11 +501,11 @@ extract_cie_info (const struct dwarf_cie *cie, struct _Unwind_Context *context,
>>   	  fs->signal_frame = 1;
>>   	  aug += 1;
>>   	}
>> -      /* aarch64 B-key pointer authentication.  */
>> -      else if (aug[0] == 'B')
>> -	{
>> -	  aug += 1;
>> -      }
>> +
>> +#if defined(MD_ARCH_EXTENSION_CIE_AUG_HANDLER)
>> +      else if (MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug[0]))
>> +	aug += 1;
>> +#endif
>>   
>>         /* Otherwise we have an unknown augmentation string.
>>   	 Bail unless we saw a 'z' prefix.  */
>> @@ -996,6 +996,9 @@ uw_frame_state_for (struct _Unwind_Context *context, _Unwind_FrameState *fs)
>>   
>>     memset (&fs->regs.how[0], 0,
>>   	  sizeof (*fs) - offsetof (_Unwind_FrameState, regs.how[0]));
>> +#if defined(MD_ARCH_EXTENSION_FRAME_INIT)
>> +  MD_ARCH_EXTENSION_FRAME_INIT(context, fs);
>> +#endif
>>     context->args_size = 0;
>>     context->lsda = 0;
>>   
>> @@ -1197,7 +1200,11 @@ uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState *fs)
>>         {
>>         case REG_UNSAVED:
>>         case REG_UNDEFINED:
>> -      case REG_UNSAVED_ARCHEXT:
>> +      case REG_ARCHEXT: // If the value depends on an augmenter, then there is
>> +			// no processing to do here, and the value computation
>> +			// should be delayed until the architecture handler
>> +			// computes the value correctly based on the augmenter
>> +			// information.
> 
> This would more usually be written as a comment above the case, rather than
> to the right.
> 
>>   	break;
>>   
>>         case REG_SAVED_OFFSET:
>> diff --git a/libgcc/unwind-dw2.h b/libgcc/unwind-dw2.h
>> index 0dd8611f697..b75f4c65f98 100644
>> --- a/libgcc/unwind-dw2.h
>> +++ b/libgcc/unwind-dw2.h
>> @@ -22,16 +22,17 @@
>>      see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>>      <http://www.gnu.org/licenses/>.  */
>>   
>> -enum {
>> +enum register_rule
>> +{
>>     REG_UNSAVED,
>>     REG_SAVED_OFFSET,
>>     REG_SAVED_REG,
>>     REG_SAVED_EXP,
>>     REG_SAVED_VAL_OFFSET,
>>     REG_SAVED_VAL_EXP,
>> -  REG_UNSAVED_ARCHEXT,		/* Target specific extension.  */
>> +  REG_ARCHEXT,		/* Target specific extension.  */
>>     REG_UNDEFINED
>> -};
>> +} __attribute__((packed));
>>   
>>   /* The result of interpreting the frame unwind info for a frame.
>>      This is all symbolic at this point, as none of the values can
>> @@ -49,7 +50,7 @@ typedef struct
>>   	const unsigned char *exp;
>>         } loc;
>>       } reg[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>> -    unsigned char how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>> +    enum register_rule how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>>   
>>       enum {
>>         CFA_UNSET,
>> @@ -65,6 +66,14 @@ typedef struct
>>       _Unwind_Sword cfa_offset;
>>       _Unwind_Word cfa_reg;
>>       const unsigned char *cfa_exp;
>> +
>> +    /* Architecture extensions information from CIE/FDE.
>> +     * Note: those information have to be saved in struct frame_state_reg_info
> 
> this information (singular)
> 
> Thanks for doing this.  It's definitely a cleaner structure than what
> we had before.
> 
> Richard
> 
>> +     * instead of _Unwind_FrameState as DW_CFA_restore_state has to be able to
>> +     * restore them.  */
>> +#if defined(__aarch64__) && !defined (__ILP32__)
>> +    unsigned char signing_key;
>> +#endif
>>     } regs;
>>   
>>     /* The PC described by the current frame state.  */

All the comments above were addressed in the next revision.
diff mbox series

Patch

diff --git a/libgcc/config/aarch64/aarch64-unwind.h b/libgcc/config/aarch64/aarch64-unwind.h
index daf96624b5e..cc225a7e207 100644
--- a/libgcc/config/aarch64/aarch64-unwind.h
+++ b/libgcc/config/aarch64/aarch64-unwind.h
@@ -25,55 +25,155 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
 #define AARCH64_UNWIND_H
 
-#define DWARF_REGNUM_AARCH64_RA_STATE 34
+#include "ansidecl.h"
+#include <stdbool.h>
+
+#define AARCH64_DWARF_REGNUM_RA_STATE 34
+#define AARCH64_DWARF_RA_STATE_MASK   0x1
+
+/* The diversifiers used to sign a function's return address. */
+typedef enum
+{
+  AARCH64_RA_no_signing = 0x0,
+  AARCH64_RA_signing_SP = 0x1,
+} __attribute__((packed)) aarch64_RA_signing_method_t;
+
+/* The key used to sign a function's return address. */
+typedef enum {
+  AARCH64_PAUTH_KEY_A,
+  AARCH64_PAUTH_KEY_B,
+} __attribute__((packed)) aarch64_pointer_auth_key;
+
+#define MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug) \
+  aarch64_cie_aug_handler(fs, aug)
+
+#define MD_ARCH_EXTENSION_FRAME_INIT(context, fs) \
+  aarch64_arch_extension_frame_init(context, fs)
 
 #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
   aarch64_demangle_return_addr (context, fs, addr)
 
-static inline int
-aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
+static inline aarch64_RA_signing_method_t
+aarch64_context_RA_state_get(struct _Unwind_Context *context)
+{
+  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
+  return _Unwind_GetGR (context, index) & AARCH64_DWARF_RA_STATE_MASK;
+}
+
+static inline aarch64_RA_signing_method_t
+aarch64_fs_RA_state_get(_Unwind_FrameState const* fs)
 {
-  const struct dwarf_fde *fde = _Unwind_Find_FDE (context->bases.func,
-						  &context->bases);
-  if (fde != NULL)
+  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
+  return fs->regs.reg[index].loc.offset & AARCH64_DWARF_RA_STATE_MASK;
+}
+
+static inline void
+aarch64_fs_RA_state_set(_Unwind_FrameState *fs,
+			aarch64_RA_signing_method_t signing_method)
+{
+  fs->regs.reg[AARCH64_DWARF_REGNUM_RA_STATE].loc.offset = signing_method;
+}
+
+static inline void
+aarch64_fs_RA_state_toggle(_Unwind_FrameState *fs)
+{
+  aarch64_RA_signing_method_t signing_method = aarch64_fs_RA_state_get(fs);
+  gcc_assert (signing_method == AARCH64_RA_no_signing ||
+	      signing_method == AARCH64_RA_signing_SP);
+  aarch64_fs_RA_state_set(fs, (signing_method == AARCH64_RA_no_signing)
+			  ? AARCH64_RA_signing_SP
+			  : AARCH64_RA_no_signing);
+}
+
+/* CIE handler for custom augmentation string.  */
+static inline bool
+aarch64_cie_aug_handler(_Unwind_FrameState *fs, unsigned char aug)
+{
+  // AArch64 B-key pointer authentication.
+  if (aug == 'B')
     {
-      const struct dwarf_cie *cie = get_cie (fde);
-      if (cie != NULL)
-	{
-	  const unsigned char *aug_str = cie->augmentation;
-	  return __builtin_strchr ((const char *) aug_str,
-				   'B') == NULL ? 0 : 1;
-	}
+      fs->regs.signing_key = AARCH64_PAUTH_KEY_B;
+      return true;
     }
-  return 0;
+  return false;
 }
 
-/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
-   unwind frame info FS.  If ADDR_WORD is signed, we do address authentication
-   on it using CFA of current frame.  */
+/* At the entrance of a new frame, some cached information from the CIE/FDE,
+ * and registers values related to architectural extensions require a default
+ * initialization.
+ * If any of those values related to architecture extensions had to be saved
+ * for the next frame, it should be done via the architecture extensions handler
+ * MD_FROB_UPDATE_CONTEXT in uw_update_context_1 (libgcc/unwind-dw2.c).  */
+static inline void
+aarch64_arch_extension_frame_init (struct _Unwind_Context *context ATTRIBUTE_UNUSED,
+				   _Unwind_FrameState *fs)
+{
+  // Reset previously cached CIE/FDE information.
+  fs->regs.signing_key = AARCH64_PAUTH_KEY_A;
+
+  // Reset some registers.
+  // Note: the associated fs->how table is automatically reset to REG_UNSAVED in
+  // uw_frame_state_for (libgcc/unwind-dw2.c). REG_UNSAVED means that whatever
+  // was in the previous context is the current register value. In other words,
+  // the next context inherits by default the previous value for a register.
+  // To keep this logic coherent with some architecture extensions, we need to
+  // reinitialize fs->how for some registers to REG_ARCHEXT.
+  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
+  fs->regs.how[reg] = REG_ARCHEXT;
+  aarch64_fs_RA_state_set(fs, AARCH64_RA_no_signing);
+}
 
+/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT and
+   unwind frame info FS. If ADDR_WORD is signed, we do address authentication
+   on it using CFA of current frame.
+   Note: this function is called after uw_update_context_1 in uw_update_context.
+   uw_update_context_1 is the function in which val expression are computed. Thus
+   if DW_CFA_val_expression is used, the value of the RA state register is stored
+   in CONTEXT, not FS. (more details about when DW_CFA_val_expression is used in
+   the next comment below)  */
 static inline void *
 aarch64_demangle_return_addr (struct _Unwind_Context *context,
 			      _Unwind_FrameState *fs,
 			      _Unwind_Word addr_word)
 {
   void *addr = (void *)addr_word;
-  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
-
-  if (fs->regs.how[reg] == REG_UNSAVED)
-    return addr;
+  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
+
+  // In libgcc, REG_ARCHEXT means that the RA state register was set by an AArch64
+  // DWARF instruction and contains a valid value.
+  // Return-address signing state is normally toggled by DW_CFA_AARCH64_negate_ra_state
+  // (also knwon by its alias as DW_CFA_GNU_window_save).
+  // However, RA state register can be set directly via DW_CFA_val_expression
+  // too. GCC does not generate such CFI but some other compilers reportedly do.
+  // (see PR104689 for more details).
+  // Any other value than REG_ARCHEXT should be interpreted as if the RA state register
+  // is set by another DWARF instruction, and the value is fetchable via _Unwind_GetGR.
+  aarch64_RA_signing_method_t signing_method = AARCH64_RA_no_signing;
+  if (fs->regs.how[reg] == REG_ARCHEXT)
+    {
+      signing_method = aarch64_fs_RA_state_get(fs);
+    }
+  else if (fs->regs.how[reg] != REG_UNSAVED)
+    {
+      signing_method = aarch64_context_RA_state_get(context);
+
+      // If the value was set in context, CONTEXT->by_value was set to 1.
+      // uw_install_context_1 contains an assert on by_value, to check that all
+      // registers values have been resolved before installing the context.
+      // This will make the unwinding crash if we don't reset CONTEXT->by_value,
+      // and CONTEXT->reg[reg].
+      context->reg[reg] = _Unwind_Get_Unwind_Context_Reg_Val(0x0);
+      context->by_value[reg] = 0;
+    }
 
-  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
-     REG_UNSAVED/REG_UNSAVED_ARCHEXT means RA signing is disabled/enabled),
-     or set by a DW_CFA_expression.  */
-  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
-      || (_Unwind_GetGR (context, reg) & 0x1) != 0)
+  if (signing_method == AARCH64_RA_signing_SP)
     {
       _Unwind_Word salt = (_Unwind_Word) context->cfa;
-      if (aarch64_cie_signed_with_b_key (context) != 0)
+      if (fs->regs.signing_key == AARCH64_PAUTH_KEY_B)
 	return __builtin_aarch64_autib1716 (addr, salt);
       return __builtin_aarch64_autia1716 (addr, salt);
     }
+  // else {} // signing_method == AARCH64_RA_no_signing, nothing to do.
 
   return addr;
 }
diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
index a6b249f9ad6..2ea78c4ef8d 100644
--- a/libgcc/unwind-dw2-execute_cfa.h
+++ b/libgcc/unwind-dw2-execute_cfa.h
@@ -271,23 +271,33 @@ 
 	      fs->regs.how[reg] = REG_SAVED_VAL_EXP;
 	      fs->regs.reg[reg].loc.exp = insn_ptr;
 	    }
+	  // Don't execute the expression, but jump over it by adding
+	  // DW_FORM_block's size to insn_ptr.
 	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
 	  insn_ptr += utmp;
 	  break;
 
-	case DW_CFA_GNU_window_save:
 #if defined (__aarch64__) && !defined (__ILP32__)
-	  /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
-	     return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
-	     mean RA signing is disabled/enabled.  */
-	  reg = DWARF_REGNUM_AARCH64_RA_STATE;
-	  gcc_assert (fs->regs.how[reg] == REG_UNSAVED
-		      || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
-	  if (fs->regs.how[reg] == REG_UNSAVED)
-	    fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
-	  else
-	    fs->regs.how[reg] = REG_UNSAVED;
+	case DW_CFA_AARCH64_negate_ra_state:
+	  // This CFA is multiplexed with Sparc.
+	  // On AArch64 it's used to toggle the status of return address signing
+	  // with SP as a diversifier.
+	  // - REG_ARCHEXT means that the RA state register in FS contains a
+	  // valid value, and that no other DWARF directive has changed the value
+	  // of this register.
+	  // - any other value is not compatible with negating the RA state.
+	  reg = AARCH64_DWARF_REGNUM_RA_STATE;
+
+	  // /!\ Mixing DW_CFA_val_expression with DW_CFA_AARCH64_negate_ra_state
+	  // will result in an undefined behavior (likely an unwinding failure),
+	  // as the chronology of the DWARF directives will be broken.
+	  gcc_assert (fs->regs.how[reg] == REG_ARCHEXT);
+
+	  // Toggle current state
+	  aarch64_fs_RA_state_toggle(fs);
+	  break;
 #else
+	case DW_CFA_GNU_window_save:
 	  /* ??? Hardcoded for SPARC register window configuration.  */
 	  if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)
 	    for (reg = 16; reg < 32; ++reg)
@@ -295,8 +305,8 @@ 
 		fs->regs.how[reg] = REG_SAVED_OFFSET;
 		fs->regs.reg[reg].loc.offset = (reg - 16) * sizeof (void *);
 	      }
-#endif
 	  break;
+#endif
 
 	case DW_CFA_GNU_args_size:
 	  insn_ptr = read_uleb128 (insn_ptr, &utmp);
diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
index 0849e89cd34..fb60853825d 100644
--- a/libgcc/unwind-dw2.c
+++ b/libgcc/unwind-dw2.c
@@ -501,11 +501,11 @@  extract_cie_info (const struct dwarf_cie *cie, struct _Unwind_Context *context,
 	  fs->signal_frame = 1;
 	  aug += 1;
 	}
-      /* aarch64 B-key pointer authentication.  */
-      else if (aug[0] == 'B')
-	{
-	  aug += 1;
-      }
+
+#if defined(MD_ARCH_EXTENSION_CIE_AUG_HANDLER)
+      else if (MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug[0]))
+	aug += 1;
+#endif
 
       /* Otherwise we have an unknown augmentation string.
 	 Bail unless we saw a 'z' prefix.  */
@@ -996,6 +996,9 @@  uw_frame_state_for (struct _Unwind_Context *context, _Unwind_FrameState *fs)
 
   memset (&fs->regs.how[0], 0,
 	  sizeof (*fs) - offsetof (_Unwind_FrameState, regs.how[0]));
+#if defined(MD_ARCH_EXTENSION_FRAME_INIT)
+  MD_ARCH_EXTENSION_FRAME_INIT(context, fs);
+#endif
   context->args_size = 0;
   context->lsda = 0;
 
@@ -1197,7 +1200,11 @@  uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState *fs)
       {
       case REG_UNSAVED:
       case REG_UNDEFINED:
-      case REG_UNSAVED_ARCHEXT:
+      case REG_ARCHEXT: // If the value depends on an augmenter, then there is
+			// no processing to do here, and the value computation
+			// should be delayed until the architecture handler
+			// computes the value correctly based on the augmenter
+			// information.
 	break;
 
       case REG_SAVED_OFFSET:
diff --git a/libgcc/unwind-dw2.h b/libgcc/unwind-dw2.h
index 0dd8611f697..b75f4c65f98 100644
--- a/libgcc/unwind-dw2.h
+++ b/libgcc/unwind-dw2.h
@@ -22,16 +22,17 @@ 
    see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-enum {
+enum register_rule
+{
   REG_UNSAVED,
   REG_SAVED_OFFSET,
   REG_SAVED_REG,
   REG_SAVED_EXP,
   REG_SAVED_VAL_OFFSET,
   REG_SAVED_VAL_EXP,
-  REG_UNSAVED_ARCHEXT,		/* Target specific extension.  */
+  REG_ARCHEXT,		/* Target specific extension.  */
   REG_UNDEFINED
-};
+} __attribute__((packed));
 
 /* The result of interpreting the frame unwind info for a frame.
    This is all symbolic at this point, as none of the values can
@@ -49,7 +50,7 @@  typedef struct
 	const unsigned char *exp;
       } loc;
     } reg[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
-    unsigned char how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
+    enum register_rule how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
 
     enum {
       CFA_UNSET,
@@ -65,6 +66,14 @@  typedef struct
     _Unwind_Sword cfa_offset;
     _Unwind_Word cfa_reg;
     const unsigned char *cfa_exp;
+
+    /* Architecture extensions information from CIE/FDE.
+     * Note: those information have to be saved in struct frame_state_reg_info
+     * instead of _Unwind_FrameState as DW_CFA_restore_state has to be able to
+     * restore them.  */
+#if defined(__aarch64__) && !defined (__ILP32__)
+    unsigned char signing_key;
+#endif
   } regs;
 
   /* The PC described by the current frame state.  */