[PATCHv2,4/7,ARM,V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

On 26/10/16 17:30, Kyrill Tkachov wrote:
> 
> On 26/10/16 17:26, Andre Vieira (lists) wrote:
>> On 26/10/16 13:51, Kyrill Tkachov wrote:
>>> Hi Andre,
>>>
>>> On 25/10/16 17:29, Andre Vieira (lists) wrote:
>>>> On 24/08/16 12:01, Andre Vieira (lists) wrote:
>>>>> On 25/07/16 14:23, Andre Vieira (lists) wrote:
>>>>>> This patch extends support for the ARMv8-M Security Extensions
>>>>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>>>>> information through unbanked registers.
>>>>>>
>>>>>> When returning from a nonsecure entry function we clear all
>>>>>> caller-saved
>>>>>> registers that are not used to pass return values, by writing either
>>>>>> the
>>>>>> LR, in case of general purpose registers, or the value 0, in case
>>>>>> of FP
>>>>>> registers. We use the LR to write to APSR and FPSCR too. We
>>>>>> currently do
>>>>>> not support entry functions that pass arguments or return
>>>>>> variables on
>>>>>> the stack and we diagnose this. This patch relies on the existing
>>>>>> code
>>>>>> to make sure callee-saved registers used in cmse_nonsecure_entry
>>>>>> functions are saved and restored thus retaining their nonsecure mode
>>>>>> value, this should be happening already as it is required by AAPCS.
>>>>>>
>>>>>> This patch also clears padding bits for cmse_nonsecure_entry
>>>>>> functions
>>>>>> with struct and union return types. For unions a bit is only
>>>>>> considered
>>>>>> a padding bit if it is an unused bit in every field of that union.
>>>>>> The
>>>>>> function that calculates these is used in a later patch to do the
>>>>>> same
>>>>>> for arguments of cmse_nonsecure_call's.
>>>>>>
>>>>>> *** gcc/ChangeLog ***
>>>>>> 2016-07-25  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>>               Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>>
>>>>>>           * config/arm/arm.c (output_return_instruction): Clear
>>>>>>           registers.
>>>>>>           (thumb2_expand_return): Likewise.
>>>>>>           (thumb1_expand_epilogue): Likewise.
>>>>>>           (thumb_exit): Likewise.
>>>>>>           (arm_expand_epilogue): Likewise.
>>>>>>           (cmse_nonsecure_entry_clear_before_return): New.
>>>>>>           (comp_not_to_clear_mask_str_un): New.
>>>>>>           (compute_not_to_clear_mask): New.
>>>>>>           * config/arm/thumb1.md (*epilogue_insns): Change length
>>>>>> attribute.
>>>>>>           * config/arm/thumb2.md (*thumb2_return): Likewise.
>>>>>>
>>>>>> *** gcc/testsuite/ChangeLog ***
>>>>>> 2016-07-25  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>>               Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>>
>>>>>>           * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>>>>>> separate.
>>>>>>           * gcc.target/arm/cmse/struct-1.c: New.
>>>>>>           * gcc.target/arm/cmse/bitfield-1.c: New.
>>>>>>           * gcc.target/arm/cmse/bitfield-2.c: New.
>>>>>>           * gcc.target/arm/cmse/bitfield-3.c: New.
>>>>>>           * gcc.target/arm/cmse/baseline/cmse-2.c: Test that
>>>>>> registers are
>>>>>> cleared.
>>>>>>           * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>>>>>           * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>>>>>           * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>>>>>           * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>>>>>           * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>>>>>
>>>>> Updated this patch to correctly clear only the cumulative
>>>>> exception-status (0-4,7) and the condition code bits (28-31) of the
>>>>> FPSCR. I also adapted the code to be handle the bigger floating point
>>>>> register files.
>>>>>
>>>>> ----
>>>>>
>>>>> This patch extends support for the ARMv8-M Security Extensions
>>>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>>>> information through unbanked registers.
>>>>>
>>>>> When returning from a nonsecure entry function we clear all
>>>>> caller-saved
>>>>> registers that are not used to pass return values, by writing
>>>>> either the
>>>>> LR, in case of general purpose registers, or the value 0, in case
>>>>> of FP
>>>>> registers. We use the LR to write to APSR. For FPSCR we clear only the
>>>>> cumulative exception-status (0-4, 7) and the condition code bits
>>>>> (28-31). We currently do not support entry functions that pass
>>>>> arguments
>>>>> or return variables on the stack and we diagnose this. This patch
>>>>> relies
>>>>> on the existing code to make sure callee-saved registers used in
>>>>> cmse_nonsecure_entry functions are saved and restored thus retaining
>>>>> their nonsecure mode value, this should be happening already as it is
>>>>> required by AAPCS.
>>>>>
>>>>> This patch also clears padding bits for cmse_nonsecure_entry functions
>>>>> with struct and union return types. For unions a bit is only
>>>>> considered
>>>>> a padding bit if it is an unused bit in every field of that union. The
>>>>> function that calculates these is used in a later patch to do the same
>>>>> for arguments of cmse_nonsecure_call's.
>>>>>
>>>>> *** gcc/ChangeLog ***
>>>>> 2016-07-xx  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>               Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>
>>>>>           * config/arm/arm.c (output_return_instruction): Clear
>>>>>           registers.
>>>>>           (thumb2_expand_return): Likewise.
>>>>>           (thumb1_expand_epilogue): Likewise.
>>>>>           (thumb_exit): Likewise.
>>>>>           (arm_expand_epilogue): Likewise.
>>>>>           (cmse_nonsecure_entry_clear_before_return): New.
>>>>>           (comp_not_to_clear_mask_str_un): New.
>>>>>           (compute_not_to_clear_mask): New.
>>>>>           * config/arm/thumb1.md (*epilogue_insns): Change length
>>>>> attribute.
>>>>>           * config/arm/thumb2.md (*thumb2_return): Duplicate
>>>>> pattern for
>>>>>           cmse_nonsecure_entry functions.
>>>>>
>>>>> *** gcc/testsuite/ChangeLog ***
>>>>> 2016-07-xx  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>               Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>
>>>>>           * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>>>>> separate.
>>>>>           * gcc.target/arm/cmse/struct-1.c: New.
>>>>>           * gcc.target/arm/cmse/bitfield-1.c: New.
>>>>>           * gcc.target/arm/cmse/bitfield-2.c: New.
>>>>>           * gcc.target/arm/cmse/bitfield-3.c: New.
>>>>>           * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers
>>>>> are
>>>>> cleared.
>>>>>           * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>>>>           * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>>>>           * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>>>>           * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>>>>           * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>>>>
>>>> Hi,
>>>>
>>>> Rebased previous patch on top of trunk as requested. No changes to
>>>> ChangeLog.
>>>>
>>>> Cheers,
>>>> Andre
>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> index
>>> bb81e5662e81a26c7d3ccf9f749e8e356e6de35e..c6260323ecfd2f2842e6a5aab06b67da16619c73
>>>
>>> 100644
>>> --- a/gcc/config/arm/arm.c
>>> +++ b/gcc/config/arm/arm.c
>>> @@ -17496,6 +17496,279 @@ note_invalid_constants (rtx_insn *insn,
>>> HOST_WIDE_INT address, int do_pushes)
>>>     return;
>>>   }
>>>   +/* This function computes the clear mask and PADDING_BITS_TO_CLEAR
>>> for
>>> structs
>>> +   and unions in the context of ARMv8-M Security Extensions.  It is
>>> used as a
>>> +   helper function for both 'cmse_nonsecure_call' and
>>> 'cmse_nonsecure_entry'
>>> +   functions.  The PADDING_BITS_TO_CLEAR pointer can be the base to
>>> either one
>>> +   or four masks, depending on whether it is being computed for a
>>> +   'cmse_nonsecure_entry' return value or a 'cmse_nonsecure_call'
>>> argument
>>> +   respectively.  The tree for the type of the argument or a field
>>> within an
>>> +   argument is passed in ARG_TYPE, the current register this argument
>>> or field
>>> +   starts in is kept in the pointer REGNO and updated accordingly, the
>>> bit this
>>> +   argument or field starts at is passed in STARTING_BIT and the last
>>> used bit
>>> +   is kept in LAST_USED_BIT which is also updated accordingly.  */
>>> +
>>> +static unsigned HOST_WIDE_INT
>>> +comp_not_to_clear_mask_str_un (tree arg_type, int * regno,
>>> +                   uint32_t * padding_bits_to_clear,
>>> +                   unsigned starting_bit, int * last_used_bit)
>>> +
>>> +{
>>> +  unsigned HOST_WIDE_INT not_to_clear_reg_mask = 0;
>>> +
>>> +  if (TREE_CODE (arg_type) == RECORD_TYPE)
>>> +    {
>>> +      unsigned current_bit = starting_bit;
>>> +      tree field;
>>> +      long int offset, size;
>>> +
>>> +
>>> +      field = TYPE_FIELDS (arg_type);
>>> +      while (field)
>>> +    {
>>> +      /* The offset within a structure is always an offset from
>>> +         the start of that structure.  Make sure we take that into the
>>> +         calculation of the register based offset that we use here.  */
>>> +      offset = starting_bit;
>>> +      offset += TREE_INT_CST_ELT (DECL_FIELD_BIT_OFFSET (field), 0);
>>> +      offset %= 32;
>>> +
>>> +      /* This is the actual size of the field, for bitfields this is
>>> the
>>> +         bitfield width and not the container size.  */
>>> +      size = TREE_INT_CST_ELT (DECL_SIZE (field), 0);
>>> +
>>> +      if (*last_used_bit != offset)
>>> +        {
>>> +          if (offset < *last_used_bit)
>>> +        {
>>> +          /* This field's offset is before the 'last_used_bit', that
>>> +             means this field goes on the next register.  So we need to
>>> +             pad the rest of the current register and increase the
>>> +             register number.  */
>>> +          uint32_t mask;
>>> +          mask  = UINT32_MAX - ((uint32_t) 1 << *last_used_bit);
>>> +          mask++;
>>> +
>>> +          *(padding_bits_to_clear + *regno) |= mask;
>>>
>>> padding_bits_to_clear[*regno] |= mask;
>>>
>>> +          not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
>>> +          (*regno)++;
>>> +        }
>>> +          else
>>> +        {
>>> +          /* Otherwise we pad the bits between the last field's end and
>>> +             the start of the new field.  */
>>> +          uint32_t mask;
>>> +
>>> +          mask = UINT32_MAX >> (32 - offset);
>>> +          mask -= ((uint32_t) 1 << *last_used_bit) - 1;
>>> +          *(padding_bits_to_clear + *regno) |= mask;
>>>
>>> Likewise.
>>>
>>> +        }
>>> +          current_bit = offset;
>>> +        }
>>> +
>>> +      /* Calculate further padding bits for inner structs/unions
>>> too.  */
>>> +      if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (field)))
>>> +        {
>>> +          *last_used_bit = current_bit;
>>> +          not_to_clear_reg_mask
>>> +        |= comp_not_to_clear_mask_str_un (TREE_TYPE (field), regno,
>>> +                          padding_bits_to_clear, offset,
>>> +                          last_used_bit);
>>> +        }
>>> +      else
>>> +        {
>>> +          /* Update 'current_bit' with this field's size.  If the
>>> +         'current_bit' lies in a subsequent register, update 'regno'
>>> and
>>> +         reset 'current_bit' to point to the current bit in that new
>>> +         register.  */
>>> +          current_bit += size;
>>> +          while (current_bit >= 32)
>>> +        {
>>> +          current_bit-=32;
>>> +          not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
>>> +          (*regno)++;
>>> +        }
>>> +          *last_used_bit = current_bit;
>>> +        }
>>> +
>>> +      field = TREE_CHAIN (field);
>>> +    }
>>> +      not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
>>> +    }
>>> +  else if (TREE_CODE (arg_type) == UNION_TYPE)
>>> +    {
>>> +      tree field, field_t;
>>> +      int i, regno_t, field_size;
>>> +      int max_reg = -1;
>>> +      int max_bit = -1;
>>> +      uint32_t mask;
>>> +      uint32_t padding_bits_to_clear_res[NUM_ARG_REGS]
>>> +    = {UINT32_MAX, UINT32_MAX, UINT32_MAX, UINT32_MAX};
>>> +
>>> +      /* To compute the padding bits in a union we only consider
>>> bits as
>>> +     padding bits if they are always either a padding bit or fall
>>> outside a
>>> +     fields size for all fields in the union.  */
>>> +      field = TYPE_FIELDS (arg_type);
>>> +      while (field)
>>> +    {
>>> +      uint32_t padding_bits_to_clear_t[NUM_ARG_REGS]
>>> +        = {0U, 0U, 0U, 0U};
>>> +      int last_used_bit_t = *last_used_bit;
>>> +      regno_t = *regno;
>>> +      field_t = TREE_TYPE (field);
>>> +
>>> +      /* If the field's type is either a record or a union make sure to
>>> +         compute their padding bits too.  */
>>> +      if (RECORD_OR_UNION_TYPE_P (field_t))
>>> +        not_to_clear_reg_mask
>>> +          |= comp_not_to_clear_mask_str_un (field_t, &regno_t,
>>> +                        &padding_bits_to_clear_t[0],
>>> +                        starting_bit, &last_used_bit_t);
>>> +      else
>>> +        {
>>> +          field_size = TREE_INT_CST_ELT (DECL_SIZE (field), 0);
>>> +          regno_t = (field_size / 32) + *regno;
>>> +          last_used_bit_t = (starting_bit + field_size) % 32;
>>> +        }
>>> +
>>> +      for (i = *regno; i < regno_t; i++)
>>> +        {
>>> +          /* For all but the last register used by this field only keep
>>> the
>>> +         padding bits that were padding bits in this field.  */
>>> +          padding_bits_to_clear_res[i] &= padding_bits_to_clear_t[i];
>>> +        }
>>> +
>>> +        /* For the last register, keep all padding bits that were
>>> padding
>>> +           bits in this field and any padding bits that are still valid
>>> +           as padding bits but fall outside of this field's size.  */
>>> +        mask = (UINT32_MAX - ((uint32_t) 1 << last_used_bit_t)) + 1;
>>> +        padding_bits_to_clear_res[regno_t]
>>> +          &= padding_bits_to_clear_t[regno_t] | mask;
>>> +
>>> +      /* Update the maximum size of the fields in terms of registers
>>> used
>>> +         ('max_reg') and the 'last_used_bit' in said register.  */
>>> +      if (max_reg < regno_t)
>>> +        {
>>> +          max_reg = regno_t;
>>> +          max_bit = last_used_bit_t;
>>> +        }
>>> +      else if (max_reg == regno_t && max_bit < last_used_bit_t)
>>> +        max_bit = last_used_bit_t;
>>> +
>>> +      field = TREE_CHAIN (field);
>>> +    }
>>> +
>>> +      /* Update the current padding_bits_to_clear using the
>>> intersection of the
>>> +     padding bits of all the fields.  */
>>> +      for (i=*regno; i < max_reg; i++)
>>> +    padding_bits_to_clear[i] |= padding_bits_to_clear_res[i];
>>> +
>>>
>>> watch the spacing in the 'for' definition.
>>>
>>>   +      /* Do not keep trailing padding bits, we do not know yet
>>> whether
>>> this
>>> +     is the end of the argument.  */
>>> +      mask = ((uint32_t) 1 << max_bit) - 1;
>>> +      padding_bits_to_clear[max_reg]
>>> +    |= padding_bits_to_clear_res[max_reg] & mask;
>>> +
>>> +      *regno = max_reg;
>>> +      *last_used_bit = max_bit;
>>> +    }
>>> +  else
>>> +    /* This function should only be used for structs and unions.  */
>>> +    gcc_unreachable ();
>>> +
>>> +  return not_to_clear_reg_mask;
>>> +}
>>> +
>>> +/* In the context of ARMv8-M Security Extensions, this function is used
>>> for both
>>> +   'cmse_nonsecure_call' and 'cmse_nonsecure_entry' functions to
>>> compute what
>>> +   registers are used when returning or passing arguments, which is
>>> then
>>> +   returned as a mask.  It will also compute a mask to indicate
>>> padding/unused
>>> +   bits for each of these registers, and passes this through the
>>> +   PADDING_BITS_TO_CLEAR pointer.  The tree of the argument type is
>>> passed in
>>> +   ARG_TYPE, the rtl representation of the argument is passed in
>>> ARG_RTX and
>>> +   the starting register used to pass this argument or return value is
>>> passed
>>> +   in REGNO.  It makes use of 'comp_not_to_clear_mask_str_un' to
>>> compute these
>>> +   for struct and union types.  */
>>> +
>>> +static unsigned HOST_WIDE_INT
>>> +compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
>>> +                 uint32_t * padding_bits_to_clear)
>>> +
>>> +{
>>> +  int last_used_bit = 0;
>>> +  unsigned HOST_WIDE_INT not_to_clear_mask;
>>> +
>>> +  if (RECORD_OR_UNION_TYPE_P (arg_type))
>>> +    {
>>> +      not_to_clear_mask
>>> +    = comp_not_to_clear_mask_str_un (arg_type, &regno,
>>> +                     padding_bits_to_clear, 0,
>>> +                     &last_used_bit);
>>> +
>>> +
>>> +      /* If the 'last_used_bit' is not zero, that means we are still
>>> using a
>>> +     part of the last 'regno'.  In such cases we must clear the
>>> trailing
>>> +     bits.  Otherwise we are not using regno and we should mark it
>>> as to
>>> +     clear.  */
>>> +      if (last_used_bit != 0)
>>> +    *(padding_bits_to_clear + regno)
>>> +      |= UINT32_MAX - ((uint32_t) 1 << last_used_bit) + 1;
>>>
>>> padding_bits_to_clear[regno] |= ...
>>>
>>> +      else
>>> +    not_to_clear_mask &= ~(HOST_WIDE_INT_1U << regno);
>>> +    }
>>> +  else
>>> +    {
>>> +      not_to_clear_mask = 0;
>>> +      /* We are not dealing with structs nor unions.  So these
>>> arguments may be
>>> +     passed in floating point registers too.  In some cases a
>>> BLKmode is
>>> +     used when returning or passing arguments in multiple VFP
>>> registers.  */
>>> +      if (GET_MODE (arg_rtx) == BLKmode)
>>> +    {
>>> +      int i, arg_regs;
>>> +      rtx reg;
>>> +
>>> +      /* This should really only occur when dealing with the hard-float
>>> +         ABI.  */
>>> +      gcc_assert (TARGET_HARD_FLOAT_ABI);
>>> +
>>> +      for (i = 0; i < XVECLEN (arg_rtx, 0); i++)
>>> +        {
>>> +          reg = XEXP (XVECEXP (arg_rtx, 0, i), 0);
>>> +          gcc_assert (REG_P (reg));
>>> +
>>> +          not_to_clear_mask |= HOST_WIDE_INT_1U << REGNO (reg);
>>> +
>>> +          /* If we are dealing with DF mode, make sure we don't
>>> +         clear either of the registers it addresses.  */
>>> +          arg_regs = ARM_NUM_REGS (GET_MODE (reg));
>>>
>>> Better assert here that you're indeed dealing with DFmode and/or you
>>> have 2 registers.
>>>
>> The current code actually works for larger modes too. I don't think code
>> will ever be generated with larger types, but why assert if it works
>> anyway?
> 
> ok, no need to assert here then.
> I suppose it doesn't add much if the code handles the other modes fine.
> 
> Kyrill
> 
Hi,

Reworked comments. No change to ChangeLogs.

Cheers,
Andre

[PATCHv2,4/7,ARM,V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

Commit Message

Comments

Patch