diff mbox series

vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

Message ID 129db1b0-0d2a-b768-bc80-9f73d665e8f8@arm.com
State New
Headers show
Series vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346] | expand

Commit Message

Andre Vieira (lists) Oct. 21, 2022, 4:42 p.m. UTC
Hi,

The ada failure reported in the PR was being caused by 
vect_check_gather_scatter failing to deal with bit offsets that weren't 
multiples of BITS_PER_UNIT. This patch makes vect_check_gather_scatter 
reject memory accesses with such offsets.

Bootstrapped and regression tested on aarch64 and x86_64.

I wasn't sure whether I should add a new Ada test that shows the same 
failure without the bitfield lowering, I suspect this is such a rare 
form of data-structure that is why no other tests have highlighted the 
failure. Let me know if you would like me to add it still, the change is 
quite simple, just change the Int24 -> Int32 type in the structure. The 
'thing' that causes the failure is the 4-bit member inside the packed 
structure before the field we access, giving it a 4-bit offset. I 
attempted but failed to create a C test using __attribute__((packed)).

Kind Regards,
Andre

gcc/ChangeLog:

         PR tree-optimization/107346
         * tree-vect-data-refs.cc (vect_check_gather_scatter): Reject 
offsets that aren't
         multiples of BITS_PER_UNIT.

Comments

Richard Biener Oct. 24, 2022, 7:17 a.m. UTC | #1
On Fri, 21 Oct 2022, Andre Vieira (lists) wrote:

> Hi,
> 
> The ada failure reported in the PR was being caused by
> vect_check_gather_scatter failing to deal with bit offsets that weren't
> multiples of BITS_PER_UNIT. This patch makes vect_check_gather_scatter reject
> memory accesses with such offsets.
> 
> Bootstrapped and regression tested on aarch64 and x86_64.
> 
> I wasn't sure whether I should add a new Ada test that shows the same failure
> without the bitfield lowering, I suspect this is such a rare form of
> data-structure that is why no other tests have highlighted the failure. Let me
> know if you would like me to add it still, the change is quite simple, just
> change the Int24 -> Int32 type in the structure. The 'thing' that causes the
> failure is the 4-bit member inside the packed structure before the field we
> access, giving it a 4-bit offset. I attempted but failed to create a C test
> using __attribute__((packed)).

Can you check why vect_find_stmt_data_reference doesn't trip on the

  if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
      && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
    {
      free_data_ref (dr);
      return opt_result::failure_at (stmt,
                                     "not vectorized:"
                                     " statement is an unsupported"
                                     " bitfield access %G", stmt);
    }

?  I think we should amend this check and I guess that
checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough?

Eric - the docs of DECL_BIT_FIELD are vague enough "must be accessed
specially" but ISTR it might eventually only apply to the fields
(bit) size and not it's position.  OTOH the Ada frontend might not
be too careful in setting this flag for bit-packed structs?

Richard.

> Kind Regards,
> Andre
> 
> gcc/ChangeLog:
> 
>         PR tree-optimization/107346
>         * tree-vect-data-refs.cc (vect_check_gather_scatter): Reject 
> offsets that aren't
>         multiples of BITS_PER_UNIT.
>
Eric Botcazou Oct. 24, 2022, 8:31 a.m. UTC | #2
> Eric - the docs of DECL_BIT_FIELD are vague enough "must be accessed
> specially" but ISTR it might eventually only apply to the fields
> (bit) size and not it's position.  OTOH the Ada frontend might not
> be too careful in setting this flag for bit-packed structs?

It sets the flag when the alignment or size of the field does not match that 
of its type, which indeed means that it needs to be accessed specially.
Andre Vieira (lists) Oct. 24, 2022, 10:31 a.m. UTC | #3
On 24/10/2022 08:17, Richard Biener wrote:
>
> Can you check why vect_find_stmt_data_reference doesn't trip on the
>
>    if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
>        && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
>      {
>        free_data_ref (dr);
>        return opt_result::failure_at (stmt,
>                                       "not vectorized:"
>                                       " statement is an unsupported"
>                                       " bitfield access %G", stmt);
>      }

It used to, which is why this test didn't trigger the error before my 
patch, but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer 
a DECL_BIT_FIELD.

But that is a red-herring, if you change the test structure's 'type 
Int24 is mod 2**24;' to 'type Int24 is mod 2**32;', thus making the 
field we access a normal 32-bit integer, the field no longer is a 
DECL_BIT_FIELD and thus my lowering does nothing. However, you will 
still get the failure because the field before it is a packed 4-bit 
field, making the offset to the field we are accessing less than 
BITS_PER_UNIT.

> ?  I think we should amend this check and I guess that
> checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough?
That won't work either, unless we do the same walk-through the full 
access as we do in get_inner_reference.

Let me elaborate, the 'offending' stmt here is:
_ifc__23 = (*x_7(D))[_1].b.D.3707;

And the struct in question is:
package Loop_Optimization23_Pkg is
   type Nibble is mod 2**4;
   type Int24  is mod 2**24;
   type StructA is record
     a : Nibble;
     b : Int24;
   end record;
   pragma Pack(StructA);
   type StructB is record
     a : Nibble;
     b : StructA;
   end record;
   pragma Pack(StructB);
   type ArrayOfStructB is array(0..100) of StructB;
   procedure Foo (X : in out ArrayOfStructB);
end Loop_Optimization23_Pkg;

That D.3707 is the 'container'm i.e. the DECL_BIT_FIELD_REPRESENTATIVE 
of the original bitfield of type Int24.
So in vect_find_stmt_data_reference , the dr is: (*x_7(D))[_1].b.D.3707 and
TREE_OPERAND (DR_REF (dr), 1): D.3707,
which has DECL_FIELD_BIT_OFFSET: 0

So that check would also pass. However, get_inner_reference, walks the 
full access and comes across '.b', the member access for StructA inside 
StructB, that has DECL_FIELD_BIT_OFFSET: 4
Which is where we get into trouble. So to catch that here, we would need 
to do the same type of walking through all the member accesses, like 
get_inner_reference does.
Richard Biener Oct. 24, 2022, 12:46 p.m. UTC | #4
On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:

> 
> On 24/10/2022 08:17, Richard Biener wrote:
> >
> > Can you check why vect_find_stmt_data_reference doesn't trip on the
> >
> >    if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
> >        && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
> >      {
> >        free_data_ref (dr);
> >        return opt_result::failure_at (stmt,
> >                                       "not vectorized:"
> >                                       " statement is an unsupported"
> >                                       " bitfield access %G", stmt);
> >      }
> 
> It used to, which is why this test didn't trigger the error before my patch,
> but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a
> DECL_BIT_FIELD.
> 
> But that is a red-herring, if you change the test structure's 'type Int24 is
> mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access a
> normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my
> lowering does nothing. However, you will still get the failure because the
> field before it is a packed 4-bit field, making the offset to the field we are
> accessing less than BITS_PER_UNIT.

Hmm, so the _intent_ of DECL_BIT_FIELD_REPRESENTATIVE is to definitely
_not_ be a DECL_BIT_FIELD (well, that's the whole point!).   So this
shows an issue with setting up DECL_BIT_FIELD_REPRESENTATIVE?  Of course
for a type with an alignment less than BITS_PER_UNIT (is StructB actually
such a type?) there cannot be a representative that isn't, so maybe
we should then set DECL_BIT_FIELD on it with a condition like Eric
mentions?

> > ?  I think we should amend this check and I guess that
> > checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough?
> That won't work either, unless we do the same walk-through the full access as
> we do in get_inner_reference.

I suppose we should not "if-convert" bit field accesses with a
DECL_BIT_FIELD representative.  There isn't any benefit doing that
(not for general bitfield lowering either).

Richard.

> Let me elaborate, the 'offending' stmt here is:
> _ifc__23 = (*x_7(D))[_1].b.D.3707;
> 
> And the struct in question is:
> package Loop_Optimization23_Pkg is
>   type Nibble is mod 2**4;
>   type Int24  is mod 2**24;
>   type StructA is record
>     a : Nibble;
>     b : Int24;
>   end record;
>   pragma Pack(StructA);
>   type StructB is record
>     a : Nibble;
>     b : StructA;
>   end record;
>   pragma Pack(StructB);
>   type ArrayOfStructB is array(0..100) of StructB;
>   procedure Foo (X : in out ArrayOfStructB);
> end Loop_Optimization23_Pkg;
> 
> That D.3707 is the 'container'm i.e. the DECL_BIT_FIELD_REPRESENTATIVE of the
> original bitfield of type Int24.
> So in vect_find_stmt_data_reference , the dr is: (*x_7(D))[_1].b.D.3707 and
> TREE_OPERAND (DR_REF (dr), 1): D.3707,
> which has DECL_FIELD_BIT_OFFSET: 0
> 
> So that check would also pass. However, get_inner_reference, walks the full
> access and comes across '.b', the member access for StructA inside StructB,
> that has DECL_FIELD_BIT_OFFSET: 4
> Which is where we get into trouble. So to catch that here, we would need to do
> the same type of walking through all the member accesses, like
> get_inner_reference does.
> 
> 
>
Andre Vieira (lists) Oct. 24, 2022, 1:24 p.m. UTC | #5
On 24/10/2022 13:46, Richard Biener wrote:
> On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:
>
>> On 24/10/2022 08:17, Richard Biener wrote:
>>> Can you check why vect_find_stmt_data_reference doesn't trip on the
>>>
>>>     if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
>>>         && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
>>>       {
>>>         free_data_ref (dr);
>>>         return opt_result::failure_at (stmt,
>>>                                        "not vectorized:"
>>>                                        " statement is an unsupported"
>>>                                        " bitfield access %G", stmt);
>>>       }
>> It used to, which is why this test didn't trigger the error before my patch,
>> but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a
>> DECL_BIT_FIELD.
>>
>> But that is a red-herring, if you change the test structure's 'type Int24 is
>> mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access a
>> normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my
>> lowering does nothing. However, you will still get the failure because the
>> field before it is a packed 4-bit field, making the offset to the field we are
>> accessing less than BITS_PER_UNIT.
> Hmm, so the _intent_ of DECL_BIT_FIELD_REPRESENTATIVE is to definitely
> _not_ be a DECL_BIT_FIELD (well, that's the whole point!).   So this
> shows an issue with setting up DECL_BIT_FIELD_REPRESENTATIVE?  Of course
> for a type with an alignment less than BITS_PER_UNIT (is StructB actually
> such a type?) there cannot be a representative that isn't, so maybe
> we should then set DECL_BIT_FIELD on it with a condition like Eric
> mentions?
I could do this, but it would not resolve the latent issue as I could 
still reproduce it without using any of the bitfield lowering code, see 
below.
>
>>> ?  I think we should amend this check and I guess that
>>> checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough?
>> That won't work either, unless we do the same walk-through the full access as
>> we do in get_inner_reference.
> I suppose we should not "if-convert" bit field accesses with a
> DECL_BIT_FIELD representative.  There isn't any benefit doing that
> (not for general bitfield lowering either).
Changing if-convert would merely change this testcase but we could still 
trigger using a different structure type, changing the size of Int24 to 
32 bits rather than 24:
package Loop_Optimization23_Pkg is
   type Nibble is mod 2**4;
   type Int24  is mod 2**32;  -- Changed this from 24->32
   type StructA is record
     a : Nibble;
     b : Int24;
   end record;
   pragma Pack(StructA);
   type StructB is record
     a : Nibble;
     b : StructA;
   end record;
   pragma Pack(StructB);
   type ArrayOfStructB is array(0..100) of StructB;
   procedure Foo (X : in out ArrayOfStructB);
end Loop_Optimization23_Pkg;

This would yield a DR_REF (dr): (*x_7(D))[_1].b.b  where the last 'b' 
isn't a DECL_BIT_FIELD anymore, but the first one still is and still has 
the non-multiple of BITS_PER_UNIT offset. Thus passing the 
vect_find_stmt_data_reference check and triggering the 
vect_check_gather_scatter failure. So unless we go and make sure we 
always set the DECL_BIT_FIELD on all subsequent accesses of a 
DECL_BIT_FIELD 'struct' (which is odd enough on its own) then we are 
better off catching the issue in vect_check_gather_scatter ?
Richard Biener Oct. 24, 2022, 1:29 p.m. UTC | #6
On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:

> 
> On 24/10/2022 13:46, Richard Biener wrote:
> > On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:
> >
> >> On 24/10/2022 08:17, Richard Biener wrote:
> >>> Can you check why vect_find_stmt_data_reference doesn't trip on the
> >>>
> >>>     if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
> >>>         && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
> >>>       {
> >>>         free_data_ref (dr);
> >>>         return opt_result::failure_at (stmt,
> >>>                                        "not vectorized:"
> >>>                                        " statement is an unsupported"
> >>>                                        " bitfield access %G", stmt);
> >>>       }
> >> It used to, which is why this test didn't trigger the error before my
> >> patch,
> >> but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a
> >> DECL_BIT_FIELD.
> >>
> >> But that is a red-herring, if you change the test structure's 'type Int24
> >> is
> >> mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access
> >> a
> >> normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my
> >> lowering does nothing. However, you will still get the failure because the
> >> field before it is a packed 4-bit field, making the offset to the field we
> >> are
> >> accessing less than BITS_PER_UNIT.
> > Hmm, so the _intent_ of DECL_BIT_FIELD_REPRESENTATIVE is to definitely
> > _not_ be a DECL_BIT_FIELD (well, that's the whole point!).   So this
> > shows an issue with setting up DECL_BIT_FIELD_REPRESENTATIVE?  Of course
> > for a type with an alignment less than BITS_PER_UNIT (is StructB actually
> > such a type?) there cannot be a representative that isn't, so maybe
> > we should then set DECL_BIT_FIELD on it with a condition like Eric
> > mentions?
> I could do this, but it would not resolve the latent issue as I could still
> reproduce it without using any of the bitfield lowering code, see below.
> >
> >>> ?  I think we should amend this check and I guess that
> >>> checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough?
> >> That won't work either, unless we do the same walk-through the full access
> >> as
> >> we do in get_inner_reference.
> > I suppose we should not "if-convert" bit field accesses with a
> > DECL_BIT_FIELD representative.  There isn't any benefit doing that
> > (not for general bitfield lowering either).
> Changing if-convert would merely change this testcase but we could still
> trigger using a different structure type, changing the size of Int24 to 32
> bits rather than 24:
> package Loop_Optimization23_Pkg is
>   type Nibble is mod 2**4;
>   type Int24  is mod 2**32;  -- Changed this from 24->32
>   type StructA is record
>     a : Nibble;
>     b : Int24;
>   end record;
>   pragma Pack(StructA);
>   type StructB is record
>     a : Nibble;
>     b : StructA;
>   end record;
>   pragma Pack(StructB);
>   type ArrayOfStructB is array(0..100) of StructB;
>   procedure Foo (X : in out ArrayOfStructB);
> end Loop_Optimization23_Pkg;
> 
> This would yield a DR_REF (dr): (*x_7(D))[_1].b.b  where the last 'b' isn't a
> DECL_BIT_FIELD anymore, but the first one still is and still has the
> non-multiple of BITS_PER_UNIT offset. Thus passing the
> vect_find_stmt_data_reference check and triggering the
> vect_check_gather_scatter failure. So unless we go and make sure we always set
> the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct'
> (which is odd enough on its own) then we are better off catching the issue in
> vect_check_gather_scatter ?

But it's not only an issue with scatter-gather, other load/store handling
assumes it can create a pointer to the start of the access and thus
requires BITS_PER_UNIT alignment for each of them.  So we need to fail
at data-ref analysis somehow.

Richard.
Andre Vieira (lists) Oct. 28, 2022, 1:43 p.m. UTC | #7
On 24/10/2022 14:29, Richard Biener wrote:
> On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:
>
>> Changing if-convert would merely change this testcase but we could still
>> trigger using a different structure type, changing the size of Int24 to 32
>> bits rather than 24:
>> package Loop_Optimization23_Pkg is
>>    type Nibble is mod 2**4;
>>    type Int24  is mod 2**32;  -- Changed this from 24->32
>>    type StructA is record
>>      a : Nibble;
>>      b : Int24;
>>    end record;
>>    pragma Pack(StructA);
>>    type StructB is record
>>      a : Nibble;
>>      b : StructA;
>>    end record;
>>    pragma Pack(StructB);
>>    type ArrayOfStructB is array(0..100) of StructB;
>>    procedure Foo (X : in out ArrayOfStructB);
>> end Loop_Optimization23_Pkg;
>>
>> This would yield a DR_REF (dr): (*x_7(D))[_1].b.b  where the last 'b' isn't a
>> DECL_BIT_FIELD anymore, but the first one still is and still has the
>> non-multiple of BITS_PER_UNIT offset. Thus passing the
>> vect_find_stmt_data_reference check and triggering the
>> vect_check_gather_scatter failure. So unless we go and make sure we always set
>> the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct'
>> (which is odd enough on its own) then we are better off catching the issue in
>> vect_check_gather_scatter ?
> But it's not only an issue with scatter-gather, other load/store handling
> assumes it can create a pointer to the start of the access and thus
> requires BITS_PER_UNIT alignment for each of them.  So we need to fail
> at data-ref analysis somehow.
>
> Richard.

Sorry for the delay on this, had some other things come in between. 
After our IRC discussion I believe we agreed that it would be neater to 
check this in vect_check_gather_scatter as I did in the original patch 
in https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604139.html
The main reasons being that to check earlier we'd need to walk the 
DR_REF to look for any FIELD_DECL that has DECL_BIT_FIELD set and we 
decided against that.

Can you confirm the original patch is OK for trunk?

Kind regards,
Andre
Richard Biener Oct. 28, 2022, 1:46 p.m. UTC | #8
On Fri, 28 Oct 2022, Andre Vieira (lists) wrote:

> 
> On 24/10/2022 14:29, Richard Biener wrote:
> > On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:
> >
> >> Changing if-convert would merely change this testcase but we could still
> >> trigger using a different structure type, changing the size of Int24 to 32
> >> bits rather than 24:
> >> package Loop_Optimization23_Pkg is
> >>    type Nibble is mod 2**4;
> >>    type Int24  is mod 2**32;  -- Changed this from 24->32
> >>    type StructA is record
> >>      a : Nibble;
> >>      b : Int24;
> >>    end record;
> >>    pragma Pack(StructA);
> >>    type StructB is record
> >>      a : Nibble;
> >>      b : StructA;
> >>    end record;
> >>    pragma Pack(StructB);
> >>    type ArrayOfStructB is array(0..100) of StructB;
> >>    procedure Foo (X : in out ArrayOfStructB);
> >> end Loop_Optimization23_Pkg;
> >>
> >> This would yield a DR_REF (dr): (*x_7(D))[_1].b.b  where the last 'b' isn't
> >> a
> >> DECL_BIT_FIELD anymore, but the first one still is and still has the
> >> non-multiple of BITS_PER_UNIT offset. Thus passing the
> >> vect_find_stmt_data_reference check and triggering the
> >> vect_check_gather_scatter failure. So unless we go and make sure we always
> >> set
> >> the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct'
> >> (which is odd enough on its own) then we are better off catching the issue
> >> in
> >> vect_check_gather_scatter ?
> > But it's not only an issue with scatter-gather, other load/store handling
> > assumes it can create a pointer to the start of the access and thus
> > requires BITS_PER_UNIT alignment for each of them.  So we need to fail
> > at data-ref analysis somehow.
> >
> > Richard.
> 
> Sorry for the delay on this, had some other things come in between. After our
> IRC discussion I believe we agreed that it would be neater to check this in
> vect_check_gather_scatter as I did in the original patch in
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604139.html
> The main reasons being that to check earlier we'd need to walk the DR_REF to
> look for any FIELD_DECL that has DECL_BIT_FIELD set and we decided against
> that.
> 
> Can you confirm the original patch is OK for trunk?

Yes.

Thanks,
Richard.

> Kind regards,
> Andre
diff mbox series

Patch

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 4a23d6172aaa12ad7049dc626e5c4afbd5ca3f74..6c892791bd4c39f672add4e4c22a9d7835e292d6 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -4016,6 +4016,11 @@  vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo,
   if (reversep)
     return false;
 
+  /* PR 107346.  Packed structs can have fields at offsets that are not
+     multiples of BITS_PER_UNIT.  Do not use gather/scatters in such cases.  */
+  if (!multiple_p (pbitpos, BITS_PER_UNIT))
+    return false;
+
   poly_int64 pbytepos = exact_div (pbitpos, BITS_PER_UNIT);
 
   if (TREE_CODE (base) == MEM_REF)