Message ID | 129db1b0-0d2a-b768-bc80-9f73d665e8f8@arm.com |
---|---|
State | New |
Headers | show |
Series | vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346] | expand |
On Fri, 21 Oct 2022, Andre Vieira (lists) wrote: > Hi, > > The ada failure reported in the PR was being caused by > vect_check_gather_scatter failing to deal with bit offsets that weren't > multiples of BITS_PER_UNIT. This patch makes vect_check_gather_scatter reject > memory accesses with such offsets. > > Bootstrapped and regression tested on aarch64 and x86_64. > > I wasn't sure whether I should add a new Ada test that shows the same failure > without the bitfield lowering, I suspect this is such a rare form of > data-structure that is why no other tests have highlighted the failure. Let me > know if you would like me to add it still, the change is quite simple, just > change the Int24 -> Int32 type in the structure. The 'thing' that causes the > failure is the 4-bit member inside the packed structure before the field we > access, giving it a 4-bit offset. I attempted but failed to create a C test > using __attribute__((packed)). Can you check why vect_find_stmt_data_reference doesn't trip on the if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) { free_data_ref (dr); return opt_result::failure_at (stmt, "not vectorized:" " statement is an unsupported" " bitfield access %G", stmt); } ? I think we should amend this check and I guess that checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough? Eric - the docs of DECL_BIT_FIELD are vague enough "must be accessed specially" but ISTR it might eventually only apply to the fields (bit) size and not it's position. OTOH the Ada frontend might not be too careful in setting this flag for bit-packed structs? Richard. > Kind Regards, > Andre > > gcc/ChangeLog: > > PR tree-optimization/107346 > * tree-vect-data-refs.cc (vect_check_gather_scatter): Reject > offsets that aren't > multiples of BITS_PER_UNIT. >
> Eric - the docs of DECL_BIT_FIELD are vague enough "must be accessed > specially" but ISTR it might eventually only apply to the fields > (bit) size and not it's position. OTOH the Ada frontend might not > be too careful in setting this flag for bit-packed structs? It sets the flag when the alignment or size of the field does not match that of its type, which indeed means that it needs to be accessed specially.
On 24/10/2022 08:17, Richard Biener wrote: > > Can you check why vect_find_stmt_data_reference doesn't trip on the > > if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF > && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) > { > free_data_ref (dr); > return opt_result::failure_at (stmt, > "not vectorized:" > " statement is an unsupported" > " bitfield access %G", stmt); > } It used to, which is why this test didn't trigger the error before my patch, but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a DECL_BIT_FIELD. But that is a red-herring, if you change the test structure's 'type Int24 is mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access a normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my lowering does nothing. However, you will still get the failure because the field before it is a packed 4-bit field, making the offset to the field we are accessing less than BITS_PER_UNIT. > ? I think we should amend this check and I guess that > checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough? That won't work either, unless we do the same walk-through the full access as we do in get_inner_reference. Let me elaborate, the 'offending' stmt here is: _ifc__23 = (*x_7(D))[_1].b.D.3707; And the struct in question is: package Loop_Optimization23_Pkg is type Nibble is mod 2**4; type Int24 is mod 2**24; type StructA is record a : Nibble; b : Int24; end record; pragma Pack(StructA); type StructB is record a : Nibble; b : StructA; end record; pragma Pack(StructB); type ArrayOfStructB is array(0..100) of StructB; procedure Foo (X : in out ArrayOfStructB); end Loop_Optimization23_Pkg; That D.3707 is the 'container'm i.e. the DECL_BIT_FIELD_REPRESENTATIVE of the original bitfield of type Int24. So in vect_find_stmt_data_reference , the dr is: (*x_7(D))[_1].b.D.3707 and TREE_OPERAND (DR_REF (dr), 1): D.3707, which has DECL_FIELD_BIT_OFFSET: 0 So that check would also pass. However, get_inner_reference, walks the full access and comes across '.b', the member access for StructA inside StructB, that has DECL_FIELD_BIT_OFFSET: 4 Which is where we get into trouble. So to catch that here, we would need to do the same type of walking through all the member accesses, like get_inner_reference does.
On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: > > On 24/10/2022 08:17, Richard Biener wrote: > > > > Can you check why vect_find_stmt_data_reference doesn't trip on the > > > > if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF > > && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) > > { > > free_data_ref (dr); > > return opt_result::failure_at (stmt, > > "not vectorized:" > > " statement is an unsupported" > > " bitfield access %G", stmt); > > } > > It used to, which is why this test didn't trigger the error before my patch, > but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a > DECL_BIT_FIELD. > > But that is a red-herring, if you change the test structure's 'type Int24 is > mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access a > normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my > lowering does nothing. However, you will still get the failure because the > field before it is a packed 4-bit field, making the offset to the field we are > accessing less than BITS_PER_UNIT. Hmm, so the _intent_ of DECL_BIT_FIELD_REPRESENTATIVE is to definitely _not_ be a DECL_BIT_FIELD (well, that's the whole point!). So this shows an issue with setting up DECL_BIT_FIELD_REPRESENTATIVE? Of course for a type with an alignment less than BITS_PER_UNIT (is StructB actually such a type?) there cannot be a representative that isn't, so maybe we should then set DECL_BIT_FIELD on it with a condition like Eric mentions? > > ? I think we should amend this check and I guess that > > checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough? > That won't work either, unless we do the same walk-through the full access as > we do in get_inner_reference. I suppose we should not "if-convert" bit field accesses with a DECL_BIT_FIELD representative. There isn't any benefit doing that (not for general bitfield lowering either). Richard. > Let me elaborate, the 'offending' stmt here is: > _ifc__23 = (*x_7(D))[_1].b.D.3707; > > And the struct in question is: > package Loop_Optimization23_Pkg is > type Nibble is mod 2**4; > type Int24 is mod 2**24; > type StructA is record > a : Nibble; > b : Int24; > end record; > pragma Pack(StructA); > type StructB is record > a : Nibble; > b : StructA; > end record; > pragma Pack(StructB); > type ArrayOfStructB is array(0..100) of StructB; > procedure Foo (X : in out ArrayOfStructB); > end Loop_Optimization23_Pkg; > > That D.3707 is the 'container'm i.e. the DECL_BIT_FIELD_REPRESENTATIVE of the > original bitfield of type Int24. > So in vect_find_stmt_data_reference , the dr is: (*x_7(D))[_1].b.D.3707 and > TREE_OPERAND (DR_REF (dr), 1): D.3707, > which has DECL_FIELD_BIT_OFFSET: 0 > > So that check would also pass. However, get_inner_reference, walks the full > access and comes across '.b', the member access for StructA inside StructB, > that has DECL_FIELD_BIT_OFFSET: 4 > Which is where we get into trouble. So to catch that here, we would need to do > the same type of walking through all the member accesses, like > get_inner_reference does. > > >
On 24/10/2022 13:46, Richard Biener wrote: > On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: > >> On 24/10/2022 08:17, Richard Biener wrote: >>> Can you check why vect_find_stmt_data_reference doesn't trip on the >>> >>> if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF >>> && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) >>> { >>> free_data_ref (dr); >>> return opt_result::failure_at (stmt, >>> "not vectorized:" >>> " statement is an unsupported" >>> " bitfield access %G", stmt); >>> } >> It used to, which is why this test didn't trigger the error before my patch, >> but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a >> DECL_BIT_FIELD. >> >> But that is a red-herring, if you change the test structure's 'type Int24 is >> mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access a >> normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my >> lowering does nothing. However, you will still get the failure because the >> field before it is a packed 4-bit field, making the offset to the field we are >> accessing less than BITS_PER_UNIT. > Hmm, so the _intent_ of DECL_BIT_FIELD_REPRESENTATIVE is to definitely > _not_ be a DECL_BIT_FIELD (well, that's the whole point!). So this > shows an issue with setting up DECL_BIT_FIELD_REPRESENTATIVE? Of course > for a type with an alignment less than BITS_PER_UNIT (is StructB actually > such a type?) there cannot be a representative that isn't, so maybe > we should then set DECL_BIT_FIELD on it with a condition like Eric > mentions? I could do this, but it would not resolve the latent issue as I could still reproduce it without using any of the bitfield lowering code, see below. > >>> ? I think we should amend this check and I guess that >>> checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough? >> That won't work either, unless we do the same walk-through the full access as >> we do in get_inner_reference. > I suppose we should not "if-convert" bit field accesses with a > DECL_BIT_FIELD representative. There isn't any benefit doing that > (not for general bitfield lowering either). Changing if-convert would merely change this testcase but we could still trigger using a different structure type, changing the size of Int24 to 32 bits rather than 24: package Loop_Optimization23_Pkg is type Nibble is mod 2**4; type Int24 is mod 2**32; -- Changed this from 24->32 type StructA is record a : Nibble; b : Int24; end record; pragma Pack(StructA); type StructB is record a : Nibble; b : StructA; end record; pragma Pack(StructB); type ArrayOfStructB is array(0..100) of StructB; procedure Foo (X : in out ArrayOfStructB); end Loop_Optimization23_Pkg; This would yield a DR_REF (dr): (*x_7(D))[_1].b.b where the last 'b' isn't a DECL_BIT_FIELD anymore, but the first one still is and still has the non-multiple of BITS_PER_UNIT offset. Thus passing the vect_find_stmt_data_reference check and triggering the vect_check_gather_scatter failure. So unless we go and make sure we always set the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct' (which is odd enough on its own) then we are better off catching the issue in vect_check_gather_scatter ?
On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: > > On 24/10/2022 13:46, Richard Biener wrote: > > On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: > > > >> On 24/10/2022 08:17, Richard Biener wrote: > >>> Can you check why vect_find_stmt_data_reference doesn't trip on the > >>> > >>> if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF > >>> && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) > >>> { > >>> free_data_ref (dr); > >>> return opt_result::failure_at (stmt, > >>> "not vectorized:" > >>> " statement is an unsupported" > >>> " bitfield access %G", stmt); > >>> } > >> It used to, which is why this test didn't trigger the error before my > >> patch, > >> but we lower it to BIT_FIELD_REFs in ifcvt now so it is no longer a > >> DECL_BIT_FIELD. > >> > >> But that is a red-herring, if you change the test structure's 'type Int24 > >> is > >> mod 2**24;' to 'type Int24 is mod 2**32;', thus making the field we access > >> a > >> normal 32-bit integer, the field no longer is a DECL_BIT_FIELD and thus my > >> lowering does nothing. However, you will still get the failure because the > >> field before it is a packed 4-bit field, making the offset to the field we > >> are > >> accessing less than BITS_PER_UNIT. > > Hmm, so the _intent_ of DECL_BIT_FIELD_REPRESENTATIVE is to definitely > > _not_ be a DECL_BIT_FIELD (well, that's the whole point!). So this > > shows an issue with setting up DECL_BIT_FIELD_REPRESENTATIVE? Of course > > for a type with an alignment less than BITS_PER_UNIT (is StructB actually > > such a type?) there cannot be a representative that isn't, so maybe > > we should then set DECL_BIT_FIELD on it with a condition like Eric > > mentions? > I could do this, but it would not resolve the latent issue as I could still > reproduce it without using any of the bitfield lowering code, see below. > > > >>> ? I think we should amend this check and I guess that > >>> checking multiple_p on DECL_FIELD_BIT_OFFSET should be enough? > >> That won't work either, unless we do the same walk-through the full access > >> as > >> we do in get_inner_reference. > > I suppose we should not "if-convert" bit field accesses with a > > DECL_BIT_FIELD representative. There isn't any benefit doing that > > (not for general bitfield lowering either). > Changing if-convert would merely change this testcase but we could still > trigger using a different structure type, changing the size of Int24 to 32 > bits rather than 24: > package Loop_Optimization23_Pkg is > type Nibble is mod 2**4; > type Int24 is mod 2**32; -- Changed this from 24->32 > type StructA is record > a : Nibble; > b : Int24; > end record; > pragma Pack(StructA); > type StructB is record > a : Nibble; > b : StructA; > end record; > pragma Pack(StructB); > type ArrayOfStructB is array(0..100) of StructB; > procedure Foo (X : in out ArrayOfStructB); > end Loop_Optimization23_Pkg; > > This would yield a DR_REF (dr): (*x_7(D))[_1].b.b where the last 'b' isn't a > DECL_BIT_FIELD anymore, but the first one still is and still has the > non-multiple of BITS_PER_UNIT offset. Thus passing the > vect_find_stmt_data_reference check and triggering the > vect_check_gather_scatter failure. So unless we go and make sure we always set > the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct' > (which is odd enough on its own) then we are better off catching the issue in > vect_check_gather_scatter ? But it's not only an issue with scatter-gather, other load/store handling assumes it can create a pointer to the start of the access and thus requires BITS_PER_UNIT alignment for each of them. So we need to fail at data-ref analysis somehow. Richard.
On 24/10/2022 14:29, Richard Biener wrote: > On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: > >> Changing if-convert would merely change this testcase but we could still >> trigger using a different structure type, changing the size of Int24 to 32 >> bits rather than 24: >> package Loop_Optimization23_Pkg is >> type Nibble is mod 2**4; >> type Int24 is mod 2**32; -- Changed this from 24->32 >> type StructA is record >> a : Nibble; >> b : Int24; >> end record; >> pragma Pack(StructA); >> type StructB is record >> a : Nibble; >> b : StructA; >> end record; >> pragma Pack(StructB); >> type ArrayOfStructB is array(0..100) of StructB; >> procedure Foo (X : in out ArrayOfStructB); >> end Loop_Optimization23_Pkg; >> >> This would yield a DR_REF (dr): (*x_7(D))[_1].b.b where the last 'b' isn't a >> DECL_BIT_FIELD anymore, but the first one still is and still has the >> non-multiple of BITS_PER_UNIT offset. Thus passing the >> vect_find_stmt_data_reference check and triggering the >> vect_check_gather_scatter failure. So unless we go and make sure we always set >> the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct' >> (which is odd enough on its own) then we are better off catching the issue in >> vect_check_gather_scatter ? > But it's not only an issue with scatter-gather, other load/store handling > assumes it can create a pointer to the start of the access and thus > requires BITS_PER_UNIT alignment for each of them. So we need to fail > at data-ref analysis somehow. > > Richard. Sorry for the delay on this, had some other things come in between. After our IRC discussion I believe we agreed that it would be neater to check this in vect_check_gather_scatter as I did in the original patch in https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604139.html The main reasons being that to check earlier we'd need to walk the DR_REF to look for any FIELD_DECL that has DECL_BIT_FIELD set and we decided against that. Can you confirm the original patch is OK for trunk? Kind regards, Andre
On Fri, 28 Oct 2022, Andre Vieira (lists) wrote: > > On 24/10/2022 14:29, Richard Biener wrote: > > On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: > > > >> Changing if-convert would merely change this testcase but we could still > >> trigger using a different structure type, changing the size of Int24 to 32 > >> bits rather than 24: > >> package Loop_Optimization23_Pkg is > >> type Nibble is mod 2**4; > >> type Int24 is mod 2**32; -- Changed this from 24->32 > >> type StructA is record > >> a : Nibble; > >> b : Int24; > >> end record; > >> pragma Pack(StructA); > >> type StructB is record > >> a : Nibble; > >> b : StructA; > >> end record; > >> pragma Pack(StructB); > >> type ArrayOfStructB is array(0..100) of StructB; > >> procedure Foo (X : in out ArrayOfStructB); > >> end Loop_Optimization23_Pkg; > >> > >> This would yield a DR_REF (dr): (*x_7(D))[_1].b.b where the last 'b' isn't > >> a > >> DECL_BIT_FIELD anymore, but the first one still is and still has the > >> non-multiple of BITS_PER_UNIT offset. Thus passing the > >> vect_find_stmt_data_reference check and triggering the > >> vect_check_gather_scatter failure. So unless we go and make sure we always > >> set > >> the DECL_BIT_FIELD on all subsequent accesses of a DECL_BIT_FIELD 'struct' > >> (which is odd enough on its own) then we are better off catching the issue > >> in > >> vect_check_gather_scatter ? > > But it's not only an issue with scatter-gather, other load/store handling > > assumes it can create a pointer to the start of the access and thus > > requires BITS_PER_UNIT alignment for each of them. So we need to fail > > at data-ref analysis somehow. > > > > Richard. > > Sorry for the delay on this, had some other things come in between. After our > IRC discussion I believe we agreed that it would be neater to check this in > vect_check_gather_scatter as I did in the original patch in > https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604139.html > The main reasons being that to check earlier we'd need to walk the DR_REF to > look for any FIELD_DECL that has DECL_BIT_FIELD set and we decided against > that. > > Can you confirm the original patch is OK for trunk? Yes. Thanks, Richard. > Kind regards, > Andre
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 4a23d6172aaa12ad7049dc626e5c4afbd5ca3f74..6c892791bd4c39f672add4e4c22a9d7835e292d6 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -4016,6 +4016,11 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, if (reversep) return false; + /* PR 107346. Packed structs can have fields at offsets that are not + multiples of BITS_PER_UNIT. Do not use gather/scatters in such cases. */ + if (!multiple_p (pbitpos, BITS_PER_UNIT)) + return false; + poly_int64 pbytepos = exact_div (pbitpos, BITS_PER_UNIT); if (TREE_CODE (base) == MEM_REF)