mbox series

[V2,0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

Message ID 20230804194431.993958-1-qing.zhao@oracle.com
Headers show
Series New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) | expand

Message

Qing Zhao Aug. 4, 2023, 7:44 p.m. UTC
Hi,

This is the 2nd version of the patch, per our discussion based on the
review comments for the 1st version, the major changes in this version
are:

1. change the name "element_count" to "counted_by";
2. change the parameter for the attribute from a STRING to an
Identifier;
3. Add logic and testing cases to handle anonymous structure/unions;
4. Clarify documentation to permit the situation when the allocation
size is larger than what's specified by "counted_by", at the same time,
it's user's error if allocation size is smaller than what's specified by
"counted_by";
5. Add a complete testing case for using counted_by attribute in
__builtin_dynamic_object_size when there is mismatch between the
allocation size and the value of "counted_by", the expecting behavior
for each case and the explanation on why in the comments. 

As discussed, I plan to add two more separate patch sets after this initial
patch set is approved and committed.

set 1. A new warning option and a new sanitizer option for the user error
       when the allocation size is smaller than the value of "counted_by".
set 2. An improvement to __builtin_dynamic_object_size  for the following
       case:

struct A
{
size_t foo;
int array[] __attribute__((counted_by (foo)));
};

extern struct fix * alloc_buf ();

int main ()
{
struct fix *p = alloc_buf ();
__builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
  /* with the current algorithm, it’s UNKNOWN */ 
__builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
  /* with the current algorithm, it’s UNKNOWN */
}

Bootstrapped and regression tested on both aarch64 and X86, no issue.

Please see more details on the description of this work on:

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619708.html

Okay for committing?

thanks.

Qing

Qing Zhao (3):
  Provide counted_by attribute to flexible array member field (PR108896)
  Use the counted_by atribute info in builtin object size [PR108896]
  Use the counted_by attribute information in bound sanitizer[PR108896]

 gcc/c-family/c-attribs.cc                     |  54 ++++-
 gcc/c-family/c-common.cc                      |  13 ++
 gcc/c-family/c-common.h                       |   1 +
 gcc/c-family/c-ubsan.cc                       |  16 ++
 gcc/c/c-decl.cc                               |  79 +++++--
 gcc/doc/extend.texi                           |  73 +++++++
 .../gcc.dg/flex-array-counted-by-2.c          |  74 +++++++
 .../gcc.dg/flex-array-counted-by-3.c          | 197 ++++++++++++++++++
 gcc/testsuite/gcc.dg/flex-array-counted-by.c  |  40 ++++
 .../ubsan/flex-array-counted-by-bounds-2.c    |  27 +++
 .../ubsan/flex-array-counted-by-bounds.c      |  46 ++++
 gcc/tree-object-size.cc                       |  37 +++-
 gcc/tree.cc                                   | 133 ++++++++++++
 gcc/tree.h                                    |  15 ++
 14 files changed, 780 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

Comments

Kees Cook Aug. 7, 2023, 4:16 p.m. UTC | #1
On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> This is the 2nd version of the patch, per our discussion based on the
> review comments for the 1st version, the major changes in this version
> are:

Thanks for the update!

> 
> 1. change the name "element_count" to "counted_by";
> 2. change the parameter for the attribute from a STRING to an
> Identifier;
> 3. Add logic and testing cases to handle anonymous structure/unions;
> 4. Clarify documentation to permit the situation when the allocation
> size is larger than what's specified by "counted_by", at the same time,
> it's user's error if allocation size is smaller than what's specified by
> "counted_by";
> 5. Add a complete testing case for using counted_by attribute in
> __builtin_dynamic_object_size when there is mismatch between the
> allocation size and the value of "counted_by", the expecting behavior
> for each case and the explanation on why in the comments. 

All the "normal" test cases I have are passing; this is wonderful! :)

I'm still seeing unexpected situations when I've intentionally set
counted_by to be smaller than alloc_size, but I assume it's due to not
yet having the patch you mention below.

> As discussed, I plan to add two more separate patch sets after this initial
> patch set is approved and committed.
> 
> set 1. A new warning option and a new sanitizer option for the user error
>        when the allocation size is smaller than the value of "counted_by".
> set 2. An improvement to __builtin_dynamic_object_size  for the following
>        case:
> 
> struct A
> {
> size_t foo;
> int array[] __attribute__((counted_by (foo)));
> };
> 
> extern struct fix * alloc_buf ();
> 
> int main ()
> {
> struct fix *p = alloc_buf ();
> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>   /* with the current algorithm, it’s UNKNOWN */ 
> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>   /* with the current algorithm, it’s UNKNOWN */
> }

Should the above be bdos instead of bos?

> Bootstrapped and regression tested on both aarch64 and X86, no issue.

I've updated the Linux kernel's macros for the name change and done
build tests with my first pass at "easy" cases for adding counted_by:
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b

Everything is working as expected. :)

-Kees
Qing Zhao Aug. 7, 2023, 4:33 p.m. UTC | #2
> On Aug 7, 2023, at 12:16 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>> This is the 2nd version of the patch, per our discussion based on the
>> review comments for the 1st version, the major changes in this version
>> are:
> 
> Thanks for the update!
> 
>> 
>> 1. change the name "element_count" to "counted_by";
>> 2. change the parameter for the attribute from a STRING to an
>> Identifier;
>> 3. Add logic and testing cases to handle anonymous structure/unions;
>> 4. Clarify documentation to permit the situation when the allocation
>> size is larger than what's specified by "counted_by", at the same time,
>> it's user's error if allocation size is smaller than what's specified by
>> "counted_by";
>> 5. Add a complete testing case for using counted_by attribute in
>> __builtin_dynamic_object_size when there is mismatch between the
>> allocation size and the value of "counted_by", the expecting behavior
>> for each case and the explanation on why in the comments. 
> 
> All the "normal" test cases I have are passing; this is wonderful! :)
> 
> I'm still seeing unexpected situations when I've intentionally set
> counted_by to be smaller than alloc_size, but I assume it's due to not
> yet having the patch you mention below.

What’s the testing case for the one that failed? 
If it’s 

__builtin_dynamic_object_size(p->array, 0/2) without the allocation information in the routine, 
then with the current algorithm, gcc cannot deduce the size for the whole object.

If not such case, let me know.

> 
>> As discussed, I plan to add two more separate patch sets after this initial
>> patch set is approved and committed.
>> 
>> set 1. A new warning option and a new sanitizer option for the user error
>>       when the allocation size is smaller than the value of "counted_by".
>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>       case:
>> 
>> struct A
>> {
>> size_t foo;
>> int array[] __attribute__((counted_by (foo)));
>> };
>> 
>> extern struct fix * alloc_buf ();
>> 
>> int main ()
>> {
>> struct fix *p = alloc_buf ();
>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>>  /* with the current algorithm, it’s UNKNOWN */ 
>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>>  /* with the current algorithm, it’s UNKNOWN */
>> }
> 
> Should the above be bdos instead of bos?

Yes, sorry for the typo.  -:)
> 
>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
> 
> I've updated the Linux kernel's macros for the name change and done
> build tests with my first pass at "easy" cases for adding counted_by:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b

> 
> Everything is working as expected. :)

Thanks a lot for the testing!

Qing
> 
> -Kees
> 
> -- 
> Kees Cook
> 
>
Martin Uecker Aug. 8, 2023, 2:54 p.m. UTC | #3
I am sure this has been discussed before, but seeing that you
test for a specific formula, let me point out the following:

There at least three different size expression which could
make sense. Consider

short foo { int a; short b; char t[]; }; 

Most people seem to use

sizeof(struct foo) + N * sizeof(foo->t);

which for N == 3 yields 11 bytes on x86-64 because the formula
adds the padding of the original struct. There is an example
in the  C standard that uses this formula.


But he minimum size of an object which stores N elements is

max(sizeof (struct s), offsetof(struct s, t[n]))

which is 9 bytes. 

This is what clang uses for statically allocated objects with
initialization, while GCC uses the rule above (11 bytes). But 
bdos / bos  then returns the smaller size of 9 which is a bit
confusing.


https://godbolt.org/z/K1hvaK1ns

https://github.com/llvm/llvm-project/issues/62929
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956


Then there is also the size of a similar array where the FAM
is replaced with an array of static size:

struct foo { int a; short b; char t[3]; }; 

This would make the most sense to me, but it has 12 bytes
because the padding is according to the usual alignment
rules.


Martin



Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> > This is the 2nd version of the patch, per our discussion based on the
> > review comments for the 1st version, the major changes in this version
> > are:
> 
> Thanks for the update!
> 
> > 
> > 1. change the name "element_count" to "counted_by";
> > 2. change the parameter for the attribute from a STRING to an
> > Identifier;
> > 3. Add logic and testing cases to handle anonymous structure/unions;
> > 4. Clarify documentation to permit the situation when the allocation
> > size is larger than what's specified by "counted_by", at the same time,
> > it's user's error if allocation size is smaller than what's specified by
> > "counted_by";
> > 5. Add a complete testing case for using counted_by attribute in
> > __builtin_dynamic_object_size when there is mismatch between the
> > allocation size and the value of "counted_by", the expecting behavior
> > for each case and the explanation on why in the comments. 
> 
> All the "normal" test cases I have are passing; this is wonderful! :)
> 
> I'm still seeing unexpected situations when I've intentionally set
> counted_by to be smaller than alloc_size, but I assume it's due to not
> yet having the patch you mention below.
> 
> > As discussed, I plan to add two more separate patch sets after this initial
> > patch set is approved and committed.
> > 
> > set 1. A new warning option and a new sanitizer option for the user error
> >        when the allocation size is smaller than the value of "counted_by".
> > set 2. An improvement to __builtin_dynamic_object_size  for the following
> >        case:
> > 
> > struct A
> > {
> > size_t foo;
> > int array[] __attribute__((counted_by (foo)));
> > };
> > 
> > extern struct fix * alloc_buf ();
> > 
> > int main ()
> > {
> > struct fix *p = alloc_buf ();
> > __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
> >   /* with the current algorithm, it’s UNKNOWN */ 
> > __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
> >   /* with the current algorithm, it’s UNKNOWN */
> > }
> 
> Should the above be bdos instead of bos?
> 
> > Bootstrapped and regression tested on both aarch64 and X86, no issue.
> 
> I've updated the Linux kernel's macros for the name change and done
> build tests with my first pass at "easy" cases for adding counted_by:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b
> 
> Everything is working as expected. :)
> 
> -Kees
>
Michael Matz Aug. 8, 2023, 4:18 p.m. UTC | #4
Hello,

On Tue, 8 Aug 2023, Martin Uecker via Gcc-patches wrote:

> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 

But should it really?  struct sizes should usually be a multiple of a 
structs alignment, and if int is 4-aligned only size 12 would be correct. 
I don't see why one should deviate from that general rule for sizes for 
FAM structs.  (I realize that the above is not about sizeof(), but rather 
bdos/bos, but I think it's fairly useful that both agree when possbible).

Say, if you were to allocate an array of such struct foos, each having 3 
elements in foo.t[].  You need to add 12 bytes to go to the next array 
element, not 9.  (Of course arrays of FAM-structs are somewhat meh, but 
still).  It's true that you would be allowed to rely on only 9 bytes of 
those 12 bytes (the rest being padding), so perhaps it's really the right 
answer for __bos, but, hmm, 12 seems to be "more correct" to my guts :-)


Ciao,
Michael.
Kees Cook Aug. 8, 2023, 7:58 p.m. UTC | #5
On Tue, Aug 08, 2023 at 04:54:46PM +0200, Martin Uecker wrote:
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);

I think this contains a typo. The above should be:

	sizeof(struct foo) + N * sizeof(*foo->t);

> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.
> 
> https://godbolt.org/z/K1hvaK1ns

And to add to the mess here, Clang used to get this wrong for statically
initialized flexible array members (returning 8):

https://godbolt.org/z/Tee96dazs

But that was fixed recently when I noticed it a few weeks ago.

> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.

Hm, in my thinking, FAMs are individually padded, so since they don't
"count" to the struct size, I think this is "as expected".

Note that the Linux kernel explicitly chooses to potentially over-estimate
for FAM allocations since there has been inconsistent sizes used depend
on the style of the prior fake FAM usage, the use of unions, etc.

i.e. our macro is, effectively:

#define struct_size(ptr, member, count)	\
	(sizeof(*ptr) + (count) * sizeof(*ptr->member))

since the other case can lead to truncated sizes:

#define struct_size(ptr, member, count) \
	(offsetof(typeof(*ptr), member) + (count) * sizeof(*ptr->member))


struct foo {
	int a, b, c;
	int count;
	union {
		struct {
			char small;
			char char_fam[];
		};
		struct {
			int large;
			int int_fam[];
		};
	};
};

https://godbolt.org/z/b1v74Mzhd

i.e.:
struct_size(x, char_fam, 1) < sizeof(*x)

so accessing "large" fails, etc. Yes, it's all horrible, but we have to
deal with this kind of thing in the kernel. :(
Qing Zhao Aug. 9, 2023, 4:05 p.m. UTC | #6
Hi, Martin,

Thanks for raising this issue.

Although this is an old FAM related issue that does not relate to my current patch 
(and might need to be resolved in a separate patch).  I think that it’s necessary to have
more discussion on this old issue and resolve it. 

The first thing that I’d like to confirm is:

What the exact memory layout for the following structure x?

struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };

And the key that is confusing me is, where should the field “t” start? 

A.  Starting at offset 8 as the following:

a          	4-bytes
b          	2-bytes
padding   2-bytes
t           	3-bytes

B. Starting at offset 6 as the following:

a          	4-bytes
b          	2-bytes
t           	3-bytes

From my understanding, A should be correct. However, when I debugged into gcc, I found that the following

tree
byte_position (const_tree field)
{
  return byte_from_pos (DECL_FIELD_OFFSET (field),
                        DECL_FIELD_BIT_OFFSET (field));
}

Returned 6 for the field “t”:

498	  tree pos = byte_position (last);
(gdb) n
499	  size = fold_build2 (PLUS_EXPR, TREE_TYPE (size), pos, compsize);
(gdb) call debug_generic_expr(pos)
6

So, I suspect that there is a bug in GCC which incorrectly represent the offset of the FAM field in the IR.

Thanks.

Qing
> On Aug 8, 2023, at 10:54 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.
> 
> 
> https://godbolt.org/z/K1hvaK1ns
> 
> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.
> 
> 
> Martin
> 
> 
> 
> Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
>> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>>> are:
>> 
>> Thanks for the update!
>> 
>>> 
>>> 1. change the name "element_count" to "counted_by";
>>> 2. change the parameter for the attribute from a STRING to an
>>> Identifier;
>>> 3. Add logic and testing cases to handle anonymous structure/unions;
>>> 4. Clarify documentation to permit the situation when the allocation
>>> size is larger than what's specified by "counted_by", at the same time,
>>> it's user's error if allocation size is smaller than what's specified by
>>> "counted_by";
>>> 5. Add a complete testing case for using counted_by attribute in
>>> __builtin_dynamic_object_size when there is mismatch between the
>>> allocation size and the value of "counted_by", the expecting behavior
>>> for each case and the explanation on why in the comments. 
>> 
>> All the "normal" test cases I have are passing; this is wonderful! :)
>> 
>> I'm still seeing unexpected situations when I've intentionally set
>> counted_by to be smaller than alloc_size, but I assume it's due to not
>> yet having the patch you mention below.
>> 
>>> As discussed, I plan to add two more separate patch sets after this initial
>>> patch set is approved and committed.
>>> 
>>> set 1. A new warning option and a new sanitizer option for the user error
>>>        when the allocation size is smaller than the value of "counted_by".
>>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>>        case:
>>> 
>>> struct A
>>> {
>>> size_t foo;
>>> int array[] __attribute__((counted_by (foo)));
>>> };
>>> 
>>> extern struct fix * alloc_buf ();
>>> 
>>> int main ()
>>> {
>>> struct fix *p = alloc_buf ();
>>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */ 
>>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */
>>> }
>> 
>> Should the above be bdos instead of bos?
>> 
>>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
>> 
>> I've updated the Linux kernel's macros for the name change and done
>> build tests with my first pass at "easy" cases for adding counted_by:
>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b
>> 
>> Everything is working as expected. :)
>> 
>> -Kees
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging
Michael Matz Aug. 9, 2023, 4:21 p.m. UTC | #7
Hello,

On Wed, 9 Aug 2023, Qing Zhao wrote:

> Although this is an old FAM related issue that does not relate to my current patch 
> (and might need to be resolved in a separate patch).  I think that it’s necessary to have
> more discussion on this old issue and resolve it. 
> 
> The first thing that I’d like to confirm is:
> 
> What the exact memory layout for the following structure x?
> 
> struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
> 
> And the key that is confusing me is, where should the field “t” start? 
> 
> A.  Starting at offset 8 as the following:
> 
> a          	4-bytes
> b          	2-bytes
> padding   2-bytes
> t           	3-bytes

Why should there be padding before 't'?  It's a char array (FAM or not), 
so it can be (and should be) placed directly after 'b'.  So ...

> B. Starting at offset 6 as the following:
> 
> a          	4-bytes
> b          	2-bytes
> t           	3-bytes

... this is the correct layout, when seen in isolation.  The discussion 
revolves around what should come after 't': if it's a non-FAM struct (with 
t[3]), then it's clear that there needs to be padding after it, so to pad 
out the whole struct to be 12 bytes long (for sizeof() purpose), as 
required by its alignment (due to the int field 'a').

So, should the equivalent FAM struct also have this sizeof()?  If no: 
there should be a good argument why it shouldn't be similar to the non-FAM 
one.

Then there is an argument that the compiler would be fine, when allocating 
a single object of such type (not as part of an array!), to only reserve 9 
bytes of space for the FAM-struct.  Then the question is: should it also 
do that for a non-FAM struct (based on the argument that the padding 
behind 't' isn't accessible, and hence doesn't need to be alloced).  I 
think it would be quite unexpected for the compiler to actually reserve 
less space than sizeof() implies, so I personally don't buy that argument.  
For FAM or non-FAM structs.

Note that if one choses to allocate less space than sizeof implies that 
this will have quite some consequences for code generation, in that 
sometimes the instruction sequences (e.g. for copying) need to be careful 
to never access tail padding that should be there in array context, but 
isn't there in single-object context.  I think this alone should make it 
clear that it's advisable that sizeof() and allocated size agree.

As in: I think sizeof for both structs should return 12, and 12 bytes 
should be reserved for objects of such types.

And then the next question is what __builtin_object_size should do with 
these: should it return the size with or without padding at end (i.e. 
could/should it return 9 even if sizeof is 12).  I can see arguments for 
both.


Ciao,
Michael.
Kees Cook Aug. 9, 2023, 7:17 p.m. UTC | #8
On Mon, Aug 07, 2023 at 04:33:13PM +0000, Qing Zhao wrote:
> What’s the testing case for the one that failed? 
> If it’s 
> 
> __builtin_dynamic_object_size(p->array, 0/2) without the allocation information in the routine, 
> then with the current algorithm, gcc cannot deduce the size for the whole object.
> 
> If not such case, let me know.

I found some more bugs in my tests (now fixed), but I'm left with a single
failure case, which is think it going to boil down to pointer/pointee
issue we discussed earlier.

Using my existing testing tool:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

I see this error with the "counted_by_seen_by_bdos" case:

Expected __builtin_dynamic_object_size(p, 1) (18446744073709551615) == sizeof(*p) + p->count * sizeof(*p->array) (80)

A reduced view of the code is:

        struct annotated *p;
        int index = MAX_INDEX + unconst;

        p = alloc_annotated(index);

        EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->count * sizeof(*p->array));

It looks like __bdos will not use the __counted_by information from the
pointee if the argument is only the pointer. i.e. this test works:

        EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->count * sizeof(*p->array));

However, I thought if any part of the pointee was used (e.g. p->count,
p->array), GCC would be happy to start using the pointee details?

And, again, for the initial version at this feature, I'm fine with this
failing test being declared not a valid test. :) But I'll need some
kind of builtin that can correctly interrogate a pointer to find the
full runtime size with the assumption that pointer is valid, but that
can come later.

And as a side note, I am excited to see the very correct warnings for
bad (too late) assignment of the __counted_by member:

	p->array[0] = 0;
	p->count = 1;

array-bounds.c: In function 'invalid_assignment_order':
array-bounds.c:366:17: warning: '*p.count' is used uninitialized [-Wuninitialized]
  366 |         p->array[0] = 0;
      |         ~~~~~~~~^~~

Yay! :)

-Kees
Qing Zhao Aug. 9, 2023, 8:10 p.m. UTC | #9
> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> 
> Hello,
> 
> On Wed, 9 Aug 2023, Qing Zhao wrote:
> 
>> Although this is an old FAM related issue that does not relate to my current patch 
>> (and might need to be resolved in a separate patch).  I think that it’s necessary to have
>> more discussion on this old issue and resolve it. 
>> 
>> The first thing that I’d like to confirm is:
>> 
>> What the exact memory layout for the following structure x?
>> 
>> struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
>> 
>> And the key that is confusing me is, where should the field “t” start? 
>> 
>> A.  Starting at offset 8 as the following:
>> 
>> a          	4-bytes
>> b          	2-bytes
>> padding   2-bytes
>> t           	3-bytes
> 
> Why should there be padding before 't'?  It's a char array (FAM or not), 
> so it can be (and should be) placed directly after 'b'.  So ...

Yes, you are right. My mistake -:)
> 
>> B. Starting at offset 6 as the following:
>> 
>> a          	4-bytes
>> b          	2-bytes
>> t           	3-bytes
> 
> ... this is the correct layout, when seen in isolation.  The discussion 
> revolves around what should come after 't': if it's a non-FAM struct (with 
> t[3]), then it's clear that there needs to be padding after it, so to pad 
> out the whole struct to be 12 bytes long (for sizeof() purpose), as 
> required by its alignment (due to the int field 'a').

right.
> 
> So, should the equivalent FAM struct also have this sizeof()?  If no: 
> there should be a good argument why it shouldn't be similar to the non-FAM 
> one.

The sizeof() of a structure with FAM is defined as: (after I searched online,
 I think that the one in Wikipedia is the most reasonable one):
https://en.wikipedia.org/wiki/Flexible_array_member

======
Effect on struct size and padding
The sizeof operator on such a struct gives the size of the structure as if the flexible 
array member were empty. This may include padding added to accommodate the
 flexible member; the compiler is also free to re-use such padding as part of the array itself.[2]

It is common to allocate sizeof(struct) + array_len*sizeof(array element) bytes. 

This is not wrong, but it may allocate a few more bytes than necessary: the compiler
 may be re-purposing some of the padding that is included in sizeof(struct). Should 
this be a concern, macros are available[3] to compute the minimum size while 
ensuring that the compiler's padding is not disrupted. 

As the array may start in the padding before the end of the structure, its content
 should always be accessed via indexing (arr[i]) or  offsetof, not sizeof.
======

By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. 
i.e, for the following two structures, one with FAM, the other with fixed array:

struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
struct foo_fix {int a; short b; char t[3]; } 

With current GCC:
sizeof(foo_flex) == 8
sizeof(foo_fix) == 12

I think that the current behavior of sizeof for structure with FAM in GCC is correct. 

The major issue is what was pointed out by Martin in the previous email:

Whether using the following formula is correct to compute the allocation?

sizeof(struct foo_flex) + N * sizeof(foo->t);

As pointed out  in the wikipedia, the value computed by this formula might
 be bigger than the actual size since “sizeof(struct foo_flex)” might include 
paddings that are used as part of the array.

So the more accurate formula should be

offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 


> Then there is an argument that the compiler would be fine, when allocating 
> a single object of such type (not as part of an array!), to only reserve 9 
> bytes of space for the FAM-struct.  Then the question is: should it also 
> do that for a non-FAM struct (based on the argument that the padding 
> behind 't' isn't accessible, and hence doesn't need to be alloced).  I 
> think it would be quite unexpected for the compiler to actually reserve 
> less space than sizeof() implies, so I personally don't buy that argument.  
> For FAM or non-FAM structs.

For the question, whether the compiler needs to allocate paddings after the FAM field,
 I don’t know the answer, and it’s not specified in the standard either. 
Does it matter?

> 
> Note that if one choses to allocate less space than sizeof implies that 
> this will have quite some consequences for code generation, in that 
> sometimes the instruction sequences (e.g. for copying) need to be careful 
> to never access tail padding that should be there in array context, but 
> isn't there in single-object context.  I think this alone should make it 
> clear that it's advisable that sizeof() and allocated size agree.

Sizeof by definition is return the size of the TYPE, not the size of the allocated object.

Another thing I need to point out is,  a structure with FAM cannot be an element of an array.
> 
> As in: I think sizeof for both structs should return 12, and 12 bytes 
> should be reserved for objects of such types.
> 
> And then the next question is what __builtin_object_size should do with 
> these: should it return the size with or without padding at end (i.e. 
> could/should it return 9 even if sizeof is 12).  I can see arguments for 
> both.

Currently, GCC’s __builtin_object_size use the following formula to compute the object size for
The structure with FAM:

offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

I think it’s correct. 

I think that the users might need to use this formula to compute the allocation size for a structure with FAM too.

Qing

> 
> 
> Ciao,
> Michael.
Qing Zhao Aug. 9, 2023, 8:34 p.m. UTC | #10
> On Aug 8, 2023, at 10:54 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.

As I checked the algorithm for bos in GCC,  it uses a similar formula as the following to compute the object size:

offset(struct foo, t[0]) + N * sizeof(*foo->t);

Which seems correct to me.  (Therefore bos returns 9 for this example).
> 
> 
> https://godbolt.org/z/K1hvaK1ns
> 
> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.

My understanding is, since a structure with FAM cannot be an element of an array, 
 the tailing padding might not be necessary?

Qing

> 
> 
> Martin
> 
> 
> 
> Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
>> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>>> are:
>> 
>> Thanks for the update!
>> 
>>> 
>>> 1. change the name "element_count" to "counted_by";
>>> 2. change the parameter for the attribute from a STRING to an
>>> Identifier;
>>> 3. Add logic and testing cases to handle anonymous structure/unions;
>>> 4. Clarify documentation to permit the situation when the allocation
>>> size is larger than what's specified by "counted_by", at the same time,
>>> it's user's error if allocation size is smaller than what's specified by
>>> "counted_by";
>>> 5. Add a complete testing case for using counted_by attribute in
>>> __builtin_dynamic_object_size when there is mismatch between the
>>> allocation size and the value of "counted_by", the expecting behavior
>>> for each case and the explanation on why in the comments. 
>> 
>> All the "normal" test cases I have are passing; this is wonderful! :)
>> 
>> I'm still seeing unexpected situations when I've intentionally set
>> counted_by to be smaller than alloc_size, but I assume it's due to not
>> yet having the patch you mention below.
>> 
>>> As discussed, I plan to add two more separate patch sets after this initial
>>> patch set is approved and committed.
>>> 
>>> set 1. A new warning option and a new sanitizer option for the user error
>>>        when the allocation size is smaller than the value of "counted_by".
>>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>>        case:
>>> 
>>> struct A
>>> {
>>> size_t foo;
>>> int array[] __attribute__((counted_by (foo)));
>>> };
>>> 
>>> extern struct fix * alloc_buf ();
>>> 
>>> int main ()
>>> {
>>> struct fix *p = alloc_buf ();
>>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */ 
>>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */
>>> }
>> 
>> Should the above be bdos instead of bos?
>> 
>>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
>> 
>> I've updated the Linux kernel's macros for the name change and done
>> build tests with my first pass at "easy" cases for adding counted_by:
>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b
>> 
>> Everything is working as expected. :)
>> 
>> -Kees
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging
Martin Uecker Aug. 10, 2023, 6:58 a.m. UTC | #11
Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> 
> > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:

...
> 
> By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. 
> i.e, for the following two structures, one with FAM, the other with fixed array:
> 
> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
> struct foo_fix {int a; short b; char t[3]; } 
> 
> With current GCC:
> sizeof(foo_flex) == 8
> sizeof(foo_fix) == 12
> 
> I think that the current behavior of sizeof for structure with FAM in GCC is correct. 

Yes, sadly the sizeof has to be like this as required by ISO C.

> 
> The major issue is what was pointed out by Martin in the previous email:
> 
> Whether using the following formula is correct to compute the allocation?
> 
> sizeof(struct foo_flex) + N * sizeof(foo->t);

That formula is safe for allocation, but might allocate more padding
than the minimum amount and it might allocate less storage than a
similar array with fixed array.

> As pointed out  in the wikipedia, the value computed by this formula might
>  be bigger than the actual size since “sizeof(struct foo_flex)” might include 
> paddings that are used as part of the array.
> 
> So the more accurate formula should be
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> 
> With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 

This formula might be considered incorrect / dangerous because
it might allocate less storage than sizeof(struct foo_flex). 


https://godbolt.org/z/8accq75f3
> 
...
> > As in: I think sizeof for both structs should return 12, and 12 bytes 
> > should be reserved for objects of such types.
> > 
> > And then the next question is what __builtin_object_size should do with 
> > these: should it return the size with or without padding at end (i.e. 
> > could/should it return 9 even if sizeof is 12).  I can see arguments for 
> > both.
> 
> Currently, GCC’s __builtin_object_size use the following formula to compute the object size for
> The structure with FAM:
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> 
> I think it’s correct. 
> 
> I think that the users might need to use this formula to compute the allocation size for a structure with FAM too.

I am not sure for the reason given above. The following
code would not work:

struct foo_flex { int a; short b; char t[]; } x;
x.a = 1;
struct foo_flex *p = malloc(sizeof(x) + x.a);
if (!p) abort();
memcpy(p, &x, sizeof(x)); // initialize struct


Martin
Michael Matz Aug. 10, 2023, 1:54 p.m. UTC | #12
Hello,

On Wed, 9 Aug 2023, Qing Zhao wrote:

> > So, should the equivalent FAM struct also have this sizeof()?  If no: 
> > there should be a good argument why it shouldn't be similar to the non-FAM 
> > one.
> 
> The sizeof() of a structure with FAM is defined as: (after I searched online,
>  I think that the one in Wikipedia is the most reasonable one):
> https://en.wikipedia.org/wiki/Flexible_array_member

Well, wikipedia has it's uses.  Here we are in language lawyering land 
together with a discussion what makes most sense in many circumstances.  
FWIW, in this case I think the cited text from wikipedia is correct in the 
sense of "not wrong" but not helpful in the sense of "good advise".

> By definition, the sizeof() of a struct with FAM might not be the same 
> as the non-FAM one. i.e, for the following two structures, one with FAM, 
> the other with fixed array:
> 
> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
> struct foo_fix {int a; short b; char t[3]; } 
> 
> With current GCC:
> sizeof(foo_flex) == 8
> sizeof(foo_fix) == 12
> 
> I think that the current behavior of sizeof for structure with FAM in 
> GCC is correct.

It is, yes.

> The major issue is what was pointed out by Martin in the previous email:
> 
> Whether using the following formula is correct to compute the 
> allocation?
> 
> sizeof(struct foo_flex) + N * sizeof(foo->t);
> 
> As pointed out  in the wikipedia, the value computed by this formula might
> be bigger than the actual size since “sizeof(struct foo_flex)” might include 
> paddings that are used as part of the array.

That doesn't make the formula incorrect, but rather conservatively 
correct.  If you don't want to be conservative, then yes, you can use a 
different formula if you happen to know the layout rules your compiler at 
hand uses (or the ones prescribed by an ABI, if it does that).  I think 
it would be bad advise to the general population do advertise this scheme 
as better.

> So the more accurate formula should be
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

"* sizeof(foo->t[0])", but yes.

> For the question, whether the compiler needs to allocate paddings after 
> the FAM field, I don’t know the answer, and it’s not specified in the 
> standard either. Does it matter?

It matters for two things:

1) Abstract reasons: is there as reason to deviate 
from the normal rules?  If not: it shouldn't deviate.  Future 
extensibility: while it right now is not possible to form an array 
of FMA-structs (in C!), who's to say that it may not be eventually added?  
It seems a natural enough extension of an extension, and while it has 
certain implementation problems (the "real" size of the elements needs to 
be computable, and hence be part of the array type) those could be 
overcome.  At that point you _really_ want to have the elements aligned 
naturally, be compatible with sizeof, and be the same as an individual 
object.

2) Practical reasons: codegeneration works better if the accessible sizes 
of objects are a multiple of their alignment, as often you have 
instructions that can move around alignment-sized blobs (say, words).  If 
you don't pad out objects you have to be careful to use only byte accesses 
when you get to the end of an object.

Let me ask the question in the opposite way: what would you _gain_ by not 
allocating padding?  And is that enough to deviate from the most obvious 
choices?  (Do note that e.g. global or local FMA-typed objects don't exist 
in isolation, and their surrounding objects have their own alignment 
rules, which often will lead to tail padding anyway, even if you would use 
the non-conservative size calculation; the same applies for malloc'ed 
objects).

> > Note that if one choses to allocate less space than sizeof implies that 
> > this will have quite some consequences for code generation, in that 
> > sometimes the instruction sequences (e.g. for copying) need to be careful 
> > to never access tail padding that should be there in array context, but 
> > isn't there in single-object context.  I think this alone should make it 
> > clear that it's advisable that sizeof() and allocated size agree.
> 
> Sizeof by definition is return the size of the TYPE, not the size of the 
> allocated object.

Sure.  Outside special cases like FMA it's the same, though.  And there 
sizeof does include tail padding.

> > And then the next question is what __builtin_object_size should do with 
> > these: should it return the size with or without padding at end (i.e. 
> > could/should it return 9 even if sizeof is 12).  I can see arguments for 
> > both.
> 
> Currently, GCC’s __builtin_object_size use the following formula to 
> compute the object size for The structure with FAM:
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> 
> I think it’s correct.

See above.  It's non-conservatively correct.  And that may be the right 
choice for this builtin, considering its intended use-cases (strict 
checking of allowed access ranges) ...

> I think that the users might need to use this formula to compute the 
> allocation size for a structure with FAM too.

... but that doesn't automatically follows.  There's a difference between 
accessible memory range (for language semantic purposes) and allocated 
memory range.  I could imagine that somewhen __builtin_object_size doesn't 
include tail padding for normal non-FMA types either.  After all you can't 
rely on the contents there, though you won't get segfaults when you access 
it.  sizeof will continue to have to include it, though.  So that's a 
demonstration of why _bos is not the right thing to use for allocation 
purposes.


Ciao,
Michael.
Qing Zhao Aug. 10, 2023, 1:59 p.m. UTC | #13
> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> 
> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>> 
>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> 
> ...
>> 
>> By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. 
>> i.e, for the following two structures, one with FAM, the other with fixed array:
>> 
>> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
>> struct foo_fix {int a; short b; char t[3]; } 
>> 
>> With current GCC:
>> sizeof(foo_flex) == 8
>> sizeof(foo_fix) == 12
>> 
>> I think that the current behavior of sizeof for structure with FAM in GCC is correct. 
> 
> Yes, sadly the sizeof has to be like this as required by ISO C.
Agreed. Yes, if the size information of the FAM can be integrated into the 
TYPE system in C standard, we will not have such issue anymore. 

> 
>> 
>> The major issue is what was pointed out by Martin in the previous email:
>> 
>> Whether using the following formula is correct to compute the allocation?
>> 
>> sizeof(struct foo_flex) + N * sizeof(foo->t);
> 
> That formula is safe for allocation, but might allocate more padding
> than the minimum amount and
Yes. 
> it might allocate less storage than a
> similar array with fixed array.
Yes. 
> 
>> As pointed out  in the wikipedia, the value computed by this formula might
>>  be bigger than the actual size since “sizeof(struct foo_flex)” might include 
>> paddings that are used as part of the array.
>> 
>> So the more accurate formula should be
>> 
>> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
>> 
>> With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 
> 
> This formula might be considered incorrect / dangerous because
> it might allocate less storage than sizeof(struct foo_flex). 
> 
> 
> https://godbolt.org/z/8accq75f3

I see, thanks.
>> 
> ...
>>> As in: I think sizeof for both structs should return 12, and 12 bytes 
>>> should be reserved for objects of such types.
>>> 
>>> And then the next question is what __builtin_object_size should do with 
>>> these: should it return the size with or without padding at end (i.e. 
>>> could/should it return 9 even if sizeof is 12).  I can see arguments for 
>>> both.
>> 
>> Currently, GCC’s __builtin_object_size use the following formula to compute the object size for
>> The structure with FAM:
>> 
>> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
>> 
>> I think it’s correct. 
>> 
>> I think that the users might need to use this formula to compute the allocation size for a structure with FAM too.
> 
> I am not sure for the reason given above. The following
> code would not work:
> 
> struct foo_flex { int a; short b; char t[]; } x;
> x.a = 1;
> struct foo_flex *p = malloc(sizeof(x) + x.a);
> if (!p) abort();
> memcpy(p, &x, sizeof(x)); // initialize struct
> 
Okay. 
Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)

Let me know if I still miss anything.

Thanks a lot for the explanation.

Qing
> Martin
> 
> 
>
Michael Matz Aug. 10, 2023, 2:02 p.m. UTC | #14
Hey,

On Thu, 10 Aug 2023, Martin Uecker wrote:

> > offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> > 
> > With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 
> 
> This formula might be considered incorrect / dangerous because
> it might allocate less storage than sizeof(struct foo_flex). 

Oh indeed.  I hadn't even considered that.  That could be "fixed" with 
another max(theabove, sizeof(struct foo_flex)), but that starts to become 
silly when the obvious choice works fine.


Ciao,
Michael.
Martin Uecker Aug. 10, 2023, 2:38 p.m. UTC | #15
Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> 
> > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > 
> > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > 
> > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > 

> > I am not sure for the reason given above. The following
> > code would not work:
> > 
> > struct foo_flex { int a; short b; char t[]; } x;
> > x.a = 1;
> > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > if (!p) abort();
> > memcpy(p, &x, sizeof(x)); // initialize struct
> > 
> Okay. 
> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> 
> Let me know if I still miss anything.

The question is not only what the user should use to
allocate, but also what BDOS should return.  In my
example the user uses the sizeof() + N * sizeof
formula and the memcpy is safe, but it would be flagged
as a buffer overrun if BDOS uses the offsetof formula.

Martin
Jakub Jelinek Aug. 10, 2023, 2:42 p.m. UTC | #16
On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> > 
> > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > > 
> > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > > 
> > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > > 
> 
> > > I am not sure for the reason given above. The following
> > > code would not work:
> > > 
> > > struct foo_flex { int a; short b; char t[]; } x;
> > > x.a = 1;
> > > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > > if (!p) abort();
> > > memcpy(p, &x, sizeof(x)); // initialize struct
> > > 
> > Okay. 
> > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> > 
> > Let me know if I still miss anything.
> 
> The question is not only what the user should use to
> allocate, but also what BDOS should return.  In my
> example the user uses the sizeof() + N * sizeof
> formula and the memcpy is safe, but it would be flagged
> as a buffer overrun if BDOS uses the offsetof formula.

BDOS/BOS (at least the 0 level) should return what is actually
allocated for the var, what size was passed to malloc and if it
is a var with flex array member with initialization what is actually the
size on the stack or in .data/.rodata etc.
And for 1 level the same unless it is just access to some element, then
it should be capped by the size of that element.

	Jakub
Martin Uecker Aug. 10, 2023, 2:47 p.m. UTC | #17
Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
> > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> > > 
> > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > > > 
> > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > > > 
> > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > > > 
> > 
> > > > I am not sure for the reason given above. The following
> > > > code would not work:
> > > > 
> > > > struct foo_flex { int a; short b; char t[]; } x;
> > > > x.a = 1;
> > > > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > > > if (!p) abort();
> > > > memcpy(p, &x, sizeof(x)); // initialize struct
> > > > 
> > > Okay. 
> > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> > > 
> > > Let me know if I still miss anything.
> > 
> > The question is not only what the user should use to
> > allocate, but also what BDOS should return.  In my
> > example the user uses the sizeof() + N * sizeof
> > formula and the memcpy is safe, but it would be flagged
> > as a buffer overrun if BDOS uses the offsetof formula.
> 
> BDOS/BOS (at least the 0 level) should return what is actually
> allocated for the var, what size was passed to malloc and if it
> is a var with flex array member with initialization what is actually the
> size on the stack or in .data/.rodata etc.

Agreed.

But what about a struct with FAM with the new "counted_by" attribute
if the original allocation is not visible?

Martin

> And for 1 level the same unless it is just access to some element, then
> it should be capped by the size of that element.
>
Siddhesh Poyarekar Aug. 10, 2023, 2:58 p.m. UTC | #18
On 2023-08-10 10:47, Martin Uecker wrote:
> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
>>>>
>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
>>>>>
>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>>>>>>
>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
>>>>>
>>>
>>>>> I am not sure for the reason given above. The following
>>>>> code would not work:
>>>>>
>>>>> struct foo_flex { int a; short b; char t[]; } x;
>>>>> x.a = 1;
>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a);
>>>>> if (!p) abort();
>>>>> memcpy(p, &x, sizeof(x)); // initialize struct
>>>>>
>>>> Okay.
>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
>>>>
>>>> Let me know if I still miss anything.
>>>
>>> The question is not only what the user should use to
>>> allocate, but also what BDOS should return.  In my
>>> example the user uses the sizeof() + N * sizeof
>>> formula and the memcpy is safe, but it would be flagged
>>> as a buffer overrun if BDOS uses the offsetof formula.
>>
>> BDOS/BOS (at least the 0 level) should return what is actually
>> allocated for the var, what size was passed to malloc and if it
>> is a var with flex array member with initialization what is actually the
>> size on the stack or in .data/.rodata etc.
> 
> Agreed.
> 
> But what about a struct with FAM with the new "counted_by" attribute
> if the original allocation is not visible?

There's precedent for this through the __access__ attribute; __bos 
trusts what the attribute says about the allocation.

Sid
Martin Uecker Aug. 10, 2023, 3:18 p.m. UTC | #19
Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:
> On 2023-08-10 10:47, Martin Uecker wrote:
> > Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
> > > On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
> > > > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> > > > > 
> > > > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > > > > > 
> > > > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > > > > > 
> > > > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > > > > > 
> > > > 
> > > > > > I am not sure for the reason given above. The following
> > > > > > code would not work:
> > > > > > 
> > > > > > struct foo_flex { int a; short b; char t[]; } x;
> > > > > > x.a = 1;
> > > > > > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > > > > > if (!p) abort();
> > > > > > memcpy(p, &x, sizeof(x)); // initialize struct
> > > > > > 
> > > > > Okay.
> > > > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> > > > > 
> > > > > Let me know if I still miss anything.
> > > > 
> > > > The question is not only what the user should use to
> > > > allocate, but also what BDOS should return.  In my
> > > > example the user uses the sizeof() + N * sizeof
> > > > formula and the memcpy is safe, but it would be flagged
> > > > as a buffer overrun if BDOS uses the offsetof formula.
> > > 
> > > BDOS/BOS (at least the 0 level) should return what is actually
> > > allocated for the var, what size was passed to malloc and if it
> > > is a var with flex array member with initialization what is actually the
> > > size on the stack or in .data/.rodata etc.
> > 
> > Agreed.
> > 
> > But what about a struct with FAM with the new "counted_by" attribute
> > if the original allocation is not visible?
> 
> There's precedent for this through the __access__ attribute; __bos 
> trusts what the attribute says about the allocation.

The access attribute gives the size directly. The counted_by gives
a length for the array which needs to be translated into a size
via a formula. There are different formulas in use. The question 
is which formula should bdos trust?

Whatever you pick, if this is not consistent with the actual
allocation or use, then it will cause problems either by
breaking code or not detecting buffer overruns.

So it needs to be consistent with what GCC allocates for a
var with FAM and initialization and also the user needs to 
be told what the right choice is so that he can use the right
size for allocation and argument to memcpy / memset etc.

Martin
Qing Zhao Aug. 10, 2023, 4:28 p.m. UTC | #20
Hi, Martin,

> On Aug 10, 2023, at 11:18 AM, Martin Uecker <muecker@gwdg.de> wrote:
> 
> Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:
>> On 2023-08-10 10:47, Martin Uecker wrote:
>>> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
>>>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
>>>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
>>>>>> 
>>>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
>>>>>>> 
>>>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>>>>>>>> 
>>>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
>>>>>>> 
>>>>> 
>>>>>>> I am not sure for the reason given above. The following
>>>>>>> code would not work:
>>>>>>> 
>>>>>>> struct foo_flex { int a; short b; char t[]; } x;
>>>>>>> x.a = 1;
>>>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a);
>>>>>>> if (!p) abort();
>>>>>>> memcpy(p, &x, sizeof(x)); // initialize struct
>>>>>>> 
>>>>>> Okay.
>>>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
>>>>>> 
>>>>>> Let me know if I still miss anything.
>>>>> 
>>>>> The question is not only what the user should use to
>>>>> allocate, but also what BDOS should return.  In my
>>>>> example the user uses the sizeof() + N * sizeof
>>>>> formula and the memcpy is safe, but it would be flagged
>>>>> as a buffer overrun if BDOS uses the offsetof formula.
>>>> 
>>>> BDOS/BOS (at least the 0 level) should return what is actually
>>>> allocated for the var, what size was passed to malloc and if it
>>>> is a var with flex array member with initialization what is actually the
>>>> size on the stack or in .data/.rodata etc.
>>> 
>>> Agreed.
>>> 
>>> But what about a struct with FAM with the new "counted_by" attribute
>>> if the original allocation is not visible?
>> 
>> There's precedent for this through the __access__ attribute; __bos 
>> trusts what the attribute says about the allocation.
> 
> The access attribute gives the size directly. The counted_by gives
> a length for the array which needs to be translated into a size
> via a formula. There are different formulas in use. The question 
> is which formula should bdos trust?
> 
> Whatever you pick, if this is not consistent with the actual
> allocation or use, then it will cause problems either by
> breaking code or not detecting buffer overruns.
> 
> So it needs to be consistent with what GCC allocates for a
> var with FAM and initialization and also the user needs to 
> be told what the right choice is so that he can use the right
> size for allocation and argument to memcpy / memset etc.

All agreed. Thanks a lot for raising these issues and providing helpful suggestions. 
I will double check on these and make sure __bos behaves correctly for structure with FAM.

Might come back with more questions for discussion…-:).

Thanks.

Qing
> 
> Martin
Siddhesh Poyarekar Aug. 10, 2023, 4:30 p.m. UTC | #21
On 2023-08-10 11:18, Martin Uecker wrote:
> Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:
>> On 2023-08-10 10:47, Martin Uecker wrote:
>>> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
>>>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
>>>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
>>>>>>
>>>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
>>>>>>>
>>>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>>>>>>>>
>>>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
>>>>>>>
>>>>>
>>>>>>> I am not sure for the reason given above. The following
>>>>>>> code would not work:
>>>>>>>
>>>>>>> struct foo_flex { int a; short b; char t[]; } x;
>>>>>>> x.a = 1;
>>>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a);
>>>>>>> if (!p) abort();
>>>>>>> memcpy(p, &x, sizeof(x)); // initialize struct
>>>>>>>
>>>>>> Okay.
>>>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
>>>>>>
>>>>>> Let me know if I still miss anything.
>>>>>
>>>>> The question is not only what the user should use to
>>>>> allocate, but also what BDOS should return.  In my
>>>>> example the user uses the sizeof() + N * sizeof
>>>>> formula and the memcpy is safe, but it would be flagged
>>>>> as a buffer overrun if BDOS uses the offsetof formula.
>>>>
>>>> BDOS/BOS (at least the 0 level) should return what is actually
>>>> allocated for the var, what size was passed to malloc and if it
>>>> is a var with flex array member with initialization what is actually the
>>>> size on the stack or in .data/.rodata etc.
>>>
>>> Agreed.
>>>
>>> But what about a struct with FAM with the new "counted_by" attribute
>>> if the original allocation is not visible?
>>
>> There's precedent for this through the __access__ attribute; __bos
>> trusts what the attribute says about the allocation.
> 
> The access attribute gives the size directly. The counted_by gives
> a length for the array which needs to be translated into a size
> via a formula. There are different formulas in use. The question
> is which formula should bdos trust?
> 
> Whatever you pick, if this is not consistent with the actual
> allocation or use, then it will cause problems either by
> breaking code or not detecting buffer overruns.
> 
> So it needs to be consistent with what GCC allocates for a
> var with FAM and initialization and also the user needs to
> be told what the right choice is so that he can use the right
> size for allocation and argument to memcpy / memset etc.

We'd rather miss overflow to the extent of padding than to try and be 
overly aggressive; I doubt if we're missing much protection in practice 
by trying to account for the padding.

The definition of __bos/__bdos allows us the freedom to *estimate* 
rather than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since 
it's bound to give the more conservative answer of the two.

Thanks,
Sid
Jakub Jelinek Aug. 10, 2023, 4:39 p.m. UTC | #22
On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
> The definition of __bos/__bdos allows us the freedom to *estimate* rather
> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
> to give the more conservative answer of the two.

To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
__bos/__bdos from actual allocation size or size of an stack object or
size of data section object but something else (say counted_by), perhaps
0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
then user code can continue testing if both modes are equal to have
exact number.

	Jakub
Siddhesh Poyarekar Aug. 10, 2023, 5:06 p.m. UTC | #23
On 2023-08-10 12:39, Jakub Jelinek wrote:
> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
>> The definition of __bos/__bdos allows us the freedom to *estimate* rather
>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
>> to give the more conservative answer of the two.
> 
> To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
> __bos/__bdos from actual allocation size or size of an stack object or
> size of data section object but something else (say counted_by), perhaps
> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
> then user code can continue testing if both modes are equal to have
> exact number.

Ack, that's fair.

Thanks,
Sid
Qing Zhao Aug. 10, 2023, 6:18 p.m. UTC | #24
> On Aug 10, 2023, at 12:39 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
>> The definition of __bos/__bdos allows us the freedom to *estimate* rather
>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
>> to give the more conservative answer of the two.
> 
> To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
> __bos/__bdos from actual allocation size or size of an stack object or
> size of data section object but something else (say counted_by), perhaps
> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
> then user code can continue testing if both modes are equal to have
> exact number.

Yes, this sounds reasonable to me.

Qing
> 
> 	Jakub
>
Qing Zhao Aug. 16, 2023, 9:42 p.m. UTC | #25
Hi, 

After some more studying and consideration, the following is my thoughts:

For a structure with FMA annotated with counted_by attribute: (the following small example)

====
struct annotated {
        size_t foo;
        char b;
        char array[] __attribute__((counted_by (foo)));
};

#define noinline __attribute__((__noinline__))
#define MAX(a, b)  ((a) > (b) ? (a) :  (b))

static struct annotated * noinline alloc_buf (size_t length)
{
  struct annotated *p;
  p = (struct annotated *) malloc (MAX (sizeof (struct annotated),
                                        			(offsetof (struct annotated, array[0])
                                         			+ (length) * sizeof (char)))); 
  p->foo = length;
  return p;
}

int main ()
{
  struct annotated *p = alloc_buf (10);
  printf("the__bdos of max p->array whole is %d \n", __builtin_dynamic_object_size(p->array, 0)); 
  printf("the__bdos of max p->array sub is %d \n", __builtin_dynamic_object_size(p->array, 1));
  printf("the__bdos of min p->array whole is %d \n", __builtin_dynamic_object_size(p->array, 2));
  printf("the__bdos of min p->array sub is %d \n", __builtin_dynamic_object_size(p->array, 3));   
}

=====

 The actual allocation of the structure and the layout of the structure p is fixed at compilation time,
    A. We know the offsetof (p->array) during compilation time, (it’s 9)
    B. We also know the size of the p->array though the counted_by attribute, it’s p->foo * sizeof (char).

  1.  for subobject size (1/3 modes), Both A and B are know at compilation time, whatever it’s MAX or MIN, we 
    can determine  the size of the subobject p->array is:  p->foo * sizeof(char) without estimation. 

  2.  for whole object size (0/2 modes), since we don’t have any info on the actual allocation or structure 
    Initialization, we don’t know the size for the whole object whatever it’s MAX or MIN. 

So, the problem to decide which formula to use ((sizeof (x) + N * sizeof(elt), or offsetof + N * sizeof(elt)) is actually
the programmer’s job when allocating memory for the structure with FMA. (It’s not compiler’s job).  

Since this size computation is really confusing for the structure with FMA, I think that adding some clarification in
the documentation might be necessary to provide more details and guidance to the end-users.

Let me know if I miss anything here.

Thanks a lot.

Qing




> On Aug 10, 2023, at 11:18 AM, Martin Uecker <muecker@gwdg.de> wrote:
> The access attribute gives the size directly. The counted_by gives
> a length for the array which needs to be translated into a size
> via a formula. There are different formulas in use. The question 
> is which formula should bdos trust?
> 
> Whatever you pick, if this is not consistent with the actual
> allocation or use, then it will cause problems either by
> breaking code or not detecting buffer overruns.
> 
> So it needs to be consistent with what GCC allocates for a
> var with FAM and initialization and also the user needs to 
> be told what the right choice is so that he can use the right
> size for allocation and argument to memcpy / memset etc.

> On Aug 10, 2023, at 1:06 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-10 12:39, Jakub Jelinek wrote:
>> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
>>> The definition of __bos/__bdos allows us the freedom to *estimate* rather
>>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
>>> to give the more conservative answer of the two.
>> To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
>> __bos/__bdos from actual allocation size or size of an stack object or
>> size of data section object but something else (say counted_by), perhaps
>> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
>> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
>> then user code can continue testing if both modes are equal to have
>> exact number.
> 
> Ack, that's fair.
> 
> Thanks,
> Sid
Kees Cook Aug. 17, 2023, 5:31 a.m. UTC | #26
On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> This is the 2nd version of the patch, per our discussion based on the
> review comments for the 1st version, the major changes in this version

I've been using Coccinelle to find and annotate[1] structures (193 so
far...), and I've encountered 2 cases of GCC internal errors. I'm working
on a minimized test case, but just in case these details are immediately
helpful, here's what I'm seeing:

../drivers/net/wireless/ath/wcn36xx/smd.c: In function 'wcn36xx_smd_rsp_process':
../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: error: incorrect sharing of tree nodes
 3299 | int wcn36xx_smd_rsp_process(struct rpmsg_device *rpdev,
      |     ^~~~~~~~~~~~~~~~~~~~~~~
MEM[(struct wcn36xx_hal_ind_msg *)_96]
_15 = &MEM[(struct wcn36xx_hal_ind_msg *)_96].msg;
during GIMPLE pass: objsz
../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
        ../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
        ../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
        ../../../../gcc/gcc/passes.cc:2142

The associated struct is:

struct wcn36xx_hal_ind_msg {
        struct list_head list;
        size_t msg_len;
        u8 msg[] __counted_by(msg_len);
};



And:

../drivers/usb/gadget/function/f_fs.c: In function '__ffs_epfile_read_data':
../drivers/usb/gadget/function/f_fs.c:900:16: error: incorrect sharing of tree nodes
  900 | static ssize_t __ffs_epfile_read_data(struct ffs_epfile *epfile,
      |                ^~~~~~~~~~~~~~~~~~~~~~
MEM[(struct ffs_buffer *)_67]
_5 = &MEM[(struct ffs_buffer *)_67].storage;
during GIMPLE pass: objsz
../drivers/usb/gadget/function/f_fs.c:900:16: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
        ../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
        ../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
        ../../../../gcc/gcc/passes.cc:2142

with:

struct ffs_buffer {
        size_t length;
        char *data;
        char storage[] __counted_by(length);
};


[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Kees Cook Aug. 17, 2023, 6:38 a.m. UTC | #27
On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote:
> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> > This is the 2nd version of the patch, per our discussion based on the
> > review comments for the 1st version, the major changes in this version
> 
> I've been using Coccinelle to find and annotate[1] structures (193 so
> far...), and I've encountered 2 cases of GCC internal errors. I'm working
> on a minimized test case, but just in case these details are immediately
> helpful, here's what I'm seeing:

Okay, I got it minimized:

$ cat poc.c
struct a {
  unsigned long c;
  char d[] __attribute__((__counted_by__(c)));
} *b;

void f(long);

void e(void) {
  long g = __builtin_dynamic_object_size(b->d, 1);
  f(g);
}
$ gcc -O2 -c -o /dev/null poc.c
poc.c: In function 'e':
poc.c:8:6: error: incorrect sharing of tree nodes
    8 | void e(void) {
      |      ^
*b.0_1
_2 = &b.0_1->d;
during GIMPLE pass: objsz
poc.c:8:6: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
        ../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
        ../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
        ../../../../gcc/gcc/passes.cc:2142
Qing Zhao Aug. 17, 2023, 1:44 p.m. UTC | #28
Hi, Kees,

Thanks for the testing case. 
Yes, I noticed this issue too, and already fixed it in my private branch. 

With the latest patch, the compilation has no issue:
[opc@qinzhao-ol8u3-x86 108896]$ sh t
/home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
[opc@qinzhao-ol8u3-x86 108896]$ 

Qing

> On Aug 17, 2023, at 2:38 AM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote:
>> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>> 
>> I've been using Coccinelle to find and annotate[1] structures (193 so
>> far...), and I've encountered 2 cases of GCC internal errors. I'm working
>> on a minimized test case, but just in case these details are immediately
>> helpful, here's what I'm seeing:
> 
> Okay, I got it minimized:
> 
> $ cat poc.c
> struct a {
>  unsigned long c;
>  char d[] __attribute__((__counted_by__(c)));
> } *b;
> 
> void f(long);
> 
> void e(void) {
>  long g = __builtin_dynamic_object_size(b->d, 1);
>  f(g);
> }
> $ gcc -O2 -c -o /dev/null poc.c
> poc.c: In function 'e':
> poc.c:8:6: error: incorrect sharing of tree nodes
>    8 | void e(void) {
>      |      ^
> *b.0_1
> _2 = &b.0_1->d;
> during GIMPLE pass: objsz
> poc.c:8:6: internal compiler error: verify_gimple failed
> 0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
>        ../../../../gcc/gcc/tree-cfg.cc:5646
> 0xe84894 execute_function_todo
>        ../../../../gcc/gcc/passes.cc:2088
> 0xe84dee execute_todo
>        ../../../../gcc/gcc/passes.cc:2142
> 
> -- 
> Kees Cook
Kees Cook Aug. 17, 2023, 4:54 p.m. UTC | #29
On Thu, Aug 17, 2023 at 01:44:42PM +0000, Qing Zhao wrote:
> Thanks for the testing case. 
> Yes, I noticed this issue too, and already fixed it in my private branch. 
> 
> With the latest patch, the compilation has no issue:
> [opc@qinzhao-ol8u3-x86 108896]$ sh t
> /home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
> [opc@qinzhao-ol8u3-x86 108896]$ 

Great! Thanks. I look forward to v3. For now I'll leave off these 2
annotations in my kernel builds. :)

-Kees