Message ID | 20230804194431.993958-1-qing.zhao@oracle.com |
---|---|
Headers | show |
Series | New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) | expand |
On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: > This is the 2nd version of the patch, per our discussion based on the > review comments for the 1st version, the major changes in this version > are: Thanks for the update! > > 1. change the name "element_count" to "counted_by"; > 2. change the parameter for the attribute from a STRING to an > Identifier; > 3. Add logic and testing cases to handle anonymous structure/unions; > 4. Clarify documentation to permit the situation when the allocation > size is larger than what's specified by "counted_by", at the same time, > it's user's error if allocation size is smaller than what's specified by > "counted_by"; > 5. Add a complete testing case for using counted_by attribute in > __builtin_dynamic_object_size when there is mismatch between the > allocation size and the value of "counted_by", the expecting behavior > for each case and the explanation on why in the comments. All the "normal" test cases I have are passing; this is wonderful! :) I'm still seeing unexpected situations when I've intentionally set counted_by to be smaller than alloc_size, but I assume it's due to not yet having the patch you mention below. > As discussed, I plan to add two more separate patch sets after this initial > patch set is approved and committed. > > set 1. A new warning option and a new sanitizer option for the user error > when the allocation size is smaller than the value of "counted_by". > set 2. An improvement to __builtin_dynamic_object_size for the following > case: > > struct A > { > size_t foo; > int array[] __attribute__((counted_by (foo))); > }; > > extern struct fix * alloc_buf (); > > int main () > { > struct fix *p = alloc_buf (); > __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int); > /* with the current algorithm, it’s UNKNOWN */ > __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int); > /* with the current algorithm, it’s UNKNOWN */ > } Should the above be bdos instead of bos? > Bootstrapped and regression tested on both aarch64 and X86, no issue. I've updated the Linux kernel's macros for the name change and done build tests with my first pass at "easy" cases for adding counted_by: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b Everything is working as expected. :) -Kees
> On Aug 7, 2023, at 12:16 PM, Kees Cook <keescook@chromium.org> wrote: > > On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: >> This is the 2nd version of the patch, per our discussion based on the >> review comments for the 1st version, the major changes in this version >> are: > > Thanks for the update! > >> >> 1. change the name "element_count" to "counted_by"; >> 2. change the parameter for the attribute from a STRING to an >> Identifier; >> 3. Add logic and testing cases to handle anonymous structure/unions; >> 4. Clarify documentation to permit the situation when the allocation >> size is larger than what's specified by "counted_by", at the same time, >> it's user's error if allocation size is smaller than what's specified by >> "counted_by"; >> 5. Add a complete testing case for using counted_by attribute in >> __builtin_dynamic_object_size when there is mismatch between the >> allocation size and the value of "counted_by", the expecting behavior >> for each case and the explanation on why in the comments. > > All the "normal" test cases I have are passing; this is wonderful! :) > > I'm still seeing unexpected situations when I've intentionally set > counted_by to be smaller than alloc_size, but I assume it's due to not > yet having the patch you mention below. What’s the testing case for the one that failed? If it’s __builtin_dynamic_object_size(p->array, 0/2) without the allocation information in the routine, then with the current algorithm, gcc cannot deduce the size for the whole object. If not such case, let me know. > >> As discussed, I plan to add two more separate patch sets after this initial >> patch set is approved and committed. >> >> set 1. A new warning option and a new sanitizer option for the user error >> when the allocation size is smaller than the value of "counted_by". >> set 2. An improvement to __builtin_dynamic_object_size for the following >> case: >> >> struct A >> { >> size_t foo; >> int array[] __attribute__((counted_by (foo))); >> }; >> >> extern struct fix * alloc_buf (); >> >> int main () >> { >> struct fix *p = alloc_buf (); >> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int); >> /* with the current algorithm, it’s UNKNOWN */ >> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int); >> /* with the current algorithm, it’s UNKNOWN */ >> } > > Should the above be bdos instead of bos? Yes, sorry for the typo. -:) > >> Bootstrapped and regression tested on both aarch64 and X86, no issue. > > I've updated the Linux kernel's macros for the name change and done > build tests with my first pass at "easy" cases for adding counted_by: > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b > > Everything is working as expected. :) Thanks a lot for the testing! Qing > > -Kees > > -- > Kees Cook > >
I am sure this has been discussed before, but seeing that you test for a specific formula, let me point out the following: There at least three different size expression which could make sense. Consider short foo { int a; short b; char t[]; }; Most people seem to use sizeof(struct foo) + N * sizeof(foo->t); which for N == 3 yields 11 bytes on x86-64 because the formula adds the padding of the original struct. There is an example in the C standard that uses this formula. But he minimum size of an object which stores N elements is max(sizeof (struct s), offsetof(struct s, t[n])) which is 9 bytes. This is what clang uses for statically allocated objects with initialization, while GCC uses the rule above (11 bytes). But bdos / bos then returns the smaller size of 9 which is a bit confusing. https://godbolt.org/z/K1hvaK1ns https://github.com/llvm/llvm-project/issues/62929 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 Then there is also the size of a similar array where the FAM is replaced with an array of static size: struct foo { int a; short b; char t[3]; }; This would make the most sense to me, but it has 12 bytes because the padding is according to the usual alignment rules. Martin Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook: > On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: > > This is the 2nd version of the patch, per our discussion based on the > > review comments for the 1st version, the major changes in this version > > are: > > Thanks for the update! > > > > > 1. change the name "element_count" to "counted_by"; > > 2. change the parameter for the attribute from a STRING to an > > Identifier; > > 3. Add logic and testing cases to handle anonymous structure/unions; > > 4. Clarify documentation to permit the situation when the allocation > > size is larger than what's specified by "counted_by", at the same time, > > it's user's error if allocation size is smaller than what's specified by > > "counted_by"; > > 5. Add a complete testing case for using counted_by attribute in > > __builtin_dynamic_object_size when there is mismatch between the > > allocation size and the value of "counted_by", the expecting behavior > > for each case and the explanation on why in the comments. > > All the "normal" test cases I have are passing; this is wonderful! :) > > I'm still seeing unexpected situations when I've intentionally set > counted_by to be smaller than alloc_size, but I assume it's due to not > yet having the patch you mention below. > > > As discussed, I plan to add two more separate patch sets after this initial > > patch set is approved and committed. > > > > set 1. A new warning option and a new sanitizer option for the user error > > when the allocation size is smaller than the value of "counted_by". > > set 2. An improvement to __builtin_dynamic_object_size for the following > > case: > > > > struct A > > { > > size_t foo; > > int array[] __attribute__((counted_by (foo))); > > }; > > > > extern struct fix * alloc_buf (); > > > > int main () > > { > > struct fix *p = alloc_buf (); > > __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int); > > /* with the current algorithm, it’s UNKNOWN */ > > __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int); > > /* with the current algorithm, it’s UNKNOWN */ > > } > > Should the above be bdos instead of bos? > > > Bootstrapped and regression tested on both aarch64 and X86, no issue. > > I've updated the Linux kernel's macros for the name change and done > build tests with my first pass at "easy" cases for adding counted_by: > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b > > Everything is working as expected. :) > > -Kees >
Hello, On Tue, 8 Aug 2023, Martin Uecker via Gcc-patches wrote: > There at least three different size expression which could > make sense. Consider > > short foo { int a; short b; char t[]; }; > > Most people seem to use > > sizeof(struct foo) + N * sizeof(foo->t); > > which for N == 3 yields 11 bytes on x86-64 because the formula > adds the padding of the original struct. There is an example > in the C standard that uses this formula. > > > But he minimum size of an object which stores N elements is > > max(sizeof (struct s), offsetof(struct s, t[n])) > > which is 9 bytes. But should it really? struct sizes should usually be a multiple of a structs alignment, and if int is 4-aligned only size 12 would be correct. I don't see why one should deviate from that general rule for sizes for FAM structs. (I realize that the above is not about sizeof(), but rather bdos/bos, but I think it's fairly useful that both agree when possbible). Say, if you were to allocate an array of such struct foos, each having 3 elements in foo.t[]. You need to add 12 bytes to go to the next array element, not 9. (Of course arrays of FAM-structs are somewhat meh, but still). It's true that you would be allowed to rely on only 9 bytes of those 12 bytes (the rest being padding), so perhaps it's really the right answer for __bos, but, hmm, 12 seems to be "more correct" to my guts :-) Ciao, Michael.
On Tue, Aug 08, 2023 at 04:54:46PM +0200, Martin Uecker wrote: > > > I am sure this has been discussed before, but seeing that you > test for a specific formula, let me point out the following: > > There at least three different size expression which could > make sense. Consider > > short foo { int a; short b; char t[]; }; > > Most people seem to use > > sizeof(struct foo) + N * sizeof(foo->t); I think this contains a typo. The above should be: sizeof(struct foo) + N * sizeof(*foo->t); > which for N == 3 yields 11 bytes on x86-64 because the formula > adds the padding of the original struct. There is an example > in the C standard that uses this formula. > > > But he minimum size of an object which stores N elements is > > max(sizeof (struct s), offsetof(struct s, t[n])) > > which is 9 bytes. > > This is what clang uses for statically allocated objects with > initialization, while GCC uses the rule above (11 bytes). But > bdos / bos then returns the smaller size of 9 which is a bit > confusing. > > https://godbolt.org/z/K1hvaK1ns And to add to the mess here, Clang used to get this wrong for statically initialized flexible array members (returning 8): https://godbolt.org/z/Tee96dazs But that was fixed recently when I noticed it a few weeks ago. > https://github.com/llvm/llvm-project/issues/62929 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 > > > Then there is also the size of a similar array where the FAM > is replaced with an array of static size: > > struct foo { int a; short b; char t[3]; }; > > This would make the most sense to me, but it has 12 bytes > because the padding is according to the usual alignment > rules. Hm, in my thinking, FAMs are individually padded, so since they don't "count" to the struct size, I think this is "as expected". Note that the Linux kernel explicitly chooses to potentially over-estimate for FAM allocations since there has been inconsistent sizes used depend on the style of the prior fake FAM usage, the use of unions, etc. i.e. our macro is, effectively: #define struct_size(ptr, member, count) \ (sizeof(*ptr) + (count) * sizeof(*ptr->member)) since the other case can lead to truncated sizes: #define struct_size(ptr, member, count) \ (offsetof(typeof(*ptr), member) + (count) * sizeof(*ptr->member)) struct foo { int a, b, c; int count; union { struct { char small; char char_fam[]; }; struct { int large; int int_fam[]; }; }; }; https://godbolt.org/z/b1v74Mzhd i.e.: struct_size(x, char_fam, 1) < sizeof(*x) so accessing "large" fails, etc. Yes, it's all horrible, but we have to deal with this kind of thing in the kernel. :(
Hi, Martin, Thanks for raising this issue. Although this is an old FAM related issue that does not relate to my current patch (and might need to be resolved in a separate patch). I think that it’s necessary to have more discussion on this old issue and resolve it. The first thing that I’d like to confirm is: What the exact memory layout for the following structure x? struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; And the key that is confusing me is, where should the field “t” start? A. Starting at offset 8 as the following: a 4-bytes b 2-bytes padding 2-bytes t 3-bytes B. Starting at offset 6 as the following: a 4-bytes b 2-bytes t 3-bytes From my understanding, A should be correct. However, when I debugged into gcc, I found that the following tree byte_position (const_tree field) { return byte_from_pos (DECL_FIELD_OFFSET (field), DECL_FIELD_BIT_OFFSET (field)); } Returned 6 for the field “t”: 498 tree pos = byte_position (last); (gdb) n 499 size = fold_build2 (PLUS_EXPR, TREE_TYPE (size), pos, compsize); (gdb) call debug_generic_expr(pos) 6 So, I suspect that there is a bug in GCC which incorrectly represent the offset of the FAM field in the IR. Thanks. Qing > On Aug 8, 2023, at 10:54 AM, Martin Uecker <uecker@tugraz.at> wrote: > > > > I am sure this has been discussed before, but seeing that you > test for a specific formula, let me point out the following: > > There at least three different size expression which could > make sense. Consider > > short foo { int a; short b; char t[]; }; > > Most people seem to use > > sizeof(struct foo) + N * sizeof(foo->t); > > which for N == 3 yields 11 bytes on x86-64 because the formula > adds the padding of the original struct. There is an example > in the C standard that uses this formula. > > > But he minimum size of an object which stores N elements is > > max(sizeof (struct s), offsetof(struct s, t[n])) > > which is 9 bytes. > > This is what clang uses for statically allocated objects with > initialization, while GCC uses the rule above (11 bytes). But > bdos / bos then returns the smaller size of 9 which is a bit > confusing. > > > https://godbolt.org/z/K1hvaK1ns > > https://github.com/llvm/llvm-project/issues/62929 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 > > > Then there is also the size of a similar array where the FAM > is replaced with an array of static size: > > struct foo { int a; short b; char t[3]; }; > > This would make the most sense to me, but it has 12 bytes > because the padding is according to the usual alignment > rules. > > > Martin > > > > Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook: >> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: >>> This is the 2nd version of the patch, per our discussion based on the >>> review comments for the 1st version, the major changes in this version >>> are: >> >> Thanks for the update! >> >>> >>> 1. change the name "element_count" to "counted_by"; >>> 2. change the parameter for the attribute from a STRING to an >>> Identifier; >>> 3. Add logic and testing cases to handle anonymous structure/unions; >>> 4. Clarify documentation to permit the situation when the allocation >>> size is larger than what's specified by "counted_by", at the same time, >>> it's user's error if allocation size is smaller than what's specified by >>> "counted_by"; >>> 5. Add a complete testing case for using counted_by attribute in >>> __builtin_dynamic_object_size when there is mismatch between the >>> allocation size and the value of "counted_by", the expecting behavior >>> for each case and the explanation on why in the comments. >> >> All the "normal" test cases I have are passing; this is wonderful! :) >> >> I'm still seeing unexpected situations when I've intentionally set >> counted_by to be smaller than alloc_size, but I assume it's due to not >> yet having the patch you mention below. >> >>> As discussed, I plan to add two more separate patch sets after this initial >>> patch set is approved and committed. >>> >>> set 1. A new warning option and a new sanitizer option for the user error >>> when the allocation size is smaller than the value of "counted_by". >>> set 2. An improvement to __builtin_dynamic_object_size for the following >>> case: >>> >>> struct A >>> { >>> size_t foo; >>> int array[] __attribute__((counted_by (foo))); >>> }; >>> >>> extern struct fix * alloc_buf (); >>> >>> int main () >>> { >>> struct fix *p = alloc_buf (); >>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int); >>> /* with the current algorithm, it’s UNKNOWN */ >>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int); >>> /* with the current algorithm, it’s UNKNOWN */ >>> } >> >> Should the above be bdos instead of bos? >> >>> Bootstrapped and regression tested on both aarch64 and X86, no issue. >> >> I've updated the Linux kernel's macros for the name change and done >> build tests with my first pass at "easy" cases for adding counted_by: >> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b >> >> Everything is working as expected. :) >> >> -Kees >> > > -- > Univ.-Prof. Dr. rer. nat. Martin Uecker > Graz University of Technology > Institute of Biomedical Imaging
Hello, On Wed, 9 Aug 2023, Qing Zhao wrote: > Although this is an old FAM related issue that does not relate to my current patch > (and might need to be resolved in a separate patch). I think that it’s necessary to have > more discussion on this old issue and resolve it. > > The first thing that I’d like to confirm is: > > What the exact memory layout for the following structure x? > > struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; > > And the key that is confusing me is, where should the field “t” start? > > A. Starting at offset 8 as the following: > > a 4-bytes > b 2-bytes > padding 2-bytes > t 3-bytes Why should there be padding before 't'? It's a char array (FAM or not), so it can be (and should be) placed directly after 'b'. So ... > B. Starting at offset 6 as the following: > > a 4-bytes > b 2-bytes > t 3-bytes ... this is the correct layout, when seen in isolation. The discussion revolves around what should come after 't': if it's a non-FAM struct (with t[3]), then it's clear that there needs to be padding after it, so to pad out the whole struct to be 12 bytes long (for sizeof() purpose), as required by its alignment (due to the int field 'a'). So, should the equivalent FAM struct also have this sizeof()? If no: there should be a good argument why it shouldn't be similar to the non-FAM one. Then there is an argument that the compiler would be fine, when allocating a single object of such type (not as part of an array!), to only reserve 9 bytes of space for the FAM-struct. Then the question is: should it also do that for a non-FAM struct (based on the argument that the padding behind 't' isn't accessible, and hence doesn't need to be alloced). I think it would be quite unexpected for the compiler to actually reserve less space than sizeof() implies, so I personally don't buy that argument. For FAM or non-FAM structs. Note that if one choses to allocate less space than sizeof implies that this will have quite some consequences for code generation, in that sometimes the instruction sequences (e.g. for copying) need to be careful to never access tail padding that should be there in array context, but isn't there in single-object context. I think this alone should make it clear that it's advisable that sizeof() and allocated size agree. As in: I think sizeof for both structs should return 12, and 12 bytes should be reserved for objects of such types. And then the next question is what __builtin_object_size should do with these: should it return the size with or without padding at end (i.e. could/should it return 9 even if sizeof is 12). I can see arguments for both. Ciao, Michael.
On Mon, Aug 07, 2023 at 04:33:13PM +0000, Qing Zhao wrote: > What’s the testing case for the one that failed? > If it’s > > __builtin_dynamic_object_size(p->array, 0/2) without the allocation information in the routine, > then with the current algorithm, gcc cannot deduce the size for the whole object. > > If not such case, let me know. I found some more bugs in my tests (now fixed), but I'm left with a single failure case, which is think it going to boil down to pointer/pointee issue we discussed earlier. Using my existing testing tool: https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c I see this error with the "counted_by_seen_by_bdos" case: Expected __builtin_dynamic_object_size(p, 1) (18446744073709551615) == sizeof(*p) + p->count * sizeof(*p->array) (80) A reduced view of the code is: struct annotated *p; int index = MAX_INDEX + unconst; p = alloc_annotated(index); EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->count * sizeof(*p->array)); It looks like __bdos will not use the __counted_by information from the pointee if the argument is only the pointer. i.e. this test works: EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->count * sizeof(*p->array)); However, I thought if any part of the pointee was used (e.g. p->count, p->array), GCC would be happy to start using the pointee details? And, again, for the initial version at this feature, I'm fine with this failing test being declared not a valid test. :) But I'll need some kind of builtin that can correctly interrogate a pointer to find the full runtime size with the assumption that pointer is valid, but that can come later. And as a side note, I am excited to see the very correct warnings for bad (too late) assignment of the __counted_by member: p->array[0] = 0; p->count = 1; array-bounds.c: In function 'invalid_assignment_order': array-bounds.c:366:17: warning: '*p.count' is used uninitialized [-Wuninitialized] 366 | p->array[0] = 0; | ~~~~~~~~^~~ Yay! :) -Kees
> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: > > Hello, > > On Wed, 9 Aug 2023, Qing Zhao wrote: > >> Although this is an old FAM related issue that does not relate to my current patch >> (and might need to be resolved in a separate patch). I think that it’s necessary to have >> more discussion on this old issue and resolve it. >> >> The first thing that I’d like to confirm is: >> >> What the exact memory layout for the following structure x? >> >> struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; >> >> And the key that is confusing me is, where should the field “t” start? >> >> A. Starting at offset 8 as the following: >> >> a 4-bytes >> b 2-bytes >> padding 2-bytes >> t 3-bytes > > Why should there be padding before 't'? It's a char array (FAM or not), > so it can be (and should be) placed directly after 'b'. So ... Yes, you are right. My mistake -:) > >> B. Starting at offset 6 as the following: >> >> a 4-bytes >> b 2-bytes >> t 3-bytes > > ... this is the correct layout, when seen in isolation. The discussion > revolves around what should come after 't': if it's a non-FAM struct (with > t[3]), then it's clear that there needs to be padding after it, so to pad > out the whole struct to be 12 bytes long (for sizeof() purpose), as > required by its alignment (due to the int field 'a'). right. > > So, should the equivalent FAM struct also have this sizeof()? If no: > there should be a good argument why it shouldn't be similar to the non-FAM > one. The sizeof() of a structure with FAM is defined as: (after I searched online, I think that the one in Wikipedia is the most reasonable one): https://en.wikipedia.org/wiki/Flexible_array_member ====== Effect on struct size and padding The sizeof operator on such a struct gives the size of the structure as if the flexible array member were empty. This may include padding added to accommodate the flexible member; the compiler is also free to re-use such padding as part of the array itself.[2] It is common to allocate sizeof(struct) + array_len*sizeof(array element) bytes. This is not wrong, but it may allocate a few more bytes than necessary: the compiler may be re-purposing some of the padding that is included in sizeof(struct). Should this be a concern, macros are available[3] to compute the minimum size while ensuring that the compiler's padding is not disrupted. As the array may start in the padding before the end of the structure, its content should always be accessed via indexing (arr[i]) or offsetof, not sizeof. ====== By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. i.e, for the following two structures, one with FAM, the other with fixed array: struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; struct foo_fix {int a; short b; char t[3]; } With current GCC: sizeof(foo_flex) == 8 sizeof(foo_fix) == 12 I think that the current behavior of sizeof for structure with FAM in GCC is correct. The major issue is what was pointed out by Martin in the previous email: Whether using the following formula is correct to compute the allocation? sizeof(struct foo_flex) + N * sizeof(foo->t); As pointed out in the wikipedia, the value computed by this formula might be bigger than the actual size since “sizeof(struct foo_flex)” might include paddings that are used as part of the array. So the more accurate formula should be offset(struct foo_flex, t[0]) + N * sizeof(foo->t); With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. > Then there is an argument that the compiler would be fine, when allocating > a single object of such type (not as part of an array!), to only reserve 9 > bytes of space for the FAM-struct. Then the question is: should it also > do that for a non-FAM struct (based on the argument that the padding > behind 't' isn't accessible, and hence doesn't need to be alloced). I > think it would be quite unexpected for the compiler to actually reserve > less space than sizeof() implies, so I personally don't buy that argument. > For FAM or non-FAM structs. For the question, whether the compiler needs to allocate paddings after the FAM field, I don’t know the answer, and it’s not specified in the standard either. Does it matter? > > Note that if one choses to allocate less space than sizeof implies that > this will have quite some consequences for code generation, in that > sometimes the instruction sequences (e.g. for copying) need to be careful > to never access tail padding that should be there in array context, but > isn't there in single-object context. I think this alone should make it > clear that it's advisable that sizeof() and allocated size agree. Sizeof by definition is return the size of the TYPE, not the size of the allocated object. Another thing I need to point out is, a structure with FAM cannot be an element of an array. > > As in: I think sizeof for both structs should return 12, and 12 bytes > should be reserved for objects of such types. > > And then the next question is what __builtin_object_size should do with > these: should it return the size with or without padding at end (i.e. > could/should it return 9 even if sizeof is 12). I can see arguments for > both. Currently, GCC’s __builtin_object_size use the following formula to compute the object size for The structure with FAM: offset(struct foo_flex, t[0]) + N * sizeof(foo->t); I think it’s correct. I think that the users might need to use this formula to compute the allocation size for a structure with FAM too. Qing > > > Ciao, > Michael.
> On Aug 8, 2023, at 10:54 AM, Martin Uecker <uecker@tugraz.at> wrote: > > > > I am sure this has been discussed before, but seeing that you > test for a specific formula, let me point out the following: > > There at least three different size expression which could > make sense. Consider > > short foo { int a; short b; char t[]; }; > > Most people seem to use > > sizeof(struct foo) + N * sizeof(foo->t); > > which for N == 3 yields 11 bytes on x86-64 because the formula > adds the padding of the original struct. There is an example > in the C standard that uses this formula. > > > But he minimum size of an object which stores N elements is > > max(sizeof (struct s), offsetof(struct s, t[n])) > > which is 9 bytes. > > This is what clang uses for statically allocated objects with > initialization, while GCC uses the rule above (11 bytes). But > bdos / bos then returns the smaller size of 9 which is a bit > confusing. As I checked the algorithm for bos in GCC, it uses a similar formula as the following to compute the object size: offset(struct foo, t[0]) + N * sizeof(*foo->t); Which seems correct to me. (Therefore bos returns 9 for this example). > > > https://godbolt.org/z/K1hvaK1ns > > https://github.com/llvm/llvm-project/issues/62929 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 > > > Then there is also the size of a similar array where the FAM > is replaced with an array of static size: > > struct foo { int a; short b; char t[3]; }; > > This would make the most sense to me, but it has 12 bytes > because the padding is according to the usual alignment > rules. My understanding is, since a structure with FAM cannot be an element of an array, the tailing padding might not be necessary? Qing > > > Martin > > > > Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook: >> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: >>> This is the 2nd version of the patch, per our discussion based on the >>> review comments for the 1st version, the major changes in this version >>> are: >> >> Thanks for the update! >> >>> >>> 1. change the name "element_count" to "counted_by"; >>> 2. change the parameter for the attribute from a STRING to an >>> Identifier; >>> 3. Add logic and testing cases to handle anonymous structure/unions; >>> 4. Clarify documentation to permit the situation when the allocation >>> size is larger than what's specified by "counted_by", at the same time, >>> it's user's error if allocation size is smaller than what's specified by >>> "counted_by"; >>> 5. Add a complete testing case for using counted_by attribute in >>> __builtin_dynamic_object_size when there is mismatch between the >>> allocation size and the value of "counted_by", the expecting behavior >>> for each case and the explanation on why in the comments. >> >> All the "normal" test cases I have are passing; this is wonderful! :) >> >> I'm still seeing unexpected situations when I've intentionally set >> counted_by to be smaller than alloc_size, but I assume it's due to not >> yet having the patch you mention below. >> >>> As discussed, I plan to add two more separate patch sets after this initial >>> patch set is approved and committed. >>> >>> set 1. A new warning option and a new sanitizer option for the user error >>> when the allocation size is smaller than the value of "counted_by". >>> set 2. An improvement to __builtin_dynamic_object_size for the following >>> case: >>> >>> struct A >>> { >>> size_t foo; >>> int array[] __attribute__((counted_by (foo))); >>> }; >>> >>> extern struct fix * alloc_buf (); >>> >>> int main () >>> { >>> struct fix *p = alloc_buf (); >>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int); >>> /* with the current algorithm, it’s UNKNOWN */ >>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int); >>> /* with the current algorithm, it’s UNKNOWN */ >>> } >> >> Should the above be bdos instead of bos? >> >>> Bootstrapped and regression tested on both aarch64 and X86, no issue. >> >> I've updated the Linux kernel's macros for the name change and done >> build tests with my first pass at "easy" cases for adding counted_by: >> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b >> >> Everything is working as expected. :) >> >> -Kees >> > > -- > Univ.-Prof. Dr. rer. nat. Martin Uecker > Graz University of Technology > Institute of Biomedical Imaging
Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: ... > > By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. > i.e, for the following two structures, one with FAM, the other with fixed array: > > struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; > struct foo_fix {int a; short b; char t[3]; } > > With current GCC: > sizeof(foo_flex) == 8 > sizeof(foo_fix) == 12 > > I think that the current behavior of sizeof for structure with FAM in GCC is correct. Yes, sadly the sizeof has to be like this as required by ISO C. > > The major issue is what was pointed out by Martin in the previous email: > > Whether using the following formula is correct to compute the allocation? > > sizeof(struct foo_flex) + N * sizeof(foo->t); That formula is safe for allocation, but might allocate more padding than the minimum amount and it might allocate less storage than a similar array with fixed array. > As pointed out in the wikipedia, the value computed by this formula might > be bigger than the actual size since “sizeof(struct foo_flex)” might include > paddings that are used as part of the array. > > So the more accurate formula should be > > offset(struct foo_flex, t[0]) + N * sizeof(foo->t); > > With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. This formula might be considered incorrect / dangerous because it might allocate less storage than sizeof(struct foo_flex). https://godbolt.org/z/8accq75f3 > ... > > As in: I think sizeof for both structs should return 12, and 12 bytes > > should be reserved for objects of such types. > > > > And then the next question is what __builtin_object_size should do with > > these: should it return the size with or without padding at end (i.e. > > could/should it return 9 even if sizeof is 12). I can see arguments for > > both. > > Currently, GCC’s __builtin_object_size use the following formula to compute the object size for > The structure with FAM: > > offset(struct foo_flex, t[0]) + N * sizeof(foo->t); > > I think it’s correct. > > I think that the users might need to use this formula to compute the allocation size for a structure with FAM too. I am not sure for the reason given above. The following code would not work: struct foo_flex { int a; short b; char t[]; } x; x.a = 1; struct foo_flex *p = malloc(sizeof(x) + x.a); if (!p) abort(); memcpy(p, &x, sizeof(x)); // initialize struct Martin
Hello, On Wed, 9 Aug 2023, Qing Zhao wrote: > > So, should the equivalent FAM struct also have this sizeof()? If no: > > there should be a good argument why it shouldn't be similar to the non-FAM > > one. > > The sizeof() of a structure with FAM is defined as: (after I searched online, > I think that the one in Wikipedia is the most reasonable one): > https://en.wikipedia.org/wiki/Flexible_array_member Well, wikipedia has it's uses. Here we are in language lawyering land together with a discussion what makes most sense in many circumstances. FWIW, in this case I think the cited text from wikipedia is correct in the sense of "not wrong" but not helpful in the sense of "good advise". > By definition, the sizeof() of a struct with FAM might not be the same > as the non-FAM one. i.e, for the following two structures, one with FAM, > the other with fixed array: > > struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; > struct foo_fix {int a; short b; char t[3]; } > > With current GCC: > sizeof(foo_flex) == 8 > sizeof(foo_fix) == 12 > > I think that the current behavior of sizeof for structure with FAM in > GCC is correct. It is, yes. > The major issue is what was pointed out by Martin in the previous email: > > Whether using the following formula is correct to compute the > allocation? > > sizeof(struct foo_flex) + N * sizeof(foo->t); > > As pointed out in the wikipedia, the value computed by this formula might > be bigger than the actual size since “sizeof(struct foo_flex)” might include > paddings that are used as part of the array. That doesn't make the formula incorrect, but rather conservatively correct. If you don't want to be conservative, then yes, you can use a different formula if you happen to know the layout rules your compiler at hand uses (or the ones prescribed by an ABI, if it does that). I think it would be bad advise to the general population do advertise this scheme as better. > So the more accurate formula should be > > offset(struct foo_flex, t[0]) + N * sizeof(foo->t); "* sizeof(foo->t[0])", but yes. > For the question, whether the compiler needs to allocate paddings after > the FAM field, I don’t know the answer, and it’s not specified in the > standard either. Does it matter? It matters for two things: 1) Abstract reasons: is there as reason to deviate from the normal rules? If not: it shouldn't deviate. Future extensibility: while it right now is not possible to form an array of FMA-structs (in C!), who's to say that it may not be eventually added? It seems a natural enough extension of an extension, and while it has certain implementation problems (the "real" size of the elements needs to be computable, and hence be part of the array type) those could be overcome. At that point you _really_ want to have the elements aligned naturally, be compatible with sizeof, and be the same as an individual object. 2) Practical reasons: codegeneration works better if the accessible sizes of objects are a multiple of their alignment, as often you have instructions that can move around alignment-sized blobs (say, words). If you don't pad out objects you have to be careful to use only byte accesses when you get to the end of an object. Let me ask the question in the opposite way: what would you _gain_ by not allocating padding? And is that enough to deviate from the most obvious choices? (Do note that e.g. global or local FMA-typed objects don't exist in isolation, and their surrounding objects have their own alignment rules, which often will lead to tail padding anyway, even if you would use the non-conservative size calculation; the same applies for malloc'ed objects). > > Note that if one choses to allocate less space than sizeof implies that > > this will have quite some consequences for code generation, in that > > sometimes the instruction sequences (e.g. for copying) need to be careful > > to never access tail padding that should be there in array context, but > > isn't there in single-object context. I think this alone should make it > > clear that it's advisable that sizeof() and allocated size agree. > > Sizeof by definition is return the size of the TYPE, not the size of the > allocated object. Sure. Outside special cases like FMA it's the same, though. And there sizeof does include tail padding. > > And then the next question is what __builtin_object_size should do with > > these: should it return the size with or without padding at end (i.e. > > could/should it return 9 even if sizeof is 12). I can see arguments for > > both. > > Currently, GCC’s __builtin_object_size use the following formula to > compute the object size for The structure with FAM: > > offset(struct foo_flex, t[0]) + N * sizeof(foo->t); > > I think it’s correct. See above. It's non-conservatively correct. And that may be the right choice for this builtin, considering its intended use-cases (strict checking of allowed access ranges) ... > I think that the users might need to use this formula to compute the > allocation size for a structure with FAM too. ... but that doesn't automatically follows. There's a difference between accessible memory range (for language semantic purposes) and allocated memory range. I could imagine that somewhen __builtin_object_size doesn't include tail padding for normal non-FMA types either. After all you can't rely on the contents there, though you won't get segfaults when you access it. sizeof will continue to have to include it, though. So that's a demonstration of why _bos is not the right thing to use for allocation purposes. Ciao, Michael.
> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: >> >>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: > > ... >> >> By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. >> i.e, for the following two structures, one with FAM, the other with fixed array: >> >> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } }; >> struct foo_fix {int a; short b; char t[3]; } >> >> With current GCC: >> sizeof(foo_flex) == 8 >> sizeof(foo_fix) == 12 >> >> I think that the current behavior of sizeof for structure with FAM in GCC is correct. > > Yes, sadly the sizeof has to be like this as required by ISO C. Agreed. Yes, if the size information of the FAM can be integrated into the TYPE system in C standard, we will not have such issue anymore. > >> >> The major issue is what was pointed out by Martin in the previous email: >> >> Whether using the following formula is correct to compute the allocation? >> >> sizeof(struct foo_flex) + N * sizeof(foo->t); > > That formula is safe for allocation, but might allocate more padding > than the minimum amount and Yes. > it might allocate less storage than a > similar array with fixed array. Yes. > >> As pointed out in the wikipedia, the value computed by this formula might >> be bigger than the actual size since “sizeof(struct foo_flex)” might include >> paddings that are used as part of the array. >> >> So the more accurate formula should be >> >> offset(struct foo_flex, t[0]) + N * sizeof(foo->t); >> >> With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. > > This formula might be considered incorrect / dangerous because > it might allocate less storage than sizeof(struct foo_flex). > > > https://godbolt.org/z/8accq75f3 I see, thanks. >> > ... >>> As in: I think sizeof for both structs should return 12, and 12 bytes >>> should be reserved for objects of such types. >>> >>> And then the next question is what __builtin_object_size should do with >>> these: should it return the size with or without padding at end (i.e. >>> could/should it return 9 even if sizeof is 12). I can see arguments for >>> both. >> >> Currently, GCC’s __builtin_object_size use the following formula to compute the object size for >> The structure with FAM: >> >> offset(struct foo_flex, t[0]) + N * sizeof(foo->t); >> >> I think it’s correct. >> >> I think that the users might need to use this formula to compute the allocation size for a structure with FAM too. > > I am not sure for the reason given above. The following > code would not work: > > struct foo_flex { int a; short b; char t[]; } x; > x.a = 1; > struct foo_flex *p = malloc(sizeof(x) + x.a); > if (!p) abort(); > memcpy(p, &x, sizeof(x)); // initialize struct > Okay. Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) Let me know if I still miss anything. Thanks a lot for the explanation. Qing > Martin > > >
Hey, On Thu, 10 Aug 2023, Martin Uecker wrote: > > offset(struct foo_flex, t[0]) + N * sizeof(foo->t); > > > > With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. > > This formula might be considered incorrect / dangerous because > it might allocate less storage than sizeof(struct foo_flex). Oh indeed. I hadn't even considered that. That could be "fixed" with another max(theabove, sizeof(struct foo_flex)), but that starts to become silly when the obvious choice works fine. Ciao, Michael.
Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: > > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: > > > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: > > > > I am not sure for the reason given above. The following > > code would not work: > > > > struct foo_flex { int a; short b; char t[]; } x; > > x.a = 1; > > struct foo_flex *p = malloc(sizeof(x) + x.a); > > if (!p) abort(); > > memcpy(p, &x, sizeof(x)); // initialize struct > > > Okay. > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) > > Let me know if I still miss anything. The question is not only what the user should use to allocate, but also what BDOS should return. In my example the user uses the sizeof() + N * sizeof formula and the memcpy is safe, but it would be flagged as a buffer overrun if BDOS uses the offsetof formula. Martin
On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: > > > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: > > > > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: > > > > > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: > > > > > > > I am not sure for the reason given above. The following > > > code would not work: > > > > > > struct foo_flex { int a; short b; char t[]; } x; > > > x.a = 1; > > > struct foo_flex *p = malloc(sizeof(x) + x.a); > > > if (!p) abort(); > > > memcpy(p, &x, sizeof(x)); // initialize struct > > > > > Okay. > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) > > > > Let me know if I still miss anything. > > The question is not only what the user should use to > allocate, but also what BDOS should return. In my > example the user uses the sizeof() + N * sizeof > formula and the memcpy is safe, but it would be flagged > as a buffer overrun if BDOS uses the offsetof formula. BDOS/BOS (at least the 0 level) should return what is actually allocated for the var, what size was passed to malloc and if it is a var with flex array member with initialization what is actually the size on the stack or in .data/.rodata etc. And for 1 level the same unless it is just access to some element, then it should be capped by the size of that element. Jakub
Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: > On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: > > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: > > > > > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: > > > > > > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: > > > > > > > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: > > > > > > > > > > I am not sure for the reason given above. The following > > > > code would not work: > > > > > > > > struct foo_flex { int a; short b; char t[]; } x; > > > > x.a = 1; > > > > struct foo_flex *p = malloc(sizeof(x) + x.a); > > > > if (!p) abort(); > > > > memcpy(p, &x, sizeof(x)); // initialize struct > > > > > > > Okay. > > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) > > > > > > Let me know if I still miss anything. > > > > The question is not only what the user should use to > > allocate, but also what BDOS should return. In my > > example the user uses the sizeof() + N * sizeof > > formula and the memcpy is safe, but it would be flagged > > as a buffer overrun if BDOS uses the offsetof formula. > > BDOS/BOS (at least the 0 level) should return what is actually > allocated for the var, what size was passed to malloc and if it > is a var with flex array member with initialization what is actually the > size on the stack or in .data/.rodata etc. Agreed. But what about a struct with FAM with the new "counted_by" attribute if the original allocation is not visible? Martin > And for 1 level the same unless it is just access to some element, then > it should be capped by the size of that element. >
On 2023-08-10 10:47, Martin Uecker wrote: > Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: >> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: >>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: >>>> >>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: >>>>> >>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: >>>>>> >>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: >>>>> >>> >>>>> I am not sure for the reason given above. The following >>>>> code would not work: >>>>> >>>>> struct foo_flex { int a; short b; char t[]; } x; >>>>> x.a = 1; >>>>> struct foo_flex *p = malloc(sizeof(x) + x.a); >>>>> if (!p) abort(); >>>>> memcpy(p, &x, sizeof(x)); // initialize struct >>>>> >>>> Okay. >>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) >>>> >>>> Let me know if I still miss anything. >>> >>> The question is not only what the user should use to >>> allocate, but also what BDOS should return. In my >>> example the user uses the sizeof() + N * sizeof >>> formula and the memcpy is safe, but it would be flagged >>> as a buffer overrun if BDOS uses the offsetof formula. >> >> BDOS/BOS (at least the 0 level) should return what is actually >> allocated for the var, what size was passed to malloc and if it >> is a var with flex array member with initialization what is actually the >> size on the stack or in .data/.rodata etc. > > Agreed. > > But what about a struct with FAM with the new "counted_by" attribute > if the original allocation is not visible? There's precedent for this through the __access__ attribute; __bos trusts what the attribute says about the allocation. Sid
Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar: > On 2023-08-10 10:47, Martin Uecker wrote: > > Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: > > > On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: > > > > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: > > > > > > > > > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: > > > > > > > > > > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: > > > > > > > > > > > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: > > > > > > > > > > > > > > > > I am not sure for the reason given above. The following > > > > > > code would not work: > > > > > > > > > > > > struct foo_flex { int a; short b; char t[]; } x; > > > > > > x.a = 1; > > > > > > struct foo_flex *p = malloc(sizeof(x) + x.a); > > > > > > if (!p) abort(); > > > > > > memcpy(p, &x, sizeof(x)); // initialize struct > > > > > > > > > > > Okay. > > > > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) > > > > > > > > > > Let me know if I still miss anything. > > > > > > > > The question is not only what the user should use to > > > > allocate, but also what BDOS should return. In my > > > > example the user uses the sizeof() + N * sizeof > > > > formula and the memcpy is safe, but it would be flagged > > > > as a buffer overrun if BDOS uses the offsetof formula. > > > > > > BDOS/BOS (at least the 0 level) should return what is actually > > > allocated for the var, what size was passed to malloc and if it > > > is a var with flex array member with initialization what is actually the > > > size on the stack or in .data/.rodata etc. > > > > Agreed. > > > > But what about a struct with FAM with the new "counted_by" attribute > > if the original allocation is not visible? > > There's precedent for this through the __access__ attribute; __bos > trusts what the attribute says about the allocation. The access attribute gives the size directly. The counted_by gives a length for the array which needs to be translated into a size via a formula. There are different formulas in use. The question is which formula should bdos trust? Whatever you pick, if this is not consistent with the actual allocation or use, then it will cause problems either by breaking code or not detecting buffer overruns. So it needs to be consistent with what GCC allocates for a var with FAM and initialization and also the user needs to be told what the right choice is so that he can use the right size for allocation and argument to memcpy / memset etc. Martin
Hi, Martin, > On Aug 10, 2023, at 11:18 AM, Martin Uecker <muecker@gwdg.de> wrote: > > Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar: >> On 2023-08-10 10:47, Martin Uecker wrote: >>> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: >>>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: >>>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: >>>>>> >>>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: >>>>>>> >>>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: >>>>>>>> >>>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: >>>>>>> >>>>> >>>>>>> I am not sure for the reason given above. The following >>>>>>> code would not work: >>>>>>> >>>>>>> struct foo_flex { int a; short b; char t[]; } x; >>>>>>> x.a = 1; >>>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a); >>>>>>> if (!p) abort(); >>>>>>> memcpy(p, &x, sizeof(x)); // initialize struct >>>>>>> >>>>>> Okay. >>>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) >>>>>> >>>>>> Let me know if I still miss anything. >>>>> >>>>> The question is not only what the user should use to >>>>> allocate, but also what BDOS should return. In my >>>>> example the user uses the sizeof() + N * sizeof >>>>> formula and the memcpy is safe, but it would be flagged >>>>> as a buffer overrun if BDOS uses the offsetof formula. >>>> >>>> BDOS/BOS (at least the 0 level) should return what is actually >>>> allocated for the var, what size was passed to malloc and if it >>>> is a var with flex array member with initialization what is actually the >>>> size on the stack or in .data/.rodata etc. >>> >>> Agreed. >>> >>> But what about a struct with FAM with the new "counted_by" attribute >>> if the original allocation is not visible? >> >> There's precedent for this through the __access__ attribute; __bos >> trusts what the attribute says about the allocation. > > The access attribute gives the size directly. The counted_by gives > a length for the array which needs to be translated into a size > via a formula. There are different formulas in use. The question > is which formula should bdos trust? > > Whatever you pick, if this is not consistent with the actual > allocation or use, then it will cause problems either by > breaking code or not detecting buffer overruns. > > So it needs to be consistent with what GCC allocates for a > var with FAM and initialization and also the user needs to > be told what the right choice is so that he can use the right > size for allocation and argument to memcpy / memset etc. All agreed. Thanks a lot for raising these issues and providing helpful suggestions. I will double check on these and make sure __bos behaves correctly for structure with FAM. Might come back with more questions for discussion…-:). Thanks. Qing > > Martin
On 2023-08-10 11:18, Martin Uecker wrote: > Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar: >> On 2023-08-10 10:47, Martin Uecker wrote: >>> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: >>>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: >>>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao: >>>>>> >>>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote: >>>>>>> >>>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao: >>>>>>>> >>>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote: >>>>>>> >>>>> >>>>>>> I am not sure for the reason given above. The following >>>>>>> code would not work: >>>>>>> >>>>>>> struct foo_flex { int a; short b; char t[]; } x; >>>>>>> x.a = 1; >>>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a); >>>>>>> if (!p) abort(); >>>>>>> memcpy(p, &x, sizeof(x)); // initialize struct >>>>>>> >>>>>> Okay. >>>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) >>>>>> >>>>>> Let me know if I still miss anything. >>>>> >>>>> The question is not only what the user should use to >>>>> allocate, but also what BDOS should return. In my >>>>> example the user uses the sizeof() + N * sizeof >>>>> formula and the memcpy is safe, but it would be flagged >>>>> as a buffer overrun if BDOS uses the offsetof formula. >>>> >>>> BDOS/BOS (at least the 0 level) should return what is actually >>>> allocated for the var, what size was passed to malloc and if it >>>> is a var with flex array member with initialization what is actually the >>>> size on the stack or in .data/.rodata etc. >>> >>> Agreed. >>> >>> But what about a struct with FAM with the new "counted_by" attribute >>> if the original allocation is not visible? >> >> There's precedent for this through the __access__ attribute; __bos >> trusts what the attribute says about the allocation. > > The access attribute gives the size directly. The counted_by gives > a length for the array which needs to be translated into a size > via a formula. There are different formulas in use. The question > is which formula should bdos trust? > > Whatever you pick, if this is not consistent with the actual > allocation or use, then it will cause problems either by > breaking code or not detecting buffer overruns. > > So it needs to be consistent with what GCC allocates for a > var with FAM and initialization and also the user needs to > be told what the right choice is so that he can use the right > size for allocation and argument to memcpy / memset etc. We'd rather miss overflow to the extent of padding than to try and be overly aggressive; I doubt if we're missing much protection in practice by trying to account for the padding. The definition of __bos/__bdos allows us the freedom to *estimate* rather than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound to give the more conservative answer of the two. Thanks, Sid
On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote: > The definition of __bos/__bdos allows us the freedom to *estimate* rather > than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound > to give the more conservative answer of the two. To be precise, we have the 0/1 modes vs. 2/3. So, when not determining __bos/__bdos from actual allocation size or size of an stack object or size of data section object but something else (say counted_by), perhaps 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt) and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt), then user code can continue testing if both modes are equal to have exact number. Jakub
On 2023-08-10 12:39, Jakub Jelinek wrote: > On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote: >> The definition of __bos/__bdos allows us the freedom to *estimate* rather >> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound >> to give the more conservative answer of the two. > > To be precise, we have the 0/1 modes vs. 2/3. So, when not determining > __bos/__bdos from actual allocation size or size of an stack object or > size of data section object but something else (say counted_by), perhaps > 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt) > and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt), > then user code can continue testing if both modes are equal to have > exact number. Ack, that's fair. Thanks, Sid
> On Aug 10, 2023, at 12:39 PM, Jakub Jelinek <jakub@redhat.com> wrote: > > On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote: >> The definition of __bos/__bdos allows us the freedom to *estimate* rather >> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound >> to give the more conservative answer of the two. > > To be precise, we have the 0/1 modes vs. 2/3. So, when not determining > __bos/__bdos from actual allocation size or size of an stack object or > size of data section object but something else (say counted_by), perhaps > 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt) > and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt), > then user code can continue testing if both modes are equal to have > exact number. Yes, this sounds reasonable to me. Qing > > Jakub >
Hi, After some more studying and consideration, the following is my thoughts: For a structure with FMA annotated with counted_by attribute: (the following small example) ==== struct annotated { size_t foo; char b; char array[] __attribute__((counted_by (foo))); }; #define noinline __attribute__((__noinline__)) #define MAX(a, b) ((a) > (b) ? (a) : (b)) static struct annotated * noinline alloc_buf (size_t length) { struct annotated *p; p = (struct annotated *) malloc (MAX (sizeof (struct annotated), (offsetof (struct annotated, array[0]) + (length) * sizeof (char)))); p->foo = length; return p; } int main () { struct annotated *p = alloc_buf (10); printf("the__bdos of max p->array whole is %d \n", __builtin_dynamic_object_size(p->array, 0)); printf("the__bdos of max p->array sub is %d \n", __builtin_dynamic_object_size(p->array, 1)); printf("the__bdos of min p->array whole is %d \n", __builtin_dynamic_object_size(p->array, 2)); printf("the__bdos of min p->array sub is %d \n", __builtin_dynamic_object_size(p->array, 3)); } ===== The actual allocation of the structure and the layout of the structure p is fixed at compilation time, A. We know the offsetof (p->array) during compilation time, (it’s 9) B. We also know the size of the p->array though the counted_by attribute, it’s p->foo * sizeof (char). 1. for subobject size (1/3 modes), Both A and B are know at compilation time, whatever it’s MAX or MIN, we can determine the size of the subobject p->array is: p->foo * sizeof(char) without estimation. 2. for whole object size (0/2 modes), since we don’t have any info on the actual allocation or structure Initialization, we don’t know the size for the whole object whatever it’s MAX or MIN. So, the problem to decide which formula to use ((sizeof (x) + N * sizeof(elt), or offsetof + N * sizeof(elt)) is actually the programmer’s job when allocating memory for the structure with FMA. (It’s not compiler’s job). Since this size computation is really confusing for the structure with FMA, I think that adding some clarification in the documentation might be necessary to provide more details and guidance to the end-users. Let me know if I miss anything here. Thanks a lot. Qing > On Aug 10, 2023, at 11:18 AM, Martin Uecker <muecker@gwdg.de> wrote: > The access attribute gives the size directly. The counted_by gives > a length for the array which needs to be translated into a size > via a formula. There are different formulas in use. The question > is which formula should bdos trust? > > Whatever you pick, if this is not consistent with the actual > allocation or use, then it will cause problems either by > breaking code or not detecting buffer overruns. > > So it needs to be consistent with what GCC allocates for a > var with FAM and initialization and also the user needs to > be told what the right choice is so that he can use the right > size for allocation and argument to memcpy / memset etc. > On Aug 10, 2023, at 1:06 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > On 2023-08-10 12:39, Jakub Jelinek wrote: >> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote: >>> The definition of __bos/__bdos allows us the freedom to *estimate* rather >>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound >>> to give the more conservative answer of the two. >> To be precise, we have the 0/1 modes vs. 2/3. So, when not determining >> __bos/__bdos from actual allocation size or size of an stack object or >> size of data section object but something else (say counted_by), perhaps >> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt) >> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt), >> then user code can continue testing if both modes are equal to have >> exact number. > > Ack, that's fair. > > Thanks, > Sid
On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: > This is the 2nd version of the patch, per our discussion based on the > review comments for the 1st version, the major changes in this version I've been using Coccinelle to find and annotate[1] structures (193 so far...), and I've encountered 2 cases of GCC internal errors. I'm working on a minimized test case, but just in case these details are immediately helpful, here's what I'm seeing: ../drivers/net/wireless/ath/wcn36xx/smd.c: In function 'wcn36xx_smd_rsp_process': ../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: error: incorrect sharing of tree nodes 3299 | int wcn36xx_smd_rsp_process(struct rpmsg_device *rpdev, | ^~~~~~~~~~~~~~~~~~~~~~~ MEM[(struct wcn36xx_hal_ind_msg *)_96] _15 = &MEM[(struct wcn36xx_hal_ind_msg *)_96].msg; during GIMPLE pass: objsz ../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: internal compiler error: verify_gimple failed 0xfe97fd verify_gimple_in_cfg(function*, bool, bool) ../../../../gcc/gcc/tree-cfg.cc:5646 0xe84894 execute_function_todo ../../../../gcc/gcc/passes.cc:2088 0xe84dee execute_todo ../../../../gcc/gcc/passes.cc:2142 The associated struct is: struct wcn36xx_hal_ind_msg { struct list_head list; size_t msg_len; u8 msg[] __counted_by(msg_len); }; And: ../drivers/usb/gadget/function/f_fs.c: In function '__ffs_epfile_read_data': ../drivers/usb/gadget/function/f_fs.c:900:16: error: incorrect sharing of tree nodes 900 | static ssize_t __ffs_epfile_read_data(struct ffs_epfile *epfile, | ^~~~~~~~~~~~~~~~~~~~~~ MEM[(struct ffs_buffer *)_67] _5 = &MEM[(struct ffs_buffer *)_67].storage; during GIMPLE pass: objsz ../drivers/usb/gadget/function/f_fs.c:900:16: internal compiler error: verify_gimple failed 0xfe97fd verify_gimple_in_cfg(function*, bool, bool) ../../../../gcc/gcc/tree-cfg.cc:5646 0xe84894 execute_function_todo ../../../../gcc/gcc/passes.cc:2088 0xe84dee execute_todo ../../../../gcc/gcc/passes.cc:2142 with: struct ffs_buffer { size_t length; char *data; char storage[] __counted_by(length); }; [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote: > On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: > > This is the 2nd version of the patch, per our discussion based on the > > review comments for the 1st version, the major changes in this version > > I've been using Coccinelle to find and annotate[1] structures (193 so > far...), and I've encountered 2 cases of GCC internal errors. I'm working > on a minimized test case, but just in case these details are immediately > helpful, here's what I'm seeing: Okay, I got it minimized: $ cat poc.c struct a { unsigned long c; char d[] __attribute__((__counted_by__(c))); } *b; void f(long); void e(void) { long g = __builtin_dynamic_object_size(b->d, 1); f(g); } $ gcc -O2 -c -o /dev/null poc.c poc.c: In function 'e': poc.c:8:6: error: incorrect sharing of tree nodes 8 | void e(void) { | ^ *b.0_1 _2 = &b.0_1->d; during GIMPLE pass: objsz poc.c:8:6: internal compiler error: verify_gimple failed 0xfe97fd verify_gimple_in_cfg(function*, bool, bool) ../../../../gcc/gcc/tree-cfg.cc:5646 0xe84894 execute_function_todo ../../../../gcc/gcc/passes.cc:2088 0xe84dee execute_todo ../../../../gcc/gcc/passes.cc:2142
Hi, Kees, Thanks for the testing case. Yes, I noticed this issue too, and already fixed it in my private branch. With the latest patch, the compilation has no issue: [opc@qinzhao-ol8u3-x86 108896]$ sh t /home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c [opc@qinzhao-ol8u3-x86 108896]$ Qing > On Aug 17, 2023, at 2:38 AM, Kees Cook <keescook@chromium.org> wrote: > > On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote: >> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote: >>> This is the 2nd version of the patch, per our discussion based on the >>> review comments for the 1st version, the major changes in this version >> >> I've been using Coccinelle to find and annotate[1] structures (193 so >> far...), and I've encountered 2 cases of GCC internal errors. I'm working >> on a minimized test case, but just in case these details are immediately >> helpful, here's what I'm seeing: > > Okay, I got it minimized: > > $ cat poc.c > struct a { > unsigned long c; > char d[] __attribute__((__counted_by__(c))); > } *b; > > void f(long); > > void e(void) { > long g = __builtin_dynamic_object_size(b->d, 1); > f(g); > } > $ gcc -O2 -c -o /dev/null poc.c > poc.c: In function 'e': > poc.c:8:6: error: incorrect sharing of tree nodes > 8 | void e(void) { > | ^ > *b.0_1 > _2 = &b.0_1->d; > during GIMPLE pass: objsz > poc.c:8:6: internal compiler error: verify_gimple failed > 0xfe97fd verify_gimple_in_cfg(function*, bool, bool) > ../../../../gcc/gcc/tree-cfg.cc:5646 > 0xe84894 execute_function_todo > ../../../../gcc/gcc/passes.cc:2088 > 0xe84dee execute_todo > ../../../../gcc/gcc/passes.cc:2142 > > -- > Kees Cook
On Thu, Aug 17, 2023 at 01:44:42PM +0000, Qing Zhao wrote: > Thanks for the testing case. > Yes, I noticed this issue too, and already fixed it in my private branch. > > With the latest patch, the compilation has no issue: > [opc@qinzhao-ol8u3-x86 108896]$ sh t > /home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c > [opc@qinzhao-ol8u3-x86 108896]$ Great! Thanks. I look forward to v3. For now I'll leave off these 2 annotations in my kernel builds. :) -Kees