Message ID | cover.1723419712.git.alx@kernel.org |
---|---|
Headers | show |
Series | c: Add __lengthof__ operator | expand |
Hi David, I want to send an updated version of n2529. The original author didn't respond to my mail, so I'll take over. I've been preparing a GCC patch set for adding the feature to GCC, and have informed Clang developers about it too. The title would be _Lengthof - New pointer-proof keyword to determine array length (v2) Can you please assign me a number for it? Thanks. Cheers, Alex
On Tue, Aug 13, 2024 at 01:34:58AM GMT, Alejandro Colomar wrote: > Hi David, I obviously meant Daniel. :-) > > I want to send an updated version of n2529. The original author didn't > respond to my mail, so I'll take over. I've been preparing a GCC patch > set for adding the feature to GCC, and have informed Clang developers > about it too. > > The title would be > > _Lengthof - New pointer-proof keyword to determine array length (v2) > > Can you please assign me a number for it? Thanks. > > Cheers, > Alex > > > -- > <https://www.alejandro-colomar.es/>
Hi, On Tue, Aug 13, 2024 at 01:34:58AM GMT, Alejandro Colomar wrote: > I want to send an updated version of n2529. The original author didn't > respond to my mail, so I'll take over. I've been preparing a GCC patch > set for adding the feature to GCC, and have informed Clang developers > about it too. > > The title would be > > _Lengthof - New pointer-proof keyword to determine array length (v2) > > Can you please assign me a number for it? Thanks. Attached is a draft for a paper (both the man(7) source and the generated PDF). I have only added lengthof for now, not _Lengthof, as suggested by Jens. Depending on feedback, I'll propose the uglified version. Cheers, Alex
I have been overseeing these last emails - thank you very much for your efforts, Alex! I did not reply until now because I do not have prior experience with gcc internals, so my feedback would probably have not been that useful. Those emails from 2020 were in fact discussing two completely different proposals at once: 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static qualifier on compound literals Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and as you already know by now, proposal #1 received some negative feedback, suggesting _Typeof/typeof + some macro magic as a pragmatic workaround instead. Since the proposal did not get much traction and I would had been unable to contribute to gcc myself, I just gave up on it. IIRC the deadline for new proposals closed soon after, anyway. But I am glad that someone with proper experience took the initiative. I still think the proposal is relevant and has interesting use cases. > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > Depending on feedback, I'll propose the uglified version. Probably, all of us know why the uglified version is the usual approach preferred by the C standard: we do not know how many applications would break otherwise. However, we see that this trend is now changing with C23, so probably it makes sense to define lengthof directly. As for the parentheses, I personally think lengthof should follow similar rules compared to sizeof. Best regards, -- Xavier Del Campo Romero Aug 13, 2024, 15:02 by alx@kernel.org: > Hi, > > On Tue, Aug 13, 2024 at 01:34:58AM GMT, Alejandro Colomar wrote: > >> I want to send an updated version of n2529. The original author didn't >> respond to my mail, so I'll take over. I've been preparing a GCC patch >> set for adding the feature to GCC, and have informed Clang developers >> about it too. >> >> The title would be >> >> _Lengthof - New pointer-proof keyword to determine array length (v2) >> >> Can you please assign me a number for it? Thanks. >> > > Attached is a draft for a paper (both the man(7) source and the > generated PDF). > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > Depending on feedback, I'll propose the uglified version. > > Cheers, > Alex > > -- > <https://www.alejandro-colomar.es/> >
Hi Xavier, On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > I have been overseeing these last emails - Ahhh, good to know; thanks! :) > thank you very much for your > efforts, Alex! :-) > I did not reply until now because I do not have prior > experience with gcc internals, so my feedback would probably have not > been that useful. Ok. > Those emails from 2020 were in fact discussing two completely different > proposals at once: > > 1. Add _Lengthof + #include <stdlengthof.h> > 2. Allow static qualifier on compound literals Yup. > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and as > you already know by now, proposal #1 received some negative feedback, > suggesting _Typeof/typeof + some macro magic as a pragmatic workaround > instead. The original author of that negative feedback talked to me in private a week ago, and said he likes my proposal. We have no negative feedback anymore. :) > Since the proposal did not get much traction and I would had been > unable to contribute to gcc myself, I just gave up on it. IIRC the > deadline for new proposals closed soon after, anyway. Ok. > But I am glad that someone with proper experience took the initiative. Fun fact: this is my second non-trivial patch to GCC. I wouldn't say I had the proper experience with GCC internals when I started this patch set. But I'm unemployed at the moment, which gives me all the time I need for learning those. :) > I still think the proposal is relevant and has interesting use cases. > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > Depending on feedback, I'll propose the uglified version. > > Probably, all of us know why the uglified version is the usual approach > preferred by the C standard: we do not know how many applications would > break otherwise. Yup. > However, we see that this trend is now changing with C23, so probably > it makes sense to define lengthof directly. Yeah, since Jens is in WG14 and he suggested to follow this trend, maybe we can. If not, it's trivial to change the proposal to use the uglified name plus a macro. Checking <https://codesearch.debian.net>, I see that while several projects have a lengthof() macro, all of them use it with semantics compatible with this keyword, so it shouldn't break too much. Maybe those projects will start receiving diagnostics that they're redefining a standard keyword, but that's not too bad. > As for the parentheses, I personally think lengthof should follow > similar rules compared to sizeof. I think most people agree with this. > > Best regards, Have a lovely night! Alex
Hi, Am 14. August 2024 00:38:53 MESZ schrieb Xavier Del Campo Romero <xavi.dcr@tutanota.com>: > I have been overseeing these last emails - thank you very much for your > efforts, Alex! I did not reply until now because I do not have prior > experience with gcc internals, so my feedback would probably have not > been that useful. > > Those emails from 2020 were in fact discussing two completely different > proposals at once: > > 1. Add _Lengthof + #include <stdlengthof.h> > 2. Allow static qualifier on compound literals > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), this was together with Alex > and as > you already know by now, proposal #1 received some negative feedback, > suggesting _Typeof/typeof + some macro magic as a pragmatic workaround > instead. > > Since the proposal did not get much traction and I would had been > unable to contribute to gcc myself, I just gave up on it. IIRC the > deadline for new proposals closed soon after, anyway. > > But I am glad that someone with proper experience took the initiative. > I still think the proposal is relevant and has interesting use cases. > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > Depending on feedback, I'll propose the uglified version. > > Probably, all of us know why the uglified version is the usual approach > preferred by the C standard: we do not know how many applications would > break otherwise. > > However, we see that this trend is now changing with C23, so probably > it makes sense to define lengthof directly. When I suggested that the double-underscore version is sufficient, I was not thinking that there would be a paper to WG 14 so quickly. For integration into go and clang the double underscore is certainly enough. Then for a standardization that is another question. > As for the parentheses, I personally think lengthof should follow > similar rules compared to sizeof. > > Best regards, > > -- > Xavier Del Campo Romero > > > > Aug 13, 2024, 15:02 by alx@kernel.org: > > > Hi, > > > > On Tue, Aug 13, 2024 at 01:34:58AM GMT, Alejandro Colomar wrote: > > > >> I want to send an updated version of n2529. The original author didn't > >> respond to my mail, so I'll take over. I've been preparing a GCC patch > >> set for adding the feature to GCC, and have informed Clang developers > >> about it too. > >> > >> The title would be > >> > >> _Lengthof - New pointer-proof keyword to determine array length (v2) > >> > >> Can you please assign me a number for it? Thanks. > >> > > > > Attached is a draft for a paper (both the man(7) source and the > > generated PDF). > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > Depending on feedback, I'll propose the uglified version. > > > > Cheers, > > Alex > > > > -- > > <https://www.alejandro-colomar.es/> > > > Jens
Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > Hi Xavier, > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > I have been overseeing these last emails - > > Ahhh, good to know; thanks! :) > > > thank you very much for your > > efforts, Alex! > > :-) > > > I did not reply until now because I do not have prior > > experience with gcc internals, so my feedback would probably have not > > been that useful. > > Ok. > > > Those emails from 2020 were in fact discussing two completely different > > proposals at once: > > > > 1. Add _Lengthof + #include <stdlengthof.h> > > 2. Allow static qualifier on compound literals > > Yup. > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and as > > you already know by now, proposal #1 received some negative feedback, > > suggesting _Typeof/typeof + some macro magic as a pragmatic workaround > > instead. > > The original author of that negative feedback talked to me in private > a week ago, and said he likes my proposal. We have no negative feedback > anymore. :) > > > Since the proposal did not get much traction and I would had been > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > deadline for new proposals closed soon after, anyway. > > Ok. > > > But I am glad that someone with proper experience took the initiative. > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't say I > had the proper experience with GCC internals when I started this patch > set. But I'm unemployed at the moment, which gives me all the time I > need for learning those. :) > > > I still think the proposal is relevant and has interesting use cases. > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > Depending on feedback, I'll propose the uglified version. > > > > Probably, all of us know why the uglified version is the usual approach > > preferred by the C standard: we do not know how many applications would > > break otherwise. > > Yup. > > > However, we see that this trend is now changing with C23, so probably > > it makes sense to define lengthof directly. > > Yeah, since Jens is in WG14 and he suggested to follow this trend, maybe > we can. If not, it's trivial to change the proposal to use the uglified > name plus a macro. > > Checking <https://codesearch.debian.net>, I see that while several > projects have a lengthof() macro, all of them use it with semantics > compatible with this keyword, so it shouldn't break too much. Maybe > those projects will start receiving diagnostics that they're redefining > a standard keyword, but that's not too bad. For a WG14 paper you should add these findings to support that choice. Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > As for the parentheses, I personally think lengthof should follow > > similar rules compared to sizeof. > > I think most people agree with this. I still don't, in particular not for standardisation. We have to remember that there are many small C compilers out there. I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar to offsetof. gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the moment, even just the macros that you have in the paper as a starting point. The rest would be "quality of implementation" What time horizon do you see to add the feature for array parameters? Thanks Jens > > Best regards, > > Have a lovely night! > Alex >
Hi Jens, Martin, On Wed, Aug 14, 2024 at 08:11:20AM GMT, Jens Gustedt wrote: > > Checking <https://codesearch.debian.net>, I see that while several > > projects have a lengthof() macro, all of them use it with semantics > > compatible with this keyword, so it shouldn't break too much. Maybe > > those projects will start receiving diagnostics that they're redefining > > a standard keyword, but that's not too bad. > > For a WG14 paper you should add these findings to support that choice. > Another option would be for WG14 to standardize the then existing implementation with the double underscores. Makes sense; I'll add that into new "Prior art" and "Backwards compatibility" sections within the paper. > > > As for the parentheses, I personally think lengthof should follow > > > similar rules compared to sizeof. > > > > I think most people agree with this. > > I still don't, in particular not for standardisation. > > We have to remember that there are many small C compilers out there. > I would not want unnecessary burden on them. So my preferred choice would be > a standardisation as a macro, similar to offsetof. > gcc (and clang) could then just map that to their builtin, other compilers could use > whatever they have at the moment, even just the macros that you have in the paper as a starting point. > > The rest would be "quality of implementation" Hmmm, sounds reasonable. Some doubts: If we allow a compiler to implement it as a predefined macro that expands to the usual sizeof division, it might produce double evaluation in some VLA cases. That would be surprising to some programs, which may expect either 0 or 1 evaluations, but not 2. Maybe we can leave it as unspecified behavior, and an implementation may document that double evaluation may happen if the input is a VLA? > What time horizon do you see to add the feature for array parameters? Martin, what do you think? I think the only blocking thing for me is what you mentioned about turning function parameters into arrays that decay almost everywhere. Once that's set up, my code will probably work with them without modification, or maybe with just a little tweak. Do you have an idea of how much time that can take you? I expect it to be well before C2y. Maybe a year or two? Have a lovely day! Alex > Thanks > Jens
Sorry for top-posting, my work account is stuck on Outlook. :-/ > For a WG14 paper you should add these findings to support that choice. > Another option would be for WG14 to standardize the then existing implementation with the double underscores. +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional arrays. That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 (and many, many others) >> > As for the parentheses, I personally think lengthof should follow >> > similar rules compared to sizeof. >> >> I think most people agree with this. > > I still don't, in particular not for standardisation. > > We have to remember that there are many small C compilers out there. Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it would be inconsistent for the rank interface to then not require parens. ~Aaron -----Original Message----- From: Jens Gustedt <jens.gustedt@inria.fr> Sent: Wednesday, August 14, 2024 2:11 AM To: Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero <xavi.dcr@tutanota.com> Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com>; Ballman, Aaron <aaron.ballman@intel.com> Subject: Re: v2.1 Draft for a lengthof paper Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > Hi Xavier, > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > I have been overseeing these last emails - > > Ahhh, good to know; thanks! :) > > > thank you very much for your > > efforts, Alex! > > :-) > > > I did not reply until now because I do not have prior experience > > with gcc internals, so my feedback would probably have not been that > > useful. > > Ok. > > > Those emails from 2020 were in fact discussing two completely > > different proposals at once: > > > > 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static > > qualifier on compound literals > > Yup. > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and > > as you already know by now, proposal #1 received some negative > > feedback, suggesting _Typeof/typeof + some macro magic as a > > pragmatic workaround instead. > > The original author of that negative feedback talked to me in private > a week ago, and said he likes my proposal. We have no negative > feedback anymore. :) > > > Since the proposal did not get much traction and I would had been > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > deadline for new proposals closed soon after, anyway. > > Ok. > > > But I am glad that someone with proper experience took the initiative. > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't say > I had the proper experience with GCC internals when I started this > patch set. But I'm unemployed at the moment, which gives me all the > time I need for learning those. :) > > > I still think the proposal is relevant and has interesting use cases. > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > Depending on feedback, I'll propose the uglified version. > > > > Probably, all of us know why the uglified version is the usual > > approach preferred by the C standard: we do not know how many > > applications would break otherwise. > > Yup. > > > However, we see that this trend is now changing with C23, so > > probably it makes sense to define lengthof directly. > > Yeah, since Jens is in WG14 and he suggested to follow this trend, > maybe we can. If not, it's trivial to change the proposal to use the > uglified name plus a macro. > > Checking <https://codesearch.debian.net>, I see that while several > projects have a lengthof() macro, all of them use it with semantics > compatible with this keyword, so it shouldn't break too much. Maybe > those projects will start receiving diagnostics that they're > redefining a standard keyword, but that's not too bad. For a WG14 paper you should add these findings to support that choice. Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > As for the parentheses, I personally think lengthof should follow > > similar rules compared to sizeof. > > I think most people agree with this. I still don't, in particular not for standardisation. We have to remember that there are many small C compilers out there. I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar to offsetof. gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the moment, even just the macros that you have in the paper as a starting point. The rest would be "quality of implementation" What time horizon do you see to add the feature for array parameters? Thanks Jens > > Best regards, > > Have a lovely night! > Alex > -- Jens Gustedt - INRIA & ICube, Strasbourg, France
Hi Aaron, Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > For a WG14 paper you should add these findings to support that choice. > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional arrays. > > That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 > (and many, many others) > > >> > As for the parentheses, I personally think lengthof should follow > >> > similar rules compared to sizeof. > >> > >> I think most people agree with this. > > > > I still don't, in particular not for standardisation. > > > > We have to remember that there are many small C compilers out there. > > Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it would be inconsistent for the rank interface to then not require parens. I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. We should not impose an implementation in the language where doing it in a header can be completely sufficient. Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. this was basically what we did for `unreachable` and I think it worked out fine. Jens > ~Aaron > > -----Original Message----- > From: Jens Gustedt <jens.gustedt@inria.fr> > Sent: Wednesday, August 14, 2024 2:11 AM > To: Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero <xavi.dcr@tutanota.com> > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com>; Ballman, Aaron <aaron.ballman@intel.com> > Subject: Re: v2.1 Draft for a lengthof paper > > Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > Hi Xavier, > > > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > > I have been overseeing these last emails - > > > > Ahhh, good to know; thanks! :) > > > > > thank you very much for your > > > efforts, Alex! > > > > :-) > > > > > I did not reply until now because I do not have prior experience > > > with gcc internals, so my feedback would probably have not been that > > > useful. > > > > Ok. > > > > > Those emails from 2020 were in fact discussing two completely > > > different proposals at once: > > > > > > 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static > > > qualifier on compound literals > > > > Yup. > > > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and > > > as you already know by now, proposal #1 received some negative > > > feedback, suggesting _Typeof/typeof + some macro magic as a > > > pragmatic workaround instead. > > > > The original author of that negative feedback talked to me in private > > a week ago, and said he likes my proposal. We have no negative > > feedback anymore. :) > > > > > Since the proposal did not get much traction and I would had been > > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > > deadline for new proposals closed soon after, anyway. > > > > Ok. > > > > > But I am glad that someone with proper experience took the initiative. > > > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't say > > I had the proper experience with GCC internals when I started this > > patch set. But I'm unemployed at the moment, which gives me all the > > time I need for learning those. :) > > > > > I still think the proposal is relevant and has interesting use cases. > > > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > > Depending on feedback, I'll propose the uglified version. > > > > > > Probably, all of us know why the uglified version is the usual > > > approach preferred by the C standard: we do not know how many > > > applications would break otherwise. > > > > Yup. > > > > > However, we see that this trend is now changing with C23, so > > > probably it makes sense to define lengthof directly. > > > > Yeah, since Jens is in WG14 and he suggested to follow this trend, > > maybe we can. If not, it's trivial to change the proposal to use the > > uglified name plus a macro. > > > > Checking <https://codesearch.debian.net>, I see that while several > > projects have a lengthof() macro, all of them use it with semantics > > compatible with this keyword, so it shouldn't break too much. Maybe > > those projects will start receiving diagnostics that they're > > redefining a standard keyword, but that's not too bad. > > For a WG14 paper you should add these findings to support that choice. > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > As for the parentheses, I personally think lengthof should follow > > > similar rules compared to sizeof. > > > > I think most people agree with this. > > I still don't, in particular not for standardisation. > > We have to remember that there are many small C compilers out there. > I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar to offsetof. > gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the moment, even just the macros that you have in the paper as a starting point. > > The rest would be "quality of implementation" > > What time horizon do you see to add the feature for array parameters? > > Thanks > Jens > > > > > Best regards, > > > > Have a lovely night! > > Alex > > > > > -- > Jens Gustedt - INRIA & ICube, Strasbourg, France
> I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a > quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. But can doing this in a header be completely sufficient in practice? e.g., the user who passes a pointer rather than an array is in for quite a surprise, or passing a struct, or passing a FAM, etc. If we want to put constraints on the interface, that may be more challenging to do from a header file than from the compiler. offsetof is a cautionary tale in that compilers that want a reasonable QoI basically all implement this as a builtin rather than the header-only version. > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. > this was basically what we did for `unreachable` and I think it worked out fine. True! I'm still thinking on how important rank + extent is vs overall array length. If C had constexpr functions, then I'd almost certainly want array rank and extent to be the building blocks and then lengthof can be a constexpr function looping over rank and summing extents. But we don't have that yet, and "bird hand" vs "bird in bush"... :-D ~Aaron -----Original Message----- From: Jens Gustedt <jens.gustedt@inria.fr> Sent: Wednesday, August 14, 2024 8:18 AM To: Ballman, Aaron <aaron.ballman@intel.com>; Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero <xavi.dcr@tutanota.com> Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> Subject: RE: v2.1 Draft for a lengthof paper Hi Aaron, Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > For a WG14 paper you should add these findings to support that choice. > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional arrays. > > That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src > /cmd/mailx/names.c?L53-55 > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_f > w.c?L292-294 > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/bl > ob/src/spur64.stack/validImage.c?L7014-7018 > (and many, many others) > > >> > As for the parentheses, I personally think lengthof should follow > >> > similar rules compared to sizeof. > >> > >> I think most people agree with this. > > > > I still don't, in particular not for standardisation. > > > > We have to remember that there are many small C compilers out there. > > Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it would be inconsistent for the rank interface to then not require parens. I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. We should not impose an implementation in the language where doing it in a header can be completely sufficient. Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. this was basically what we did for `unreachable` and I think it worked out fine. Jens > ~Aaron > > -----Original Message----- > From: Jens Gustedt <jens.gustedt@inria.fr> > Sent: Wednesday, August 14, 2024 2:11 AM > To: Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero > <xavi.dcr@tutanota.com> > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh > <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers > <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub > Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing > Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; > Florian Weimer <fweimer@redhat.com>; Andreas Schwab > <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang > <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com>; Ballman, > Aaron <aaron.ballman@intel.com> > Subject: Re: v2.1 Draft for a lengthof paper > > Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > Hi Xavier, > > > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > > I have been overseeing these last emails - > > > > Ahhh, good to know; thanks! :) > > > > > thank you very much for your > > > efforts, Alex! > > > > :-) > > > > > I did not reply until now because I do not have prior experience > > > with gcc internals, so my feedback would probably have not been > > > that useful. > > > > Ok. > > > > > Those emails from 2020 were in fact discussing two completely > > > different proposals at once: > > > > > > 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static > > > qualifier on compound literals > > > > Yup. > > > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and > > > as you already know by now, proposal #1 received some negative > > > feedback, suggesting _Typeof/typeof + some macro magic as a > > > pragmatic workaround instead. > > > > The original author of that negative feedback talked to me in > > private a week ago, and said he likes my proposal. We have no > > negative feedback anymore. :) > > > > > Since the proposal did not get much traction and I would had been > > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > > deadline for new proposals closed soon after, anyway. > > > > Ok. > > > > > But I am glad that someone with proper experience took the initiative. > > > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't > > say I had the proper experience with GCC internals when I started > > this patch set. But I'm unemployed at the moment, which gives me > > all the time I need for learning those. :) > > > > > I still think the proposal is relevant and has interesting use cases. > > > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > > Depending on feedback, I'll propose the uglified version. > > > > > > Probably, all of us know why the uglified version is the usual > > > approach preferred by the C standard: we do not know how many > > > applications would break otherwise. > > > > Yup. > > > > > However, we see that this trend is now changing with C23, so > > > probably it makes sense to define lengthof directly. > > > > Yeah, since Jens is in WG14 and he suggested to follow this trend, > > maybe we can. If not, it's trivial to change the proposal to use > > the uglified name plus a macro. > > > > Checking <https://codesearch.debian.net>, I see that while several > > projects have a lengthof() macro, all of them use it with semantics > > compatible with this keyword, so it shouldn't break too much. Maybe > > those projects will start receiving diagnostics that they're > > redefining a standard keyword, but that's not too bad. > > For a WG14 paper you should add these findings to support that choice. > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > As for the parentheses, I personally think lengthof should follow > > > similar rules compared to sizeof. > > > > I think most people agree with this. > > I still don't, in particular not for standardisation. > > We have to remember that there are many small C compilers out there. > I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar to offsetof. > gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the moment, even just the macros that you have in the paper as a starting point. > > The rest would be "quality of implementation" > > What time horizon do you see to add the feature for array parameters? > > Thanks > Jens > > > > > Best regards, > > > > Have a lovely night! > > Alex > > > > > -- > Jens Gustedt - INRIA & ICube, Strasbourg, France
Hi Aaron, Jens, On Wed, Aug 14, 2024 at 02:17:52PM GMT, Jens Gustedt wrote: > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part > > of the paper. However, please also point out that C++ has a prior > > art as well which is slightly different and very much worth > > considering: they have one API for getting the array's rank, > > and another for getting a specific rank's extent. This is a general > > solution that doesn't require the programmer to have deep knowledge > > of C's declarator syntax and how it relates to multidimensional > > arrays. I have added that to my draft. I'll publish it soon as a reply to the GCC mailing list. See below for details of what I have added for now. > > > > That said, I suspect WG14 would not be keen on standardizing > > `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 > > (and many, many others) What regex did you use for searching? I was thinking of renaming the proposal to elementsof(), to avoid confusion between length of an array and length of a string. Would you mind checking if elementsof() is ok? > > >> > As for the parentheses, I personally think lengthof should follow > > >> > similar rules compared to sizeof. > > >> > > >> I think most people agree with this. > > > > > > I still don't, in particular not for standardisation. > > > > > > We have to remember that there are many small C compilers out there. > > > > Those compilers already have to handle parsing this for sizeof, so > > that's not particularly compelling Agree. I suspect it will be simpler for existing compilers to follow sizeof than to have new syntax. However, it's easy to keep it as a QoI detail, so I've temporarily changed the wording to require parentheses, and let implementations lift that restriction. > > (even if we wanted to design C > > for the lowest common denominator of implementation effort, which > > I'm not convinced is a good approach these days). Off-topic, but I wish that had been the approach when a few implementations (I suspect proprietary vendors; this was never disclosed) rejected redefining NULL as the right thing: (void *) 0. I fixed one of the last free-software implementations of NULL that expanded to 0, and nullptr would probably never have been added if WG14 had not accepted the pressure from such horrible implementations. <https://github.com/cc65/cc65/issues/1823> > > That said, if we went with a rank/extent design, I think we'd *have* > > to use parens because the extent interface would take two operands > > (the array and the rank you're interested in getting the extent of) > > and it would be inconsistent for the rank interface to then not > > require parens. Prior art C It is common in C programs to get the number of elements of an array via the usual sizeof division and wrap it in a macro. Common names include: • ARRAY_SIZE() • NELEM() • NELEMS() • NITEMS() • NELTS() • elementsof() • lengthof() C++ In C++, there are several standard features to determine the number of elements of an array: std::size() (since C++17) std::ssize() (since C++20) The syntax of these is identical to the usual C macros named above. It’s a bit different, since it’s a general purpose sizing template, which works on non‐array types too, with different semantics. But when applied to an array, it has the same seman‐ tics as the macros above. std::extent (since C++23) The syntax of this is quite different. It uses a numeric index as a second parameter to determine the dimension in which the number of elements should be counted. C arrays are much simpler than C++’s many array‐like types, and I don’t see a reason why we would need something as complex as std::extent in C. Cer‐ tainly, existing projects have not developed such a macro, even if it is technically possible: #define DEREFERENCE(a, n) DEREFERENCE_ ## n (a, c) #define DEREFERENCE_9(a) (*********(a)) #define DEREFERENCE_8(a) (********(a)) #define DEREFERENCE_7(a) (*******(a)) #define DEREFERENCE_6(a) (******(a)) #define DEREFERENCE_5(a) (*****(a)) #define DEREFERENCE_4(a) (****(a)) #define DEREFERENCE_3(a) (***(a)) #define DEREFERENCE_2(a) (**(a)) #define DEREFERENCE_1(a) (*(a)) #define DEREFERENCE_0(a) ((a)) #define extent(a, n) nitems(DEREFERENCE(a, n)) If any project needs that syntax, they can implement their own trivial wrapper macro, as demonstrated above. Existing prior art in C seems to favour a design that fol‐ lows the syntax of other operators like sizeof. > I think that this argument goes too short. E. g. implementation that > already have compound expressions (or lambdas ;-) may provide a > quality implementation using `static_assert` and `typeof` alone, and > don't have to touch their compiler at all. > > We should not impose an implementation in the language where doing it > in a header can be completely sufficient. I have concerns about a libc (or a predefined macro) implementation: the sizeof division causes double evaluation with any VLAs, while my implementation for GCC has less cases of evaluation, and when it needs to evaluate, it only does it once. It would be hard to find a good wording that would allow an implementation to implement this as a macro. constexpr The usual sizeof division evaluates the operand and results in a run‐time value in cases where it wouldn’t be necessary. If the top‐level array number of elements is determined by an integer constant expression, but an internal array is a VLA, sizeof must evaluate: int a[7][n]; int (*p)[7][n]; p = &a; nitems(*p++); With a elementsof operator, this would result in an integer con‐ stant expression of value 7. Double evaluation With the sizeof‐based implementation from above, the example from above causes double evaluation of *p++. > Plus, implementing as a macro in a header (probably <stddef.h>) makes > also a feature test, for those applications that already have > something similar. This is interesting. But I think an implementation could just #define lengthof lengthof to provide a feature-test macro. > this was basically what we did for `unreachable` and I think it worked > out fine. > > Jens Have a lovely day! Alex
Hi Aaron, On Wed, Aug 14, 2024 at 12:40:41PM GMT, Ballman, Aaron wrote: > > We should not impose an implementation in the language where doing > > it in a header can be completely sufficient. > > But can doing this in a header be completely sufficient in practice? > e.g., the user who passes a pointer rather than an array is in for > quite a surprise, or passing a struct, or passing a FAM, etc. If we > want to put constraints on the interface, that may be more challenging > to do from a header file than from the compiler. I've provided a C23-portable and safe implementation of lengthof() as a macro: Portability Prior to C23 it was impossible to do this portably, but since C23 it is possible to portably write a macro that determines the num‐ ber of elements of an array, that is, the number of elements in the array. #define must_be(e) \ ( \ 0 * (int) sizeof( \ struct { \ static_assert(e); \ int ISO_C_forbids_a_struct_with_no_members; \ } \ ) \ ) #define is_array(a) \ ( \ _Generic(&(a), \ typeof((a)[0]) **: 0, \ default: 1 \ ) \ ) #define sizeof_array(a) (sizeof(a) + must_be(is_array(a))) #define nitems(a) (sizeof_array(a) / sizeof((a)[0])) While diagnostics could be better, with good helper‐macro names, they are decent. The issues with this implementation are also listed in the paper. Here's a TL;DR: - It doesn't accept type names. - In results unnecessarily in run-time values where a keyword could result in an integer constant expression: int a[7][n]; int (*p)[7][n]; p = &a; nitems(*p++); - Double evaluation: not only the macro evaluates in more cases than a keyword, it evaluates twice (due to the two sizeof calls). - Less diagnostics. Since there are less constant expressions, there are less opportunities to catch UB. So far, we've lived with all of those issues (plus the lack of portability, since this could only be implemented via compiler extensions until C23). But ideally, I'd like to avoid the wording juggling that would be required to allow such an implementation. Here's an example of the difference in wording that would be required: The elementsof operator yields the number of elements of its operand. The number of elements is determined from the type of the operand. The result is an integer. If the number of elements of the array type is variable, the operand is evaluated; +otherwise, +if the operand is a variable-length array, +it is unspecified whether the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant. +If the operand is evaluated, +it is unspecified the number of times it is evaluated. Which sounds very suspicious. > I'm still thinking on how important rank + extent is vs overall array > length. If C had constexpr functions, then I'd almost certainly want > array rank and extent to be the building blocks and then lengthof can > be a constexpr function looping over rank and summing extents. But we > don't have that yet, and "bird hand" vs "bird in bush"... :-D Or you can build it the other way around: define extent() as a macro that wraps lengthof(). About rank, I suspect you could also develop something with _Generic(3), but I didn't try. Cheers, Alex
> What regex did you use for searching? I went cheap and easy rather than trying to narrow down: https://sourcegraph.com/search?q=context:global+lang:C+lengthof&patternType=regexp&sm=0 > I was thinking of renaming the proposal to elementsof(), to avoid confusion between length of an array and length of a string. Would you mind checking if elementsof() is ok? From what I was seeing, it looks to be used more uniformly as a function-like macro accepting a single argument. ~Aaron -----Original Message----- From: Alejandro Colomar <alx@kernel.org> Sent: Wednesday, August 14, 2024 8:58 AM To: Jens Gustedt <jens.gustedt@inria.fr>; Ballman, Aaron <aaron.ballman@intel.com> Cc: Xavier Del Campo Romero <xavi.dcr@tutanota.com>; Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> Subject: Re: v2.1 Draft for a lengthof paper Hi Aaron, Jens, On Wed, Aug 14, 2024 at 02:17:52PM GMT, Jens Gustedt wrote: > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part > > of the paper. However, please also point out that C++ has a prior > > art as well which is slightly different and very much worth > > considering: they have one API for getting the array's rank, and > > another for getting a specific rank's extent. This is a general > > solution that doesn't require the programmer to have deep knowledge > > of C's declarator syntax and how it relates to multidimensional > > arrays. I have added that to my draft. I'll publish it soon as a reply to the GCC mailing list. See below for details of what I have added for now. > > > > That said, I suspect WG14 would not be keen on standardizing > > `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/s > > rc/cmd/mailx/names.c?L53-55 > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod > > _fw.c?L292-294 > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/ > > blob/src/spur64.stack/validImage.c?L7014-7018 > > (and many, many others) What regex did you use for searching? I was thinking of renaming the proposal to elementsof(), to avoid confusion between length of an array and length of a string. Would you mind checking if elementsof() is ok? > > >> > As for the parentheses, I personally think lengthof should > > >> > follow similar rules compared to sizeof. > > >> > > >> I think most people agree with this. > > > > > > I still don't, in particular not for standardisation. > > > > > > We have to remember that there are many small C compilers out there. > > > > Those compilers already have to handle parsing this for sizeof, so > > that's not particularly compelling Agree. I suspect it will be simpler for existing compilers to follow sizeof than to have new syntax. However, it's easy to keep it as a QoI detail, so I've temporarily changed the wording to require parentheses, and let implementations lift that restriction. > > (even if we wanted to design C > > for the lowest common denominator of implementation effort, which > > I'm not convinced is a good approach these days). Off-topic, but I wish that had been the approach when a few implementations (I suspect proprietary vendors; this was never disclosed) rejected redefining NULL as the right thing: (void *) 0. I fixed one of the last free-software implementations of NULL that expanded to 0, and nullptr would probably never have been added if WG14 had not accepted the pressure from such horrible implementations. <https://github.com/cc65/cc65/issues/1823> > > That said, if we went with a rank/extent design, I think we'd *have* > > to use parens because the extent interface would take two operands > > (the array and the rank you're interested in getting the extent of) > > and it would be inconsistent for the rank interface to then not > > require parens. Prior art C It is common in C programs to get the number of elements of an array via the usual sizeof division and wrap it in a macro. Common names include: • ARRAY_SIZE() • NELEM() • NELEMS() • NITEMS() • NELTS() • elementsof() • lengthof() C++ In C++, there are several standard features to determine the number of elements of an array: std::size() (since C++17) std::ssize() (since C++20) The syntax of these is identical to the usual C macros named above. It’s a bit different, since it’s a general purpose sizing template, which works on non‐array types too, with different semantics. But when applied to an array, it has the same seman‐ tics as the macros above. std::extent (since C++23) The syntax of this is quite different. It uses a numeric index as a second parameter to determine the dimension in which the number of elements should be counted. C arrays are much simpler than C++’s many array‐like types, and I don’t see a reason why we would need something as complex as std::extent in C. Cer‐ tainly, existing projects have not developed such a macro, even if it is technically possible: #define DEREFERENCE(a, n) DEREFERENCE_ ## n (a, c) #define DEREFERENCE_9(a) (*********(a)) #define DEREFERENCE_8(a) (********(a)) #define DEREFERENCE_7(a) (*******(a)) #define DEREFERENCE_6(a) (******(a)) #define DEREFERENCE_5(a) (*****(a)) #define DEREFERENCE_4(a) (****(a)) #define DEREFERENCE_3(a) (***(a)) #define DEREFERENCE_2(a) (**(a)) #define DEREFERENCE_1(a) (*(a)) #define DEREFERENCE_0(a) ((a)) #define extent(a, n) nitems(DEREFERENCE(a, n)) If any project needs that syntax, they can implement their own trivial wrapper macro, as demonstrated above. Existing prior art in C seems to favour a design that fol‐ lows the syntax of other operators like sizeof. > I think that this argument goes too short. E. g. implementation that > already have compound expressions (or lambdas ;-) may provide a > quality implementation using `static_assert` and `typeof` alone, and > don't have to touch their compiler at all. > > We should not impose an implementation in the language where doing it > in a header can be completely sufficient. I have concerns about a libc (or a predefined macro) implementation: the sizeof division causes double evaluation with any VLAs, while my implementation for GCC has less cases of evaluation, and when it needs to evaluate, it only does it once. It would be hard to find a good wording that would allow an implementation to implement this as a macro. constexpr The usual sizeof division evaluates the operand and results in a run‐time value in cases where it wouldn’t be necessary. If the top‐level array number of elements is determined by an integer constant expression, but an internal array is a VLA, sizeof must evaluate: int a[7][n]; int (*p)[7][n]; p = &a; nitems(*p++); With a elementsof operator, this would result in an integer con‐ stant expression of value 7. Double evaluation With the sizeof‐based implementation from above, the example from above causes double evaluation of *p++. > Plus, implementing as a macro in a header (probably <stddef.h>) makes > also a feature test, for those applications that already have > something similar. This is interesting. But I think an implementation could just #define lengthof lengthof to provide a feature-test macro. > this was basically what we did for `unreachable` and I think it worked > out fine. > > Jens Have a lovely day! Alex -- <https://www.alejandro-colomar.es/>
Am 14. August 2024 14:40:41 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a > quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. > > > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. > > But can doing this in a header be completely sufficient in practice? Ithindso. > e.g., the user who passes a pointer rather than an array is in for quite a surprise, or passing a struct, or passing a FAM, etc. If we want to put constraints on the interface, that may be more challenging to do from a header file than from the compiler. offsetof is a cautionary tale in that compilers that want a reasonable QoI basically all implement this as a builtin rather than the header-only version. Yes, with the tools that I listed and the ideas that are already in the paper you can basically do all that, including given valuable feedback in case of failure. I am currently on a summer bike trip, so not able to provide a full reference implantation. But could do so, once I am back. > > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. > > this was basically what we did for `unreachable` and I think it worked out fine. > > True! > > I'm still thinking on how important rank + extent is vs overall array length. If C had constexpr functions, then I'd almost certainly want array rank and extent to be the building blocks and then lengthof can be a constexpr function looping over rank and summing extents. But we don't have that yet, and "bird hand" vs "bird in bush"... :-D Why would you be looping? lengthof only addresses the outer dimension sizeof would need a loop, no ? Generally I would be opposed to imposing a complicated solution for a simple feature Jens > > ~Aaron > > -----Original Message----- > From: Jens Gustedt <jens.gustedt@inria.fr> > Sent: Wednesday, August 14, 2024 8:18 AM > To: Ballman, Aaron <aaron.ballman@intel.com>; Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero <xavi.dcr@tutanota.com> > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> > Subject: RE: v2.1 Draft for a lengthof paper > > Hi Aaron, > > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional arrays. > > > > That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src > > /cmd/mailx/names.c?L53-55 > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_f > > w.c?L292-294 > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/bl > > ob/src/spur64.stack/validImage.c?L7014-7018 > > (and many, many others) > > > > >> > As for the parentheses, I personally think lengthof should follow > > >> > similar rules compared to sizeof. > > >> > > >> I think most people agree with this. > > > > > > I still don't, in particular not for standardisation. > > > > > > We have to remember that there are many small C compilers out there. > > > > Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it would be inconsistent for the rank interface to then not require parens. > > I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. > > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. > this was basically what we did for `unreachable` and I think it worked out fine. > > Jens > > > ~Aaron > > > > -----Original Message----- > > From: Jens Gustedt <jens.gustedt@inria.fr> > > Sent: Wednesday, August 14, 2024 2:11 AM > > To: Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero > > <xavi.dcr@tutanota.com> > > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh > > <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers > > <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub > > Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing > > Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; > > Florian Weimer <fweimer@redhat.com>; Andreas Schwab > > <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang > > <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com>; Ballman, > > Aaron <aaron.ballman@intel.com> > > Subject: Re: v2.1 Draft for a lengthof paper > > > > Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > > Hi Xavier, > > > > > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > > > I have been overseeing these last emails - > > > > > > Ahhh, good to know; thanks! :) > > > > > > > thank you very much for your > > > > efforts, Alex! > > > > > > :-) > > > > > > > I did not reply until now because I do not have prior experience > > > > with gcc internals, so my feedback would probably have not been > > > > that useful. > > > > > > Ok. > > > > > > > Those emails from 2020 were in fact discussing two completely > > > > different proposals at once: > > > > > > > > 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static > > > > qualifier on compound literals > > > > > > Yup. > > > > > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and > > > > as you already know by now, proposal #1 received some negative > > > > feedback, suggesting _Typeof/typeof + some macro magic as a > > > > pragmatic workaround instead. > > > > > > The original author of that negative feedback talked to me in > > > private a week ago, and said he likes my proposal. We have no > > > negative feedback anymore. :) > > > > > > > Since the proposal did not get much traction and I would had been > > > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > > > deadline for new proposals closed soon after, anyway. > > > > > > Ok. > > > > > > > But I am glad that someone with proper experience took the initiative. > > > > > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't > > > say I had the proper experience with GCC internals when I started > > > this patch set. But I'm unemployed at the moment, which gives me > > > all the time I need for learning those. :) > > > > > > > I still think the proposal is relevant and has interesting use cases. > > > > > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > > > Depending on feedback, I'll propose the uglified version. > > > > > > > > Probably, all of us know why the uglified version is the usual > > > > approach preferred by the C standard: we do not know how many > > > > applications would break otherwise. > > > > > > Yup. > > > > > > > However, we see that this trend is now changing with C23, so > > > > probably it makes sense to define lengthof directly. > > > > > > Yeah, since Jens is in WG14 and he suggested to follow this trend, > > > maybe we can. If not, it's trivial to change the proposal to use > > > the uglified name plus a macro. > > > > > > Checking <https://codesearch.debian.net>, I see that while several > > > projects have a lengthof() macro, all of them use it with semantics > > > compatible with this keyword, so it shouldn't break too much. Maybe > > > those projects will start receiving diagnostics that they're > > > redefining a standard keyword, but that's not too bad. > > > > For a WG14 paper you should add these findings to support that choice. > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > > > As for the parentheses, I personally think lengthof should follow > > > > similar rules compared to sizeof. > > > > > > I think most people agree with this. > > > > I still don't, in particular not for standardisation. > > > > We have to remember that there are many small C compilers out there. > > I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar to offsetof. > > gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the moment, even just the macros that you have in the paper as a starting point. > > > > The rest would be "quality of implementation" > > > > What time horizon do you see to add the feature for array parameters? > > > > Thanks > > Jens > > > > > > > > Best regards, > > > > > > Have a lovely night! > > > Alex > > > > > > > > > -- > > Jens Gustedt - INRIA & ICube, Strasbourg, France > >
Am Mittwoch, dem 14.08.2024 um 12:40 +0000 schrieb Ballman, Aaron: > > I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas > > ;-) may provide a > quality implementation using `static_assert` and `typeof` alone, and don't have to touch their > > compiler at all. > > > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. > > But can doing this in a header be completely sufficient in practice? e.g., the user who passes a pointer rather than > an array is in for quite a surprise, or passing a struct, or passing a FAM, etc. If we want to put constraints on the > interface, that may be more challenging to do from a header file than from the compiler. offsetof is a cautionary tale > in that compilers that want a reasonable QoI basically all implement this as a builtin rather than the header-only > version. > > > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications > > that already have something similar. > > this was basically what we did for `unreachable` and I think it worked out fine. > > True! > > I'm still thinking on how important rank + extent is vs overall array length. If C had constexpr functions, then I'd > almost certainly want array rank and extent to be the building blocks and then lengthof can be a constexpr function > looping over rank and summing extents. But we don't have that yet, and "bird hand" vs "bird in bush"... :-D An operator that returns an array with all dimensions of a multi-dimensional array would make a a lot of sense to me. double array[4][3][2]; // array_dims(array) = (constexpr size_t[3]){ 4, 3, 2 } int dim1 = (array_dims(array))[0] int dim2 = (array_dims(array))[1] int dim3 = (array_dims(array))[2] You can then implement lengthof in terms of this operator: #define lengthof(x) (array_dims(array)[0]) and you can obtain the rank by applying lengthof to the array: #define rank(x) lengthof(array_dims(x)) If the array is constexpr for regular arrays and array indexing returns a constant again for constexpr arrays, this would all work out. Martin > > ~Aaron > > -----Original Message----- > From: Jens Gustedt <jens.gustedt@inria.fr> > Sent: Wednesday, August 14, 2024 8:18 AM > To: Ballman, Aaron <aaron.ballman@intel.com>; Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero > <xavi.dcr@tutanota.com> > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; > Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook > <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer > <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang > <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> > Subject: RE: v2.1 Draft for a lengthof paper > > Hi Aaron, > > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out > > that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for > > getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't > > require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional > > arrays. > > > > That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are > > plenty of other uses of it that would break: > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src > > /cmd/mailx/names.c?L53-55 > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_f > > w.c?L292-294 > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/bl > > ob/src/spur64.stack/validImage.c?L7014-7018 > > (and many, many others) > > > > > > > As for the parentheses, I personally think lengthof should follow > > > > > similar rules compared to sizeof. > > > > > > > > I think most people agree with this. > > > > > > I still don't, in particular not for standardisation. > > > > > > We have to remember that there are many small C compilers out there. > > > > Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we > > wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good > > approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the > > extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it > > would be inconsistent for the rank interface to then not require parens. > > I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) > may provide a quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler > at all. > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. > > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that > already have something similar. > this was basically what we did for `unreachable` and I think it worked out fine. > > Jens > > > ~Aaron > > > > -----Original Message----- > > From: Jens Gustedt <jens.gustedt@inria.fr> > > Sent: Wednesday, August 14, 2024 2:11 AM > > To: Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero > > <xavi.dcr@tutanota.com> > > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh > > <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers > > <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub > > Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing > > Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; > > Florian Weimer <fweimer@redhat.com>; Andreas Schwab > > <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang > > <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com>; Ballman, > > Aaron <aaron.ballman@intel.com> > > Subject: Re: v2.1 Draft for a lengthof paper > > > > Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > > Hi Xavier, > > > > > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > > > I have been overseeing these last emails - > > > > > > Ahhh, good to know; thanks! :) > > > > > > > thank you very much for your > > > > efforts, Alex! > > > > > > :-) > > > > > > > I did not reply until now because I do not have prior experience > > > > with gcc internals, so my feedback would probably have not been > > > > that useful. > > > > > > Ok. > > > > > > > Those emails from 2020 were in fact discussing two completely > > > > different proposals at once: > > > > > > > > 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static > > > > qualifier on compound literals > > > > > > Yup. > > > > > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and > > > > as you already know by now, proposal #1 received some negative > > > > feedback, suggesting _Typeof/typeof + some macro magic as a > > > > pragmatic workaround instead. > > > > > > The original author of that negative feedback talked to me in > > > private a week ago, and said he likes my proposal. We have no > > > negative feedback anymore. :) > > > > > > > Since the proposal did not get much traction and I would had been > > > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > > > deadline for new proposals closed soon after, anyway. > > > > > > Ok. > > > > > > > But I am glad that someone with proper experience took the initiative. > > > > > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't > > > say I had the proper experience with GCC internals when I started > > > this patch set. But I'm unemployed at the moment, which gives me > > > all the time I need for learning those. :) > > > > > > > I still think the proposal is relevant and has interesting use cases. > > > > > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > > > Depending on feedback, I'll propose the uglified version. > > > > > > > > Probably, all of us know why the uglified version is the usual > > > > approach preferred by the C standard: we do not know how many > > > > applications would break otherwise. > > > > > > Yup. > > > > > > > However, we see that this trend is now changing with C23, so > > > > probably it makes sense to define lengthof directly. > > > > > > Yeah, since Jens is in WG14 and he suggested to follow this trend, > > > maybe we can. If not, it's trivial to change the proposal to use > > > the uglified name plus a macro. > > > > > > Checking <https://codesearch.debian.net>, I see that while several > > > projects have a lengthof() macro, all of them use it with semantics > > > compatible with this keyword, so it shouldn't break too much. Maybe > > > those projects will start receiving diagnostics that they're > > > redefining a standard keyword, but that's not too bad. > > > > For a WG14 paper you should add these findings to support that choice. > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > > > As for the parentheses, I personally think lengthof should follow > > > > similar rules compared to sizeof. > > > > > > I think most people agree with this. > > > > I still don't, in particular not for standardisation. > > > > We have to remember that there are many small C compilers out there. > > I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar > > to offsetof. > > gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the > > moment, even just the macros that you have in the paper as a starting point. > > > > The rest would be "quality of implementation" > > > > What time horizon do you see to add the feature for array parameters? > > > > Thanks > > Jens > > > > > > > > Best regards, > > > > > > Have a lovely night! > > > Alex > > > > > > > > > -- > > Jens Gustedt - INRIA & ICube, Strasbourg, France > > > -- > Jens Gustedt - INRIA & ICube, Strasbourg, France
Am 14. August 2024 14:58:16 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > Hi Aaron, Jens, > > On Wed, Aug 14, 2024 at 02:17:52PM GMT, Jens Gustedt wrote: > > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > > > For a WG14 paper you should add these findings to support that choice. > > > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > > > +1, it's always good to explain prior art and existing uses as part > > > of the paper. However, please also point out that C++ has a prior > > > art as well which is slightly different and very much worth > > > considering: they have one API for getting the array's rank, > > > and another for getting a specific rank's extent. This is a general > > > solution that doesn't require the programmer to have deep knowledge > > > of C's declarator syntax and how it relates to multidimensional > > > arrays. > > I have added that to my draft. I'll publish it soon as a reply to the > GCC mailing list. See below for details of what I have added for now. > > > > > > > That said, I suspect WG14 would not be keen on standardizing > > > `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 > > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 > > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 > > > (and many, many others) > > What regex did you use for searching? > > I was thinking of renaming the proposal to elementsof(), to avoid > confusion between length of an array and length of a string. Would you > mind checking if elementsof() is ok? No, not for me. I really want as to go consistently to talk about array length for this. Consistent terminology is important. > > > >> > As for the parentheses, I personally think lengthof should follow > > > >> > similar rules compared to sizeof. > > > >> > > > >> I think most people agree with this. > > > > > > > > I still don't, in particular not for standardisation. > > > > > > > > We have to remember that there are many small C compilers out there. > > > > > > Those compilers already have to handle parsing this for sizeof, so > > > that's not particularly compelling > > Agree. I suspect it will be simpler for existing compilers to follow > sizeof than to have new syntax. However, it's easy to keep it as a QoI > detail, so I've temporarily changed the wording to require parentheses, > and let implementations lift that restriction. great ! that is a reasonable approach, I think. > > > (even if we wanted to design C > > > for the lowest common denominator of implementation effort, which > > > I'm not convinced is a good approach these days). > > Off-topic, but I wish that had been the approach when a few > implementations (I suspect proprietary vendors; this was never > disclosed) rejected redefining NULL as the right thing: (void *) 0. > > I fixed one of the last free-software implementations of NULL that > expanded to 0, and nullptr would probably never have been added if WG14 > had not accepted the pressure from such horrible implementations. > > <https://github.com/cc65/cc65/issues/1823> > > > > That said, if we went with a rank/extent design, I think we'd *have* > > > to use parens because the extent interface would take two operands > > > (the array and the rank you're interested in getting the extent of) > > > and it would be inconsistent for the rank interface to then not > > > require parens. > > Prior art > C > It is common in C programs to get the number of elements of > an array via the usual sizeof division and wrap it in a > macro. Common names include: > > • ARRAY_SIZE() > • NELEM() > • NELEMS() > • NITEMS() > • NELTS() > • elementsof() > • lengthof() > > C++ > In C++, there are several standard features to determine > the number of elements of an array: > > std::size() (since C++17) > std::ssize() (since C++20) > The syntax of these is identical to the usual C > macros named above. > > It’s a bit different, since it’s a general purpose > sizing template, which works on non‐array types too, > with different semantics. > > But when applied to an array, it has the same seman‐ > tics as the macros above. > > std::extent (since C++23) > The syntax of this is quite different. It uses a > numeric index as a second parameter to determine the > dimension in which the number of elements should be > counted. > > C arrays are much simpler than C++’s many array‐like > types, and I don’t see a reason why we would need > something as complex as std::extent in C. Cer‐ > tainly, existing projects have not developed such a > macro, even if it is technically possible: > > #define DEREFERENCE(a, n) DEREFERENCE_ ## n (a, c) > #define DEREFERENCE_9(a) (*********(a)) > #define DEREFERENCE_8(a) (********(a)) > #define DEREFERENCE_7(a) (*******(a)) > #define DEREFERENCE_6(a) (******(a)) > #define DEREFERENCE_5(a) (*****(a)) > #define DEREFERENCE_4(a) (****(a)) > #define DEREFERENCE_3(a) (***(a)) > #define DEREFERENCE_2(a) (**(a)) > #define DEREFERENCE_1(a) (*(a)) > #define DEREFERENCE_0(a) ((a)) > #define extent(a, n) nitems(DEREFERENCE(a, n)) > > If any project needs that syntax, they can implement > their own trivial wrapper macro, as demonstrated > above. > > Existing prior art in C seems to favour a design that fol‐ > lows the syntax of other operators like sizeof. > > > I think that this argument goes too short. E. g. implementation that > > already have compound expressions (or lambdas ;-) may provide a > > quality implementation using `static_assert` and `typeof` alone, and > > don't have to touch their compiler at all. > > > > We should not impose an implementation in the language where doing it > > in a header can be completely sufficient. > > I have concerns about a libc (or a predefined macro) implementation: > the sizeof division causes double evaluation with any VLAs, while my > implementation for GCC has less cases of evaluation, and when it needs > to evaluate, it only does it once. It would be hard to find a good > wording that would allow an implementation to implement this as a macro. No, we should not allow double evaluation. putting this in a `({ })` and doing a `typedef typeof(X) _my_type;` with the macro parameter `X` at the beginning completely avoids double evaluation. So quality implantations are possible, but perhaps differently and with other builtins than we are imagining. Don't impose the view of one particular implementation onto others. Somewhere was brought in an argument with `offsetof`. This is exactly what we need. Implementations being able to start with a simple solution (as everybody did in the beginning of `offsetof` ), and improve that implementation at their pace when they are ready for it. > > constexpr > The usual sizeof division evaluates the operand and results in a > run‐time value in cases where it wouldn’t be necessary. If the > top‐level array number of elements is determined by an integer > constant expression, but an internal array is a VLA, sizeof must > evaluate: > > int a[7][n]; > int (*p)[7][n]; > > p = &a; > nitems(*p++); > > With a elementsof operator, this would result in an integer con‐ > stant expression of value 7. > > Double evaluation > With the sizeof‐based implementation from above, the example from > above causes double evaluation of *p++. > > > Plus, implementing as a macro in a header (probably <stddef.h>) makes > > also a feature test, for those applications that already have > > something similar. > > This is interesting. But I think an implementation could just > > #define lengthof lengthof > > to provide a feature-test macro. Sure, but leave some slack to implementations to do this in a way that's best for them > > this was basically what we did for `unreachable` and I think it worked > > out fine. I still think that the different options that we had there can be used to ask the right questions for WG14. Jens
> I am currently on a summer bike trip, so not able to provide a full reference implantation. But could do so, once I am back. No need (after thinking on this a bit more, I believe you're right that this can be done in a macro-only implementation; we might not go that route in Clang because of AST matching needs and whatnot, but that's not an issue), but thank you for the offer. Please enjoy your summer bike trip! 😊 > Why would you be looping? lengthof only addresses the outer dimension sizeof would need a loop, no ? Due to poor reading comprehension, I missed in the paper that lengthof works on the outer dimension. 😉 I think having a way to get the flattened size of a multidimensional array is a useful feature. ~Aaron -----Original Message----- From: Jens Gustedt <jens.gustedt@inria.fr> Sent: Wednesday, August 14, 2024 9:25 AM To: Ballman, Aaron <aaron.ballman@intel.com>; Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero <xavi.dcr@tutanota.com> Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> Subject: RE: v2.1 Draft for a lengthof paper Am 14. August 2024 14:40:41 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a > quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. > > > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. > > But can doing this in a header be completely sufficient in practice? Ithindso. > e.g., the user who passes a pointer rather than an array is in for quite a surprise, or passing a struct, or passing a FAM, etc. If we want to put constraints on the interface, that may be more challenging to do from a header file than from the compiler. offsetof is a cautionary tale in that compilers that want a reasonable QoI basically all implement this as a builtin rather than the header-only version. Yes, with the tools that I listed and the ideas that are already in the paper you can basically do all that, including given valuable feedback in case of failure. I am currently on a summer bike trip, so not able to provide a full reference implantation. But could do so, once I am back. > > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. > > this was basically what we did for `unreachable` and I think it worked out fine. > > True! > > I'm still thinking on how important rank + extent is vs overall array > length. If C had constexpr functions, then I'd almost certainly want > array rank and extent to be the building blocks and then lengthof can > be a constexpr function looping over rank and summing extents. But we > don't have that yet, and "bird hand" vs "bird in bush"... :-D Why would you be looping? lengthof only addresses the outer dimension sizeof would need a loop, no ? Generally I would be opposed to imposing a complicated solution for a simple feature Jens > > ~Aaron > > -----Original Message----- > From: Jens Gustedt <jens.gustedt@inria.fr> > Sent: Wednesday, August 14, 2024 8:18 AM > To: Ballman, Aaron <aaron.ballman@intel.com>; Alejandro Colomar > <alx@kernel.org>; Xavier Del Campo Romero <xavi.dcr@tutanota.com> > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh > <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers > <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub > Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing > Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; > Florian Weimer <fweimer@redhat.com>; Andreas Schwab > <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang > <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> > Subject: RE: v2.1 Draft for a lengthof paper > > Hi Aaron, > > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" <aaron.ballman@intel.com>: > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional arrays. > > > > That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/s > > rc > > /cmd/mailx/names.c?L53-55 > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod > > _f > > w.c?L292-294 > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/ > > bl > > ob/src/spur64.stack/validImage.c?L7014-7018 > > (and many, many others) > > > > >> > As for the parentheses, I personally think lengthof should > > >> > follow similar rules compared to sizeof. > > >> > > >> I think most people agree with this. > > > > > > I still don't, in particular not for standardisation. > > > > > > We have to remember that there are many small C compilers out there. > > > > Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it would be inconsistent for the rank interface to then not require parens. > > I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. > > We should not impose an implementation in the language where doing it in a header can be completely sufficient. > > Plus, implementing as a macro in a header (probably <stddef.h>) makes also a feature test, for those applications that already have something similar. > this was basically what we did for `unreachable` and I think it worked out fine. > > Jens > > > ~Aaron > > > > -----Original Message----- > > From: Jens Gustedt <jens.gustedt@inria.fr> > > Sent: Wednesday, August 14, 2024 2:11 AM > > To: Alejandro Colomar <alx@kernel.org>; Xavier Del Campo Romero > > <xavi.dcr@tutanota.com> > > Cc: Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh > > <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers > > <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub > > Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing > > Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; > > Florian Weimer <fweimer@redhat.com>; Andreas Schwab > > <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang > > <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com>; Ballman, > > Aaron <aaron.ballman@intel.com> > > Subject: Re: v2.1 Draft for a lengthof paper > > > > Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > > Hi Xavier, > > > > > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > > > I have been overseeing these last emails - > > > > > > Ahhh, good to know; thanks! :) > > > > > > > thank you very much for your > > > > efforts, Alex! > > > > > > :-) > > > > > > > I did not reply until now because I do not have prior experience > > > > with gcc internals, so my feedback would probably have not been > > > > that useful. > > > > > > Ok. > > > > > > > Those emails from 2020 were in fact discussing two completely > > > > different proposals at once: > > > > > > > > 1. Add _Lengthof + #include <stdlengthof.h> 2. Allow static > > > > qualifier on compound literals > > > > > > Yup. > > > > > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), > > > > and as you already know by now, proposal #1 received some > > > > negative feedback, suggesting _Typeof/typeof + some macro magic > > > > as a pragmatic workaround instead. > > > > > > The original author of that negative feedback talked to me in > > > private a week ago, and said he likes my proposal. We have no > > > negative feedback anymore. :) > > > > > > > Since the proposal did not get much traction and I would had > > > > been unable to contribute to gcc myself, I just gave up on it. > > > > IIRC the deadline for new proposals closed soon after, anyway. > > > > > > Ok. > > > > > > > But I am glad that someone with proper experience took the initiative. > > > > > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't > > > say I had the proper experience with GCC internals when I started > > > this patch set. But I'm unemployed at the moment, which gives me > > > all the time I need for learning those. :) > > > > > > > I still think the proposal is relevant and has interesting use cases. > > > > > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > > > Depending on feedback, I'll propose the uglified version. > > > > > > > > Probably, all of us know why the uglified version is the usual > > > > approach preferred by the C standard: we do not know how many > > > > applications would break otherwise. > > > > > > Yup. > > > > > > > However, we see that this trend is now changing with C23, so > > > > probably it makes sense to define lengthof directly. > > > > > > Yeah, since Jens is in WG14 and he suggested to follow this trend, > > > maybe we can. If not, it's trivial to change the proposal to use > > > the uglified name plus a macro. > > > > > > Checking <https://codesearch.debian.net>, I see that while several > > > projects have a lengthof() macro, all of them use it with > > > semantics compatible with this keyword, so it shouldn't break too > > > much. Maybe those projects will start receiving diagnostics that > > > they're redefining a standard keyword, but that's not too bad. > > > > For a WG14 paper you should add these findings to support that choice. > > Another option would be for WG14 to standardize the then existing implementation with the double underscores. > > > > > > As for the parentheses, I personally think lengthof should > > > > follow similar rules compared to sizeof. > > > > > > I think most people agree with this. > > > > I still don't, in particular not for standardisation. > > > > We have to remember that there are many small C compilers out there. > > I would not want unnecessary burden on them. So my preferred choice would be a standardisation as a macro, similar to offsetof. > > gcc (and clang) could then just map that to their builtin, other compilers could use whatever they have at the moment, even just the macros that you have in the paper as a starting point. > > > > The rest would be "quality of implementation" > > > > What time horizon do you see to add the feature for array parameters? > > > > Thanks > > Jens > > > > > > > > Best regards, > > > > > > Have a lovely night! > > > Alex > > > > > > > > > -- > > Jens Gustedt - INRIA & ICube, Strasbourg, France > > -- Jens Gustedt - INRIA & ICube, Strasbourg, France
Hi Aaron, On Wed, Aug 14, 2024 at 01:21:18PM GMT, Ballman, Aaron wrote: > > What regex did you use for searching? > > I went cheap and easy rather than trying to narrow down: > https://sourcegraph.com/search?q=context:global+lang:C+lengthof&patternType=regexp&sm=0 Ahh, context:global seems to be what I wanted. Where is that documented? > > I was thinking of renaming the proposal to elementsof(), to avoid confusion between length of an array and length of a string. Would you mind checking if elementsof() is ok? > > From what I was seeing, it looks to be used more uniformly as a > function-like macro accepting a single argument. Thanks! I'll rename it to elementsof(). Cheers, Alex > ~Aaron
> Ahh, context:global seems to be what I wanted. Where is that documented? For me it is the default when I go to https://sourcegraph.com/search but there's documentation at https://sourcegraph.com/docs/code-search/working/search_contexts > Thanks! I'll rename it to elementsof(). Rather than renaming it, I'd say that the name chosen in the proposed text is a placeholder, and have a section in the prose that describes different naming choices, pros and cons, suggests a name from you as the author, but asks WG14 to pick the final name. I know Jens mentioned he doesn’t like the name `elementsof` and I suspect if we ask five more people we'll get about seven more opinions on what the name could/should be. 😝 ~Aaron -----Original Message----- From: Alejandro Colomar <alx@kernel.org> Sent: Wednesday, August 14, 2024 10:00 AM To: Ballman, Aaron <aaron.ballman@intel.com> Cc: Jens Gustedt <jens.gustedt@inria.fr>; Xavier Del Campo Romero <xavi.dcr@tutanota.com>; Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> Subject: Re: v2.1 Draft for a lengthof paper Hi Aaron, On Wed, Aug 14, 2024 at 01:21:18PM GMT, Ballman, Aaron wrote: > > What regex did you use for searching? > > I went cheap and easy rather than trying to narrow down: > https://sourcegraph.com/search?q=context:global+lang:C+lengthof&patter > nType=regexp&sm=0 Ahh, context:global seems to be what I wanted. Where is that documented? > > I was thinking of renaming the proposal to elementsof(), to avoid confusion between length of an array and length of a string. Would you mind checking if elementsof() is ok? > > From what I was seeing, it looks to be used more uniformly as a > function-like macro accepting a single argument. Thanks! I'll rename it to elementsof(). Cheers, Alex > ~Aaron -- <https://www.alejandro-colomar.es/>
Hi Martin, On Wed, Aug 14, 2024 at 03:50:00PM GMT, Martin Uecker wrote: > An operator that returns an array with all dimensions of a multi-dimensional > array would make a a lot of sense to me. > > > double array[4][3][2]; > > // array_dims(array) = (constexpr size_t[3]){ 4, 3, 2 } And what if array[4][n][2]? No constexpr anymore, which is bad. > > int dim1 = (array_dims(array))[0] > int dim2 = (array_dims(array))[1] > int dim3 = (array_dims(array))[2] > > You can then implement lengthof in terms of this operator: > > #define lengthof(x) (array_dims(array)[0]) Not really. This implementation would result in less constant expressions that my proposal. That's detrimental for diagnostics and usability. And the fundamental operator would be very complex, to allow users implementing simpler wrappers. I think the fundamental operators should be as simple as possible, in the spirit of C, and let users build on top of those basic tools. This reminds me of the 'static' specifier for array parameters, which is conflated with two meanings: nonnull and length. I'd rather have a way to specify nullness, and another one to specify length, and let users compose them. At first glance I oppose this array_dims operator. > and you can obtain the rank by applying lengthof to the array: > > #define rank(x) lengthof(array_dims(x)) I'm curious to see what kind of code would be enabled by a rank() operator in C that we can't write at the moment. > If the array is constexpr for regular arrays and array > indexing returns a constant again for constexpr arrays, this > would all work out. > > Martin Have a lovely day! Alex
Hi Aaron, On Wed, Aug 14, 2024 at 01:59:58PM GMT, Ballman, Aaron wrote: > > Why would you be looping? lengthof only addresses the outer dimension sizeof would need a loop, no ? > > Due to poor reading comprehension, I missed in the paper that lengthof > works on the outer dimension. 😉 I think having a way to get the > flattened size of a multidimensional array is a useful feature. As long as you know the type of the inner-most element, you can do it. This excludes auto, but I think you usually know this. double x[4][5][6][7]; size_t n = sizeof(x) / sizeof(double); This hard-codes 'double', but should be good enough usually. Cheers, Alex
Am Mittwoch, dem 14.08.2024 um 16:12 +0200 schrieb Alejandro Colomar: > Hi Martin, > > On Wed, Aug 14, 2024 at 03:50:00PM GMT, Martin Uecker wrote: > > An operator that returns an array with all dimensions of a multi-dimensional > > array would make a a lot of sense to me. > > > > > > double array[4][3][2]; > > > > // array_dims(array) = (constexpr size_t[3]){ 4, 3, 2 } > > And what if array[4][n][2]? No constexpr anymore, which is bad. > > > > int dim1 = (array_dims(array))[0] > > int dim2 = (array_dims(array))[1] > > int dim3 = (array_dims(array))[2] > > > > You can then implement lengthof in terms of this operator: > > > > #define lengthof(x) (array_dims(array)[0]) > > Not really. This implementation would result in less constant > expressions that my proposal. That's detrimental for diagnostics and > usability. Yes, this would be a downside when implementing lengthof in this way. > > And the fundamental operator would be very complex, to allow users > implementing simpler wrappers. I think the fundamental operators should > be as simple as possible, in the spirit of C, and let users build on top > of those basic tools. > > This reminds me of the 'static' specifier for array parameters, which is > conflated with two meanings: nonnull and length. I'd rather have a way > to specify nullness, and another one to specify length, and let users > compose them. > > At first glance I oppose this array_dims operator. Opinionated as usual ;-) > > and you can obtain the rank by applying lengthof to the array: > > > > #define rank(x) lengthof(array_dims(x)) > > I'm curious to see what kind of code would be enabled by a rank() > operator in C that we can't write at the moment. There seems to be no generic way to get all dimensions from a multi-dimensional array of arbitrary rank. Martin > > > If the array is constexpr for regular arrays and array > > indexing returns a constant again for constexpr arrays, this > > would all work out. > > > > Martin > > Have a lovely day! > Alex >
On Wed, Aug 14, 2024 at 03:50:21PM GMT, Jens Gustedt wrote: > > > > > > > > That said, I suspect WG14 would not be keen on standardizing > > > > `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 > > > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 > > > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 > > > > (and many, many others) > > > > What regex did you use for searching? > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > confusion between length of an array and length of a string. Would you > > mind checking if elementsof() is ok? > > No, not for me. I really want as to go consistently to talk about > array length for this. Consistent terminology is important. I understand your desire for consistency. I think your paper is a net improvement over the status quo (which is a mix of length, size, and number of elements). After your proposal, there will be only length and number of elements. That's great. However, strlen(3) came first, and we must respect it. Since you haven't proposed eliminating "number of elements" from the standard, and it would still be used alongside length, I think elementsof() would be consistent with your view (consistent with "number of elements"). Alternatively, you could use a new term, for example extent, for referring to the number of elements of an array. That would be more respectful to strlen(3), keeping a strong distinction between string length and array ******. Or how about always referring to it as "number of elements"? It's longer to type, but would be the most consistent approach. Also, elementsof() is free to use, while lengthof() has a several existing incompatible cases (as Aaron has shown), so we can't use that name so freely. > > I have concerns about a libc (or a predefined macro) implementation: > > the sizeof division causes double evaluation with any VLAs, while my > > implementation for GCC has less cases of evaluation, and when it needs > > to evaluate, it only does it once. It would be hard to find a good > > wording that would allow an implementation to implement this as a macro. > > No, we should not allow double evaluation. > > putting this in a `({ })` I would love to see a proposal for adding this GNU extension to ISO C. Did nobody do it yet? I could try to, if I find some time. (But I'll take a longish time for that; if anyone else does it, it would be great.) > and doing a `typedef typeof(X) _my_type;` with the macro parameter `X` at the beginning completely avoids double evaluation. So quality implantations are > possible, but perhaps differently and with other builtins than we are > imagining. Don't impose the view of one particular implementation onto others. Ahhh, good. I haven't thought of that possibility. Sure, that makes sense now. It gives more strength to your proposal of allowing libc implementations, and thus require parens in the standard. > Somewhere was brought in an argument with `offsetof`. > This is exactly what we need. Implementations being able to start > with a simple solution (as everybody did in the beginning of > `offsetof`), and improve that implementation at their pace when they > are ready for it. Agree. > > > this was basically what we did for `unreachable` and I think it worked > > > out fine. > > I still think that the different options that we had there can be used > to ask the right questions for WG14. I'm looking at it. I've already taken some parts of it. :) Cheers, Alex
> I would love to see a proposal for adding this GNU extension to ISO C. > Did nobody do it yet? I could try to, if I find some time. (But I'll take a longish time for that; if anyone else does it, it would be great.) It's been discussed but hasn't moved forward because there are design issues with it (the odd way in which it produces a resulting value, sometimes surprising behavior with how it interacts with flow control, the fact that it can't be used in all contexts, etc). The committee was leaning more towards lambdas despite those being a bit orthogonal. ~Aaron -----Original Message----- From: Alejandro Colomar <alx@kernel.org> Sent: Wednesday, August 14, 2024 10:48 AM To: Jens Gustedt <jens.gustedt@inria.fr> Cc: Ballman, Aaron <aaron.ballman@intel.com>; Xavier Del Campo Romero <xavi.dcr@tutanota.com>; Gcc Patches <gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplakosh@cert.org>; Martin Uecker <uecker@tugraz.at>; Joseph Myers <josmyers@redhat.com>; Gabriel Ravier <gabravier@gmail.com>; Jakub Jelinek <jakub@redhat.com>; Kees Cook <keescook@chromium.org>; Qing Zhao <qing.zhao@oracle.com>; David Brown <david.brown@hesbynett.no>; Florian Weimer <fweimer@redhat.com>; Andreas Schwab <schwab@linux-m68k.org>; Timm Baeder <tbaeder@redhat.com>; A. Jiang <de34@live.cn>; Eugene Zelenko <eugene.zelenko@gmail.com> Subject: Re: v2.1 Draft for a lengthof paper On Wed, Aug 14, 2024 at 03:50:21PM GMT, Jens Gustedt wrote: > > > > > > > > That said, I suspect WG14 would not be keen on standardizing > > > > `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/u > > > > sr/src/cmd/mailx/names.c?L53-55 > > > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ > > > > ipod_fw.c?L292-294 > > > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-v > > > > m/-/blob/src/spur64.stack/validImage.c?L7014-7018 > > > > (and many, many others) > > > > What regex did you use for searching? > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > confusion between length of an array and length of a string. Would > > you mind checking if elementsof() is ok? > > No, not for me. I really want as to go consistently to talk about > array length for this. Consistent terminology is important. I understand your desire for consistency. I think your paper is a net improvement over the status quo (which is a mix of length, size, and number of elements). After your proposal, there will be only length and number of elements. That's great. However, strlen(3) came first, and we must respect it. Since you haven't proposed eliminating "number of elements" from the standard, and it would still be used alongside length, I think elementsof() would be consistent with your view (consistent with "number of elements"). Alternatively, you could use a new term, for example extent, for referring to the number of elements of an array. That would be more respectful to strlen(3), keeping a strong distinction between string length and array ******. Or how about always referring to it as "number of elements"? It's longer to type, but would be the most consistent approach. Also, elementsof() is free to use, while lengthof() has a several existing incompatible cases (as Aaron has shown), so we can't use that name so freely. > > I have concerns about a libc (or a predefined macro) implementation: > > the sizeof division causes double evaluation with any VLAs, while my > > implementation for GCC has less cases of evaluation, and when it > > needs to evaluate, it only does it once. It would be hard to find a > > good wording that would allow an implementation to implement this as a macro. > > No, we should not allow double evaluation. > > putting this in a `({ })` I would love to see a proposal for adding this GNU extension to ISO C. Did nobody do it yet? I could try to, if I find some time. (But I'll take a longish time for that; if anyone else does it, it would be great.) > and doing a `typedef typeof(X) _my_type;` with the macro parameter `X` > at the beginning completely avoids double evaluation. So quality > implantations are possible, but perhaps differently and with other builtins than we are imagining. Don't impose the view of one particular implementation onto others. Ahhh, good. I haven't thought of that possibility. Sure, that makes sense now. It gives more strength to your proposal of allowing libc implementations, and thus require parens in the standard. > Somewhere was brought in an argument with `offsetof`. > This is exactly what we need. Implementations being able to start with > a simple solution (as everybody did in the beginning of `offsetof`), > and improve that implementation at their pace when they are ready for > it. Agree. > > > this was basically what we did for `unreachable` and I think it > > > worked out fine. > > I still think that the different options that we had there can be used > to ask the right questions for WG14. I'm looking at it. I've already taken some parts of it. :) Cheers, Alex -- <https://www.alejandro-colomar.es/>
Hi Aaron, On Wed, Aug 14, 2024 at 02:07:16PM GMT, Ballman, Aaron wrote: > > Ahh, context:global seems to be what I wanted. Where is that documented? > > For me it is the default when I go to https://sourcegraph.com/search but there's documentation at https://sourcegraph.com/docs/code-search/working/search_contexts Ahh, no, it was a red herring. I though that was restricting the search to global definitions. There's no way to restrict to definitions, right? I'd like a way to discard uses, since that doesn't give much info. But for lengthof() it seems to quickly find incomatible cases, so we were lucky that we don't need to restrict it. > > > Thanks! I'll rename it to elementsof(). > > Rather than renaming it, I'd say that the name chosen in the proposed > text is a placeholder, and have a section in the prose that describes > different naming choices, pros and cons, suggests a name from you as > the author, but asks WG14 to pick the final name. > I know Jens mentioned he doesn’t like the name `elementsof` and I > suspect if we ask five more people we'll get about seven more opinions > on what the name could/should be. 😝 Yup, but I want to have a placeholder that would be a name that I would like, and a defendible one. :-) I'll add questions at the bottom, proposing alternatives. Cheers, Alex
Am Mittwoch, dem 14.08.2024 um 14:52 +0000 schrieb Ballman, Aaron: > > I would love to see a proposal for adding this GNU extension to ISO C. > > Did nobody do it yet? I could try to, if I find some time. (But I'll take a longish time for that; if anyone else > > does it, it would be > great.) > > It's been discussed but hasn't moved forward because there are design issues with it (the odd way in which it produces > a resulting value, sometimes surprising behavior with how it interacts with flow control, the fact that it can't be > used in all contexts, etc). The committee was leaning more towards lambdas despite those being a bit orthogonal. I do not think this is a fair characterization. We did not see any proposal for ({ }) so it is not clear where the committee is leaning more towards. Lambdas ultimately failed because they were too complex for not having any implementation and user experience in C. I agree though that lambdas could be nicer, but I still have issues with the last type-generic version and I do not have similar objections against ({ }). Martin
Am 14. August 2024 16:47:32 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > On Wed, Aug 14, 2024 at 03:50:21PM GMT, Jens Gustedt wrote: > > > > > > > > > > That said, I suspect WG14 would not be keen on standardizing > > > > > `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: > > > > > > > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 > > > > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 > > > > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 > > > > > (and many, many others) > > > > > > What regex did you use for searching? > > > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > > confusion between length of an array and length of a string. Would you > > > mind checking if elementsof() is ok? > > > > No, not for me. I really want as to go consistently to talk about > > array length for this. Consistent terminology is important. > > I understand your desire for consistency. I think your paper is a net > improvement over the status quo (which is a mix of length, size, and > number of elements). After your proposal, there will be only length and > number of elements. That's great. > > However, strlen(3) came first, and we must respect it. Sure, string length, a dynamic feature, and array length are two features. But we also have VLA and not VNEA in the standard, So we should respect this ;-) > Since you haven't proposed eliminating "number of elements" from the > standard, and it would still be used alongside length, I think > elementsof() would be consistent with your view (consistent with "number > of elements"). didn't we ? Then this is actually a good idea to do so, thanks for the idea ! "elements of" is a stretch, linguistically, because you don't mean the elements themselves, you are referring to their number. "elementsof" for me would refer to a list of these elements. > Alternatively, you could use a new term, for example extent, for > referring to the number of elements of an array. That would be more > respectful to strlen(3), keeping a strong distinction between string > length and array ******. Only that this separation doesn't exist, even now, as said, it is called "variable length array" > Or how about always referring to it as "number of elements"? It's > longer to type, but would be the most consistent approach. > > Also, elementsof() is free to use, while lengthof() has a several > existing incompatible cases (as Aaron has shown), so we can't use that > name so freely. > > > > I have concerns about a libc (or a predefined macro) implementation: > > > the sizeof division causes double evaluation with any VLAs, while my > > > implementation for GCC has less cases of evaluation, and when it needs > > > to evaluate, it only does it once. It would be hard to find a good > > > wording that would allow an implementation to implement this as a macro. > > > > No, we should not allow double evaluation. > > > > putting this in a `({ })` > > I would love to see a proposal for adding this GNU extension to ISO C. > Did nobody do it yet? I could try to, if I find some time. (But I'll > take a longish time for that; if anyone else does it, it would be > great.) > > > and doing a `typedef typeof(X) _my_type;` with the macro parameter `X` at the beginning completely avoids double evaluation. So quality implantations are > > possible, but perhaps differently and with other builtins than we are > > imagining. Don't impose the view of one particular implementation onto others. > > Ahhh, good. I haven't thought of that possibility. Sure, that makes > sense now. It gives more strength to your proposal of allowing libc > implementations, and thus require parens in the standard. > > > Somewhere was brought in an argument with `offsetof`. > > This is exactly what we need. Implementations being able to start > > with a simple solution (as everybody did in the beginning of > > `offsetof`), and improve that implementation at their pace when they > > are ready for it. > > Agree. > > > > > this was basically what we did for `unreachable` and I think it worked > > > > out fine. > > > > I still think that the different options that we had there can be used > > to ask the right questions for WG14. > > I'm looking at it. I've already taken some parts of it. :) > > Cheers, > Alex >
Hi Jens, Martin, On Wed, Aug 14, 2024 at 05:44:57PM GMT, Jens Gustedt wrote: > Am 14. August 2024 16:47:32 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > > > confusion between length of an array and length of a string. Would you > > > > mind checking if elementsof() is ok? > > > > > > No, not for me. I really want as to go consistently to talk about > > > array length for this. Consistent terminology is important. > > > > I understand your desire for consistency. I think your paper is a net > > improvement over the status quo (which is a mix of length, size, and > > number of elements). After your proposal, there will be only length and > > number of elements. That's great. > > > > However, strlen(3) came first, and we must respect it. > > Sure, string length, a dynamic feature, and array length are two features. > > But we also have VLA and not VNEA in the standard, So we should respect this ;-) I hadn't thought about it until yesterday after Martin insisted in preferring lengthof over nelementsof or a contraction of it, and worried about nelementsof possibly causing ambiguity with multi-dimensional arrays. But: VLA is a misnomer. ~~~~~~~~~~~~~~~~~~ First, let's assume length refers to the number of elements, as we all agree that length should not refer to the size in bytes of an array, since we already have the term "size" for it, which is consistent with sizeof. int vla[3][n]; The array from above is a so-called variable length array, according to the standard. But it does not have a variable length, according to the presumed meaning of length. It does indeed have a variable size. The element of vla is itself an array, which is the one that really has a variable length (or number of elements, as is the more technical term). So, if n3187 develops, and really pretends to uniquely and unambiguously use a term for the number of elements and another one for the size of an array, it should also rename "variable length array" into "variable size array". It is indeed due to this problematic misuse of the colloquial term length that "lenght" and not "number of elements" is misleading in multi-dimensional arrays. The standard is very strict in using NoE for the first dimension of an array (so its true dimension), and not for the dimensions of arrays that are elements of it. And now you could say that this is only a problem of multi-dimensional arrays. It's not. They're just the composition of arrays with elements of type array. The same problem arises with single dimensional arrays in complex situations (although, admittedly, this is non-standard): $ cat vla.c int main(void) { int n = 5; struct s { int v[n]; }; struct s a[3]; return sizeof(a); } $ gcc -Wall -Wextra -Wpedantic vla.c vla.c: In function ‘main’: vla.c:7:22: warning: a member of a structure or union cannot have a variably modified type [-Wpedantic] 7 | int v[n]; | ^ $ ./a.out; echo $? 60 a is a VLA even if it is a single-dimension array of known constant number of elements. Huh? :) Terminology ~~~~~~~~~~~ Once we've determined that "length" in VLA does refer to the size and not the number of elements, it's hard to justify a reformation of terminology that would base on length meaning number of elements. Indeed, either basing justifications of the origins of length on strlen(3) or on VLA, we must conclude that "variable length array" must be renamed to "variable size array". I'm preparing a paper for that. If eventually that paper would be accepted, I'd prepare a second paper that would reform every use of size and length with arrays so that size always refers to the size in bytes, length is completely removed, and number of elements stands as the only term to refer to the number of elements. Have a lovely day! Alex > > Since you haven't proposed eliminating "number of elements" from the > > standard, and it would still be used alongside length, I think > > elementsof() would be consistent with your view (consistent with "number > > of elements"). > > didn't we ? Then this is actually a good idea to do so, thanks for the idea !
Alex, I am all for making things more consistent, but there is also a cost to changing stuff too much. length is the established term in most programming languages and I would recommend to stick to it. Note that it is not true that the standard consistently refers to char a[3][n] as a VLA. It does so in the description in sizeof but not in the type compatibility rules, at least as understood by most compilers. This is an inconsistency we *should* fix, but I do not think that changing away from "length" is a good ida. Note that "number of elements" is inherently an ambiguous term for multi-dimensional arrays, and I am not sure how you want to avoid this without making the wording more complex (e.g. "number of elements of the outermost array). So I would recommend not to go this way. You would need a really good argument to convince me to vote for this, and I haven't seen any such argument. Martin Am Sonntag, dem 01.09.2024 um 11:10 +0200 schrieb Alejandro Colomar: > Hi Jens, Martin, > > On Wed, Aug 14, 2024 at 05:44:57PM GMT, Jens Gustedt wrote: > > Am 14. August 2024 16:47:32 MESZ schrieb Alejandro Colomar <alx@kernel.org>: > > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > > > > confusion between length of an array and length of a string. Would you > > > > > mind checking if elementsof() is ok? > > > > > > > > No, not for me. I really want as to go consistently to talk about > > > > array length for this. Consistent terminology is important. > > > > > > I understand your desire for consistency. I think your paper is a net > > > improvement over the status quo (which is a mix of length, size, and > > > number of elements). After your proposal, there will be only length and > > > number of elements. That's great. > > > > > > However, strlen(3) came first, and we must respect it. > > > > Sure, string length, a dynamic feature, and array length are two features. > > > > But we also have VLA and not VNEA in the standard, So we should respect this ;-) > > I hadn't thought about it until yesterday after Martin insisted in > preferring lengthof over nelementsof or a contraction of it, and worried > about nelementsof possibly causing ambiguity with multi-dimensional > arrays. But: > > VLA is a misnomer. > ~~~~~~~~~~~~~~~~~~ > > First, let's assume length refers to the number of elements, as we all > agree that length should not refer to the size in bytes of an array, > since we already have the term "size" for it, which is consistent with > sizeof. > > int vla[3][n]; > > The array from above is a so-called variable length array, according to > the standard. But it does not have a variable length, according to the > presumed meaning of length. It does indeed have a variable size. The > element of vla is itself an array, which is the one that really has a > variable length (or number of elements, as is the more technical term). > > So, if n3187 develops, and really pretends to uniquely and unambiguously > use a term for the number of elements and another one for the size of an > array, it should also rename "variable length array" into "variable size > array". > > It is indeed due to this problematic misuse of the colloquial term > length that "lenght" and not "number of elements" is misleading in > multi-dimensional arrays. The standard is very strict in using NoE for > the first dimension of an array (so its true dimension), and not for > the dimensions of arrays that are elements of it. > > And now you could say that this is only a problem of multi-dimensional > arrays. It's not. They're just the composition of arrays with elements > of type array. The same problem arises with single dimensional arrays > in complex situations (although, admittedly, this is non-standard): > > $ cat vla.c > int > main(void) > { > int n = 5; > > struct s { > int v[n]; > }; > > struct s a[3]; > > return sizeof(a); > } > $ gcc -Wall -Wextra -Wpedantic vla.c > vla.c: In function ‘main’: > vla.c:7:22: warning: a member of a structure or union cannot have a variably modified type [-Wpedantic] > 7 | int v[n]; > | ^ > $ ./a.out; echo $? > 60 > > a is a VLA even if it is a single-dimension array of known constant > number of elements. Huh? :) > > Terminology > ~~~~~~~~~~~ > > Once we've determined that "length" in VLA does refer to the size and > not the number of elements, it's hard to justify a reformation of > terminology that would base on length meaning number of elements. > > Indeed, either basing justifications of the origins of length on > strlen(3) or on VLA, we must conclude that "variable length array" must > be renamed to "variable size array". I'm preparing a paper for that. > > If eventually that paper would be accepted, I'd prepare a second paper > that would reform every use of size and length with arrays so that size > always refers to the size in bytes, length is completely removed, and > number of elements stands as the only term to refer to the number of > elements. > > > Have a lovely day! > Alex > > > > Since you haven't proposed eliminating "number of elements" from the > > > standard, and it would still be used alongside length, I think > > > elementsof() would be consistent with your view (consistent with "number > > > of elements"). > > > > didn't we ? Then this is actually a good idea to do so, thanks for the idea ! >