mbox series

[RFC,v1,0/2] c: Add _Lengthof operator

Message ID 20240728141547.302478-1-alx@kernel.org
Headers show
Series c: Add _Lengthof operator | expand

Message

Alejandro Colomar July 28, 2024, 2:15 p.m. UTC
Hi!

I've got something working:

	$ cat len.c 
	#include <stdio.h>

	int
	main(int argc, char *argv[argc + 1])
	{
		int     a[42];
		size_t  n;

		(void) argv;

		//n = _Lengthof(argv);
		//printf("_Lengthof(argv) == %zu\n", n);

		n = _Lengthof(a);
		printf("%zu\n", n);

		n = _Lengthof(long [99]);
		printf("%zu\n", n);

		n = _Lengthof(short [n - 10]);
		printf("%zu\n", n);

		int  b[n / 2];
		n = _Lengthof(b);
		printf("%zu\n", n);
	}
	$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra len.c 
	$ ./a.out 
	42
	99
	89
	44

I'd like to be able to uncomment those two lines in the future, but we
probably want to do that in a separate commit.

Does anyone know if we have the information available for getting that
value from the 'tree' object?  Or do we need some refactor first in
order to keep that information?

Also, I've reused much of sizeof's code, and maybe I should change some
of that.  It's my first contribution to gcc of this size, and I don't
yet know what some of those internals mean.  While my implementation
seems to be working (for the test above), I guess it's not ideal in
terms of readability, and may also not work well with C++ or some corner
cases.  Can somebody with more experience have a look at the code?

I suspect this is not yet ready, and thus doesn't have a ChangeLog.

Have a lovely day!
Alex


Alejandro Colomar (2):
  Merge definitions of array_type_nelts_top()
  c: Add _Lengthof() operator

 gcc/Makefile.in               |  1 +
 gcc/c-family/c-common.cc      | 20 +++++++++
 gcc/c-family/c-common.def     |  4 ++
 gcc/c-family/c-common.h       |  2 +
 gcc/c/c-parser.cc             | 35 +++++++++++----
 gcc/c/c-tree.h                |  4 ++
 gcc/c/c-typeck.cc             | 84 +++++++++++++++++++++++++++++++++++
 gcc/cp/cp-tree.h              |  1 -
 gcc/cp/operators.def          |  1 +
 gcc/cp/tree.cc                | 13 ------
 gcc/ginclude/stdlength.h      | 35 +++++++++++++++
 gcc/rust/backend/rust-tree.cc | 13 ------
 gcc/rust/backend/rust-tree.h  |  2 -
 gcc/target.h                  |  3 ++
 gcc/tree.cc                   | 13 ++++++
 gcc/tree.h                    |  1 +
 16 files changed, 195 insertions(+), 37 deletions(-)
 create mode 100644 gcc/ginclude/stdlength.h

Range-diff against v0:
-:  ----------- > 1:  507f5a51e17 Merge definitions of array_type_nelts_top()
-:  ----------- > 2:  e5835b982af c: Add _Lengthof() operator

Comments

Martin Uecker July 28, 2024, 2:42 p.m. UTC | #1
Am Sonntag, dem 28.07.2024 um 16:15 +0200 schrieb Alejandro Colomar:

...
> 
> Does anyone know if we have the information available for getting that
> value from the 'tree' object?  Or do we need some refactor first in
> order to keep that information?

What I wanted to try is to not immediately adjust the type to a
pointer and keep it represented as an array, so that the size is
preserved.  We currently already have a flag that tells us that
the type came from a parameter declared as an array, so we would
do it the other way round and have a flag that tells us that it
is really a pointer.  

In most cases the array would then decay and nothing needs to be
changed, but in some cases we would special case it to get the 
correct result (addressof / sizeof / typeof and maybe others).  
Those are also cases which should have a warning anyway. So this
*should* be simple, but I haven't tried.

I will look at you code later, but I would recommend to avoid
refactoring that touches different parts of the compiler at this
point.

I also wonder whether it would make sense to propose a GNU
extension first instead of implementing a full prototype for
the standard feature? I do not think we could merge the
former without having an accepted proposal.

Martin

> 
> Also, I've reused much of sizeof's code, and maybe I should change some
> of that.  It's my first contribution to gcc of this size, and I don't
> yet know what some of those internals mean.  While my implementation
> seems to be working (for the test above), I guess it's not ideal in
> terms of readability, and may also not work well with C++ or some corner
> cases.  Can somebody with more experience have a look at the code?
> 
> I suspect this is not yet ready, and thus doesn't have a ChangeLog.
> 
> Have a lovely day!
> Alex
> 
> 
> Alejandro Colomar (2):
>   Merge definitions of array_type_nelts_top()
>   c: Add _Lengthof() operator
> 
>  gcc/Makefile.in               |  1 +
>  gcc/c-family/c-common.cc      | 20 +++++++++
>  gcc/c-family/c-common.def     |  4 ++
>  gcc/c-family/c-common.h       |  2 +
>  gcc/c/c-parser.cc             | 35 +++++++++++----
>  gcc/c/c-tree.h                |  4 ++
>  gcc/c/c-typeck.cc             | 84 +++++++++++++++++++++++++++++++++++
>  gcc/cp/cp-tree.h              |  1 -
>  gcc/cp/operators.def          |  1 +
>  gcc/cp/tree.cc                | 13 ------
>  gcc/ginclude/stdlength.h      | 35 +++++++++++++++
>  gcc/rust/backend/rust-tree.cc | 13 ------
>  gcc/rust/backend/rust-tree.h  |  2 -
>  gcc/target.h                  |  3 ++
>  gcc/tree.cc                   | 13 ++++++
>  gcc/tree.h                    |  1 +
>  16 files changed, 195 insertions(+), 37 deletions(-)
>  create mode 100644 gcc/ginclude/stdlength.h
> 
> Range-diff against v0:
> -:  ----------- > 1:  507f5a51e17 Merge definitions of array_type_nelts_top()
> -:  ----------- > 2:  e5835b982af c: Add _Lengthof() operator
Alejandro Colomar July 28, 2024, 3:19 p.m. UTC | #2
Hi Martin,

On Sun, Jul 28, 2024 at 04:42:26PM GMT, Martin Uecker wrote:
> Am Sonntag, dem 28.07.2024 um 16:15 +0200 schrieb Alejandro Colomar:
> 
> ...
> > 
> > Does anyone know if we have the information available for getting that
> > value from the 'tree' object?  Or do we need some refactor first in
> > order to keep that information?
> 
> What I wanted to try is to not immediately adjust the type to a
> pointer and keep it represented as an array, so that the size is
> preserved.  We currently already have a flag that tells us that
> the type came from a parameter declared as an array, so we would
> do it the other way round and have a flag that tells us that it
> is really a pointer.  

Thanks!  Thanks makes sense.  I suspected you had something in mind.  :)

Do you remember that flag's name?  I can have a look at inverting that.

> In most cases the array would then decay and nothing needs to be
> changed, but in some cases we would special case it to get the 
> correct result (addressof / sizeof / typeof and maybe others).  
> Those are also cases which should have a warning anyway. So this
> *should* be simple, but I haven't tried.
> 
> I will look at you code later, but I would recommend to avoid
> refactoring that touches different parts of the compiler at this
> point.

I had to do patch 1/2 for having the function that gives me the nelts
value; other than that, I've tried to keep it minimal.

> I also wonder whether it would make sense to propose a GNU
> extension first instead of implementing a full prototype for
> the standard feature? I do not think we could merge the
> former without having an accepted proposal.

Do you mean a __lengthof__ operator first, without using the _L name
or <stdlength.h>?

If so, I agree.

> Martin

Cheers,
Alex

> > Alejandro Colomar (2):
> >   Merge definitions of array_type_nelts_top()
> >   c: Add _Lengthof() operator
> > 
> >  gcc/Makefile.in               |  1 +
> >  gcc/c-family/c-common.cc      | 20 +++++++++
> >  gcc/c-family/c-common.def     |  4 ++
> >  gcc/c-family/c-common.h       |  2 +
> >  gcc/c/c-parser.cc             | 35 +++++++++++----
> >  gcc/c/c-tree.h                |  4 ++
> >  gcc/c/c-typeck.cc             | 84 +++++++++++++++++++++++++++++++++++
> >  gcc/cp/cp-tree.h              |  1 -
> >  gcc/cp/operators.def          |  1 +
> >  gcc/cp/tree.cc                | 13 ------
> >  gcc/ginclude/stdlength.h      | 35 +++++++++++++++
> >  gcc/rust/backend/rust-tree.cc | 13 ------
> >  gcc/rust/backend/rust-tree.h  |  2 -
> >  gcc/target.h                  |  3 ++
> >  gcc/tree.cc                   | 13 ++++++
> >  gcc/tree.h                    |  1 +
> >  16 files changed, 195 insertions(+), 37 deletions(-)
> >  create mode 100644 gcc/ginclude/stdlength.h
> > 
> > Range-diff against v0:
> > -:  ----------- > 1:  507f5a51e17 Merge definitions of array_type_nelts_top()
> > -:  ----------- > 2:  e5835b982af c: Add _Lengthof() operator
>
Joseph Myers July 29, 2024, 11:13 a.m. UTC | #3
On Sun, 28 Jul 2024, Alejandro Colomar wrote:

>  gcc/Makefile.in               |  1 +
>  gcc/c-family/c-common.cc      | 20 +++++++++
>  gcc/c-family/c-common.def     |  4 ++
>  gcc/c-family/c-common.h       |  2 +
>  gcc/c/c-parser.cc             | 35 +++++++++++----
>  gcc/c/c-tree.h                |  4 ++
>  gcc/c/c-typeck.cc             | 84 +++++++++++++++++++++++++++++++++++
>  gcc/cp/cp-tree.h              |  1 -
>  gcc/cp/operators.def          |  1 +
>  gcc/cp/tree.cc                | 13 ------
>  gcc/ginclude/stdlength.h      | 35 +++++++++++++++
>  gcc/rust/backend/rust-tree.cc | 13 ------
>  gcc/rust/backend/rust-tree.h  |  2 -
>  gcc/target.h                  |  3 ++
>  gcc/tree.cc                   | 13 ++++++
>  gcc/tree.h                    |  1 +

Please start with documentation and testcases, neither of which are 
included here - making sure that both documentation and testcases cover 
all the error cases and questions of e.g. evaluation of VLA operands.  
Documentation and testcases are the most important pieces for reviewing a 
proposed addition of a new language feature, before the actual 
implementation.

A relevant semantic question to answer here: sizeof evaluates all VLA 
operands, should this operator do likewise, or should it only evaluate 
when the toplevel array is of variable length (but not for a 
constant-length array of variable-size elements)?
Alejandro Colomar Aug. 3, 2024, 10:59 p.m. UTC | #4
Hi Joseph,

On Mon, Jul 29, 2024 at 11:13:08AM GMT, Joseph Myers wrote:
> On Sun, 28 Jul 2024, Alejandro Colomar wrote:
> 
> >  gcc/Makefile.in               |  1 +
> >  gcc/c-family/c-common.cc      | 20 +++++++++
> >  gcc/c-family/c-common.def     |  4 ++
> >  gcc/c-family/c-common.h       |  2 +
> >  gcc/c/c-parser.cc             | 35 +++++++++++----
> >  gcc/c/c-tree.h                |  4 ++
> >  gcc/c/c-typeck.cc             | 84 +++++++++++++++++++++++++++++++++++
> >  gcc/cp/cp-tree.h              |  1 -
> >  gcc/cp/operators.def          |  1 +
> >  gcc/cp/tree.cc                | 13 ------
> >  gcc/ginclude/stdlength.h      | 35 +++++++++++++++
> >  gcc/rust/backend/rust-tree.cc | 13 ------
> >  gcc/rust/backend/rust-tree.h  |  2 -
> >  gcc/target.h                  |  3 ++
> >  gcc/tree.cc                   | 13 ++++++
> >  gcc/tree.h                    |  1 +
> 
> Please start with documentation and testcases, neither of which are 
> included here

While I haven't started yet with test cases within the test suite, I
have tests.  There's a test program in the cover letter of the patch set
is the draft of a test suite for the feature.

Running the test suite is much more uncomfortable (to me) than compiling
a program manually, for iterating on the feature.  I need to get used to
this test suite.  :)

> - making sure that both documentation and testcases cover 
> all the error cases and questions of e.g. evaluation of VLA operands.  
> Documentation and testcases are the most important pieces for reviewing a 
> proposed addition of a new language feature, before the actual 
> implementation.

I was just having a look at what can be done.  Now that it's working for
the cases I wanted it to work, I've started documenting it.  I'll also
have a look at adding the tests to the test suite, although that will
take a few more iterations probably.

> A relevant semantic question to answer here: sizeof evaluates all VLA 
> operands, should this operator do likewise, or should it only evaluate 
> when the toplevel array is of variable length (but not for a 
> constant-length array of variable-size elements)?

I see benefits of both approaches.  The former is trivial to implement.
I'm not sure how much work would be needed for the latter, but probably
a bit more than that.  As a programmer, I think I would ask for the
latter; I guess that's what we'd want, ideally.  Thanks for the
feedback!  :)

Have a lovely night!
Alex

> -- 
> Joseph S. Myers
> josmyers@redhat.com
>