Message ID | 20240205171819.474283-1-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | docs/style: allow C99 mixed declarations | expand |
On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote: > C99 mixed declarations support interleaving of local variable > declarations and code. > > The coding style "generally" forbids C99 mixed declarations with some > exceptions to the rule. This rule is not checked by checkpatch.pl and > naturally there are violations in the source tree. > > While contemplating adding another exception, I came to the conclusion > that the best location for declarations depends on context. Let the > programmer declare variables where it is best for legibility. Don't try > to define all possible scenarios/exceptions. IIRC, we had a discussion on this topic sometime last year, but can't remember what the $SUBJECT was, so I'll just repeat what I said then. Combining C99 mixed declarations with 'goto' is dangerous. Consider this program: $ cat jump.c #include <stdlib.h> int main(int argc, char**argv) { if (getenv("SKIP")) goto target; char *foo = malloc(30); target: free(foo); } $ gcc -Wall -Wuninitialized -o jump jump.c $ SKIP=1 ./jump free(): invalid pointer Aborted (core dumped) -> The programmer thinks they have initialized 'foo' -> GCC thinks the programmer has initialized 'foo' -> Yet 'foo' is not guaranteed to be initialized at 'target:' Given that QEMU makes heavy use of 'goto', allowing C99 mixed declarations exposes us to significant danger. Full disclosure, GCC fails to diagnmose this mistake, even with a decl at start of 'main', but at least the mistake is now more visible to the programmer. Fortunately with -fanalyzer GCC can diagnose this: $ gcc -fanalyzer -Wall -o jump jump.c jump.c: In function ‘main’: jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value] 12 | free(foo); | ^~~~~~~~~ ‘main’: events 1-5 | | 6 | if (getenv("SKIP")) | | ~ | | | | | (3) following ‘true’ branch... | 7 | goto target; | | ~~~~ | | | | | (4) ...to here | 8 | | 9 | char *foo = malloc(30); | | ^~~ | | | | | (1) region created on stack here | | (2) capacity: 8 bytes |...... | 12 | free(foo); | | ~~~~~~~~~ | | | | | (5) use of uninitialized value ‘foo’ here ...but -fanalyzer isn't something we have enabled by default, it is opt-in. I'm also not sure how comprehensive the flow control analysis of -fanalyzer is ? Can we be sure it'll catch these mistakes in large complex functions with many code paths ? Even if the compiler does reliably warn, I think the code pattern remains misleading to contributors, as the flow control flaw is very non-obvious. Rather than accept the status quo and remove the coding guideline, I think we should strengthen the guidelines, such that it is explicitly forbidden in any method that uses 'goto'. Personally I'd go all the way to -Werror=declaration-after-statement, as while C99 mixed decl is appealing, it isn't exactly a game changer in improving code maintainability. > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > docs/devel/style.rst | 20 -------------------- > 1 file changed, 20 deletions(-) > > diff --git a/docs/devel/style.rst b/docs/devel/style.rst > index 2f68b50079..80c4e4df52 100644 > --- a/docs/devel/style.rst > +++ b/docs/devel/style.rst > @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) bracing style reduces > ambiguity and avoids needless churn when lines are added or removed. > Furthermore, it is the QEMU coding style. > > -Declarations > -============ > - > -Mixed declarations (interleaving statements and declarations within > -blocks) are generally not allowed; declarations should be at the beginning > -of blocks. To avoid accidental re-use it is permissible to declare > -loop variables inside for loops: > - > -.. code-block:: c > - > - for (int i = 0; i < ARRAY_SIZE(thing); i++) { > - /* do something loopy */ > - } > - > -Every now and then, an exception is made for declarations inside a > -#ifdef or #ifndef block: if the code looks nicer, such declarations can > -be placed at the top of the block even if there are statements above. > -On the other hand, however, it's often best to move that #ifdef/#ifndef > -block to a separate function altogether. > - > Conditional statements > ====================== > > -- > 2.43.0 > With regards, Daniel
On Mon, 5 Feb 2024 at 17:41, Daniel P. Berrangé <berrange@redhat.com> wrote: > Rather than accept the status quo and remove the coding guideline, > I think we should strengthen the guidelines, such that it is > explicitly forbidden in any method that uses 'goto'. Personally > I'd go all the way to -Werror=declaration-after-statement, as > while C99 mixed decl is appealing, it isn't exactly a game > changer in improving code maintainability. I definitely think we should either (a) say we're OK with mixed declarations or (b) enforce it via the warning option. (gcc -Wdeclaration-after-statement does not appear to warn for the "declaration inside for()" syntax, so we could enable it if we wanted to fix the existing style lapses.) Personally I prefer declaration-at-top-of-block, but more as an aesthetic choice than anything else. thanks -- PMM
Daniel P. Berrangé <berrange@redhat.com> writes:
> $ gcc -Wall -Wuninitialized -o jump jump.c
Note that many GCC warnings don't trigger if you don't enable
optimizations. In the case you exhibit, adding -O is enough to get
a sensible warning:
$ gcc -Wall -O -o jump jump.c
jump.c: In function ‘main’:
jump.c:11:3: warning: ‘foo’ may be used uninitialized
[-Wmaybe-uninitialized]
11 | free(foo);
| ^~~~~~~~~
jump.c:8:9: note: ‘foo’ was declared here
8 | char *foo = malloc(30);
| ^~~
Best.
Sam
On Mon, 5 Feb 2024 at 13:16, Samuel Tardieu <sam@rfc1149.net> wrote: > > > Daniel P. Berrangé <berrange@redhat.com> writes: > > > $ gcc -Wall -Wuninitialized -o jump jump.c > > Note that many GCC warnings don't trigger if you don't enable > optimizations. In the case you exhibit, adding -O is enough to get > a sensible warning: > > $ gcc -Wall -O -o jump jump.c > jump.c: In function ‘main’: > jump.c:11:3: warning: ‘foo’ may be used uninitialized > [-Wmaybe-uninitialized] > 11 | free(foo); > | ^~~~~~~~~ > jump.c:8:9: note: ‘foo’ was declared here > 8 | char *foo = malloc(30); > | ^~~ llvm also prints a warning: jump.c:5:7: warning: variable 'foo' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] I confirmed that QEMU's current compiler flags enable these warnings so both gcc and llvm detect the issue that Daniel pointed out in QEMU code. Daniel: Does this address your concern about compiler warnings? Stefan
On Mon, Feb 05, 2024 at 02:06:39PM -0500, Stefan Hajnoczi wrote: > On Mon, 5 Feb 2024 at 13:16, Samuel Tardieu <sam@rfc1149.net> wrote: > > > > > > Daniel P. Berrangé <berrange@redhat.com> writes: > > > > > $ gcc -Wall -Wuninitialized -o jump jump.c > > > > Note that many GCC warnings don't trigger if you don't enable > > optimizations. In the case you exhibit, adding -O is enough to get > > a sensible warning: > > > > $ gcc -Wall -O -o jump jump.c > > jump.c: In function ‘main’: > > jump.c:11:3: warning: ‘foo’ may be used uninitialized > > [-Wmaybe-uninitialized] > > 11 | free(foo); > > | ^~~~~~~~~ > > jump.c:8:9: note: ‘foo’ was declared here > > 8 | char *foo = malloc(30); > > | ^~~ > > llvm also prints a warning: > > jump.c:5:7: warning: variable 'foo' is used uninitialized whenever > 'if' condition is true [-Wsometimes-uninitialized] > > I confirmed that QEMU's current compiler flags enable these warnings > so both gcc and llvm detect the issue that Daniel pointed out in QEMU > code. > > Daniel: Does this address your concern about compiler warnings? While it is good that its included when optimization is enabled, IME GCC is not entirely reliable at detecting unintialized variables as it has limits on extent of its flow control analysis. Also consider the maint impact if a method includes C99 decls, and a contributor later needs to add a 'goto' error cleanup path. They might be forced to make a bunch of unrelated refactoring changes to eliminate use of the C99 decls. This makes me not a fan of having a situation where we tell contributors they can use C99 mixed decls, except when they can't use them. With regards, Daniel
Stefan Hajnoczi <stefanha@redhat.com> writes: > C99 mixed declarations support interleaving of local variable > declarations and code. > > The coding style "generally" forbids C99 mixed declarations with some > exceptions to the rule. This rule is not checked by checkpatch.pl and > naturally there are violations in the source tree. > > While contemplating adding another exception, I came to the conclusion > that the best location for declarations depends on context. Let the > programmer declare variables where it is best for legibility. Don't try > to define all possible scenarios/exceptions. > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > docs/devel/style.rst | 20 -------------------- > 1 file changed, 20 deletions(-) While I may be convinced there may be some cases where mixed declarations could make things simpler/clearer I don't support removing all the guidance and leaving it as a free for all as long as the compiler is happy.
Daniel P. Berrangé <berrange@redhat.com> writes: > On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote: >> C99 mixed declarations support interleaving of local variable >> declarations and code. >> >> The coding style "generally" forbids C99 mixed declarations with some >> exceptions to the rule. This rule is not checked by checkpatch.pl and >> naturally there are violations in the source tree. >> >> While contemplating adding another exception, I came to the conclusion >> that the best location for declarations depends on context. Let the >> programmer declare variables where it is best for legibility. Don't try >> to define all possible scenarios/exceptions. > > IIRC, we had a discussion on this topic sometime last year, but can't > remember what the $SUBJECT was, so I'll just repeat what I said then. From: Juan Quintela <quintela@redhat.com> Subject: [PATCH] Change the default for Mixed declarations. Date: Tue, 14 Feb 2023 17:07:38 +0100 Message-Id: <20230214160738.88614-1-quintela@redhat.com> https://lore.kernel.org/qemu-devel/20230214160738.88614-1-quintela@redhat.com/ > Combining C99 mixed declarations with 'goto' is dangerous. > > Consider this program: > > $ cat jump.c > #include <stdlib.h> > > int main(int argc, char**argv) { > > if (getenv("SKIP")) > goto target; > > char *foo = malloc(30); > > target: > free(foo); > } > > $ gcc -Wall -Wuninitialized -o jump jump.c > > $ SKIP=1 ./jump > free(): invalid pointer > Aborted (core dumped) > > > -> The programmer thinks they have initialized 'foo' > -> GCC thinks the programmer has initialized 'foo' > -> Yet 'foo' is not guaranteed to be initialized at 'target:' > > Given that QEMU makes heavy use of 'goto', allowing C99 mixed > declarations exposes us to significant danger. > > Full disclosure, GCC fails to diagnmose this mistake, even > with a decl at start of 'main', but at least the mistake is > now more visible to the programmer. > > Fortunately with -fanalyzer GCC can diagnose this: > > $ gcc -fanalyzer -Wall -o jump jump.c > jump.c: In function ‘main’: > jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value] > 12 | free(foo); > | ^~~~~~~~~ > ‘main’: events 1-5 > | > | 6 | if (getenv("SKIP")) > | | ~ > | | | > | | (3) following ‘true’ branch... > | 7 | goto target; > | | ~~~~ > | | | > | | (4) ...to here > | 8 | > | 9 | char *foo = malloc(30); > | | ^~~ > | | | > | | (1) region created on stack here > | | (2) capacity: 8 bytes > |...... > | 12 | free(foo); > | | ~~~~~~~~~ > | | | > | | (5) use of uninitialized value ‘foo’ here > > > ...but -fanalyzer isn't something we have enabled by default, it > is opt-in. I'm also not sure how comprehensive the flow control > analysis of -fanalyzer is ? Can we be sure it'll catch these > mistakes in large complex functions with many code paths ? > > Even if the compiler does reliably warn, I think the code pattern > remains misleading to contributors, as the flow control flaw is > very non-obvious. Yup. Strong dislike. > Rather than accept the status quo and remove the coding guideline, > I think we should strengthen the guidelines, such that it is > explicitly forbidden in any method that uses 'goto'. Personally > I'd go all the way to -Werror=declaration-after-statement, as I support this. > while C99 mixed decl is appealing, Not to me. I much prefer declarations and statements to be visually distinct. Putting declarations first and separating from statements them with a blank line accomplishes that. Less necessary in languages where declarations are syntactically obvious. > it isn't exactly a game > changer in improving code maintainability. [...]
On Mon, 5 Feb 2024 at 12:19, Stefan Hajnoczi <stefanha@redhat.com> wrote: > > C99 mixed declarations support interleaving of local variable > declarations and code. > > The coding style "generally" forbids C99 mixed declarations with some > exceptions to the rule. This rule is not checked by checkpatch.pl and > naturally there are violations in the source tree. > > While contemplating adding another exception, I came to the conclusion > that the best location for declarations depends on context. Let the > programmer declare variables where it is best for legibility. Don't try > to define all possible scenarios/exceptions. > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > docs/devel/style.rst | 20 -------------------- > 1 file changed, 20 deletions(-) I'm withdrawing this patch because it seems many of the other QEMU maintainers prefer not to have C99 mixed declarations. Stefan > diff --git a/docs/devel/style.rst b/docs/devel/style.rst > index 2f68b50079..80c4e4df52 100644 > --- a/docs/devel/style.rst > +++ b/docs/devel/style.rst > @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) bracing style reduces > ambiguity and avoids needless churn when lines are added or removed. > Furthermore, it is the QEMU coding style. > > -Declarations > -============ > - > -Mixed declarations (interleaving statements and declarations within > -blocks) are generally not allowed; declarations should be at the beginning > -of blocks. To avoid accidental re-use it is permissible to declare > -loop variables inside for loops: > - > -.. code-block:: c > - > - for (int i = 0; i < ARRAY_SIZE(thing); i++) { > - /* do something loopy */ > - } > - > -Every now and then, an exception is made for declarations inside a > -#ifdef or #ifndef block: if the code looks nicer, such declarations can > -be placed at the top of the block even if there are statements above. > -On the other hand, however, it's often best to move that #ifdef/#ifndef > -block to a separate function altogether. > - > Conditional statements > ====================== > > -- > 2.43.0 > >
On 6/2/24 06:53, Markus Armbruster wrote: > Daniel P. Berrangé <berrange@redhat.com> writes: > >> On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote: >>> C99 mixed declarations support interleaving of local variable >>> declarations and code. >>> >>> The coding style "generally" forbids C99 mixed declarations with some >>> exceptions to the rule. This rule is not checked by checkpatch.pl and >>> naturally there are violations in the source tree. >>> >>> While contemplating adding another exception, I came to the conclusion >>> that the best location for declarations depends on context. Let the >>> programmer declare variables where it is best for legibility. Don't try >>> to define all possible scenarios/exceptions. ... >> Even if the compiler does reliably warn, I think the code pattern >> remains misleading to contributors, as the flow control flaw is >> very non-obvious. > > Yup. Strong dislike. > >> Rather than accept the status quo and remove the coding guideline, >> I think we should strengthen the guidelines, such that it is >> explicitly forbidden in any method that uses 'goto'. Personally >> I'd go all the way to -Werror=declaration-after-statement, as > > I support this. > >> while C99 mixed decl is appealing, > > Not to me. > > I much prefer declarations and statements to be visually distinct. > Putting declarations first and separating from statements them with a > blank line accomplishes that. Less necessary in languages where > declarations are syntactically obvious. But we already implicitly suggest C99, see commit ae7c80a7bd ("error: New macro ERRP_GUARD()"): * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. #define ERRP_GUARD() \ g_auto(ErrorPropagator) _auto_errp_prop = {.errp = errp}; \ do { \ if (!errp || errp == &error_fatal) { \ errp = &_auto_errp_prop.local_err; \ } \ } while (0) Or commit 5626f8c6d4 ("rcu: Add automatically released rcu_read_lock variants") with WITH_RCU_READ*: util/aio-posix.c:540:5: error: mixing declarations and code is incompatible with standards before C99 [-Werror,-Wdeclaration-after-statement] RCU_READ_LOCK_GUARD(); ^ include/qemu/rcu.h:189:28: note: expanded from macro 'RCU_READ_LOCK_GUARD' g_autoptr(RCUReadAuto) _rcu_read_auto __attribute__((unused)) = rcu_read_auto_lock() ^
Philippe Mathieu-Daudé <philmd@linaro.org> writes: > On 6/2/24 06:53, Markus Armbruster wrote: >> Daniel P. Berrangé <berrange@redhat.com> writes: >> >>> On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote: >>>> C99 mixed declarations support interleaving of local variable >>>> declarations and code. >>>> >>>> The coding style "generally" forbids C99 mixed declarations with some >>>> exceptions to the rule. This rule is not checked by checkpatch.pl and >>>> naturally there are violations in the source tree. >>>> >>>> While contemplating adding another exception, I came to the conclusion >>>> that the best location for declarations depends on context. Let the >>>> programmer declare variables where it is best for legibility. Don't try >>>> to define all possible scenarios/exceptions. > ... > >>> Even if the compiler does reliably warn, I think the code pattern >>> remains misleading to contributors, as the flow control flaw is >>> very non-obvious. >> >> Yup. Strong dislike. >> >>> Rather than accept the status quo and remove the coding guideline, >>> I think we should strengthen the guidelines, such that it is >>> explicitly forbidden in any method that uses 'goto'. Personally >>> I'd go all the way to -Werror=declaration-after-statement, as >> >> I support this. >> >>> while C99 mixed decl is appealing, >> >> Not to me. >> >> I much prefer declarations and statements to be visually distinct. >> Putting declarations first and separating from statements them with a >> blank line accomplishes that. Less necessary in languages where >> declarations are syntactically obvious. > > But we already implicitly suggest C99, see commit ae7c80a7bd > ("error: New macro ERRP_GUARD()"): > > * To use ERRP_GUARD(), add it right at the beginning of the function. > * @errp can then be used without worrying about the argument being > * NULL or &error_fatal. > > #define ERRP_GUARD() \ > g_auto(ErrorPropagator) _auto_errp_prop = {.errp = errp}; \ > do { \ > if (!errp || errp == &error_fatal) { \ > errp = &_auto_errp_prop.local_err; \ > } \ > } while (0) We can make ERRP_GUARD() expand into just a declaration: #define ERRP_GUARD() \ g_auto(ErrorPropagator) _auto_errp_prop = { \ .errp = errp, \ .local_err = ((!errp || errp == &error_fatal \ ? errp = &_auto_errp_prop.local_err \ : NULL), \ NULL) } > Or commit 5626f8c6d4 ("rcu: Add automatically released rcu_read_lock > variants") with WITH_RCU_READ*: > > util/aio-posix.c:540:5: error: mixing declarations and code is > incompatible with standards before C99 > [-Werror,-Wdeclaration-after-statement] > RCU_READ_LOCK_GUARD(); > ^ > include/qemu/rcu.h:189:28: note: expanded from macro 'RCU_READ_LOCK_GUARD' > g_autoptr(RCUReadAuto) _rcu_read_auto __attribute__((unused)) = > rcu_read_auto_lock() > ^ Valid example; RCU_READ_LOCK_GUARD() expands into a declaration. To enable -Wdeclaration-after-statement, we'd have to futz around with _Pragma.
diff --git a/docs/devel/style.rst b/docs/devel/style.rst index 2f68b50079..80c4e4df52 100644 --- a/docs/devel/style.rst +++ b/docs/devel/style.rst @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) bracing style reduces ambiguity and avoids needless churn when lines are added or removed. Furthermore, it is the QEMU coding style. -Declarations -============ - -Mixed declarations (interleaving statements and declarations within -blocks) are generally not allowed; declarations should be at the beginning -of blocks. To avoid accidental re-use it is permissible to declare -loop variables inside for loops: - -.. code-block:: c - - for (int i = 0; i < ARRAY_SIZE(thing); i++) { - /* do something loopy */ - } - -Every now and then, an exception is made for declarations inside a -#ifdef or #ifndef block: if the code looks nicer, such declarations can -be placed at the top of the block even if there are statements above. -On the other hand, however, it's often best to move that #ifdef/#ifndef -block to a separate function altogether. - Conditional statements ======================
C99 mixed declarations support interleaving of local variable declarations and code. The coding style "generally" forbids C99 mixed declarations with some exceptions to the rule. This rule is not checked by checkpatch.pl and naturally there are violations in the source tree. While contemplating adding another exception, I came to the conclusion that the best location for declarations depends on context. Let the programmer declare variables where it is best for legibility. Don't try to define all possible scenarios/exceptions. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- docs/devel/style.rst | 20 -------------------- 1 file changed, 20 deletions(-)