diff mbox

Implement -Wimplicit-fallthrough (version 9)

Message ID yddintiyebz.fsf@CeBiTec.Uni-Bielefeld.DE
State New
Headers show

Commit Message

Rainer Orth Sept. 26, 2016, 1:34 p.m. UTC
Hi Marek,

> All right.  I'll commit the patch on Monday.

this patch caused quite some breakage: Ada, Solaris/x86 and SPARC don't
bootstrap any longer. 

The following patch allows i386-pc-solaris2.12 and sparc-sun-solaris2.12
bootstraps continue.

Strangely, I needed to use gcc_fallthrough () in i386.c; a mere /* FALLTHRU */
had no effect.

Ok for mainline if the bootstraps pass (with appropriate changelog
entries, of course)?

	Rainer

Comments

Jakub Jelinek Sept. 26, 2016, 1:41 p.m. UTC | #1
On Mon, Sep 26, 2016 at 03:34:40PM +0200, Rainer Orth wrote:
> Ok for mainline if the bootstraps pass (with appropriate changelog
> entries, of course)?

Yes.

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -17917,6 +17917,7 @@ ix86_print_operand (FILE *file, rtx x, i
>  #ifdef HAVE_AS_IX86_CMOV_SUN_SYNTAX
>  	  if (ASSEMBLER_DIALECT == ASM_ATT)
>  	    putc ('.', file);
> +	  gcc_fallthrough ();
>  #endif

Where have you been adding the /*FALLTHROUGH*/ ?  Before #endif or after it?

	Jakub
Marek Polacek Sept. 26, 2016, 1:42 p.m. UTC | #2
On Mon, Sep 26, 2016 at 03:34:40PM +0200, Rainer Orth wrote:
> Hi Marek,
> 
> > All right.  I'll commit the patch on Monday.
> 
> this patch caused quite some breakage: Ada, Solaris/x86 and SPARC don't
> bootstrap any longer. 
 
Sorry about that.  I had tested Ada + x86_64/ppc64/aarch64, but couldn't
test neither Solaris nor SPARC.

> The following patch allows i386-pc-solaris2.12 and sparc-sun-solaris2.12
> bootstraps continue.
> 
> Strangely, I needed to use gcc_fallthrough () in i386.c; a mere /* FALLTHRU */
> had no effect.
 
Yes, this is expected.  /* FALLTHRU */ comments only work if the next token
is a case label or default label.

> Ok for mainline if the bootstraps pass (with appropriate changelog
> entries, of course)?

LGTM, though I can't approve.

Thanks a lot.

	Marek
Arnaud Charlet Sept. 26, 2016, 3:06 p.m. UTC | #3
> this patch caused quite some breakage: Ada, Solaris/x86 and SPARC don't
> bootstrap any longer. 
> 
> The following patch allows i386-pc-solaris2.12 and
> sparc-sun-solaris2.12
> bootstraps continue.
> 
> Strangely, I needed to use gcc_fallthrough () in i386.c; a mere /* FALLTHRU
> */
> had no effect.
> 
> Ok for mainline if the bootstraps pass (with appropriate changelog
> entries, of course)?

Ada part is OK
Jason Merrill Sept. 26, 2016, 3:46 p.m. UTC | #4
It seems unfortunate that the warning doesn't accept /* ... fall
through ... */ as a fallthrough comment.

Jason
Rainer Orth Sept. 26, 2016, 8:17 p.m. UTC | #5
Hi Jakub,

> On Mon, Sep 26, 2016 at 03:34:40PM +0200, Rainer Orth wrote:
>> Ok for mainline if the bootstraps pass (with appropriate changelog
>> entries, of course)?
>
> Yes.

testing completed successfully, so I've installed the patch with this
ChangeLog entry:

2016-09-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc:
	* config/i386/i386.c (ix86_print_operand)
	[HAVE_AS_IX86_CMOV_SUN_SYNTAX]: Add gcc_fallthrough.
	* config/sparc/sparc.c (check_pic): Add fallthrough comment.
	(epilogue_renumber): Likewise.

	gcc/ada:
	* gcc-interface/decl.c: Fix fall through comment formatting.
	* gcc-interface/misc.c: Likewise.
	* gcc-interface/trans.c: Likewise.
	* gcc-interface/utils.c: Likewise.
	* gcc-interface/utils2.c: Likewise.

>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -17917,6 +17917,7 @@ ix86_print_operand (FILE *file, rtx x, i
>>  #ifdef HAVE_AS_IX86_CMOV_SUN_SYNTAX
>>  	  if (ASSEMBLER_DIALECT == ASM_ATT)
>>  	    putc ('.', file);
>> +	  gcc_fallthrough ();
>>  #endif
>
> Where have you been adding the /*FALLTHROUGH*/ ?  Before #endif or after it?

Before: seemed natural to me.

	Rainer
Rainer Orth Sept. 26, 2016, 8:21 p.m. UTC | #6
Hi Marek,

> On Mon, Sep 26, 2016 at 03:34:40PM +0200, Rainer Orth wrote:
>> Hi Marek,
>> 
>> > All right.  I'll commit the patch on Monday.
>> 
>> this patch caused quite some breakage: Ada, Solaris/x86 and SPARC don't
>> bootstrap any longer. 
>  
> Sorry about that.  I had tested Ada + x86_64/ppc64/aarch64, but couldn't
> test neither Solaris nor SPARC.

the Solaris/x86 and SPARC issues were minor; I was just astonished to
see that so much didn't work on the Ada side.

>> The following patch allows i386-pc-solaris2.12 and sparc-sun-solaris2.12
>> bootstraps continue.
>> 
>> Strangely, I needed to use gcc_fallthrough () in i386.c; a mere /* FALLTHRU */
>> had no effect.
>  
> Yes, this is expected.  /* FALLTHRU */ comments only work if the next token
> is a case label or default label.

Ok, that explains what's going on.  I guess it would be good to make
this explicit in invoke.texi since someone else is guaranteed to stumble
across this as well.

Thanks.
        Rainer
Eric Botcazou Sept. 27, 2016, 6:49 a.m. UTC | #7
> It seems unfortunate that the warning doesn't accept /* ... fall
> through ... */ as a fallthrough comment.

Seconded.  The warning should take into account existing practices instead of 
forcing the user to make completely bogus changes to the code (and Ada should 
have been tested before the patch was approved).
Eric Botcazou Sept. 27, 2016, 7:55 a.m. UTC | #8
> Seconded.  The warning should take into account existing practices instead
> of forcing the user to make completely bogus changes to the code (and Ada
> should have been tested before the patch was approved).

I have a bootstrap failure on x86-64/Linux:

/home/eric/svn/gcc/gcc/combine.c: In function 'rtx_code 
simplify_comparison(rtx_code, rtx_def**, rtx_def**)':
/home/eric/svn/gcc/gcc/combine.c:11928:11: error: this statement may fall 
through [-Werror=implicit-fallthrough]
      break;
           ^
/home/eric/svn/gcc/gcc/combine.c:11932:2: note: here
  case ZERO_EXTEND:
  ^~~~
/home/eric/svn/gcc/gcc/combine.c:12340:6: error: this statement may fall 
through [-Werror=implicit-fallthrough]
      }
      ^
/home/eric/svn/gcc/gcc/combine.c:12343:2: note: here
  case LSHIFTRT:
  ^~~~

	  /* ... fall through ...  */

	  /* ... fall through ...  */
Jakub Jelinek Sept. 27, 2016, 8:17 a.m. UTC | #9
On Tue, Sep 27, 2016 at 08:49:00AM +0200, Eric Botcazou wrote:
> > It seems unfortunate that the warning doesn't accept /* ... fall
> > through ... */ as a fallthrough comment.
> 
> Seconded.  The warning should take into account existing practices instead of 
> forcing the user to make completely bogus changes to the code (and Ada should 
> have been tested before the patch was approved).

The intent has been that we catch the most common forms, but still require it not
to be complete free form.  Because, as experience shows, people are
extremely creative in these comments, and it is not very good idea to
support everything.  For ... fall through ... , what is the purpose of those
...s?

	Jakub
Eric Botcazou Sept. 27, 2016, 8:46 a.m. UTC | #10
> The intent has been that we catch the most common forms, but still require
> it not to be complete free form.  Because, as experience shows, people are
> extremely creative in these comments, and it is not very good idea to
> support everything.  For ... fall through ... , what is the purpose of
> those ...s?

No idea, but it has been there for a while and seems perfectly reasonable.
IMO any sentence containing "fall" and "through/thru/etc" on the same line 
should be accepted, otherwise it's just misplaced pickiness.
Markus Trippelsdorf Sept. 27, 2016, 10:39 a.m. UTC | #11
On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > The intent has been that we catch the most common forms, but still require
> > it not to be complete free form.  Because, as experience shows, people are
> > extremely creative in these comments, and it is not very good idea to
> > support everything.  For ... fall through ... , what is the purpose of
> > those ...s?
> 
> No idea, but it has been there for a while and seems perfectly reasonable.
> IMO any sentence containing "fall" and "through/thru/etc" on the same line 
> should be accepted, otherwise it's just misplaced pickiness.

+1. Folks will just disable the warning if gcc is not very permissive
when paring existing comments. You cannot expect anyone to change
perfectly fine fall-through comments just to accommodate an arbitrary
gcc style.
Jakub Jelinek Sept. 27, 2016, 10:47 a.m. UTC | #12
On Tue, Sep 27, 2016 at 12:39:41PM +0200, Markus Trippelsdorf wrote:
> On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > > The intent has been that we catch the most common forms, but still require
> > > it not to be complete free form.  Because, as experience shows, people are
> > > extremely creative in these comments, and it is not very good idea to
> > > support everything.  For ... fall through ... , what is the purpose of
> > > those ...s?
> > 
> > No idea, but it has been there for a while and seems perfectly reasonable.
> > IMO any sentence containing "fall" and "through/thru/etc" on the same line 
> > should be accepted, otherwise it's just misplaced pickiness.
> 
> +1. Folks will just disable the warning if gcc is not very permissive
> when paring existing comments. You cannot expect anyone to change
> perfectly fine fall-through comments just to accommodate an arbitrary
> gcc style.

The accepted style is already very permissive, we don't allow just one
spelling as various lint tools.  I'm afraid looking for various cases of
fall and through/thru possibly separated by anything and surrounded by
anything is IMHO already too much, the compiler shouldn't try to try to
grammar analyze the comments on what they actually talk about and whether it
might be related to the switch fall through or something completely
different.  Users should start using [[fallthrough]]; anyway.

	Jakub
Jakub Jelinek Sept. 27, 2016, 10:52 a.m. UTC | #13
On Tue, Sep 27, 2016 at 12:47:50PM +0200, Jakub Jelinek wrote:
> On Tue, Sep 27, 2016 at 12:39:41PM +0200, Markus Trippelsdorf wrote:
> > On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > > > The intent has been that we catch the most common forms, but still require
> > > > it not to be complete free form.  Because, as experience shows, people are
> > > > extremely creative in these comments, and it is not very good idea to
> > > > support everything.  For ... fall through ... , what is the purpose of
> > > > those ...s?
> > > 
> > > No idea, but it has been there for a while and seems perfectly reasonable.
> > > IMO any sentence containing "fall" and "through/thru/etc" on the same line 
> > > should be accepted, otherwise it's just misplaced pickiness.
> > 
> > +1. Folks will just disable the warning if gcc is not very permissive
> > when paring existing comments. You cannot expect anyone to change
> > perfectly fine fall-through comments just to accommodate an arbitrary
> > gcc style.
> 
> The accepted style is already very permissive, we don't allow just one
> spelling as various lint tools.  I'm afraid looking for various cases of
> fall and through/thru possibly separated by anything and surrounded by
> anything is IMHO already too much, the compiler shouldn't try to try to
> grammar analyze the comments on what they actually talk about and whether it
> might be related to the switch fall through or something completely
> different.  Users should start using [[fallthrough]]; anyway.

Oh, forgot, I think allowing
  /* Fallthrough */
  /* arbitrary comment */
  case ...
might be something we could be also supporting, especially because
sometimes users might want to comment on what the following case handle and
fallthrough would be just something in between.  But IMHO forcing users to
use some clear markup style if they don't want to/can't switch to
[[fallthrough]];
__attribute__((fallthrough));
or some macro that does that is a good idea.  That will certainly increase
the chance other compilers could do the same thing, parsing arbitrary stuff
is hard to agree on.

	Jakub
Marek Polacek Sept. 27, 2016, 10:56 a.m. UTC | #14
On Tue, Sep 27, 2016 at 12:47:50PM +0200, Jakub Jelinek wrote:
> On Tue, Sep 27, 2016 at 12:39:41PM +0200, Markus Trippelsdorf wrote:
> > On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > > > The intent has been that we catch the most common forms, but still require
> > > > it not to be complete free form.  Because, as experience shows, people are
> > > > extremely creative in these comments, and it is not very good idea to
> > > > support everything.  For ... fall through ... , what is the purpose of
> > > > those ...s?
> > > 
> > > No idea, but it has been there for a while and seems perfectly reasonable.
> > > IMO any sentence containing "fall" and "through/thru/etc" on the same line 
> > > should be accepted, otherwise it's just misplaced pickiness.
> > 
> > +1. Folks will just disable the warning if gcc is not very permissive
> > when paring existing comments. You cannot expect anyone to change
> > perfectly fine fall-through comments just to accommodate an arbitrary
> > gcc style.
> 
> The accepted style is already very permissive, we don't allow just one
> spelling as various lint tools.  I'm afraid looking for various cases of
> fall and through/thru possibly separated by anything and surrounded by
> anything is IMHO already too much, the compiler shouldn't try to try to
> grammar analyze the comments on what they actually talk about and whether it
> might be related to the switch fall through or something completely
> different.  Users should start using [[fallthrough]]; anyway.

I'm thinking perhaps we should also accept /* ... fall through ... */
and /* else fall through */, but accepting any sentence containing "fall" and
"through/thru/etc" on the same line would mean that we also accept
/* Don't fall through here.  */ and that is clearly not desirable.

	Marek
Eric Botcazou Sept. 27, 2016, 11:06 a.m. UTC | #15
> The accepted style is already very permissive, we don't allow just one
> spelling as various lint tools.

Well, it cannot even handle the variations of a single codebase, GCC itself, 
so I'm afraid very permissive is not exactly the appropriate wording here.
Anyway, we'll see whether Bugzilla is flooded or not in the upcoming days.
Richard Biener Sept. 27, 2016, 11:09 a.m. UTC | #16
On Tue, Sep 27, 2016 at 1:06 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> The accepted style is already very permissive, we don't allow just one
>> spelling as various lint tools.
>
> Well, it cannot even handle the variations of a single codebase, GCC itself,
> so I'm afraid very permissive is not exactly the appropriate wording here.
> Anyway, we'll see whether Bugzilla is flooded or not in the upcoming days.

During discussion I already pointed out that people may use non-english
variants as well.  I've seen a lot of french variable/function names in my
academic life for example.

Richard.

> --
> Eric Botcazou
Jakub Jelinek Sept. 27, 2016, 11:31 a.m. UTC | #17
On Tue, Sep 27, 2016 at 12:56:29PM +0200, Marek Polacek wrote:
> On Tue, Sep 27, 2016 at 12:47:50PM +0200, Jakub Jelinek wrote:
> > On Tue, Sep 27, 2016 at 12:39:41PM +0200, Markus Trippelsdorf wrote:
> > > On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > > > > The intent has been that we catch the most common forms, but still require
> > > > > it not to be complete free form.  Because, as experience shows, people are
> > > > > extremely creative in these comments, and it is not very good idea to
> > > > > support everything.  For ... fall through ... , what is the purpose of
> > > > > those ...s?
> > > > 
> > > > No idea, but it has been there for a while and seems perfectly reasonable.
> > > > IMO any sentence containing "fall" and "through/thru/etc" on the same line 
> > > > should be accepted, otherwise it's just misplaced pickiness.
> > > 
> > > +1. Folks will just disable the warning if gcc is not very permissive
> > > when paring existing comments. You cannot expect anyone to change
> > > perfectly fine fall-through comments just to accommodate an arbitrary
> > > gcc style.
> > 
> > The accepted style is already very permissive, we don't allow just one
> > spelling as various lint tools.  I'm afraid looking for various cases of
> > fall and through/thru possibly separated by anything and surrounded by
> > anything is IMHO already too much, the compiler shouldn't try to try to
> > grammar analyze the comments on what they actually talk about and whether it
> > might be related to the switch fall through or something completely
> > different.  Users should start using [[fallthrough]]; anyway.
> 
> I'm thinking perhaps we should also accept /* ... fall through ... */
> and /* else fall through */, but accepting any sentence containing "fall" and
> "through/thru/etc" on the same line would mean that we also accept
> /* Don't fall through here.  */ and that is clearly not desirable.

I think it is important to think in terms of what regexps we still want to
match, even when the matching is actually implemented in C, not using
regexps.  And yes, you list one reason why arbitrary text with fall and
through somewhere in it is not a good idea.  Another:
/* XXX Really fallthru?  */
(what we have in pch.c).
So, if you want to allow ... fall through ... and else fall through, and
perhaps // fall through - some explanation
then it might be e.g.
//-fallthrough$
//@fallthrough@$
/\*-fallthrough\*/
/\*@fallthrough@\*/
//[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?$
//[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?$
//[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?$
/\*[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?\*/
/\*[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?\*/
/\*[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?\*/
where . would match even newlines in the last 3,
but $ would always match just end of line?

	Jakub
Markus Trippelsdorf Sept. 27, 2016, 11:46 a.m. UTC | #18
On 2016.09.27 at 12:56 +0200, Marek Polacek wrote:
> On Tue, Sep 27, 2016 at 12:47:50PM +0200, Jakub Jelinek wrote:
> > On Tue, Sep 27, 2016 at 12:39:41PM +0200, Markus Trippelsdorf wrote:
> > > On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > > > > The intent has been that we catch the most common forms, but still require
> > > > > it not to be complete free form.  Because, as experience shows, people are
> > > > > extremely creative in these comments, and it is not very good idea to
> > > > > support everything.  For ... fall through ... , what is the purpose of
> > > > > those ...s?
> > > >
> > > > No idea, but it has been there for a while and seems perfectly reasonable.
> > > > IMO any sentence containing "fall" and "through/thru/etc" on the same line
> > > > should be accepted, otherwise it's just misplaced pickiness.
> > >
> > > +1. Folks will just disable the warning if gcc is not very permissive
> > > when paring existing comments. You cannot expect anyone to change
> > > perfectly fine fall-through comments just to accommodate an arbitrary
> > > gcc style.
> >
> > The accepted style is already very permissive, we don't allow just one
> > spelling as various lint tools.  I'm afraid looking for various cases of
> > fall and through/thru possibly separated by anything and surrounded by
> > anything is IMHO already too much, the compiler shouldn't try to try to
> > grammar analyze the comments on what they actually talk about and whether it
> > might be related to the switch fall through or something completely
> > different.  Users should start using [[fallthrough]]; anyway.
>
> I'm thinking perhaps we should also accept /* ... fall through ... */
> and /* else fall through */, but accepting any sentence containing "fall" and
> "through/thru/etc" on the same line would mean that we also accept
> /* Don't fall through here.  */ and that is clearly not desirable.
>

I'm also wondering about the situation where not a single break is used
in all of the cases. It would be best not to warn here.

An example from ffmpeg:

#define LPC1(x) {           \
    int c = coefs[(x)-1];   \
    p0   += MUL(c, s);      \
    s     = smp[i-(x)+1];   \
    p1   += MUL(c, s);      \
}

static av_always_inline void FUNC(lpc_encode_unrolled)(int32_t *res,
                                  const int32_t *smp, int len, int order,
                                  const int32_t *coefs, int shift, int big)
{
    int i;
    for (i = order; i < len; i += 2) {
        int s  = smp[i-order];
        sum_type p0 = 0, p1 = 0;
        if (big) {
            switch (order) {
            case 32: LPC1(32)
            case 31: LPC1(31)
            case 30: LPC1(30)
            case 29: LPC1(29)
            case 28: LPC1(28)
            case 27: LPC1(27)
            case 26: LPC1(26)
            case 25: LPC1(25)
            case 24: LPC1(24)
            case 23: LPC1(23)
            case 22: LPC1(22)
            case 21: LPC1(21)
            case 20: LPC1(20)
            case 19: LPC1(19)
            case 18: LPC1(18)
            case 17: LPC1(17)
            case 16: LPC1(16)
            case 15: LPC1(15)
            case 14: LPC1(14)
            case 13: LPC1(13)
            case 12: LPC1(12)
            case 11: LPC1(11)
            case 10: LPC1(10)
            case  9: LPC1( 9)
                     LPC1( 8)
                     LPC1( 7)
                     LPC1( 6)
                     LPC1( 5)
                     LPC1( 4)
                     LPC1( 3)
                     LPC1( 2)
                     LPC1( 1)
            }
        } else {
            switch (order) {
            case  8: LPC1( 8)
            case  7: LPC1( 7)
            case  6: LPC1( 6)
            case  5: LPC1( 5)
            case  4: LPC1( 4)
            case  3: LPC1( 3)
            case  2: LPC1( 2)
            case  1: LPC1( 1)
            }
        }
        res[i  ] = smp[i  ] - CLIP(p0 >> shift);
        res[i+1] = smp[i+1] - CLIP(p1 >> shift);
    }
}


--
Markus
Bernd Schmidt Sept. 27, 2016, 11:47 a.m. UTC | #19
On 09/27/2016 01:09 PM, Richard Biener wrote:
> On Tue, Sep 27, 2016 at 1:06 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>>> The accepted style is already very permissive, we don't allow just one
>>> spelling as various lint tools.
>>
>> Well, it cannot even handle the variations of a single codebase, GCC itself,
>> so I'm afraid very permissive is not exactly the appropriate wording here.
>> Anyway, we'll see whether Bugzilla is flooded or not in the upcoming days.
>
> During discussion I already pointed out that people may use non-english
> variants as well.  I've seen a lot of french variable/function names in my
> academic life for example.

Yes, I pointed out the same thing a few weeks ago.

The warning does seem to be useful and discover errors, but I worry 
about the large amount of false positives it produces.


Bernd
Segher Boessenkool Sept. 27, 2016, 11:51 a.m. UTC | #20
On Tue, Sep 27, 2016 at 01:31:15PM +0200, Jakub Jelinek wrote:
> I think it is important to think in terms of what regexps we still want to
> match, even when the matching is actually implemented in C, not using
> regexps.  And yes, you list one reason why arbitrary text with fall and
> through somewhere in it is not a good idea.  Another:
> /* XXX Really fallthru?  */
> (what we have in pch.c).
> So, if you want to allow ... fall through ... and else fall through, and
> perhaps // fall through - some explanation
> then it might be e.g.
> //-fallthrough$
> //@fallthrough@$
> /\*-fallthrough\*/
> /\*@fallthrough@\*/
> //[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?$
> //[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?$
> //[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?$
> /\*[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?\*/
> /\*[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?\*/
> /\*[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?\*/
> where . would match even newlines in the last 3,
> but $ would always match just end of line?

Any comment with text

^[^_[:alnum:]]*(else )?fall(s | |-)?thr(ough|u)[^_[:alnum:]]*$

perhaps?  Case-insensitive.  Or allow any amount of space, or even any
interpunction.  Just don't allow any alphanumerics except for those
exact words, and there won't be many false hits at all.


Segher
Marek Polacek Sept. 27, 2016, 11:51 a.m. UTC | #21
On Tue, Sep 27, 2016 at 01:47:07PM +0200, Bernd Schmidt wrote:
> On 09/27/2016 01:09 PM, Richard Biener wrote:
> > On Tue, Sep 27, 2016 at 1:06 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> > > > The accepted style is already very permissive, we don't allow just one
> > > > spelling as various lint tools.
> > > 
> > > Well, it cannot even handle the variations of a single codebase, GCC itself,
> > > so I'm afraid very permissive is not exactly the appropriate wording here.
> > > Anyway, we'll see whether Bugzilla is flooded or not in the upcoming days.
> > 
> > During discussion I already pointed out that people may use non-english
> > variants as well.  I've seen a lot of french variable/function names in my
> > academic life for example.
> 
> Yes, I pointed out the same thing a few weeks ago.

But the C/C++ keywords are all English, too; lint tools only accept English,
and so it wouldn't seem unreasonable to only accept English keywords in the
comments.  And in any case, I don't see how a compiler can be expected to
be able to parse non-English languages.

	Marek
Bernd Schmidt Sept. 27, 2016, 11:55 a.m. UTC | #22
On 09/27/2016 01:51 PM, Marek Polacek wrote:
> But the C/C++ keywords are all English, too; lint tools only accept English,
> and so it wouldn't seem unreasonable to only accept English keywords in the
> comments.  And in any case, I don't see how a compiler can be expected to
> be able to parse non-English languages.

It isn't. But it can also be reasonably by expected not to warn about 
things that are valid according to the language specification and are 
frequently used.


Bernd
Jakub Jelinek Sept. 27, 2016, 11:58 a.m. UTC | #23
On Tue, Sep 27, 2016 at 06:51:31AM -0500, Segher Boessenkool wrote:
> On Tue, Sep 27, 2016 at 01:31:15PM +0200, Jakub Jelinek wrote:
> > I think it is important to think in terms of what regexps we still want to
> > match, even when the matching is actually implemented in C, not using
> > regexps.  And yes, you list one reason why arbitrary text with fall and
> > through somewhere in it is not a good idea.  Another:
> > /* XXX Really fallthru?  */
> > (what we have in pch.c).
> > So, if you want to allow ... fall through ... and else fall through, and
> > perhaps // fall through - some explanation
> > then it might be e.g.
> > //-fallthrough$
> > //@fallthrough@$
> > /\*-fallthrough\*/
> > /\*@fallthrough@\*/
> > //[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?$
> > //[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?$
> > //[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?$
> > /\*[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?\*/
> > /\*[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?\*/
> > /\*[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?\*/
> > where . would match even newlines in the last 3,
> > but $ would always match just end of line?
> 
> Any comment with text
> 
> ^[^_[:alnum:]]*(else )?fall(s | |-)?thr(ough|u)[^_[:alnum:]]*$
> 
> perhaps?  Case-insensitive.  Or allow any amount of space, or even any
> interpunction.  Just don't allow any alphanumerics except for those
> exact words, and there won't be many false hits at all.

Not sure we want to match FaLlS THrouGH, and [^_[:alnum:]]* isn't without a
problem either, what if there is hebrew, or chinese, etc. text in there?
The matching shouldn't depend on the current locale IMHO, and figuring out what
unicode entry points are letters and which are not really isn't easy without that.

IMO before changing anything further, we want to gather some statistics what
styles are actually used in the wild together with how often they are used,
and then for the more common ones decide what is really supportable.

	Jakub
Marek Polacek Sept. 27, 2016, 12:01 p.m. UTC | #24
On Tue, Sep 27, 2016 at 01:55:22PM +0200, Bernd Schmidt wrote:
> On 09/27/2016 01:51 PM, Marek Polacek wrote:
> > But the C/C++ keywords are all English, too; lint tools only accept English,
> > and so it wouldn't seem unreasonable to only accept English keywords in the
> > comments.  And in any case, I don't see how a compiler can be expected to
> > be able to parse non-English languages.
> 
> It isn't. But it can also be reasonably by expected not to warn about things
> that are valid according to the language specification and are frequently
> used.

Ok, but note that the warning is in -Wextra, not enabled by default/-Wall.
I'm all for reducing false positives whenever possible and we can improve
out comment-parsing heuristics, but I just can't see us handling anything
other than English.

	Marek
Marek Polacek Sept. 27, 2016, 12:03 p.m. UTC | #25
On Tue, Sep 27, 2016 at 01:46:08PM +0200, Markus Trippelsdorf wrote:
> I'm also wondering about the situation where not a single break is used
> in all of the cases. It would be best not to warn here.

This is tricky and I'm afraid all I can offer here is to use the diagnostics
pragma to suppress the warning for Duff's device-like constructs.

	Marek
Florian Weimer Sept. 27, 2016, 12:27 p.m. UTC | #26
* Marek Polacek:

> On Tue, Sep 27, 2016 at 01:46:08PM +0200, Markus Trippelsdorf wrote:
>> I'm also wondering about the situation where not a single break is used
>> in all of the cases. It would be best not to warn here.
>
> This is tricky and I'm afraid all I can offer here is to use the diagnostics
> pragma to suppress the warning for Duff's device-like constructs.

Would it make sense to apply the fallthrough attribute to the entire
switch statement to address such scenarios?  Currently, that does not
seem supported.
Jakub Jelinek Sept. 27, 2016, 12:31 p.m. UTC | #27
On Tue, Sep 27, 2016 at 02:27:12PM +0200, Florian Weimer wrote:
> * Marek Polacek:
> 
> > On Tue, Sep 27, 2016 at 01:46:08PM +0200, Markus Trippelsdorf wrote:
> >> I'm also wondering about the situation where not a single break is used
> >> in all of the cases. It would be best not to warn here.
> >
> > This is tricky and I'm afraid all I can offer here is to use the diagnostics
> > pragma to suppress the warning for Duff's device-like constructs.
> 
> Would it make sense to apply the fallthrough attribute to the entire
> switch statement to address such scenarios?  Currently, that does not
> seem supported.

Where the attribute is allowed or not allowed is currently intentionally
derived from where C++17 allows it.

	Jakub
Marek Polacek Sept. 27, 2016, 12:32 p.m. UTC | #28
On Tue, Sep 27, 2016 at 02:27:12PM +0200, Florian Weimer wrote:
> * Marek Polacek:
> 
> > On Tue, Sep 27, 2016 at 01:46:08PM +0200, Markus Trippelsdorf wrote:
> >> I'm also wondering about the situation where not a single break is used
> >> in all of the cases. It would be best not to warn here.
> >
> > This is tricky and I'm afraid all I can offer here is to use the diagnostics
> > pragma to suppress the warning for Duff's device-like constructs.
> 
> Would it make sense to apply the fallthrough attribute to the entire
> switch statement to address such scenarios?  Currently, that does not
> seem supported.

I've been thinking about this, too.  But I think we'd have to invent
a new attribute, e.g. no_warn_fallthrough or so.

	Marek
Segher Boessenkool Sept. 27, 2016, 12:48 p.m. UTC | #29
On Tue, Sep 27, 2016 at 01:58:54PM +0200, Jakub Jelinek wrote:
> > Any comment with text
> > 
> > ^[^_[:alnum:]]*(else )?fall(s | |-)?thr(ough|u)[^_[:alnum:]]*$
> > 
> > perhaps?  Case-insensitive.  Or allow any amount of space, or even any
> > interpunction.  Just don't allow any alphanumerics except for those
> > exact words, and there won't be many false hits at all.
> 
> Not sure we want to match FaLlS THrouGH,

Yes it's silly, but would it ever match the wrong thing?

> and [^_[:alnum:]]* isn't without a
> problem either, what if there is hebrew, or chinese, etc. text in there?

I meant in LANG=C, but it would work otherwise, too.  Nasty, of course.

> The matching shouldn't depend on the current locale IMHO, and figuring out what
> unicode entry points are letters and which are not really isn't easy without that.

Right.

> IMO before changing anything further, we want to gather some statistics what
> styles are actually used in the wild together with how often they are used,
> and then for the more common ones decide what is really supportable.

If you do not allow a lot then there will be many false negatives.


Segher
Bernd Schmidt Sept. 27, 2016, 1:48 p.m. UTC | #30
On 09/27/2016 02:01 PM, Marek Polacek wrote:
> On Tue, Sep 27, 2016 at 01:55:22PM +0200, Bernd Schmidt wrote:
>> On 09/27/2016 01:51 PM, Marek Polacek wrote:
>>> But the C/C++ keywords are all English, too; lint tools only accept English,
>>> and so it wouldn't seem unreasonable to only accept English keywords in the
>>> comments.  And in any case, I don't see how a compiler can be expected to
>>> be able to parse non-English languages.
>>
>> It isn't. But it can also be reasonably by expected not to warn about things
>> that are valid according to the language specification and are frequently
>> used.
>
> Ok, but note that the warning is in -Wextra, not enabled by default/-Wall.

I think it's problematic enough that it needs to be removed from -Wextra 
as well. The latest ia64 backend patch shows that clearly IMO.


Bernd
Jakub Jelinek Sept. 27, 2016, 1:49 p.m. UTC | #31
On Tue, Sep 27, 2016 at 03:48:07PM +0200, Bernd Schmidt wrote:
> On 09/27/2016 02:01 PM, Marek Polacek wrote:
> >On Tue, Sep 27, 2016 at 01:55:22PM +0200, Bernd Schmidt wrote:
> >>On 09/27/2016 01:51 PM, Marek Polacek wrote:
> >>>But the C/C++ keywords are all English, too; lint tools only accept English,
> >>>and so it wouldn't seem unreasonable to only accept English keywords in the
> >>>comments.  And in any case, I don't see how a compiler can be expected to
> >>>be able to parse non-English languages.
> >>
> >>It isn't. But it can also be reasonably by expected not to warn about things
> >>that are valid according to the language specification and are frequently
> >>used.
> >
> >Ok, but note that the warning is in -Wextra, not enabled by default/-Wall.
> 
> I think it's problematic enough that it needs to be removed from -Wextra as
> well. The latest ia64 backend patch shows that clearly IMO.

Just compare that to the number of real bugs the warning found in gcc
codebase.  It is really worth it for -Wextra.

	Jakub
Bernd Schmidt Sept. 27, 2016, 1:53 p.m. UTC | #32
On 09/27/2016 03:49 PM, Jakub Jelinek wrote:
> On Tue, Sep 27, 2016 at 03:48:07PM +0200, Bernd Schmidt wrote:
>> On 09/27/2016 02:01 PM, Marek Polacek wrote:
>>> On Tue, Sep 27, 2016 at 01:55:22PM +0200, Bernd Schmidt wrote:
>>>> On 09/27/2016 01:51 PM, Marek Polacek wrote:
>>>>> But the C/C++ keywords are all English, too; lint tools only accept English,
>>>>> and so it wouldn't seem unreasonable to only accept English keywords in the
>>>>> comments.  And in any case, I don't see how a compiler can be expected to
>>>>> be able to parse non-English languages.
>>>>
>>>> It isn't. But it can also be reasonably by expected not to warn about things
>>>> that are valid according to the language specification and are frequently
>>>> used.
>>>
>>> Ok, but note that the warning is in -Wextra, not enabled by default/-Wall.
>>
>> I think it's problematic enough that it needs to be removed from -Wextra as
>> well. The latest ia64 backend patch shows that clearly IMO.
>
> Just compare that to the number of real bugs the warning found in gcc
> codebase.  It is really worth it for -Wextra.

What's the ratio of comments "fixed" to actual bugs found? IMO this is 
not something we should inflict on users unasked.


Bernd
Michael Matz Sept. 27, 2016, 1:56 p.m. UTC | #33
Hi,

On Tue, 27 Sep 2016, Jakub Jelinek wrote:

> Just compare that to the number of real bugs the warning found in gcc 
> codebase.  It is really worth it for -Wextra.

All those bugs would also have been found as well when it had simply 
accepted
  /fall.*thr/i
anywhere in the preceding comment on one line.  But all the recent 
spelling changes of comments to cater for the strictness exactly shows how 
misguided that is.  The above would accept "Don't fall through" as well.  
I say: so what?


Ciao,
Michael.
Jakub Jelinek Sept. 27, 2016, 1:58 p.m. UTC | #34
On Tue, Sep 27, 2016 at 03:53:25PM +0200, Bernd Schmidt wrote:
> On 09/27/2016 03:49 PM, Jakub Jelinek wrote:
> >On Tue, Sep 27, 2016 at 03:48:07PM +0200, Bernd Schmidt wrote:
> >>On 09/27/2016 02:01 PM, Marek Polacek wrote:
> >>>On Tue, Sep 27, 2016 at 01:55:22PM +0200, Bernd Schmidt wrote:
> >>>>On 09/27/2016 01:51 PM, Marek Polacek wrote:
> >>>>>But the C/C++ keywords are all English, too; lint tools only accept English,
> >>>>>and so it wouldn't seem unreasonable to only accept English keywords in the
> >>>>>comments.  And in any case, I don't see how a compiler can be expected to
> >>>>>be able to parse non-English languages.
> >>>>
> >>>>It isn't. But it can also be reasonably by expected not to warn about things
> >>>>that are valid according to the language specification and are frequently
> >>>>used.
> >>>
> >>>Ok, but note that the warning is in -Wextra, not enabled by default/-Wall.
> >>
> >>I think it's problematic enough that it needs to be removed from -Wextra as
> >>well. The latest ia64 backend patch shows that clearly IMO.
> >
> >Just compare that to the number of real bugs the warning found in gcc
> >codebase.  It is really worth it for -Wextra.
> 
> What's the ratio of comments "fixed" to actual bugs found? IMO this is not
> something we should inflict on users unasked.

We've inflicted on users many other coding style warnings that have a
reasonably high chance to reveal real bugs, e.g. -Wmisleading-indentation which is
even enabled in -Wall, not just -Wextra.  In reality, -Wextra isn't used
that often, and the people that use it will most likely benefit from the
warning IMNSHO.

	Jakub
Marek Polacek Sept. 27, 2016, 2:33 p.m. UTC | #35
On Tue, Sep 27, 2016 at 03:53:25PM +0200, Bernd Schmidt wrote:
> What's the ratio of comments "fixed" to actual bugs found? IMO this is not
> something we should inflict on users unasked.

One might argue that users actually *asked* for this by turning on -Wextra.

	Marek
Jason Merrill Sept. 27, 2016, 2:48 p.m. UTC | #36
On Tue, Sep 27, 2016 at 9:56 AM, Michael Matz <matz@suse.de> wrote:
> On Tue, 27 Sep 2016, Jakub Jelinek wrote:
>
>> Just compare that to the number of real bugs the warning found in gcc
>> codebase.  It is really worth it for -Wextra.
>
> All those bugs would also have been found as well when it had simply
> accepted
>   /fall.*thr/i
> anywhere in the preceding comment on one line.  But all the recent
> spelling changes of comments to cater for the strictness exactly shows how
> misguided that is.  The above would accept "Don't fall through" as well.
> I say: so what?

I agree.

Jason
Marek Polacek Sept. 27, 2016, 2:54 p.m. UTC | #37
On Tue, Sep 27, 2016 at 10:48:50AM -0400, Jason Merrill wrote:
> On Tue, Sep 27, 2016 at 9:56 AM, Michael Matz <matz@suse.de> wrote:
> > On Tue, 27 Sep 2016, Jakub Jelinek wrote:
> >
> >> Just compare that to the number of real bugs the warning found in gcc
> >> codebase.  It is really worth it for -Wextra.
> >
> > All those bugs would also have been found as well when it had simply
> > accepted
> >   /fall.*thr/i
> > anywhere in the preceding comment on one line.  But all the recent
> > spelling changes of comments to cater for the strictness exactly shows how
> > misguided that is.  The above would accept "Don't fall through" as well.
> > I say: so what?
> 
> I agree.

All right, I'm not opposed to making the comment parsing more benevolent.
We still should have enough time to fine-tune it.

	Marek
Jakub Jelinek Sept. 27, 2016, 3:04 p.m. UTC | #38
On Tue, Sep 27, 2016 at 04:54:28PM +0200, Marek Polacek wrote:
> On Tue, Sep 27, 2016 at 10:48:50AM -0400, Jason Merrill wrote:
> > On Tue, Sep 27, 2016 at 9:56 AM, Michael Matz <matz@suse.de> wrote:
> > > On Tue, 27 Sep 2016, Jakub Jelinek wrote:
> > >
> > >> Just compare that to the number of real bugs the warning found in gcc
> > >> codebase.  It is really worth it for -Wextra.
> > >
> > > All those bugs would also have been found as well when it had simply
> > > accepted
> > >   /fall.*thr/i
> > > anywhere in the preceding comment on one line.  But all the recent
> > > spelling changes of comments to cater for the strictness exactly shows how
> > > misguided that is.  The above would accept "Don't fall through" as well.
> > > I say: so what?
> > 
> > I agree.
> 
> All right, I'm not opposed to making the comment parsing more benevolent.
> We still should have enough time to fine-tune it.

Perhaps we want -Wimplicit-fallthrough{,=1,=2,=3,=4}, where
=1 would match indeed /fall.*thr/i (note, it will be really costly in this
case, one will have to parse all comments in detail in the preprocessor,
so I'd be against making it the default), =2 would allow
what we do right now, perhaps with the optional else and dots (perhaps
selected other interpunction chars), =3 would only allow the standardized
lint comments and =4 would not allow any comments, just the attributes?
Then each project can choose what they want.

	Jakub
Bernd Schmidt Sept. 27, 2016, 3:19 p.m. UTC | #39
On 09/27/2016 05:04 PM, Jakub Jelinek wrote:
>>> On Tue, Sep 27, 2016 at 9:56 AM, Michael Matz <matz@suse.de> wrote:
>>>> All those bugs would also have been found as well when it had simply
>>>> accepted
>>>>   /fall.*thr/i
>>>> anywhere in the preceding comment on one line.  But all the recent
>>>> spelling changes of comments to cater for the strictness exactly shows how
>>>> misguided that is.  The above would accept "Don't fall through" as well.
>>>> I say: so what?

> Perhaps we want -Wimplicit-fallthrough{,=1,=2,=3,=4}, where
> =1 would match indeed /fall.*thr/i (note, it will be really costly in this
> case, one will have to parse all comments in detail in the preprocessor,
> so I'd be against making it the default), =2 would allow
> what we do right now, perhaps with the optional else and dots (perhaps
> selected other interpunction chars), =3 would only allow the standardized
> lint comments and =4 would not allow any comments, just the attributes?
> Then each project can choose what they want.

I feel that's overthinking it. I believe Michael has identified the 
correct way to think about the issue.


Bernd
Marek Polacek Sept. 27, 2016, 3:24 p.m. UTC | #40
On Tue, Sep 27, 2016 at 05:04:23PM +0200, Jakub Jelinek wrote:
> On Tue, Sep 27, 2016 at 04:54:28PM +0200, Marek Polacek wrote:
> > On Tue, Sep 27, 2016 at 10:48:50AM -0400, Jason Merrill wrote:
> > > On Tue, Sep 27, 2016 at 9:56 AM, Michael Matz <matz@suse.de> wrote:
> > > > On Tue, 27 Sep 2016, Jakub Jelinek wrote:
> > > >
> > > >> Just compare that to the number of real bugs the warning found in gcc
> > > >> codebase.  It is really worth it for -Wextra.
> > > >
> > > > All those bugs would also have been found as well when it had simply
> > > > accepted
> > > >   /fall.*thr/i
> > > > anywhere in the preceding comment on one line.  But all the recent
> > > > spelling changes of comments to cater for the strictness exactly shows how
> > > > misguided that is.  The above would accept "Don't fall through" as well.
> > > > I say: so what?
> > > 
> > > I agree.
> > 
> > All right, I'm not opposed to making the comment parsing more benevolent.
> > We still should have enough time to fine-tune it.
> 
> Perhaps we want -Wimplicit-fallthrough{,=1,=2,=3,=4}, where
> =1 would match indeed /fall.*thr/i (note, it will be really costly in this
> case, one will have to parse all comments in detail in the preprocessor,
> so I'd be against making it the default),

Perhaps we could use POSIX regcomp/regex functions; do you (or anyone else)
have an idea how expensive they are and if it's feasible to use them in the
preprocessor?

	Marek
Jakub Jelinek Sept. 27, 2016, 3:28 p.m. UTC | #41
On Tue, Sep 27, 2016 at 05:19:10PM +0200, Bernd Schmidt wrote:
> On 09/27/2016 05:04 PM, Jakub Jelinek wrote:
> >>>On Tue, Sep 27, 2016 at 9:56 AM, Michael Matz <matz@suse.de> wrote:
> >>>>All those bugs would also have been found as well when it had simply
> >>>>accepted
> >>>>  /fall.*thr/i
> >>>>anywhere in the preceding comment on one line.  But all the recent
> >>>>spelling changes of comments to cater for the strictness exactly shows how
> >>>>misguided that is.  The above would accept "Don't fall through" as well.
> >>>>I say: so what?
> 
> >Perhaps we want -Wimplicit-fallthrough{,=1,=2,=3,=4}, where
> >=1 would match indeed /fall.*thr/i (note, it will be really costly in this
> >case, one will have to parse all comments in detail in the preprocessor,
> >so I'd be against making it the default), =2 would allow
> >what we do right now, perhaps with the optional else and dots (perhaps
> >selected other interpunction chars), =3 would only allow the standardized
> >lint comments and =4 would not allow any comments, just the attributes?
> >Then each project can choose what they want.
> 
> I feel that's overthinking it. I believe Michael has identified the correct
> way to think about the issue.

See above, it is very expensive at preprocessing time (look at how the
preprocessor optimizes skipping over comments, with that it is all gone),
and not everybody will want /* Don't fall through here.  */ or
/* This is fallible.  Threats are high.  */ (pick any of the hundreds+
english words with fall in them and thousands+ of words with thr in them)
to disable the warning.

	Jakub
Michael Matz Sept. 27, 2016, 3:41 p.m. UTC | #42
Hi,

On Tue, 27 Sep 2016, Marek Polacek wrote:

> > Perhaps we want -Wimplicit-fallthrough{,=1,=2,=3,=4}, where =1 would 
> > match indeed /fall.*thr/i (note, it will be really costly in this 
> > case, one will have to parse all comments in detail in the 
> > preprocessor, so I'd be against making it the default),
> 
> Perhaps we could use POSIX regcomp/regex functions; do you (or anyone 
> else) have an idea how expensive they are and if it's feasible to use 
> them in the preprocessor?

Why?  The regexp I gave was for demonstration.  Matching /fall.*thr/i 
would be done by something similar to:

  a = strcasestr(comment, "fall");
  if (!a) return;                  // no fall
  b = strcasestr(a+4, "thr");
  if (!b) return;                  // no thr
  if (memchr(a+4, '\n', b-a-4)) return; // on different lines
  foundit();

(With appropriate massaging that the comment to parse ends with 0.  
strcasestr would need addition to libiberty for where it's not available 
(or falling back to strstr there); obviously the above can be sped up by 
various tricks for ASCII and UTF-8 because of the relation of upper and 
lower case characters.  During tokenizing the comment (i.e. while 
searching for the end) one could already search for "fall" for instance to 
quickly early-out.)


Ciao,
Michael.
Tom Tromey Sept. 27, 2016, 4:19 p.m. UTC | #43
>>>>> "Michael" == Michael Matz <matz@suse.de> writes:

Michael> All those bugs would also have been found as well when it had simply 
Michael> accepted
Michael>   /fall.*thr/i
Michael> anywhere in the preceding comment on one line.  But all the recent 
Michael> spelling changes of comments to cater for the strictness exactly shows how 
Michael> misguided that is.  The above would accept "Don't fall through" as well.  
Michael> I say: so what?

The point of the warning is to make code more robust.  But accepting any
comment like "Don't fall through" is not more robust, but rather an
error waiting to happen; as IIUC the user has no way to detect this
case.

I think it's better for the comment-scanning feature to be very picky
(or even just not exist at all) -- that way you know exactly what you
are getting.  Lint was traditionally picky IIRC.  And, this is a warning
that isn't default and can also be disabled, so it's not as if users
have no recourse.

Tom
Marek Polacek Sept. 27, 2016, 4:55 p.m. UTC | #44
On Tue, Sep 27, 2016 at 01:31:15PM +0200, Jakub Jelinek wrote:
> On Tue, Sep 27, 2016 at 12:56:29PM +0200, Marek Polacek wrote:
> > On Tue, Sep 27, 2016 at 12:47:50PM +0200, Jakub Jelinek wrote:
> > > On Tue, Sep 27, 2016 at 12:39:41PM +0200, Markus Trippelsdorf wrote:
> > > > On 2016.09.27 at 10:46 +0200, Eric Botcazou wrote:
> > > > > > The intent has been that we catch the most common forms, but still require
> > > > > > it not to be complete free form.  Because, as experience shows, people are
> > > > > > extremely creative in these comments, and it is not very good idea to
> > > > > > support everything.  For ... fall through ... , what is the purpose of
> > > > > > those ...s?
> > > > > 
> > > > > No idea, but it has been there for a while and seems perfectly reasonable.
> > > > > IMO any sentence containing "fall" and "through/thru/etc" on the same line 
> > > > > should be accepted, otherwise it's just misplaced pickiness.
> > > > 
> > > > +1. Folks will just disable the warning if gcc is not very permissive
> > > > when paring existing comments. You cannot expect anyone to change
> > > > perfectly fine fall-through comments just to accommodate an arbitrary
> > > > gcc style.
> > > 
> > > The accepted style is already very permissive, we don't allow just one
> > > spelling as various lint tools.  I'm afraid looking for various cases of
> > > fall and through/thru possibly separated by anything and surrounded by
> > > anything is IMHO already too much, the compiler shouldn't try to try to
> > > grammar analyze the comments on what they actually talk about and whether it
> > > might be related to the switch fall through or something completely
> > > different.  Users should start using [[fallthrough]]; anyway.
> > 
> > I'm thinking perhaps we should also accept /* ... fall through ... */
> > and /* else fall through */, but accepting any sentence containing "fall" and
> > "through/thru/etc" on the same line would mean that we also accept
> > /* Don't fall through here.  */ and that is clearly not desirable.
> 
> I think it is important to think in terms of what regexps we still want to
> match, even when the matching is actually implemented in C, not using
> regexps.  And yes, you list one reason why arbitrary text with fall and
> through somewhere in it is not a good idea.  Another:
> /* XXX Really fallthru?  */
> (what we have in pch.c).
> So, if you want to allow ... fall through ... and else fall through, and
> perhaps // fall through - some explanation
> then it might be e.g.
> //-fallthrough$
> //@fallthrough@$
> /\*-fallthrough\*/
> /\*@fallthrough@\*/
> //[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?$
> //[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?$
> //[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?$
> /\*[ \t.]*(ELSE )?FALL(S | |-)?THR(OUGH|U)[ \t.]*(-.*)?\*/
> /\*[ \t.]*(Else )?Fall(s | |-)?[Tt]hr(ough|u)[ \t.]*(-.*)?\*/
> /\*[ \t.]*(else )?fall(s | |-)?thr(ough|u)[ \t.]*(-.*)?\*/
> where . would match even newlines in the last 3,
> but $ would always match just end of line?

This looks like a step in the right direction.  Apparently, it's hard
to come up with something that will make everyone happy; this might be
partly because GCC is probably the only compiler that attempts to
parse comments like this.  While clang has the implicit fallthrough
warning, they aren't even trying to parse the comments.

Jakub, do you want me to look into this (make GCC accept more), or do
you want to adjust that by yourself?

	Marek
Michael Matz Sept. 28, 2016, 12:15 p.m. UTC | #45
Hi,

On Tue, 27 Sep 2016, Tom Tromey wrote:

> The point of the warning is to make code more robust.  But accepting any 
> comment like "Don't fall through" is not more robust, but rather an 
> error waiting to happen; as IIUC the user has no way to detect this 
> case.
> 
> I think it's better for the comment-scanning feature to be very picky 
> (or even just not exist at all) -- that way you know exactly what you 
> are getting.  Lint was traditionally picky IIRC.  And, this is a warning 
> that isn't default and can also be disabled, so it's not as if users 
> have no recourse.

Not accepting
  /* And here we intentionally fall through because ... */
and forcing users to replace this by:
  /* fallthrough */
is not robust either.  It's actually actively lowering robustness of code, 
it creates work for programmers that will be regarded as pointless (and 
rightly so) and will merely lead to everybody disabling the warning (see 
our generated files) at which point we could just as well not have 
implemented it (which would be a shame because I think it's genuinely 
useful).

The point of warnings is to make code robust under the condition of not 
being a pain by giving zillions of false positives.  In this specific case 
the chance of giving false positives by being picky in how to disable the 
warning is very high.  On the other hand the chance of unintentionally 
disabling the warning by a negative comment like "Don't fall through here" 
is low, because presumably the one adding that comment (and hence thinking 
about that part of the code) also in fact put in the "break;" afterwards.

The argument with lint being picky would apply only if GCC would have 
added this warning maybe 20 years ago, not now where nearly nobody even 
knows what lint is, which lead to a large existing code base not having 
comments that would be accepted by lint but comments that do specify the 
intent of falling through.


Ciao,
Michael.
P.S.: Initially I even wanted to argue that the mere existence of _any_ 
comment before a case label would disable the warning.  I don't have the 
numbers but I bet even that version would have found the very same bugs 
that the picky version has.
Tom Tromey Sept. 28, 2016, 3:29 p.m. UTC | #46
>>>>> "Michael" == Michael Matz <matz@suse.de> writes:

Michael> Not accepting
Michael>   /* And here we intentionally fall through because ... */
Michael> and forcing users to replace this by:
Michael>   /* fallthrough */
Michael> is not robust either.  It's actually actively lowering robustness of code, 
Michael> it creates work for programmers that will be regarded as pointless (and 
Michael> rightly so) and will merely lead to everybody disabling the warning (see 
Michael> our generated files)

We can't control what programmers might do.  My point is that accepting
too much is actively bad -- it hides errors.  If this somehow makes some
programmer fall down a slippery slope, well, that's their error, not
gcc's.

TBH I think it would be better not to parse comments at all.  Heuristics
are generally bad and this case and ensuing discussion is a great
demonstration of that.

The other day I built gdb with -Wimplicit-fallthrough.  I was surprised
to find that gcc rejected this:

	default:
	  {
	    complaint (&symfile_complaints,
		       _("Storage class %d not recognized during scan"),
		       sclass);
	  }
	  /* FALLTHROUGH */

	  /* C_FCN is .bf and .ef symbols.  I think it is sufficient
	     to handle only the C_FUN and C_EXT.  */
	case C_FCN:

Presumably without the comment heuristic, this would be accepted.

Michael> The point of warnings is to make code robust under the condition of not 
Michael> being a pain by giving zillions of false positives.

My experience so far is that it's not so bad.  gdb actually had comments
in most spots, they just required a quick pass to clean them up:

    https://sourceware.org/ml/gdb-patches/2016-09/msg00375.html

And, code bases in more dire straights can just disable the warning after all.

Tom
Jakub Jelinek Sept. 28, 2016, 7:10 p.m. UTC | #47
On Wed, Sep 28, 2016 at 09:29:01AM -0600, Tom Tromey wrote:
> >>>>> "Michael" == Michael Matz <matz@suse.de> writes:
> 
> Michael> Not accepting
> Michael>   /* And here we intentionally fall through because ... */
> Michael> and forcing users to replace this by:
> Michael>   /* fallthrough */
> Michael> is not robust either.  It's actually actively lowering robustness of code, 
> Michael> it creates work for programmers that will be regarded as pointless (and 
> Michael> rightly so) and will merely lead to everybody disabling the warning (see 
> Michael> our generated files)
> 
> We can't control what programmers might do.  My point is that accepting
> too much is actively bad -- it hides errors.  If this somehow makes some
> programmer fall down a slippery slope, well, that's their error, not
> gcc's.
> 
> TBH I think it would be better not to parse comments at all.  Heuristics
> are generally bad and this case and ensuing discussion is a great
> demonstration of that.
> 
> The other day I built gdb with -Wimplicit-fallthrough.  I was surprised
> to find that gcc rejected this:
> 
> 	default:
> 	  {
> 	    complaint (&symfile_complaints,
> 		       _("Storage class %d not recognized during scan"),
> 		       sclass);
> 	  }
> 	  /* FALLTHROUGH */
> 
> 	  /* C_FCN is .bf and .ef symbols.  I think it is sufficient
> 	     to handle only the C_FUN and C_EXT.  */
> 	case C_FCN:
> 
> Presumably without the comment heuristic, this would be accepted.

Is complaint a noreturn call?  If not, then it would certainly warn, unless
there is [[fallthrough]]; or __attribute__((fallthrough)); etc. (or the
comment).  For the comment, /* FALLTHROUGH */ is the recognized spelling of
the comment, but right now we only look for such comments immediately before
a case/default keyword or user label; if there is another comment in
between, it is ignored.  This is something we are considering to change,
exactly because often the /* FALLTHRU */ comment appears after some case and
then there is unrelated comment before the next case about what that case
handles.

	Jakub
Bernd Schmidt Sept. 28, 2016, 7:13 p.m. UTC | #48
On 09/28/2016 02:15 PM, Michael Matz wrote:
> P.S.: Initially I even wanted to argue that the mere existence of _any_
> comment before a case label would disable the warning.  I don't have the
> numbers but I bet even that version would have found the very same bugs
> that the picky version has.

Sounds like a pretty good idea to me for a default setting. If we really 
want to have multiple levels of the warning. I agree that it's likely to 
find the majority of problems, and it no longer depends on language and 
spelling of the comment.


Bernd
Tom Tromey Sept. 28, 2016, 8:13 p.m. UTC | #49
>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

>> default:
>> {
>> complaint (&symfile_complaints,
>> _("Storage class %d not recognized during scan"),
>> sclass);
>> }
>> /* FALLTHROUGH */
>> 
>> /* C_FCN is .bf and .ef symbols.  I think it is sufficient
>> to handle only the C_FUN and C_EXT.  */
>> case C_FCN:

Jakub> Is complaint a noreturn call?

Nope.

Jakub> but right now we only look for such comments immediately before a
Jakub> case/default keyword or user label; if there is another comment
Jakub> in between, it is ignored.  This is something we are considering
Jakub> to change, exactly because often the /* FALLTHRU */ comment
Jakub> appears after some case and then there is unrelated comment
Jakub> before the next case about what that case handles.

Make sense.  Thanks.

Tom
diff mbox

Patch

diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -596,7 +596,7 @@  gnat_to_gnu_entity (Entity_Id gnat_entit
 	gnu_expr
 	  = gnat_to_gnu_external (Expression (Declaration_Node (gnat_entity)));
 
-      /* ... fall through ... */
+      /* fall through */
 
     case E_Exception:
     case E_Loop_Parameter:
@@ -3369,7 +3369,7 @@  gnat_to_gnu_entity (Entity_Id gnat_entit
 	  break;
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case E_Record_Subtype:
       /* If Cloned_Subtype is Present it means this record subtype has
@@ -3804,7 +3804,7 @@  gnat_to_gnu_entity (Entity_Id gnat_entit
 	    break;
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case E_Allocator_Type:
     case E_Access_Type:
@@ -6882,7 +6882,7 @@  choices_to_gnu (tree operand, Node_Id ch
 	      break;
 	    }
 
-	  /* ... fall through ... */
+	  /* fall through */
 
 	case N_Character_Literal:
 	case N_Integer_Literal:
@@ -8089,7 +8089,7 @@  annotate_value (tree gnu_size)
       else
 	return Uint_Minus_1;
 
-      /* Fall through... */
+      /* fall through */
 
     default:
       return No_Uint;
diff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -157,7 +157,7 @@  gnat_handle_option (size_t scode, const 
     case OPT_gant:
       warning (0, "%<-gnat%> misspelled as %<-gant%>");
 
-      /* ... fall through ... */
+      /* fall through */
 
     case OPT_gnat:
     case OPT_gnatO:
@@ -486,13 +486,13 @@  gnat_print_type (FILE *file, tree node, 
       else
 	print_node (file, "index type", TYPE_INDEX_TYPE (node), indent + 4);
 
-      /* ... fall through ... */
+      /* fall through */
 
     case ENUMERAL_TYPE:
     case BOOLEAN_TYPE:
       print_node_brief (file, "RM size", TYPE_RM_SIZE (node), indent + 4);
 
-      /* ... fall through ... */
+      /* fall through */
 
     case REAL_TYPE:
       print_node_brief (file, "RM min", TYPE_RM_MIN_VALUE (node), indent + 4);
diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -844,7 +844,7 @@  lvalue_required_p (Node_Id gnat_node, tr
 		 && Ekind (Entity (gnat_temp)) == E_Enumeration_Literal))
 	  return 1;
 
-      /* ... fall through ... */
+      /* fall through */
 
     case N_Slice:
       /* Only the array expression can require an lvalue.  */
@@ -890,7 +890,7 @@  lvalue_required_p (Node_Id gnat_node, tr
 	if (!constant)
 	  return 1;
 
-      /* ... fall through ... */
+      /* fall through */
 
     case N_Type_Conversion:
     case N_Qualified_Expression:
@@ -914,7 +914,7 @@  lvalue_required_p (Node_Id gnat_node, tr
 				  get_unpadded_type (Etype (gnat_parent)),
 				  true, false, true);
 
-      /* ... fall through ... */
+      /* fall through */
 
     default:
       return 0;
@@ -1681,7 +1681,7 @@  Attribute_to_gnu (Node_Id gnat_node, tre
 	  break;
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case Attr_Access:
     case Attr_Unchecked_Access:
@@ -1938,7 +1938,7 @@  Attribute_to_gnu (Node_Id gnat_node, tre
 	  break;
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case Attr_Length:
       {
@@ -2393,7 +2393,7 @@  Attribute_to_gnu (Node_Id gnat_node, tre
       /* We treat Model as identical to Machine.  This is true for at least
 	 IEEE and some other nice floating-point systems.  */
 
-      /* ... fall through ... */
+      /* fall through */
 
     case Attr_Machine:
       /* The trick is to force the compiler to store the result in memory so
@@ -2537,7 +2537,7 @@  Case_Statement_to_gnu (Node_Id gnat_node
 		  break;
 		}
 
-	      /* ... fall through ... */
+	      /* fall through */
 
 	    case N_Character_Literal:
 	    case N_Integer_Literal:
@@ -4007,7 +4007,7 @@  node_is_atomic (Node_Id gnat_node)
 	  && Has_Atomic_Components (Entity (Prefix (gnat_node))))
 	return true;
 
-      /* ... fall through ... */
+      /* fall through */
 
     case N_Explicit_Dereference:
       return Is_Atomic (Etype (gnat_node));
@@ -4123,7 +4123,7 @@  atomic_access_required_p (Node_Id gnat_n
       /* Nothing to do if we are the prefix of an attribute, since we do not
 	 want an atomic access for things like 'Size.  */
 
-      /* ... fall through ... */
+      /* fall through */
 
     case N_Reference:
       /* The N_Reference node is like an attribute.  */
@@ -6580,7 +6580,7 @@  gnat_to_gnu (Node_Id gnat_node)
 	  break;
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case N_Op_Eq:
     case N_Op_Ne:
@@ -6747,7 +6747,7 @@  gnat_to_gnu (Node_Id gnat_node)
 	  break;
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case N_Op_Minus:
     case N_Op_Abs:
@@ -8344,7 +8344,7 @@  gnat_gimplify_expr (tree *expr_p, gimple
 	    break;
 	  }
 
-      /* ... fall through ... */
+      /* fall through */
 
     default:
       return GS_UNHANDLED;
@@ -9867,7 +9867,7 @@  set_gnu_expr_location_from_node (tree no
       if (EXPR_P (TREE_OPERAND (node, 1)))
 	set_gnu_expr_location_from_node (TREE_OPERAND (node, 1), gnat_node);
 
-      /* ... fall through ... */
+      /* fall through */
 
     default:
       if (!REFERENCE_CLASS_P (node) && !EXPR_HAS_LOCATION (node))
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -3166,7 +3166,7 @@  create_subprog_decl (tree name, tree asm
 				    NULL_TREE, NULL_TREE),
 			 ATTR_FLAG_TYPE_IN_PLACE);
 
-      /* ... fall through ... */
+      /* fall through */
 
     case is_enabled:
       DECL_DECLARED_INLINE_P (subprog_decl) = 1;
@@ -4271,6 +4271,8 @@  convert (tree type, tree expr)
 	  return expr;
 	}
 
+      /* fall through */
+
     case CONSTRUCTOR:
       /* If we are converting a CONSTRUCTOR to a mere type variant, or to
 	 another padding type around the same type, just make a new one in
@@ -4508,7 +4510,7 @@  convert (tree type, tree expr)
 					  convert (TREE_TYPE (type),
 						   TYPE_MIN_VALUE (type))));
 
-      /* ... fall through ... */
+      /* fall through */
 
     case ENUMERAL_TYPE:
     case BOOLEAN_TYPE:
@@ -4585,7 +4587,7 @@  convert (tree type, tree expr)
 	  return gnat_build_constructor (type, v);
 	}
 
-      /* ... fall through ... */
+      /* fall through */
 
     case ARRAY_TYPE:
       /* In these cases, assume the front-end has validated the conversion.
@@ -4701,7 +4703,7 @@  convert_to_index_type (tree expr)
 	  break;
       }
 
-      /* ... fall through ... */
+      /* fall through */
 
     case NON_LVALUE_EXPR:
       return fold_build1 (code, sizetype,
diff --git a/gcc/ada/gcc-interface/utils2.c b/gcc/ada/gcc-interface/utils2.c
--- a/gcc/ada/gcc-interface/utils2.c
+++ b/gcc/ada/gcc-interface/utils2.c
@@ -180,7 +180,7 @@  known_alignment (tree exp)
 	  return known_alignment (t);
       }
 
-      /* ... fall through ... */
+      /* fall through */
 
     default:
       /* For other pointer expressions, we assume that the pointed-to object
@@ -1011,7 +1011,7 @@  build_binary_op (enum tree_code op_code,
       if (!operation_type)
 	operation_type = TREE_TYPE (left_type);
 
-      /* ... fall through ... */
+      /* fall through */
 
     case ARRAY_RANGE_REF:
       /* First look through conversion between type variants.  Note that
@@ -1230,7 +1230,7 @@  build_binary_op (enum tree_code op_code,
 	op_code = MINUS_EXPR;
       modulus = NULL_TREE;
 
-      /* ... fall through ... */
+      /* fall through */
 
     case PLUS_EXPR:
     case MINUS_EXPR:
@@ -1244,7 +1244,7 @@  build_binary_op (enum tree_code op_code,
 	  = gnat_type_for_mode (TYPE_MODE (operation_type),
 				TYPE_UNSIGNED (operation_type));
 
-      /* ... fall through ... */
+      /* fall through */
 
     default:
     common:
@@ -1466,7 +1466,7 @@  build_unary_op (enum tree_code op_code, 
 	    return build_unary_op (ADDR_EXPR, result_type,
 				   TREE_OPERAND (operand, 0));
 
-	  /* ... fallthru ... */
+	  /* fallthru */
 
 	case VIEW_CONVERT_EXPR:
 	  /* If this just a variant conversion or if the conversion doesn't
@@ -1487,7 +1487,7 @@  build_unary_op (enum tree_code op_code, 
 	case CONST_DECL:
 	  operand = DECL_CONST_CORRESPONDING_VAR (operand);
 
-	  /* ... fall through ... */
+	  /* fall through */
 
 	default:
 	common:
@@ -1648,7 +1648,7 @@  build_unary_op (enum tree_code op_code, 
 	  }
       }
 
-      /* ... fall through ... */
+      /* fall through */
 
     default:
       gcc_assert (operation_type == base_type);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -17917,6 +17917,7 @@  ix86_print_operand (FILE *file, rtx x, i
 #ifdef HAVE_AS_IX86_CMOV_SUN_SYNTAX
 	  if (ASSEMBLER_DIALECT == ASM_ATT)
 	    putc ('.', file);
+	  gcc_fallthrough ();
 #endif
 
 	case 'C':
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -3832,6 +3832,7 @@  check_pic (int i)
 		      || (GET_CODE (XEXP (op, 0)) == MINUS
 			  && XEXP (XEXP (op, 0), 0) == sparc_got ()
 			  && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST)));
+      /* fallthrough */
     case 2:
     default:
       return 1;
@@ -8371,6 +8372,7 @@  epilogue_renumber (register rtx *where, 
 	return 1;
       if (! test && REGNO (*where) >= 24 && REGNO (*where) < 32)
 	*where = gen_rtx_REG (GET_MODE (*where), OUTGOING_REGNO (REGNO(*where)));
+      /* fallthrough */
     case SCRATCH:
     case CC0:
     case PC: