Message ID | 20241018132520.817073-1-lhyatt@gmail.com |
---|---|
State | New |
Headers | show |
Series | diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423] | expand |
On Fri, 2024-10-18 at 09:25 -0400, Lewis Hyatt wrote: > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114423 > > The diagnostics we issue while lexing tokens from a _Pragma string > have > always come out at invalid locations. I had tried a couple years ago > to > fix this in a general way, but I think that ended up being too > invasive a > change to fix a problem that's pretty minor in practice, and it never > got > over the finish line. Here is a simple patch that improves the > situation > and addresses the recently filed PR on that topic, hopefully this > incremental improvement is a better way to make some progress on > this? > > It just adds a way for libcpp to override the location for all > diagnostics > temporarily, so that the diagnostics issued while lexing from a > _Pragma > string are issued at a real location (the location of the _Pragma > token) > and not a bogus one. That's a lot simpler than trying to arrange to > produce valid locations when lexing tokens from an internal > buffer. Bootstrap + regtest all languages on x86-64 Linux, tweaked a > few > existing tests to adjust to the new locations. OK for trunk? Thanks! [...snip...] > diff --git a/libcpp/errors.cc b/libcpp/errors.cc > index ad45f61913c..96fc165c12a 100644 > --- a/libcpp/errors.cc > +++ b/libcpp/errors.cc > @@ -60,13 +60,14 @@ cpp_diagnostic_at (cpp_reader * pfile, enum > cpp_diagnostic_level level, > enum cpp_warning_reason reason, rich_location > *richloc, > const char *msgid, va_list *ap) > { > - bool ret; > - > if (!pfile->cb.diagnostic) > abort (); > - ret = pfile->cb.diagnostic (pfile, level, reason, richloc, > _(msgid), ap); > - > - return ret; > + if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE) > + { > + rich_location rc2{pfile->line_table, pfile- > >diagnostic_override_loc}; > + return pfile->cb.diagnostic (pfile, level, reason, &rc2, > _(msgid), ap); > + } This will effectively override the primary location in the rich_location, but by using a second rich_location instance it will also ignore any secondary locations and fix-it hints. This might will be what we want here, but did you consider richloc.set_range (0, pfile->diagnostic_override_loc, SHOW_RANGE_WITH_CARET); to reset the primary location? Otherwise, looks good to me. [...snip...] Thanks Dave
On Fri, Oct 18, 2024 at 11:25 AM David Malcolm <dmalcolm@redhat.com> wrote: > > if (!pfile->cb.diagnostic) > > abort (); > > - ret = pfile->cb.diagnostic (pfile, level, reason, richloc, > > _(msgid), ap); > > - > > - return ret; > > + if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE) > > + { > > + rich_location rc2{pfile->line_table, pfile- > > >diagnostic_override_loc}; > > + return pfile->cb.diagnostic (pfile, level, reason, &rc2, > > _(msgid), ap); > > + } > > This will effectively override the primary location in the > rich_location, but by using a second rich_location instance it will > also ignore any secondary locations and fix-it hints. > > This might will be what we want here, but did you consider > richloc.set_range (0, pfile->diagnostic_override_loc, > SHOW_RANGE_WITH_CARET); > to reset the primary location? > > Otherwise, looks good to me. > > [...snip...] > > Thanks > Dave > Thanks for taking a look! My thinking was, when libcpp produces tokens from a _Pragma string, basically every location_t that it generates is wrong and shouldn't be used. Because it doesn't actually set up the line_map, it gets something random that's just sorta close to reasonable. So I think it makes sense to discard fixits and secondary locations too. libcpp does use rich_location pretty sparingly, but I suppose the goal is to use it more over time. We use one fixit hint for invalid directive names currently, that one can't show up in a _Pragma though. Right now we do define an encoding_rich_location subclass for escaping unprintable bytes, which inherits rich_location and just adds a new constructor to set the escape flag when it is created. You are definitely right that this patch as of now loses that information. Here's a source that uses an improperly normalized character: _Pragma("ோ") Currently we output: t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-Wnormalized=] 1 | _Pragma("<U+0BC7><U+0BBE>") | ^~~~~~ With the patch we output: t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-Wnormalized=] 1 | _Pragma("ோ") | ^~~~~~~ So the main location range is improved (it underlines all of _Pragma instead of most of it), but we have indeed lost the intended feature that the incorrect bytes are escaped on the source line. For this particular case I could improve it with a one line addition like rc2.set_escape_on_output (richloc->escape_on_output_p ()); and that would actually handle all currently needed cases since we don't use a lot of rich_locations in libcpp. It would just become stale if some other option gets added to rich_location in the future that we also should preserve. I think to fix it in a fully general way, it would be necessary to add a new interface to class rich_location. It already has a method to delete all the fixit hints, it would also need a method to delete all the ranges. Then we could make a copy of the richloc and just delete everything we don't want to preserve. Do you have a preference one way or the other? By the way, your suggestion was to directly modify richloc... these functions do take the richloc by non-const pointer, but is it OK to modify it or is a copy needed? The functions like cpp_warning_at() are exposed in the public header file, although at the moment, all call sites are within libcpp and don't look like they would notice if the argument was modified. I wasn't sure what is the expected interface here. Thanks again... -Lewis
diff --git a/libcpp/directives.cc b/libcpp/directives.cc index 9d235fa1b05..5706c28b835 100644 --- a/libcpp/directives.cc +++ b/libcpp/directives.cc @@ -2430,6 +2430,12 @@ destringize_and_run (cpp_reader *pfile, const cpp_string *in, pfile->buffer->file = pfile->buffer->prev->file; pfile->buffer->sysp = pfile->buffer->prev->sysp; + /* See comment below regarding the use of expansion_loc as the location + for all tokens; arrange here that diagnostics issued during lexing + get the same treatment. */ + const auto prev_loc_override = pfile->diagnostic_override_loc; + pfile->diagnostic_override_loc = expansion_loc; + start_directive (pfile); _cpp_clean_line (pfile); save_directive = pfile->directive; @@ -2497,6 +2503,7 @@ destringize_and_run (cpp_reader *pfile, const cpp_string *in, make that applicable to the real buffer too. */ pfile->buffer->prev->sysp = pfile->buffer->sysp; _cpp_pop_buffer (pfile); + pfile->diagnostic_override_loc = prev_loc_override; /* Reset the old macro state before ... */ XDELETE (pfile->context); diff --git a/libcpp/errors.cc b/libcpp/errors.cc index ad45f61913c..96fc165c12a 100644 --- a/libcpp/errors.cc +++ b/libcpp/errors.cc @@ -60,13 +60,14 @@ cpp_diagnostic_at (cpp_reader * pfile, enum cpp_diagnostic_level level, enum cpp_warning_reason reason, rich_location *richloc, const char *msgid, va_list *ap) { - bool ret; - if (!pfile->cb.diagnostic) abort (); - ret = pfile->cb.diagnostic (pfile, level, reason, richloc, _(msgid), ap); - - return ret; + if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE) + { + rich_location rc2{pfile->line_table, pfile->diagnostic_override_loc}; + return pfile->cb.diagnostic (pfile, level, reason, &rc2, _(msgid), ap); + } + return pfile->cb.diagnostic (pfile, level, reason, richloc, _(msgid), ap); } /* Print a diagnostic at the location of the previously lexed token. */ @@ -201,8 +202,14 @@ cpp_diagnostic_with_line (cpp_reader * pfile, enum cpp_diagnostic_level level, if (!pfile->cb.diagnostic) abort (); + /* Don't override note locations, which will likely make the note + more confusing. */ + const bool do_loc_override + = pfile->diagnostic_override_loc && level != CPP_DL_NOTE; + if (do_loc_override) + src_loc = pfile->diagnostic_override_loc; rich_location richloc (pfile->line_table, src_loc); - if (column) + if (column && !do_loc_override) richloc.override_column (column); ret = pfile->cb.diagnostic (pfile, level, reason, &richloc, _(msgid), ap); diff --git a/libcpp/internal.h b/libcpp/internal.h index a658a8c5739..13186c5a5b0 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -616,6 +616,10 @@ struct cpp_reader zero of said file. */ location_t main_loc; + /* If non-zero, override diagnostic locations (other than DK_NOTE + diagnostics) to this one. */ + location_t diagnostic_override_loc; + /* Returns true iff we should warn about UTF-8 bidirectional control characters. */ bool warn_bidi_p () const diff --git a/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c index 9867c94a8dd..6e37294fd2b 100644 --- a/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c +++ b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c @@ -3,9 +3,8 @@ #pragma GCC warning "warn-a" // { dg-warning warn-a } #pragma GCC error "err-b" // { dg-error err-b } -#define CONST1 _Pragma("GCC warning \"warn-c\"") 1 -#define CONST2 _Pragma("GCC error \"err-d\"") 2 - -char a[CONST1]; // { dg-warning warn-c } -char b[CONST2]; // { dg-error err-d } +#define CONST1 _Pragma("GCC warning \"warn-c\"") 1 // { dg-warning warn-c } +#define CONST2 _Pragma("GCC error \"err-d\"") 2 // { dg-error err-d } +char a[CONST1]; // { dg-note "in expansion of macro 'CONST1'" } +char b[CONST2]; // { dg-note "in expansion of macro 'CONST2'" } diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-diagnostic-loc.c b/gcc/testsuite/c-c++-common/cpp/pragma-diagnostic-loc.c new file mode 100644 index 00000000000..4ef40420cdd --- /dev/null +++ b/gcc/testsuite/c-c++-common/cpp/pragma-diagnostic-loc.c @@ -0,0 +1,17 @@ +/* { dg-do preprocess } */ +/* PR preprocessor/114423 */ +/* Check that we now issue diagnostics at the location of the _Pragma + instead of an invalid location. If we someday manage to issue + diagnostics at better locations in the future, this will need + updating. */ +_Pragma("GCC warning \"warning1\"") /* { dg-warning "1:warning1" } */ +#define P _Pragma("GCC warning \"warning2\"") /* { dg-warning "11:warning2" } */ +P /* { dg-note "in expansion of macro" } */ +#define S "GCC warning \"warning3\"" +/**/ _Pragma(S) /* { dg-warning "6:warning3" } */ + +/* This diagnostic uses a different code path (cpp_diagnostic_at() rather + than cpp_error_with_line()). Also make sure that the dg-note location + does not get overridden to the _Pragma location. */ +#pragma GCC poison xyz /* { dg-note "poisoned here" } */ +/* */ _Pragma("xyz") /* { dg-error "7:attempt to use poisoned" } */ diff --git a/gcc/testsuite/g++.dg/pch/operator-1.C b/gcc/testsuite/g++.dg/pch/operator-1.C index 290b5f7ab21..1138a969536 100644 --- a/gcc/testsuite/g++.dg/pch/operator-1.C +++ b/gcc/testsuite/g++.dg/pch/operator-1.C @@ -1,2 +1,8 @@ #include "operator-1.H" -int main(void){ major(0);} /* { dg-warning "Did not Work" } */ +int main(void){ major(0);} /* { dg-note "in expansion of macro 'major'" } */ +/* Line numbers below pertain to the header file. */ +/* { dg-warning "Did not Work" "" { target *-*-* } 1 } */ +/* { dg-note "in expansion of macro '__glibc_macro_warning1'" "" { target *-*-* } 3 } */ +/* { dg-note "in expansion of macro '__glibc_macro_warning'" "" { target *-*-* } 4 } */ +/* { dg-note "in expansion of macro '__SYSMACROS_DM1'" "" { target *-*-* } 6 } */ +/* { dg-note "in expansion of macro '__SYSMACROS_DM'" "" { target *-*-* } 9 } */