Message ID | 20240113221251.2180315-1-lhyatt@gmail.com |
---|---|
State | New |
Headers | show |
Series | libcpp: Support extended characters for #pragma {push, pop}_macro [PR109704] | expand |
Hello- https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html May I please ping this one? Thanks! On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 > > The below patch fixes the issue noted in the PR that extended characters > cannot appear in the identifier passed to a #pragma push_macro or #pragma > pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for > GCC 13 please? > > I know we just entered stage 4, however I feel this is kinda like an old > regression, given that the issue was not apparent until support for UCNs and > UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into > GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this > release. (The other major one was for extended characters in a user-defined > literal, that was fixed by r14-2629). > > Speaking of just entering stage 4. I do have 4 really short patches sent > over the past several months that never got any response. Is there any > chance someone may have a few minutes to look at them please? They are > really just like 1-3 line fixes for PRs. > > libcpp (pinged once recently): > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html > > diagnostics (pinged for 3rd time last week): > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html > -- >8 -- > > The implementation of #pragma push_macro and #pragma pop_macro has to date > made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an > identifier out of a string. When support was added for extended characters > in identifiers ($, UCNs, or UTF-8), that support was added only for the > "normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and > not for the ad-hoc way. Consequently, extended identifiers are not usable > with these pragmas. > > The logic for lexing identifiers has become more complicated than it was > when _cpp_lex_identifier() was written -- it now handles things like \N{} > escapes in C++, for instance -- and it no longer seems practical to maintain > a redundant code path for lexing identifiers. Address the issue by changing > the implementation of #pragma {push,pop}_macro to lex identifiers in the > expected way, i.e. by pushing a cpp_buffer and lexing the identifier from > there. > > The existing implementation has some quirks because of the ad-hoc parsing > logic. For example: > > #pragma push_macro("X ") > ... > #pragma pop_macro("X") > > will not restore macro X (note the extra space in the first string). However: > > #pragma push_macro("X ") > ... > #pragma pop_macro("X ") > > actually does sucessfully restore "X". This is because the key for looking > up the saved macro on the push stack is the original string passed, so the > string passed to pop_macro needs to match it exactly. It is not that easy to > reproduce this logic in the world of extended characters, given that for > example it should be valid to pass a UCN to push_macro, and the > corresponding UTF-8 to pop_macro. Given that this aspect of the existing > behavior seems unintentional and has no tests (and does not match other > implementations), I opted to make the new logic more straightforward. The > string passed needs to lex to one token, which must be a valid identifier, > or else no action is taken and no error is generated. Any diagnostics > encountered during lexing (e.g., due to a UTF-8 character not permitted to > appear in an identifier) are also suppressed. > > It could be nice (for GCC 15) to also add a warning if a pop_macro does not > match a previous push_macro. > > libcpp/ChangeLog: > > PR preprocessor/109704 > * include/cpplib.h (class cpp_auto_suppress_diagnostics): New class. > * errors.cc > (cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New > function. > (cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New > function. > * charset.cc (noop_diagnostic_cb): Remove. > (cpp_interpret_string_ranges): Refactor diagnostic suppression logic > into new class cpp_auto_suppress_diagnostics. > (count_source_chars): Likewise. > * directives.cc (cpp_pop_definition): Add cpp_hashnode argument. > (lex_identifier_from_string): New static helper function. > (push_pop_macro_common): Refactor common logic from > do_pragma_push_macro and do_pragma_pop_macro; use > lex_identifier_from_string instead of _cpp_lex_identifier. > (do_pragma_push_macro): Reimplement using push_pop_macro_common. > (do_pragma_pop_macro): Likewise. > * internal.h (_cpp_lex_identifier): Remove. > * lex.cc (lex_identifier_intern): Remove. > (_cpp_lex_identifier): Remove. > > gcc/testsuite/ChangeLog: > > PR preprocessor/109704 > * c-c++-common/cpp/pragma-push-pop-utf8.c: New test. > * g++.dg/pch/pushpop-2.C: New test. > * g++.dg/pch/pushpop-2.Hs: New test. > * gcc.dg/pch/pushpop-2.c: New test. > * gcc.dg/pch/pushpop-2.hs: New test. > --- > libcpp/charset.cc | 33 +-- > libcpp/directives.cc | 175 +++++++-------- > libcpp/errors.cc | 16 ++ > libcpp/include/cpplib.h | 13 ++ > libcpp/internal.h | 1 - > libcpp/lex.cc | 33 --- > .../c-c++-common/cpp/pragma-push-pop-utf8.c | 203 ++++++++++++++++++ > gcc/testsuite/g++.dg/pch/pushpop-2.C | 18 ++ > gcc/testsuite/g++.dg/pch/pushpop-2.Hs | 9 + > gcc/testsuite/gcc.dg/pch/pushpop-2.c | 18 ++ > gcc/testsuite/gcc.dg/pch/pushpop-2.hs | 9 + > 11 files changed, 378 insertions(+), 150 deletions(-) > create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c > create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.C > create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.Hs > create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.c > create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.hs > > diff --git a/libcpp/charset.cc b/libcpp/charset.cc > index 54d7b9e0932..7937df7d78c 100644 > --- a/libcpp/charset.cc > +++ b/libcpp/charset.cc > @@ -2590,19 +2590,6 @@ cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count, > return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL); > } > > -/* A "do nothing" diagnostic-handling callback for use by > - cpp_interpret_string_ranges, so that it can temporarily suppress > - diagnostic-handling. */ > - > -static bool > -noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level, > - enum cpp_warning_reason, rich_location *, > - const char *, va_list *) > -{ > - /* no-op. */ > - return true; > -} > - > /* This function mimics the behavior of cpp_interpret_string, but > rather than generating a string in the execution character set, > *OUT is written to with the source code ranges of the characters > @@ -2642,20 +2629,10 @@ cpp_interpret_string_ranges (cpp_reader *pfile, const cpp_string *from, > failing, rather than being emitted as a user-visible diagnostic. > If an diagnostic does occur, we should see it via the return value of > cpp_interpret_string_1. */ > - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, > - enum cpp_warning_reason, rich_location *, > - const char *, va_list *) > - ATTRIBUTE_FPTR_PRINTF(5,0); > - > - saved_diagnostic_handler = pfile->cb.diagnostic; > - pfile->cb.diagnostic = noop_diagnostic_cb; > - > + cpp_auto_suppress_diagnostics suppress {pfile}; > bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type, > loc_readers, out); > > - /* Restore the saved diagnostic-handler. */ > - pfile->cb.diagnostic = saved_diagnostic_handler; > - > if (!result) > return "cpp_interpret_string_1 failed"; > > @@ -2691,17 +2668,11 @@ static unsigned > count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type) > { > cpp_string str2 = { 0, 0 }; > - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, > - enum cpp_warning_reason, rich_location *, > - const char *, va_list *) > - ATTRIBUTE_FPTR_PRINTF(5,0); > - saved_diagnostic_handler = pfile->cb.diagnostic; > - pfile->cb.diagnostic = noop_diagnostic_cb; > + cpp_auto_suppress_diagnostics suppress {pfile}; > convert_f save_func = pfile->narrow_cset_desc.func; > pfile->narrow_cset_desc.func = convert_count_chars; > bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type); > pfile->narrow_cset_desc.func = save_func; > - pfile->cb.diagnostic = saved_diagnostic_handler; > if (ret) > { > if (str2.text != str.text) > diff --git a/libcpp/directives.cc b/libcpp/directives.cc > index 479f8c716e8..019e4009dc9 100644 > --- a/libcpp/directives.cc > +++ b/libcpp/directives.cc > @@ -137,7 +137,8 @@ static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *); > static void handle_assertion (cpp_reader *, const char *, int); > static void do_pragma_push_macro (cpp_reader *); > static void do_pragma_pop_macro (cpp_reader *); > -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *); > +static void cpp_pop_definition (cpp_reader *, def_pragma_macro *, > + cpp_hashnode *); > > /* This is the table of directive handlers. All extensions other than > #warning, #include_next, and #import are deprecated. The name is > @@ -1595,55 +1596,95 @@ do_pragma_once (cpp_reader *pfile) > _cpp_mark_file_once_only (pfile, pfile->buffer->file); > } > > -/* Handle #pragma push_macro(STRING). */ > -static void > -do_pragma_push_macro (cpp_reader *pfile) > +/* Helper for #pragma {push,pop}_macro. Destringize STR and > + lex it into an identifier, returning the hash node for it. */ > + > +static cpp_hashnode * > +lex_identifier_from_string (cpp_reader *pfile, cpp_string str) > { > + auto src = (const uchar *) memchr (str.text, '"', str.len); > + gcc_checking_assert (src); > + ++src; > + const auto limit = str.text + str.len - 1; > + gcc_checking_assert (*limit == '"' && limit >= src); > + const auto ident = XALLOCAVEC (uchar, limit - src + 1); > + auto dest = ident; > + while (src != limit) > + { > + /* We know there is a character following the backslash. */ > + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) > + src++; > + *dest++ = *src++; > + } > + > + /* We reserved a spot for the newline with the + 1 when allocating IDENT. > + Push a buffer containing the identifier to lex. */ > + *dest = '\n'; > + cpp_push_buffer (pfile, ident, dest - ident, true); > + _cpp_clean_line (pfile); > + pfile->cur_token = _cpp_temp_token (pfile); > + cpp_token *tok; > + { > + /* Suppress diagnostics during lexing so that we silently ignore invalid > + input, as seems to be the common practice for this pragma. */ > + cpp_auto_suppress_diagnostics suppress {pfile}; > + tok = _cpp_lex_direct (pfile); > + } > + > cpp_hashnode *node; > - size_t defnlen; > - const uchar *defn = NULL; > - char *macroname, *dest; > - const char *limit, *src; > - const cpp_token *txt; > - struct def_pragma_macro *c; > + if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit) > + node = nullptr; > + else > + node = tok->val.node.node; > > - txt = get__Pragma_string (pfile); > - if (!txt) > + _cpp_pop_buffer (pfile); > + return node; > +} > + > +/* Common processing for #pragma {push,pop}_macro. */ > + > +static cpp_hashnode * > +push_pop_macro_common (cpp_reader *pfile, const char *type) > +{ > + const cpp_token *const txt = get__Pragma_string (pfile); > + ++pfile->keep_tokens; > + cpp_hashnode *node; > + if (txt) > { > - location_t src_loc = pfile->cur_token[-1].src_loc; > - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, > - "invalid #pragma push_macro directive"); > check_eol (pfile, false); > skip_rest_of_line (pfile); > - return; > + node = lex_identifier_from_string (pfile, txt->val.str); > } > - dest = macroname = (char *) alloca (txt->val.str.len + 2); > - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); > - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); > - while (src < limit) > + else > { > - /* We know there is a character following the backslash. */ > - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) > - src++; > - *dest++ = *src++; > + node = nullptr; > + location_t src_loc = pfile->cur_token[-1].src_loc; > + cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, > + "invalid #pragma %s_macro directive", type); > + skip_rest_of_line (pfile); > } > - *dest = 0; > - check_eol (pfile, false); > - skip_rest_of_line (pfile); > - c = XNEW (struct def_pragma_macro); > - memset (c, 0, sizeof (struct def_pragma_macro)); > - c->name = XNEWVAR (char, strlen (macroname) + 1); > - strcpy (c->name, macroname); > + --pfile->keep_tokens; > + return node; > +} > + > +/* Handle #pragma push_macro(STRING). */ > +static void > +do_pragma_push_macro (cpp_reader *pfile) > +{ > + const auto node = push_pop_macro_common (pfile, "push"); > + if (!node) > + return; > + const auto c = XCNEW (def_pragma_macro); > + c->name = xstrdup ((const char *) NODE_NAME (node)); > c->next = pfile->pushed_macros; > - node = _cpp_lex_identifier (pfile, c->name); > if (node->type == NT_VOID) > c->is_undef = 1; > else if (node->type == NT_BUILTIN_MACRO) > c->is_builtin = 1; > else > { > - defn = cpp_macro_definition (pfile, node); > - defnlen = ustrlen (defn); > + const auto defn = cpp_macro_definition (pfile, node); > + const size_t defnlen = ustrlen (defn); > c->definition = XNEWVEC (uchar, defnlen + 2); > c->definition[defnlen] = '\n'; > c->definition[defnlen + 1] = 0; > @@ -1660,50 +1701,24 @@ do_pragma_push_macro (cpp_reader *pfile) > static void > do_pragma_pop_macro (cpp_reader *pfile) > { > - char *macroname, *dest; > - const char *limit, *src; > - const cpp_token *txt; > - struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros; > - txt = get__Pragma_string (pfile); > - if (!txt) > - { > - location_t src_loc = pfile->cur_token[-1].src_loc; > - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, > - "invalid #pragma pop_macro directive"); > - check_eol (pfile, false); > - skip_rest_of_line (pfile); > - return; > - } > - dest = macroname = (char *) alloca (txt->val.str.len + 2); > - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); > - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); > - while (src < limit) > - { > - /* We know there is a character following the backslash. */ > - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) > - src++; > - *dest++ = *src++; > - } > - *dest = 0; > - check_eol (pfile, false); > - skip_rest_of_line (pfile); > - > - while (c != NULL) > + const auto node = push_pop_macro_common (pfile, "pop"); > + if (!node) > + return; > + for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = c->next) > { > - if (!strcmp (c->name, macroname)) > + if (!strcmp (c->name, (const char *) NODE_NAME (node))) > { > if (!l) > pfile->pushed_macros = c->next; > else > l->next = c->next; > - cpp_pop_definition (pfile, c); > + cpp_pop_definition (pfile, c, node); > free (c->definition); > free (c->name); > free (c); > break; > } > l = c; > - c = c->next; > } > } > > @@ -2607,12 +2622,8 @@ cpp_undef (cpp_reader *pfile, const char *macro) > /* Replace a previous definition DEF of the macro STR. If DEF is NULL, > or first element is zero, then the macro should be undefined. */ > static void > -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) > +cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode *node) > { > - cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name); > - if (node == NULL) > - return; > - > if (pfile->cb.before_define) > pfile->cb.before_define (pfile); > > @@ -2634,29 +2645,23 @@ cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) > } > > { > - size_t namelen; > - const uchar *dn; > - cpp_hashnode *h = NULL; > - cpp_buffer *nbuf; > - > - namelen = ustrcspn (c->definition, "( \n"); > - h = cpp_lookup (pfile, c->definition, namelen); > - dn = c->definition + namelen; > - > - nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true); > + const auto namelen = ustrcspn (c->definition, "( \n"); > + const auto dn = c->definition + namelen; > + const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, > + true); > if (nbuf != NULL) > { > _cpp_clean_line (pfile); > nbuf->sysp = 1; > - if (!_cpp_create_definition (pfile, h, 0)) > + if (!_cpp_create_definition (pfile, node, 0)) > abort (); > _cpp_pop_buffer (pfile); > } > else > abort (); > - h->value.macro->line = c->line; > - h->value.macro->syshdr = c->syshdr; > - h->value.macro->used = c->used; > + node->value.macro->line = c->line; > + node->value.macro->syshdr = c->syshdr; > + node->value.macro->used = c->used; > } > } > > diff --git a/libcpp/errors.cc b/libcpp/errors.cc > index 295496df7ed..3228dcbe7f6 100644 > --- a/libcpp/errors.cc > +++ b/libcpp/errors.cc > @@ -350,3 +350,19 @@ cpp_errno_filename (cpp_reader *pfile, enum cpp_diagnostic_level level, > return cpp_error_at (pfile, level, loc, "%s: %s", filename, > xstrerror (errno)); > } > + > +cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader *pfile) > + : m_pfile (pfile), m_cb (pfile->cb.diagnostic) > +{ > + m_pfile->cb.diagnostic > + = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason, > + rich_location *, const char *, va_list *) > + { > + return true; > + }; > +} > + > +cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics () > +{ > + m_pfile->cb.diagnostic = m_cb; > +} > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h > index 5746aac9ea4..50705e3377a 100644 > --- a/libcpp/include/cpplib.h > +++ b/libcpp/include/cpplib.h > @@ -1638,4 +1638,17 @@ enum cpp_xid_property { > > unsigned int cpp_check_xid_property (cppchar_t c); > > +/* In errors.cc */ > + > +/* RAII class to suppress CPP diagnostics in the current scope. */ > +class cpp_auto_suppress_diagnostics > +{ > + public: > + explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile); > + ~cpp_auto_suppress_diagnostics (); > + private: > + cpp_reader *const m_pfile; > + const decltype (cpp_callbacks::diagnostic) m_cb; > +}; > + > #endif /* ! LIBCPP_CPPLIB_H */ > diff --git a/libcpp/internal.h b/libcpp/internal.h > index a20215c5709..6221ef0d1e7 100644 > --- a/libcpp/internal.h > +++ b/libcpp/internal.h > @@ -753,7 +753,6 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *); > extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *); > extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *); > extern void _cpp_init_tokenrun (tokenrun *, unsigned int); > -extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *); > extern int _cpp_remaining_tokens_num_in_context (cpp_context *); > extern void _cpp_init_lexer (void); > static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have, > diff --git a/libcpp/lex.cc b/libcpp/lex.cc > index 5aa379980cf..ba97377417b 100644 > --- a/libcpp/lex.cc > +++ b/libcpp/lex.cc > @@ -2204,39 +2204,6 @@ identifier_diagnostics_on_lex (cpp_reader *pfile, cpp_hashnode *node) > NODE_NAME (node)); > } > > -/* Helper function to get the cpp_hashnode of the identifier BASE. */ > -static cpp_hashnode * > -lex_identifier_intern (cpp_reader *pfile, const uchar *base) > -{ > - cpp_hashnode *result; > - const uchar *cur; > - unsigned int len; > - unsigned int hash = HT_HASHSTEP (0, *base); > - > - cur = base + 1; > - while (ISIDNUM (*cur)) > - { > - hash = HT_HASHSTEP (hash, *cur); > - cur++; > - } > - len = cur - base; > - hash = HT_HASHFINISH (hash, len); > - result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table, > - base, len, hash, HT_ALLOC)); > - identifier_diagnostics_on_lex (pfile, result); > - return result; > -} > - > -/* Get the cpp_hashnode of an identifier specified by NAME in > - the current cpp_reader object. If none is found, NULL is returned. */ > -cpp_hashnode * > -_cpp_lex_identifier (cpp_reader *pfile, const char *name) > -{ > - cpp_hashnode *result; > - result = lex_identifier_intern (pfile, (uchar *) name); > - return result; > -} > - > /* Lex an identifier starting at BASE. BUFFER->CUR is expected to point > one past the first character at BASE, which may be a (possibly multi-byte) > character if STARTS_UCN is true. */ > diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c > new file mode 100644 > index 00000000000..c8665960e30 > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c > @@ -0,0 +1,203 @@ > +/* { dg-do preprocess } */ > +/* { dg-options "-std=c11 -pedantic" { target c } } */ > +/* { dg-options "-std=c++11 -pedantic" { target c++ } } */ > +/* { dg-additional-options "-Wall" } */ > + > +/* PR preprocessor/109704 */ > + > +/* Verify basic operations for different extended identifiers... */ > + > +/* ...dollar sign. */ > +#define $x 1 > +#pragma push_macro("$x") > +#undef $x > +#define $x 0 > +#pragma pop_macro("$x") > +#if !$x > +#error $x > +#endif > +#define $x 1 > +_Pragma("push_macro(\"$x\")") > +#undef $x > +#define $x 0 > +_Pragma("pop_macro(\"$x\")") > +#if !$x > +#error $x > +#endif > +#define x$ 1 > +#pragma push_macro("x$") > +#undef x$ > +#define x$ 0 > +#pragma pop_macro("x$") > +#if !x$ > +#error x$ > +#endif > +#define x$ 1 > +_Pragma("push_macro(\"x$\")") > +#undef x$ > +#define x$ 0 > +_Pragma("pop_macro(\"x$\")") > +#if !x$ > +#error x$ > +#endif > + > +/* ...UCN. */ > +#define \u03B1x 1 > +#pragma push_macro("\u03B1x") > +#undef \u03B1x > +#define \u03B1x 0 > +#pragma pop_macro("\u03B1x") > +#if !\u03B1x > +#error \u03B1x > +#endif > +#define \u03B1x 1 > +_Pragma("push_macro(\"\\u03B1x\")") > +#undef \u03B1x > +#define \u03B1x 0 > +_Pragma("pop_macro(\"\\u03B1x\")") > +#if !\u03B1x > +#error \u03B1x > +#endif > +#define x\u03B1 1 > +#pragma push_macro("x\u03B1") > +#undef x\u03B1 > +#define x\u03B1 0 > +#pragma pop_macro("x\u03B1") > +#if !x\u03B1 > +#error x\u03B1 > +#endif > +#define x\u03B1 1 > +_Pragma("push_macro(\"x\\u03B1\")") > +#undef x\u03B1 > +#define x\u03B1 0 > +_Pragma("pop_macro(\"x\\u03B1\")") > +#if !x\u03B1 > +#error x\u03B1 > +#endif > + > +/* ...UTF-8. */ > +#define πx 1 > +#pragma push_macro("πx") > +#undef πx > +#define πx 0 > +#pragma pop_macro("πx") > +#if !πx > +#error πx > +#endif > +#define πx 1 > +_Pragma("push_macro(\"πx\")") > +#undef πx > +#define πx 0 > +_Pragma("pop_macro(\"πx\")") > +#if !πx > +#error πx > +#endif > +#define xπ 1 > +#pragma push_macro("xπ") > +#undef xπ > +#define xπ 0 > +#pragma pop_macro("xπ") > +#if !xπ > +#error xπ > +#endif > +#define xπ 1 > +_Pragma("push_macro(\"xπ\")") > +#undef xπ > +#define xπ 0 > +_Pragma("pop_macro(\"xπ\")") > +#if !xπ > +#error xπ > +#endif > + > +/* Verify UCN and UTF-8 can be intermixed. */ > +#define ħ_0 1 > +#pragma push_macro("ħ_0") > +#undef ħ_0 > +#define ħ_0 0 > +#if ħ_0 > +#error ħ_0 ħ_0 \U00000127_0 > +#endif > +#pragma pop_macro("\U00000127_0") > +#if !ħ_0 > +#error ħ_0 ħ_0 \U00000127_0 > +#endif > +#define ħ_1 1 > +#pragma push_macro("\U00000127_1") > +#undef ħ_1 > +#define ħ_1 0 > +#if ħ_1 > +#error ħ_1 \U00000127_1 ħ_1 > +#endif > +#pragma pop_macro("ħ_1") > +#if !ħ_1 > +#error ħ_1 \U00000127_1 ħ_1 > +#endif > +#define ħ_2 1 > +#pragma push_macro("\U00000127_2") > +#undef ħ_2 > +#define ħ_2 0 > +#if ħ_2 > +#error ħ_2 \U00000127_2 \U00000127_2 > +#endif > +#pragma pop_macro("\U00000127_2") > +#if !ħ_2 > +#error ħ_2 \U00000127_2 \U00000127_2 > +#endif > +#define \U00000127_3 1 > +#pragma push_macro("ħ_3") > +#undef \U00000127_3 > +#define \U00000127_3 0 > +#if \U00000127_3 > +#error \U00000127_3 ħ_3 ħ_3 > +#endif > +#pragma pop_macro("ħ_3") > +#if !\U00000127_3 > +#error \U00000127_3 ħ_3 ħ_3 > +#endif > +#define \U00000127_4 1 > +#pragma push_macro("ħ_4") > +#undef \U00000127_4 > +#define \U00000127_4 0 > +#if \U00000127_4 > +#error \U00000127_4 ħ_4 \U00000127_4 > +#endif > +#pragma pop_macro("\U00000127_4") > +#if !\U00000127_4 > +#error \U00000127_4 ħ_4 \U00000127_4 > +#endif > +#define \U00000127_5 1 > +#pragma push_macro("\U00000127_5") > +#undef \U00000127_5 > +#define \U00000127_5 0 > +#if \U00000127_5 > +#error \U00000127_5 \U00000127_5 ħ_5 > +#endif > +#pragma pop_macro("ħ_5") > +#if !\U00000127_5 > +#error \U00000127_5 \U00000127_5 ħ_5 > +#endif > + > +/* Verify invalid input produces no diagnostics. */ > +#pragma push_macro("") /* { dg-bogus "." } */ > +#pragma push_macro("\u") /* { dg-bogus "." } */ > +#pragma push_macro("\u0000") /* { dg-bogus "." } */ > +#pragma push_macro("not a single identifier") /* { dg-bogus "." } */ > +#pragma push_macro("invalid╬character") /* { dg-bogus "." } */ > +#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */ > +#pragma push_macro("#include <cstdlib>") /* { dg-bogus "." } */ > + > +/* Verify end-of-line diagnostics for valid and invalid input. */ > +#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("not a single identifier") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("invalid╬character") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra tokens" } */ > +#pragma push_macro("#include <cstdlib>") oops /* { dg-warning "extra tokens" } */ > + > +/* Verify expected diagnostics. */ > +#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */ > +#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */ > +_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */ > +_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } */ > diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C b/gcc/testsuite/g++.dg/pch/pushpop-2.C > new file mode 100644 > index 00000000000..84886aea985 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C > @@ -0,0 +1,18 @@ > +/* { dg-options -std=c++11 } */ > +#include "pushpop-2.Hs" > + > +#if π != 4 > +#error π != 4 > +#endif > +#pragma pop_macro("\u03C0") > +#if π != 3 > +#error π != 3 > +#endif > + > +#if \u03B1 != 6 > +#error α != 6 > +#endif > +_Pragma("pop_macro(\"\\u03B1\")") > +#if α != 5 > +#error α != 5 > +#endif > diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs > new file mode 100644 > index 00000000000..797139a3196 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs > @@ -0,0 +1,9 @@ > +#define π 3 > +#pragma push_macro ("π") > +#undef π > +#define π 4 > + > +#define \u03B1 5 > +#pragma push_macro ("α") > +#undef α > +#define α 6 > diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c b/gcc/testsuite/gcc.dg/pch/pushpop-2.c > new file mode 100644 > index 00000000000..61b8430c6d2 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c > @@ -0,0 +1,18 @@ > +/* { dg-options -std=c11 } */ > +#include "pushpop-2.hs" > + > +#if π != 4 > +#error π != 4 > +#endif > +#pragma pop_macro("\u03C0") > +#if π != 3 > +#error π != 3 > +#endif > + > +#if \u03B1 != 6 > +#error α != 6 > +#endif > +_Pragma("pop_macro(\"\\u03B1\")") > +#if α != 5 > +#error α != 5 > +#endif > diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs > new file mode 100644 > index 00000000000..797139a3196 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs > @@ -0,0 +1,9 @@ > +#define π 3 > +#pragma push_macro ("π") > +#undef π > +#define π 4 > + > +#define \u03B1 5 > +#pragma push_macro ("α") > +#undef α > +#define α 6
Hello- May I please ping this one (now for GCC 15)? Thanks! https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html -Lewis On Sat, Feb 10, 2024 at 9:02 AM Lewis Hyatt <lhyatt@gmail.com> wrote: > > Hello- > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > May I please ping this one? Thanks! > > On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > Hello- > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 > > > > The below patch fixes the issue noted in the PR that extended characters > > cannot appear in the identifier passed to a #pragma push_macro or #pragma > > pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for > > GCC 13 please? > > > > I know we just entered stage 4, however I feel this is kinda like an old > > regression, given that the issue was not apparent until support for UCNs and > > UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into > > GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this > > release. (The other major one was for extended characters in a user-defined > > literal, that was fixed by r14-2629). > > > > Speaking of just entering stage 4. I do have 4 really short patches sent > > over the past several months that never got any response. Is there any > > chance someone may have a few minutes to look at them please? They are > > really just like 1-3 line fixes for PRs. > > > > libcpp (pinged once recently): > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html > > > > diagnostics (pinged for 3rd time last week): > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html > > > -- >8 -- > > > > The implementation of #pragma push_macro and #pragma pop_macro has to date > > made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an > > identifier out of a string. When support was added for extended characters > > in identifiers ($, UCNs, or UTF-8), that support was added only for the > > "normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and > > not for the ad-hoc way. Consequently, extended identifiers are not usable > > with these pragmas. > > > > The logic for lexing identifiers has become more complicated than it was > > when _cpp_lex_identifier() was written -- it now handles things like \N{} > > escapes in C++, for instance -- and it no longer seems practical to maintain > > a redundant code path for lexing identifiers. Address the issue by changing > > the implementation of #pragma {push,pop}_macro to lex identifiers in the > > expected way, i.e. by pushing a cpp_buffer and lexing the identifier from > > there. > > > > The existing implementation has some quirks because of the ad-hoc parsing > > logic. For example: > > > > #pragma push_macro("X ") > > ... > > #pragma pop_macro("X") > > > > will not restore macro X (note the extra space in the first string). However: > > > > #pragma push_macro("X ") > > ... > > #pragma pop_macro("X ") > > > > actually does sucessfully restore "X". This is because the key for looking > > up the saved macro on the push stack is the original string passed, so the > > string passed to pop_macro needs to match it exactly. It is not that easy to > > reproduce this logic in the world of extended characters, given that for > > example it should be valid to pass a UCN to push_macro, and the > > corresponding UTF-8 to pop_macro. Given that this aspect of the existing > > behavior seems unintentional and has no tests (and does not match other > > implementations), I opted to make the new logic more straightforward. The > > string passed needs to lex to one token, which must be a valid identifier, > > or else no action is taken and no error is generated. Any diagnostics > > encountered during lexing (e.g., due to a UTF-8 character not permitted to > > appear in an identifier) are also suppressed. > > > > It could be nice (for GCC 15) to also add a warning if a pop_macro does not > > match a previous push_macro. > > > > libcpp/ChangeLog: > > > > PR preprocessor/109704 > > * include/cpplib.h (class cpp_auto_suppress_diagnostics): New class. > > * errors.cc > > (cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New > > function. > > (cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New > > function. > > * charset.cc (noop_diagnostic_cb): Remove. > > (cpp_interpret_string_ranges): Refactor diagnostic suppression logic > > into new class cpp_auto_suppress_diagnostics. > > (count_source_chars): Likewise. > > * directives.cc (cpp_pop_definition): Add cpp_hashnode argument. > > (lex_identifier_from_string): New static helper function. > > (push_pop_macro_common): Refactor common logic from > > do_pragma_push_macro and do_pragma_pop_macro; use > > lex_identifier_from_string instead of _cpp_lex_identifier. > > (do_pragma_push_macro): Reimplement using push_pop_macro_common. > > (do_pragma_pop_macro): Likewise. > > * internal.h (_cpp_lex_identifier): Remove. > > * lex.cc (lex_identifier_intern): Remove. > > (_cpp_lex_identifier): Remove. > > > > gcc/testsuite/ChangeLog: > > > > PR preprocessor/109704 > > * c-c++-common/cpp/pragma-push-pop-utf8.c: New test. > > * g++.dg/pch/pushpop-2.C: New test. > > * g++.dg/pch/pushpop-2.Hs: New test. > > * gcc.dg/pch/pushpop-2.c: New test. > > * gcc.dg/pch/pushpop-2.hs: New test. > > --- > > libcpp/charset.cc | 33 +-- > > libcpp/directives.cc | 175 +++++++-------- > > libcpp/errors.cc | 16 ++ > > libcpp/include/cpplib.h | 13 ++ > > libcpp/internal.h | 1 - > > libcpp/lex.cc | 33 --- > > .../c-c++-common/cpp/pragma-push-pop-utf8.c | 203 ++++++++++++++++++ > > gcc/testsuite/g++.dg/pch/pushpop-2.C | 18 ++ > > gcc/testsuite/g++.dg/pch/pushpop-2.Hs | 9 + > > gcc/testsuite/gcc.dg/pch/pushpop-2.c | 18 ++ > > gcc/testsuite/gcc.dg/pch/pushpop-2.hs | 9 + > > 11 files changed, 378 insertions(+), 150 deletions(-) > > create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c > > create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.C > > create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.Hs > > create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.c > > create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.hs > > > > diff --git a/libcpp/charset.cc b/libcpp/charset.cc > > index 54d7b9e0932..7937df7d78c 100644 > > --- a/libcpp/charset.cc > > +++ b/libcpp/charset.cc > > @@ -2590,19 +2590,6 @@ cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count, > > return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL); > > } > > > > -/* A "do nothing" diagnostic-handling callback for use by > > - cpp_interpret_string_ranges, so that it can temporarily suppress > > - diagnostic-handling. */ > > - > > -static bool > > -noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level, > > - enum cpp_warning_reason, rich_location *, > > - const char *, va_list *) > > -{ > > - /* no-op. */ > > - return true; > > -} > > - > > /* This function mimics the behavior of cpp_interpret_string, but > > rather than generating a string in the execution character set, > > *OUT is written to with the source code ranges of the characters > > @@ -2642,20 +2629,10 @@ cpp_interpret_string_ranges (cpp_reader *pfile, const cpp_string *from, > > failing, rather than being emitted as a user-visible diagnostic. > > If an diagnostic does occur, we should see it via the return value of > > cpp_interpret_string_1. */ > > - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, > > - enum cpp_warning_reason, rich_location *, > > - const char *, va_list *) > > - ATTRIBUTE_FPTR_PRINTF(5,0); > > - > > - saved_diagnostic_handler = pfile->cb.diagnostic; > > - pfile->cb.diagnostic = noop_diagnostic_cb; > > - > > + cpp_auto_suppress_diagnostics suppress {pfile}; > > bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type, > > loc_readers, out); > > > > - /* Restore the saved diagnostic-handler. */ > > - pfile->cb.diagnostic = saved_diagnostic_handler; > > - > > if (!result) > > return "cpp_interpret_string_1 failed"; > > > > @@ -2691,17 +2668,11 @@ static unsigned > > count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type) > > { > > cpp_string str2 = { 0, 0 }; > > - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, > > - enum cpp_warning_reason, rich_location *, > > - const char *, va_list *) > > - ATTRIBUTE_FPTR_PRINTF(5,0); > > - saved_diagnostic_handler = pfile->cb.diagnostic; > > - pfile->cb.diagnostic = noop_diagnostic_cb; > > + cpp_auto_suppress_diagnostics suppress {pfile}; > > convert_f save_func = pfile->narrow_cset_desc.func; > > pfile->narrow_cset_desc.func = convert_count_chars; > > bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type); > > pfile->narrow_cset_desc.func = save_func; > > - pfile->cb.diagnostic = saved_diagnostic_handler; > > if (ret) > > { > > if (str2.text != str.text) > > diff --git a/libcpp/directives.cc b/libcpp/directives.cc > > index 479f8c716e8..019e4009dc9 100644 > > --- a/libcpp/directives.cc > > +++ b/libcpp/directives.cc > > @@ -137,7 +137,8 @@ static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *); > > static void handle_assertion (cpp_reader *, const char *, int); > > static void do_pragma_push_macro (cpp_reader *); > > static void do_pragma_pop_macro (cpp_reader *); > > -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *); > > +static void cpp_pop_definition (cpp_reader *, def_pragma_macro *, > > + cpp_hashnode *); > > > > /* This is the table of directive handlers. All extensions other than > > #warning, #include_next, and #import are deprecated. The name is > > @@ -1595,55 +1596,95 @@ do_pragma_once (cpp_reader *pfile) > > _cpp_mark_file_once_only (pfile, pfile->buffer->file); > > } > > > > -/* Handle #pragma push_macro(STRING). */ > > -static void > > -do_pragma_push_macro (cpp_reader *pfile) > > +/* Helper for #pragma {push,pop}_macro. Destringize STR and > > + lex it into an identifier, returning the hash node for it. */ > > + > > +static cpp_hashnode * > > +lex_identifier_from_string (cpp_reader *pfile, cpp_string str) > > { > > + auto src = (const uchar *) memchr (str.text, '"', str.len); > > + gcc_checking_assert (src); > > + ++src; > > + const auto limit = str.text + str.len - 1; > > + gcc_checking_assert (*limit == '"' && limit >= src); > > + const auto ident = XALLOCAVEC (uchar, limit - src + 1); > > + auto dest = ident; > > + while (src != limit) > > + { > > + /* We know there is a character following the backslash. */ > > + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) > > + src++; > > + *dest++ = *src++; > > + } > > + > > + /* We reserved a spot for the newline with the + 1 when allocating IDENT. > > + Push a buffer containing the identifier to lex. */ > > + *dest = '\n'; > > + cpp_push_buffer (pfile, ident, dest - ident, true); > > + _cpp_clean_line (pfile); > > + pfile->cur_token = _cpp_temp_token (pfile); > > + cpp_token *tok; > > + { > > + /* Suppress diagnostics during lexing so that we silently ignore invalid > > + input, as seems to be the common practice for this pragma. */ > > + cpp_auto_suppress_diagnostics suppress {pfile}; > > + tok = _cpp_lex_direct (pfile); > > + } > > + > > cpp_hashnode *node; > > - size_t defnlen; > > - const uchar *defn = NULL; > > - char *macroname, *dest; > > - const char *limit, *src; > > - const cpp_token *txt; > > - struct def_pragma_macro *c; > > + if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit) > > + node = nullptr; > > + else > > + node = tok->val.node.node; > > > > - txt = get__Pragma_string (pfile); > > - if (!txt) > > + _cpp_pop_buffer (pfile); > > + return node; > > +} > > + > > +/* Common processing for #pragma {push,pop}_macro. */ > > + > > +static cpp_hashnode * > > +push_pop_macro_common (cpp_reader *pfile, const char *type) > > +{ > > + const cpp_token *const txt = get__Pragma_string (pfile); > > + ++pfile->keep_tokens; > > + cpp_hashnode *node; > > + if (txt) > > { > > - location_t src_loc = pfile->cur_token[-1].src_loc; > > - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, > > - "invalid #pragma push_macro directive"); > > check_eol (pfile, false); > > skip_rest_of_line (pfile); > > - return; > > + node = lex_identifier_from_string (pfile, txt->val.str); > > } > > - dest = macroname = (char *) alloca (txt->val.str.len + 2); > > - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); > > - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); > > - while (src < limit) > > + else > > { > > - /* We know there is a character following the backslash. */ > > - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) > > - src++; > > - *dest++ = *src++; > > + node = nullptr; > > + location_t src_loc = pfile->cur_token[-1].src_loc; > > + cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, > > + "invalid #pragma %s_macro directive", type); > > + skip_rest_of_line (pfile); > > } > > - *dest = 0; > > - check_eol (pfile, false); > > - skip_rest_of_line (pfile); > > - c = XNEW (struct def_pragma_macro); > > - memset (c, 0, sizeof (struct def_pragma_macro)); > > - c->name = XNEWVAR (char, strlen (macroname) + 1); > > - strcpy (c->name, macroname); > > + --pfile->keep_tokens; > > + return node; > > +} > > + > > +/* Handle #pragma push_macro(STRING). */ > > +static void > > +do_pragma_push_macro (cpp_reader *pfile) > > +{ > > + const auto node = push_pop_macro_common (pfile, "push"); > > + if (!node) > > + return; > > + const auto c = XCNEW (def_pragma_macro); > > + c->name = xstrdup ((const char *) NODE_NAME (node)); > > c->next = pfile->pushed_macros; > > - node = _cpp_lex_identifier (pfile, c->name); > > if (node->type == NT_VOID) > > c->is_undef = 1; > > else if (node->type == NT_BUILTIN_MACRO) > > c->is_builtin = 1; > > else > > { > > - defn = cpp_macro_definition (pfile, node); > > - defnlen = ustrlen (defn); > > + const auto defn = cpp_macro_definition (pfile, node); > > + const size_t defnlen = ustrlen (defn); > > c->definition = XNEWVEC (uchar, defnlen + 2); > > c->definition[defnlen] = '\n'; > > c->definition[defnlen + 1] = 0; > > @@ -1660,50 +1701,24 @@ do_pragma_push_macro (cpp_reader *pfile) > > static void > > do_pragma_pop_macro (cpp_reader *pfile) > > { > > - char *macroname, *dest; > > - const char *limit, *src; > > - const cpp_token *txt; > > - struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros; > > - txt = get__Pragma_string (pfile); > > - if (!txt) > > - { > > - location_t src_loc = pfile->cur_token[-1].src_loc; > > - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, > > - "invalid #pragma pop_macro directive"); > > - check_eol (pfile, false); > > - skip_rest_of_line (pfile); > > - return; > > - } > > - dest = macroname = (char *) alloca (txt->val.str.len + 2); > > - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); > > - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); > > - while (src < limit) > > - { > > - /* We know there is a character following the backslash. */ > > - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) > > - src++; > > - *dest++ = *src++; > > - } > > - *dest = 0; > > - check_eol (pfile, false); > > - skip_rest_of_line (pfile); > > - > > - while (c != NULL) > > + const auto node = push_pop_macro_common (pfile, "pop"); > > + if (!node) > > + return; > > + for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = c->next) > > { > > - if (!strcmp (c->name, macroname)) > > + if (!strcmp (c->name, (const char *) NODE_NAME (node))) > > { > > if (!l) > > pfile->pushed_macros = c->next; > > else > > l->next = c->next; > > - cpp_pop_definition (pfile, c); > > + cpp_pop_definition (pfile, c, node); > > free (c->definition); > > free (c->name); > > free (c); > > break; > > } > > l = c; > > - c = c->next; > > } > > } > > > > @@ -2607,12 +2622,8 @@ cpp_undef (cpp_reader *pfile, const char *macro) > > /* Replace a previous definition DEF of the macro STR. If DEF is NULL, > > or first element is zero, then the macro should be undefined. */ > > static void > > -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) > > +cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode *node) > > { > > - cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name); > > - if (node == NULL) > > - return; > > - > > if (pfile->cb.before_define) > > pfile->cb.before_define (pfile); > > > > @@ -2634,29 +2645,23 @@ cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) > > } > > > > { > > - size_t namelen; > > - const uchar *dn; > > - cpp_hashnode *h = NULL; > > - cpp_buffer *nbuf; > > - > > - namelen = ustrcspn (c->definition, "( \n"); > > - h = cpp_lookup (pfile, c->definition, namelen); > > - dn = c->definition + namelen; > > - > > - nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true); > > + const auto namelen = ustrcspn (c->definition, "( \n"); > > + const auto dn = c->definition + namelen; > > + const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, > > + true); > > if (nbuf != NULL) > > { > > _cpp_clean_line (pfile); > > nbuf->sysp = 1; > > - if (!_cpp_create_definition (pfile, h, 0)) > > + if (!_cpp_create_definition (pfile, node, 0)) > > abort (); > > _cpp_pop_buffer (pfile); > > } > > else > > abort (); > > - h->value.macro->line = c->line; > > - h->value.macro->syshdr = c->syshdr; > > - h->value.macro->used = c->used; > > + node->value.macro->line = c->line; > > + node->value.macro->syshdr = c->syshdr; > > + node->value.macro->used = c->used; > > } > > } > > > > diff --git a/libcpp/errors.cc b/libcpp/errors.cc > > index 295496df7ed..3228dcbe7f6 100644 > > --- a/libcpp/errors.cc > > +++ b/libcpp/errors.cc > > @@ -350,3 +350,19 @@ cpp_errno_filename (cpp_reader *pfile, enum cpp_diagnostic_level level, > > return cpp_error_at (pfile, level, loc, "%s: %s", filename, > > xstrerror (errno)); > > } > > + > > +cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader *pfile) > > + : m_pfile (pfile), m_cb (pfile->cb.diagnostic) > > +{ > > + m_pfile->cb.diagnostic > > + = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason, > > + rich_location *, const char *, va_list *) > > + { > > + return true; > > + }; > > +} > > + > > +cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics () > > +{ > > + m_pfile->cb.diagnostic = m_cb; > > +} > > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h > > index 5746aac9ea4..50705e3377a 100644 > > --- a/libcpp/include/cpplib.h > > +++ b/libcpp/include/cpplib.h > > @@ -1638,4 +1638,17 @@ enum cpp_xid_property { > > > > unsigned int cpp_check_xid_property (cppchar_t c); > > > > +/* In errors.cc */ > > + > > +/* RAII class to suppress CPP diagnostics in the current scope. */ > > +class cpp_auto_suppress_diagnostics > > +{ > > + public: > > + explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile); > > + ~cpp_auto_suppress_diagnostics (); > > + private: > > + cpp_reader *const m_pfile; > > + const decltype (cpp_callbacks::diagnostic) m_cb; > > +}; > > + > > #endif /* ! LIBCPP_CPPLIB_H */ > > diff --git a/libcpp/internal.h b/libcpp/internal.h > > index a20215c5709..6221ef0d1e7 100644 > > --- a/libcpp/internal.h > > +++ b/libcpp/internal.h > > @@ -753,7 +753,6 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *); > > extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *); > > extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *); > > extern void _cpp_init_tokenrun (tokenrun *, unsigned int); > > -extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *); > > extern int _cpp_remaining_tokens_num_in_context (cpp_context *); > > extern void _cpp_init_lexer (void); > > static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have, > > diff --git a/libcpp/lex.cc b/libcpp/lex.cc > > index 5aa379980cf..ba97377417b 100644 > > --- a/libcpp/lex.cc > > +++ b/libcpp/lex.cc > > @@ -2204,39 +2204,6 @@ identifier_diagnostics_on_lex (cpp_reader *pfile, cpp_hashnode *node) > > NODE_NAME (node)); > > } > > > > -/* Helper function to get the cpp_hashnode of the identifier BASE. */ > > -static cpp_hashnode * > > -lex_identifier_intern (cpp_reader *pfile, const uchar *base) > > -{ > > - cpp_hashnode *result; > > - const uchar *cur; > > - unsigned int len; > > - unsigned int hash = HT_HASHSTEP (0, *base); > > - > > - cur = base + 1; > > - while (ISIDNUM (*cur)) > > - { > > - hash = HT_HASHSTEP (hash, *cur); > > - cur++; > > - } > > - len = cur - base; > > - hash = HT_HASHFINISH (hash, len); > > - result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table, > > - base, len, hash, HT_ALLOC)); > > - identifier_diagnostics_on_lex (pfile, result); > > - return result; > > -} > > - > > -/* Get the cpp_hashnode of an identifier specified by NAME in > > - the current cpp_reader object. If none is found, NULL is returned. */ > > -cpp_hashnode * > > -_cpp_lex_identifier (cpp_reader *pfile, const char *name) > > -{ > > - cpp_hashnode *result; > > - result = lex_identifier_intern (pfile, (uchar *) name); > > - return result; > > -} > > - > > /* Lex an identifier starting at BASE. BUFFER->CUR is expected to point > > one past the first character at BASE, which may be a (possibly multi-byte) > > character if STARTS_UCN is true. */ > > diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c > > new file mode 100644 > > index 00000000000..c8665960e30 > > --- /dev/null > > +++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c > > @@ -0,0 +1,203 @@ > > +/* { dg-do preprocess } */ > > +/* { dg-options "-std=c11 -pedantic" { target c } } */ > > +/* { dg-options "-std=c++11 -pedantic" { target c++ } } */ > > +/* { dg-additional-options "-Wall" } */ > > + > > +/* PR preprocessor/109704 */ > > + > > +/* Verify basic operations for different extended identifiers... */ > > + > > +/* ...dollar sign. */ > > +#define $x 1 > > +#pragma push_macro("$x") > > +#undef $x > > +#define $x 0 > > +#pragma pop_macro("$x") > > +#if !$x > > +#error $x > > +#endif > > +#define $x 1 > > +_Pragma("push_macro(\"$x\")") > > +#undef $x > > +#define $x 0 > > +_Pragma("pop_macro(\"$x\")") > > +#if !$x > > +#error $x > > +#endif > > +#define x$ 1 > > +#pragma push_macro("x$") > > +#undef x$ > > +#define x$ 0 > > +#pragma pop_macro("x$") > > +#if !x$ > > +#error x$ > > +#endif > > +#define x$ 1 > > +_Pragma("push_macro(\"x$\")") > > +#undef x$ > > +#define x$ 0 > > +_Pragma("pop_macro(\"x$\")") > > +#if !x$ > > +#error x$ > > +#endif > > + > > +/* ...UCN. */ > > +#define \u03B1x 1 > > +#pragma push_macro("\u03B1x") > > +#undef \u03B1x > > +#define \u03B1x 0 > > +#pragma pop_macro("\u03B1x") > > +#if !\u03B1x > > +#error \u03B1x > > +#endif > > +#define \u03B1x 1 > > +_Pragma("push_macro(\"\\u03B1x\")") > > +#undef \u03B1x > > +#define \u03B1x 0 > > +_Pragma("pop_macro(\"\\u03B1x\")") > > +#if !\u03B1x > > +#error \u03B1x > > +#endif > > +#define x\u03B1 1 > > +#pragma push_macro("x\u03B1") > > +#undef x\u03B1 > > +#define x\u03B1 0 > > +#pragma pop_macro("x\u03B1") > > +#if !x\u03B1 > > +#error x\u03B1 > > +#endif > > +#define x\u03B1 1 > > +_Pragma("push_macro(\"x\\u03B1\")") > > +#undef x\u03B1 > > +#define x\u03B1 0 > > +_Pragma("pop_macro(\"x\\u03B1\")") > > +#if !x\u03B1 > > +#error x\u03B1 > > +#endif > > + > > +/* ...UTF-8. */ > > +#define πx 1 > > +#pragma push_macro("πx") > > +#undef πx > > +#define πx 0 > > +#pragma pop_macro("πx") > > +#if !πx > > +#error πx > > +#endif > > +#define πx 1 > > +_Pragma("push_macro(\"πx\")") > > +#undef πx > > +#define πx 0 > > +_Pragma("pop_macro(\"πx\")") > > +#if !πx > > +#error πx > > +#endif > > +#define xπ 1 > > +#pragma push_macro("xπ") > > +#undef xπ > > +#define xπ 0 > > +#pragma pop_macro("xπ") > > +#if !xπ > > +#error xπ > > +#endif > > +#define xπ 1 > > +_Pragma("push_macro(\"xπ\")") > > +#undef xπ > > +#define xπ 0 > > +_Pragma("pop_macro(\"xπ\")") > > +#if !xπ > > +#error xπ > > +#endif > > + > > +/* Verify UCN and UTF-8 can be intermixed. */ > > +#define ħ_0 1 > > +#pragma push_macro("ħ_0") > > +#undef ħ_0 > > +#define ħ_0 0 > > +#if ħ_0 > > +#error ħ_0 ħ_0 \U00000127_0 > > +#endif > > +#pragma pop_macro("\U00000127_0") > > +#if !ħ_0 > > +#error ħ_0 ħ_0 \U00000127_0 > > +#endif > > +#define ħ_1 1 > > +#pragma push_macro("\U00000127_1") > > +#undef ħ_1 > > +#define ħ_1 0 > > +#if ħ_1 > > +#error ħ_1 \U00000127_1 ħ_1 > > +#endif > > +#pragma pop_macro("ħ_1") > > +#if !ħ_1 > > +#error ħ_1 \U00000127_1 ħ_1 > > +#endif > > +#define ħ_2 1 > > +#pragma push_macro("\U00000127_2") > > +#undef ħ_2 > > +#define ħ_2 0 > > +#if ħ_2 > > +#error ħ_2 \U00000127_2 \U00000127_2 > > +#endif > > +#pragma pop_macro("\U00000127_2") > > +#if !ħ_2 > > +#error ħ_2 \U00000127_2 \U00000127_2 > > +#endif > > +#define \U00000127_3 1 > > +#pragma push_macro("ħ_3") > > +#undef \U00000127_3 > > +#define \U00000127_3 0 > > +#if \U00000127_3 > > +#error \U00000127_3 ħ_3 ħ_3 > > +#endif > > +#pragma pop_macro("ħ_3") > > +#if !\U00000127_3 > > +#error \U00000127_3 ħ_3 ħ_3 > > +#endif > > +#define \U00000127_4 1 > > +#pragma push_macro("ħ_4") > > +#undef \U00000127_4 > > +#define \U00000127_4 0 > > +#if \U00000127_4 > > +#error \U00000127_4 ħ_4 \U00000127_4 > > +#endif > > +#pragma pop_macro("\U00000127_4") > > +#if !\U00000127_4 > > +#error \U00000127_4 ħ_4 \U00000127_4 > > +#endif > > +#define \U00000127_5 1 > > +#pragma push_macro("\U00000127_5") > > +#undef \U00000127_5 > > +#define \U00000127_5 0 > > +#if \U00000127_5 > > +#error \U00000127_5 \U00000127_5 ħ_5 > > +#endif > > +#pragma pop_macro("ħ_5") > > +#if !\U00000127_5 > > +#error \U00000127_5 \U00000127_5 ħ_5 > > +#endif > > + > > +/* Verify invalid input produces no diagnostics. */ > > +#pragma push_macro("") /* { dg-bogus "." } */ > > +#pragma push_macro("\u") /* { dg-bogus "." } */ > > +#pragma push_macro("\u0000") /* { dg-bogus "." } */ > > +#pragma push_macro("not a single identifier") /* { dg-bogus "." } */ > > +#pragma push_macro("invalid╬character") /* { dg-bogus "." } */ > > +#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */ > > +#pragma push_macro("#include <cstdlib>") /* { dg-bogus "." } */ > > + > > +/* Verify end-of-line diagnostics for valid and invalid input. */ > > +#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("not a single identifier") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("invalid╬character") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra tokens" } */ > > +#pragma push_macro("#include <cstdlib>") oops /* { dg-warning "extra tokens" } */ > > + > > +/* Verify expected diagnostics. */ > > +#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */ > > +#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */ > > +_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */ > > +_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } */ > > diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C b/gcc/testsuite/g++.dg/pch/pushpop-2.C > > new file mode 100644 > > index 00000000000..84886aea985 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C > > @@ -0,0 +1,18 @@ > > +/* { dg-options -std=c++11 } */ > > +#include "pushpop-2.Hs" > > + > > +#if π != 4 > > +#error π != 4 > > +#endif > > +#pragma pop_macro("\u03C0") > > +#if π != 3 > > +#error π != 3 > > +#endif > > + > > +#if \u03B1 != 6 > > +#error α != 6 > > +#endif > > +_Pragma("pop_macro(\"\\u03B1\")") > > +#if α != 5 > > +#error α != 5 > > +#endif > > diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs > > new file mode 100644 > > index 00000000000..797139a3196 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs > > @@ -0,0 +1,9 @@ > > +#define π 3 > > +#pragma push_macro ("π") > > +#undef π > > +#define π 4 > > + > > +#define \u03B1 5 > > +#pragma push_macro ("α") > > +#undef α > > +#define α 6 > > diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c b/gcc/testsuite/gcc.dg/pch/pushpop-2.c > > new file mode 100644 > > index 00000000000..61b8430c6d2 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c > > @@ -0,0 +1,18 @@ > > +/* { dg-options -std=c11 } */ > > +#include "pushpop-2.hs" > > + > > +#if π != 4 > > +#error π != 4 > > +#endif > > +#pragma pop_macro("\u03C0") > > +#if π != 3 > > +#error π != 3 > > +#endif > > + > > +#if \u03B1 != 6 > > +#error α != 6 > > +#endif > > +_Pragma("pop_macro(\"\\u03B1\")") > > +#if α != 5 > > +#error α != 5 > > +#endif > > diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs > > new file mode 100644 > > index 00000000000..797139a3196 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs > > @@ -0,0 +1,9 @@ > > +#define π 3 > > +#pragma push_macro ("π") > > +#undef π > > +#define π 4 > > + > > +#define \u03B1 5 > > +#pragma push_macro ("α") > > +#undef α > > +#define α 6
Hello- https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html May I please ping this one again? It's the largest remaining gap in UTF-8 support for libcpp that I know of. Thanks! -Lewis On Tue, May 28, 2024 at 7:46 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > Hello- > > May I please ping this one (now for GCC 15)? Thanks! > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > -Lewis > > On Sat, Feb 10, 2024 at 9:02 AM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > Hello- > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > > > May I please ping this one? Thanks! > > > > On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > > > Hello- > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 > > > > > > The below patch fixes the issue noted in the PR that extended characters > > > cannot appear in the identifier passed to a #pragma push_macro or #pragma > > > pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for > > > GCC 13 please?
Hello- https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html Ping please? Jakub + Jason, hope you don't mind that I CCed you, I saw you had your attention on extended character identifiers a bit now :). Thanks! -Lewis On Fri, Jul 5, 2024 at 4:23 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > Hello- > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > May I please ping this one again? It's the largest remaining gap in > UTF-8 support for libcpp that I know of. Thanks! > > -Lewis > > On Tue, May 28, 2024 at 7:46 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > Hello- > > > > May I please ping this one (now for GCC 15)? Thanks! > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > > > -Lewis > > > > On Sat, Feb 10, 2024 at 9:02 AM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > > > Hello- > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > > > > > May I please ping this one? Thanks! > > > > > > On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > > > > > Hello- > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 > > > > > > > > The below patch fixes the issue noted in the PR that extended characters > > > > cannot appear in the identifier passed to a #pragma push_macro or #pragma > > > > pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for > > > > GCC 13 please?
Hello- https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html Monthly ping for this one please :). Thanks... -Lewis On Sat, Jul 27, 2024 at 3:09 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > Hello- > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > Ping please? Jakub + Jason, hope you don't mind that I CCed you, I saw > you had your attention on extended character identifiers a bit now :). > Thanks! > > -Lewis > > On Fri, Jul 5, 2024 at 4:23 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > Hello- > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > > > May I please ping this one again? It's the largest remaining gap in > > UTF-8 support for libcpp that I know of. Thanks! > > > > -Lewis > > > > On Tue, May 28, 2024 at 7:46 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > > > Hello- > > > > > > May I please ping this one (now for GCC 15)? Thanks! > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > > > > > -Lewis > > > > > > On Sat, Feb 10, 2024 at 9:02 AM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > > > > > Hello- > > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html > > > > > > > > May I please ping this one? Thanks! > > > > > > > > On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > > > > > > > > > Hello- > > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 > > > > > > > > > > The below patch fixes the issue noted in the PR that extended characters > > > > > cannot appear in the identifier passed to a #pragma push_macro or #pragma > > > > > pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for > > > > > GCC 13 please?
diff --git a/libcpp/charset.cc b/libcpp/charset.cc index 54d7b9e0932..7937df7d78c 100644 --- a/libcpp/charset.cc +++ b/libcpp/charset.cc @@ -2590,19 +2590,6 @@ cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count, return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL); } -/* A "do nothing" diagnostic-handling callback for use by - cpp_interpret_string_ranges, so that it can temporarily suppress - diagnostic-handling. */ - -static bool -noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level, - enum cpp_warning_reason, rich_location *, - const char *, va_list *) -{ - /* no-op. */ - return true; -} - /* This function mimics the behavior of cpp_interpret_string, but rather than generating a string in the execution character set, *OUT is written to with the source code ranges of the characters @@ -2642,20 +2629,10 @@ cpp_interpret_string_ranges (cpp_reader *pfile, const cpp_string *from, failing, rather than being emitted as a user-visible diagnostic. If an diagnostic does occur, we should see it via the return value of cpp_interpret_string_1. */ - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, - enum cpp_warning_reason, rich_location *, - const char *, va_list *) - ATTRIBUTE_FPTR_PRINTF(5,0); - - saved_diagnostic_handler = pfile->cb.diagnostic; - pfile->cb.diagnostic = noop_diagnostic_cb; - + cpp_auto_suppress_diagnostics suppress {pfile}; bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type, loc_readers, out); - /* Restore the saved diagnostic-handler. */ - pfile->cb.diagnostic = saved_diagnostic_handler; - if (!result) return "cpp_interpret_string_1 failed"; @@ -2691,17 +2668,11 @@ static unsigned count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type) { cpp_string str2 = { 0, 0 }; - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, - enum cpp_warning_reason, rich_location *, - const char *, va_list *) - ATTRIBUTE_FPTR_PRINTF(5,0); - saved_diagnostic_handler = pfile->cb.diagnostic; - pfile->cb.diagnostic = noop_diagnostic_cb; + cpp_auto_suppress_diagnostics suppress {pfile}; convert_f save_func = pfile->narrow_cset_desc.func; pfile->narrow_cset_desc.func = convert_count_chars; bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type); pfile->narrow_cset_desc.func = save_func; - pfile->cb.diagnostic = saved_diagnostic_handler; if (ret) { if (str2.text != str.text) diff --git a/libcpp/directives.cc b/libcpp/directives.cc index 479f8c716e8..019e4009dc9 100644 --- a/libcpp/directives.cc +++ b/libcpp/directives.cc @@ -137,7 +137,8 @@ static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *); static void handle_assertion (cpp_reader *, const char *, int); static void do_pragma_push_macro (cpp_reader *); static void do_pragma_pop_macro (cpp_reader *); -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *); +static void cpp_pop_definition (cpp_reader *, def_pragma_macro *, + cpp_hashnode *); /* This is the table of directive handlers. All extensions other than #warning, #include_next, and #import are deprecated. The name is @@ -1595,55 +1596,95 @@ do_pragma_once (cpp_reader *pfile) _cpp_mark_file_once_only (pfile, pfile->buffer->file); } -/* Handle #pragma push_macro(STRING). */ -static void -do_pragma_push_macro (cpp_reader *pfile) +/* Helper for #pragma {push,pop}_macro. Destringize STR and + lex it into an identifier, returning the hash node for it. */ + +static cpp_hashnode * +lex_identifier_from_string (cpp_reader *pfile, cpp_string str) { + auto src = (const uchar *) memchr (str.text, '"', str.len); + gcc_checking_assert (src); + ++src; + const auto limit = str.text + str.len - 1; + gcc_checking_assert (*limit == '"' && limit >= src); + const auto ident = XALLOCAVEC (uchar, limit - src + 1); + auto dest = ident; + while (src != limit) + { + /* We know there is a character following the backslash. */ + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) + src++; + *dest++ = *src++; + } + + /* We reserved a spot for the newline with the + 1 when allocating IDENT. + Push a buffer containing the identifier to lex. */ + *dest = '\n'; + cpp_push_buffer (pfile, ident, dest - ident, true); + _cpp_clean_line (pfile); + pfile->cur_token = _cpp_temp_token (pfile); + cpp_token *tok; + { + /* Suppress diagnostics during lexing so that we silently ignore invalid + input, as seems to be the common practice for this pragma. */ + cpp_auto_suppress_diagnostics suppress {pfile}; + tok = _cpp_lex_direct (pfile); + } + cpp_hashnode *node; - size_t defnlen; - const uchar *defn = NULL; - char *macroname, *dest; - const char *limit, *src; - const cpp_token *txt; - struct def_pragma_macro *c; + if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit) + node = nullptr; + else + node = tok->val.node.node; - txt = get__Pragma_string (pfile); - if (!txt) + _cpp_pop_buffer (pfile); + return node; +} + +/* Common processing for #pragma {push,pop}_macro. */ + +static cpp_hashnode * +push_pop_macro_common (cpp_reader *pfile, const char *type) +{ + const cpp_token *const txt = get__Pragma_string (pfile); + ++pfile->keep_tokens; + cpp_hashnode *node; + if (txt) { - location_t src_loc = pfile->cur_token[-1].src_loc; - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, - "invalid #pragma push_macro directive"); check_eol (pfile, false); skip_rest_of_line (pfile); - return; + node = lex_identifier_from_string (pfile, txt->val.str); } - dest = macroname = (char *) alloca (txt->val.str.len + 2); - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); - while (src < limit) + else { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; + node = nullptr; + location_t src_loc = pfile->cur_token[-1].src_loc; + cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, + "invalid #pragma %s_macro directive", type); + skip_rest_of_line (pfile); } - *dest = 0; - check_eol (pfile, false); - skip_rest_of_line (pfile); - c = XNEW (struct def_pragma_macro); - memset (c, 0, sizeof (struct def_pragma_macro)); - c->name = XNEWVAR (char, strlen (macroname) + 1); - strcpy (c->name, macroname); + --pfile->keep_tokens; + return node; +} + +/* Handle #pragma push_macro(STRING). */ +static void +do_pragma_push_macro (cpp_reader *pfile) +{ + const auto node = push_pop_macro_common (pfile, "push"); + if (!node) + return; + const auto c = XCNEW (def_pragma_macro); + c->name = xstrdup ((const char *) NODE_NAME (node)); c->next = pfile->pushed_macros; - node = _cpp_lex_identifier (pfile, c->name); if (node->type == NT_VOID) c->is_undef = 1; else if (node->type == NT_BUILTIN_MACRO) c->is_builtin = 1; else { - defn = cpp_macro_definition (pfile, node); - defnlen = ustrlen (defn); + const auto defn = cpp_macro_definition (pfile, node); + const size_t defnlen = ustrlen (defn); c->definition = XNEWVEC (uchar, defnlen + 2); c->definition[defnlen] = '\n'; c->definition[defnlen + 1] = 0; @@ -1660,50 +1701,24 @@ do_pragma_push_macro (cpp_reader *pfile) static void do_pragma_pop_macro (cpp_reader *pfile) { - char *macroname, *dest; - const char *limit, *src; - const cpp_token *txt; - struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros; - txt = get__Pragma_string (pfile); - if (!txt) - { - location_t src_loc = pfile->cur_token[-1].src_loc; - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, - "invalid #pragma pop_macro directive"); - check_eol (pfile, false); - skip_rest_of_line (pfile); - return; - } - dest = macroname = (char *) alloca (txt->val.str.len + 2); - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); - while (src < limit) - { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; - } - *dest = 0; - check_eol (pfile, false); - skip_rest_of_line (pfile); - - while (c != NULL) + const auto node = push_pop_macro_common (pfile, "pop"); + if (!node) + return; + for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = c->next) { - if (!strcmp (c->name, macroname)) + if (!strcmp (c->name, (const char *) NODE_NAME (node))) { if (!l) pfile->pushed_macros = c->next; else l->next = c->next; - cpp_pop_definition (pfile, c); + cpp_pop_definition (pfile, c, node); free (c->definition); free (c->name); free (c); break; } l = c; - c = c->next; } } @@ -2607,12 +2622,8 @@ cpp_undef (cpp_reader *pfile, const char *macro) /* Replace a previous definition DEF of the macro STR. If DEF is NULL, or first element is zero, then the macro should be undefined. */ static void -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) +cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode *node) { - cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name); - if (node == NULL) - return; - if (pfile->cb.before_define) pfile->cb.before_define (pfile); @@ -2634,29 +2645,23 @@ cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) } { - size_t namelen; - const uchar *dn; - cpp_hashnode *h = NULL; - cpp_buffer *nbuf; - - namelen = ustrcspn (c->definition, "( \n"); - h = cpp_lookup (pfile, c->definition, namelen); - dn = c->definition + namelen; - - nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true); + const auto namelen = ustrcspn (c->definition, "( \n"); + const auto dn = c->definition + namelen; + const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, + true); if (nbuf != NULL) { _cpp_clean_line (pfile); nbuf->sysp = 1; - if (!_cpp_create_definition (pfile, h, 0)) + if (!_cpp_create_definition (pfile, node, 0)) abort (); _cpp_pop_buffer (pfile); } else abort (); - h->value.macro->line = c->line; - h->value.macro->syshdr = c->syshdr; - h->value.macro->used = c->used; + node->value.macro->line = c->line; + node->value.macro->syshdr = c->syshdr; + node->value.macro->used = c->used; } } diff --git a/libcpp/errors.cc b/libcpp/errors.cc index 295496df7ed..3228dcbe7f6 100644 --- a/libcpp/errors.cc +++ b/libcpp/errors.cc @@ -350,3 +350,19 @@ cpp_errno_filename (cpp_reader *pfile, enum cpp_diagnostic_level level, return cpp_error_at (pfile, level, loc, "%s: %s", filename, xstrerror (errno)); } + +cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader *pfile) + : m_pfile (pfile), m_cb (pfile->cb.diagnostic) +{ + m_pfile->cb.diagnostic + = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason, + rich_location *, const char *, va_list *) + { + return true; + }; +} + +cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics () +{ + m_pfile->cb.diagnostic = m_cb; +} diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 5746aac9ea4..50705e3377a 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -1638,4 +1638,17 @@ enum cpp_xid_property { unsigned int cpp_check_xid_property (cppchar_t c); +/* In errors.cc */ + +/* RAII class to suppress CPP diagnostics in the current scope. */ +class cpp_auto_suppress_diagnostics +{ + public: + explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile); + ~cpp_auto_suppress_diagnostics (); + private: + cpp_reader *const m_pfile; + const decltype (cpp_callbacks::diagnostic) m_cb; +}; + #endif /* ! LIBCPP_CPPLIB_H */ diff --git a/libcpp/internal.h b/libcpp/internal.h index a20215c5709..6221ef0d1e7 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -753,7 +753,6 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *); extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *); extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *); extern void _cpp_init_tokenrun (tokenrun *, unsigned int); -extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *); extern int _cpp_remaining_tokens_num_in_context (cpp_context *); extern void _cpp_init_lexer (void); static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have, diff --git a/libcpp/lex.cc b/libcpp/lex.cc index 5aa379980cf..ba97377417b 100644 --- a/libcpp/lex.cc +++ b/libcpp/lex.cc @@ -2204,39 +2204,6 @@ identifier_diagnostics_on_lex (cpp_reader *pfile, cpp_hashnode *node) NODE_NAME (node)); } -/* Helper function to get the cpp_hashnode of the identifier BASE. */ -static cpp_hashnode * -lex_identifier_intern (cpp_reader *pfile, const uchar *base) -{ - cpp_hashnode *result; - const uchar *cur; - unsigned int len; - unsigned int hash = HT_HASHSTEP (0, *base); - - cur = base + 1; - while (ISIDNUM (*cur)) - { - hash = HT_HASHSTEP (hash, *cur); - cur++; - } - len = cur - base; - hash = HT_HASHFINISH (hash, len); - result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table, - base, len, hash, HT_ALLOC)); - identifier_diagnostics_on_lex (pfile, result); - return result; -} - -/* Get the cpp_hashnode of an identifier specified by NAME in - the current cpp_reader object. If none is found, NULL is returned. */ -cpp_hashnode * -_cpp_lex_identifier (cpp_reader *pfile, const char *name) -{ - cpp_hashnode *result; - result = lex_identifier_intern (pfile, (uchar *) name); - return result; -} - /* Lex an identifier starting at BASE. BUFFER->CUR is expected to point one past the first character at BASE, which may be a (possibly multi-byte) character if STARTS_UCN is true. */ diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c new file mode 100644 index 00000000000..c8665960e30 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c @@ -0,0 +1,203 @@ +/* { dg-do preprocess } */ +/* { dg-options "-std=c11 -pedantic" { target c } } */ +/* { dg-options "-std=c++11 -pedantic" { target c++ } } */ +/* { dg-additional-options "-Wall" } */ + +/* PR preprocessor/109704 */ + +/* Verify basic operations for different extended identifiers... */ + +/* ...dollar sign. */ +#define $x 1 +#pragma push_macro("$x") +#undef $x +#define $x 0 +#pragma pop_macro("$x") +#if !$x +#error $x +#endif +#define $x 1 +_Pragma("push_macro(\"$x\")") +#undef $x +#define $x 0 +_Pragma("pop_macro(\"$x\")") +#if !$x +#error $x +#endif +#define x$ 1 +#pragma push_macro("x$") +#undef x$ +#define x$ 0 +#pragma pop_macro("x$") +#if !x$ +#error x$ +#endif +#define x$ 1 +_Pragma("push_macro(\"x$\")") +#undef x$ +#define x$ 0 +_Pragma("pop_macro(\"x$\")") +#if !x$ +#error x$ +#endif + +/* ...UCN. */ +#define \u03B1x 1 +#pragma push_macro("\u03B1x") +#undef \u03B1x +#define \u03B1x 0 +#pragma pop_macro("\u03B1x") +#if !\u03B1x +#error \u03B1x +#endif +#define \u03B1x 1 +_Pragma("push_macro(\"\\u03B1x\")") +#undef \u03B1x +#define \u03B1x 0 +_Pragma("pop_macro(\"\\u03B1x\")") +#if !\u03B1x +#error \u03B1x +#endif +#define x\u03B1 1 +#pragma push_macro("x\u03B1") +#undef x\u03B1 +#define x\u03B1 0 +#pragma pop_macro("x\u03B1") +#if !x\u03B1 +#error x\u03B1 +#endif +#define x\u03B1 1 +_Pragma("push_macro(\"x\\u03B1\")") +#undef x\u03B1 +#define x\u03B1 0 +_Pragma("pop_macro(\"x\\u03B1\")") +#if !x\u03B1 +#error x\u03B1 +#endif + +/* ...UTF-8. */ +#define πx 1 +#pragma push_macro("πx") +#undef πx +#define πx 0 +#pragma pop_macro("πx") +#if !πx +#error πx +#endif +#define πx 1 +_Pragma("push_macro(\"πx\")") +#undef πx +#define πx 0 +_Pragma("pop_macro(\"πx\")") +#if !πx +#error πx +#endif +#define xπ 1 +#pragma push_macro("xπ") +#undef xπ +#define xπ 0 +#pragma pop_macro("xπ") +#if !xπ +#error xπ +#endif +#define xπ 1 +_Pragma("push_macro(\"xπ\")") +#undef xπ +#define xπ 0 +_Pragma("pop_macro(\"xπ\")") +#if !xπ +#error xπ +#endif + +/* Verify UCN and UTF-8 can be intermixed. */ +#define ħ_0 1 +#pragma push_macro("ħ_0") +#undef ħ_0 +#define ħ_0 0 +#if ħ_0 +#error ħ_0 ħ_0 \U00000127_0 +#endif +#pragma pop_macro("\U00000127_0") +#if !ħ_0 +#error ħ_0 ħ_0 \U00000127_0 +#endif +#define ħ_1 1 +#pragma push_macro("\U00000127_1") +#undef ħ_1 +#define ħ_1 0 +#if ħ_1 +#error ħ_1 \U00000127_1 ħ_1 +#endif +#pragma pop_macro("ħ_1") +#if !ħ_1 +#error ħ_1 \U00000127_1 ħ_1 +#endif +#define ħ_2 1 +#pragma push_macro("\U00000127_2") +#undef ħ_2 +#define ħ_2 0 +#if ħ_2 +#error ħ_2 \U00000127_2 \U00000127_2 +#endif +#pragma pop_macro("\U00000127_2") +#if !ħ_2 +#error ħ_2 \U00000127_2 \U00000127_2 +#endif +#define \U00000127_3 1 +#pragma push_macro("ħ_3") +#undef \U00000127_3 +#define \U00000127_3 0 +#if \U00000127_3 +#error \U00000127_3 ħ_3 ħ_3 +#endif +#pragma pop_macro("ħ_3") +#if !\U00000127_3 +#error \U00000127_3 ħ_3 ħ_3 +#endif +#define \U00000127_4 1 +#pragma push_macro("ħ_4") +#undef \U00000127_4 +#define \U00000127_4 0 +#if \U00000127_4 +#error \U00000127_4 ħ_4 \U00000127_4 +#endif +#pragma pop_macro("\U00000127_4") +#if !\U00000127_4 +#error \U00000127_4 ħ_4 \U00000127_4 +#endif +#define \U00000127_5 1 +#pragma push_macro("\U00000127_5") +#undef \U00000127_5 +#define \U00000127_5 0 +#if \U00000127_5 +#error \U00000127_5 \U00000127_5 ħ_5 +#endif +#pragma pop_macro("ħ_5") +#if !\U00000127_5 +#error \U00000127_5 \U00000127_5 ħ_5 +#endif + +/* Verify invalid input produces no diagnostics. */ +#pragma push_macro("") /* { dg-bogus "." } */ +#pragma push_macro("\u") /* { dg-bogus "." } */ +#pragma push_macro("\u0000") /* { dg-bogus "." } */ +#pragma push_macro("not a single identifier") /* { dg-bogus "." } */ +#pragma push_macro("invalid╬character") /* { dg-bogus "." } */ +#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */ +#pragma push_macro("#include <cstdlib>") /* { dg-bogus "." } */ + +/* Verify end-of-line diagnostics for valid and invalid input. */ +#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("not a single identifier") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("invalid╬character") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("#include <cstdlib>") oops /* { dg-warning "extra tokens" } */ + +/* Verify expected diagnostics. */ +#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */ +#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */ +_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */ +_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } */ diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C b/gcc/testsuite/g++.dg/pch/pushpop-2.C new file mode 100644 index 00000000000..84886aea985 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C @@ -0,0 +1,18 @@ +/* { dg-options -std=c++11 } */ +#include "pushpop-2.Hs" + +#if π != 4 +#error π != 4 +#endif +#pragma pop_macro("\u03C0") +#if π != 3 +#error π != 3 +#endif + +#if \u03B1 != 6 +#error α != 6 +#endif +_Pragma("pop_macro(\"\\u03B1\")") +#if α != 5 +#error α != 5 +#endif diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs new file mode 100644 index 00000000000..797139a3196 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs @@ -0,0 +1,9 @@ +#define π 3 +#pragma push_macro ("π") +#undef π +#define π 4 + +#define \u03B1 5 +#pragma push_macro ("α") +#undef α +#define α 6 diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c b/gcc/testsuite/gcc.dg/pch/pushpop-2.c new file mode 100644 index 00000000000..61b8430c6d2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c @@ -0,0 +1,18 @@ +/* { dg-options -std=c11 } */ +#include "pushpop-2.hs" + +#if π != 4 +#error π != 4 +#endif +#pragma pop_macro("\u03C0") +#if π != 3 +#error π != 3 +#endif + +#if \u03B1 != 6 +#error α != 6 +#endif +_Pragma("pop_macro(\"\\u03B1\")") +#if α != 5 +#error α != 5 +#endif diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs new file mode 100644 index 00000000000..797139a3196 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs @@ -0,0 +1,9 @@ +#define π 3 +#pragma push_macro ("π") +#undef π +#define π 4 + +#define \u03B1 5 +#pragma push_macro ("α") +#undef α +#define α 6