From patchwork Sat Jul 20 18:31:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marek Polacek X-Patchwork-Id: 1962799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=DBoAhgJP; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WRFWP67l3z1ySl for ; Sun, 21 Jul 2024 04:32:00 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4B549386075E for ; Sat, 20 Jul 2024 18:31:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 5FE363858D20 for ; Sat, 20 Jul 2024 18:31:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5FE363858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5FE363858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721500298; cv=none; b=pLfMr5OaDa0+bIWHLqu85t5KmIXtUaH0O/jO6ZdlBBSzomUY2XwZGKbCljkWFLBOy5D1gcVKNKFzSyWNPhQ37SwJD3X9JIvKM/MLRuFDrTAJQOCD5uwpF/a2p/ksz7OQZl1gY7j3+eLbFyDG7tj2ii+4cbmlNr3lWpW0e/3wueY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721500298; c=relaxed/simple; bh=ER4O18zy0dAMI4ZTltkeilpNnziOeyPOWuGJYJwCp2M=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=CtankL0rFgdZVVOu0klgEgrc0snRddIxxoRD0J88LL3Hc978dhVB16AwZ3yzNmhC1a+y59EJBbUQHjDmBzFjGtx8Y7FQfV52Mk8KKso2hDe8AgcZ7Dit3mkLdg9lQ20KgkbWJ/U7PFKYjglD6e0jEIsSpR92zyRKO9ceQtbM+VM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721500295; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=S27vk/BbEzbAqHDOpuYv20ypYqxW3ox0xzzExqrJ6JI=; b=DBoAhgJPm/1KY/3zuMnJ/L9BnEfioTP12neIFuqsFFHVITQ1XjLr+8n5p5XWCq5NTPOa0l 2LPQ92KCeI13tJSBTF2vWR1CrJqUQRsTDgrepsUIP0tWHP7K+ejjVjkuwRdIiXWIQ5HDxE nlNor2B9FwIaDymiZ8jNC8F32soxCeM= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-445-zpRJuepWOuWJ8DZh4i_63g-1; Sat, 20 Jul 2024 14:31:31 -0400 X-MC-Unique: zpRJuepWOuWJ8DZh4i_63g-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 30AE01956080 for ; Sat, 20 Jul 2024 18:31:30 +0000 (UTC) Received: from pdp-11.lan (unknown [10.22.32.4]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 552ED3000188; Sat, 20 Jul 2024 18:31:29 +0000 (UTC) From: Marek Polacek To: Jason Merrill , GCC Patches Subject: [PATCH] c++: fix wrong ambiguity resolution [PR29834] Date: Sat, 20 Jul 2024 14:31:25 -0400 Message-ID: <20240720183125.26575-1-polacek@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org [ Entering the contest to fix the oldest PR in this cycle. ] Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- This 18-year-old PR reports that we parse certain comma expressions as a declaration rather than statement when the statement begins with a functional-style cast expression. Consider int(x), 0; which does not declare x--it only casts x to int--, whereas int(x), (y); declares x and y. We need some kind of look-ahead to decide how we should disambiguate the construct, because cp_parser_init_declarator commits eagerly once it has seen "int(x)", and then it's too late to recover. This patch makes us try to parse the code as a sequence of declarators; if that fails, we are likely looking at a statement. That's a simple idea, but it's complicated by code like void (*p)(void *)(fun); which initializes a pointer-to-function, or int(x), (x) + 1; which is an expression statement, but the second (x) is parsed as a valid declarator, only the + after reveals that the whole thing is an expression. You can have things like int(**p) which by itself doesn't tell you much. You can have int(*q)(void*) which looks like it starts with a functional-style cast, but it is not a cast. The simple int(x) = 42; has an initializer so it declares x; it is not an assignment. But then, int(d) __attribute__(()); does not have an initializer, but the attribute makes it a declaration. PR c++/29834 PR c++/54905 gcc/cp/ChangeLog: * parser.cc (cp_parser_lambda_introducer): Use cp_parser_next_token_starts_initializer_p. (cp_parser_simple_declaration): Add look-ahead to decide if we're looking at a declaration or statement. (cp_parser_next_token_starts_initializer_p): New. gcc/testsuite/ChangeLog: * g++.dg/parse/ambig15.C: New test. * g++.dg/parse/ambig16.C: New test. --- gcc/cp/parser.cc | 73 ++++++++++++++++++++++-- gcc/testsuite/g++.dg/parse/ambig15.C | 83 ++++++++++++++++++++++++++++ gcc/testsuite/g++.dg/parse/ambig16.C | 18 ++++++ 3 files changed, 168 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/g++.dg/parse/ambig15.C create mode 100644 gcc/testsuite/g++.dg/parse/ambig16.C base-commit: 493c55578fe00f5f4a7534b8f5cb5213f86f4d01 diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 1fa0780944b..797cfc3204e 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -2947,6 +2947,8 @@ static bool cp_parser_next_token_ends_template_argument_p (cp_parser *); static bool cp_parser_nth_token_starts_template_argument_list_p (cp_parser *, size_t); +static bool cp_parser_next_token_starts_initializer_p + (cp_parser *); static enum tag_types cp_parser_token_is_class_key (cp_token *); static enum tag_types cp_parser_token_is_type_parameter_key @@ -11663,9 +11665,7 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr) } /* Find the initializer for this capture. */ - if (cp_lexer_next_token_is (parser->lexer, CPP_EQ) - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN) - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)) + if (cp_parser_next_token_starts_initializer_p (parser)) { /* An explicit initializer exists. */ if (cxx_dialect < cxx14) @@ -11747,9 +11747,7 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr) /* If what follows is an initializer, the second '...' is invalid. But for cases like [...xs...], the first one is invalid. */ - if (cp_lexer_next_token_is (parser->lexer, CPP_EQ) - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN) - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)) + if (cp_parser_next_token_starts_initializer_p (parser)) ellipsis_loc = loc; error_at (ellipsis_loc, "too many %<...%> in lambda capture"); continue; @@ -16047,6 +16045,58 @@ cp_parser_simple_declaration (cp_parser* parser, else break; + /* If we are still uncommitted, we're probably looking at something like + T(x), which can be a declaration but does not have to be, depending + on what comes after. Consider + int(x), 0; + which is _not_ a declaration of x, it's a functional cast, and + int(x), (y); + which declares x and y. We need some kind of look-ahead to decide, + cp_parser_init_declarator below will commit eagerly once it has seen + "int(x)". So we try to parse this as a sequence of declarators; if + that fails, we are likely looking at a statement. (We could avoid + all of this if there is no non-nested comma.) */ + if (cp_parser_uncommitted_to_tentative_parse_p (parser) + && cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) + { + bool all_ok = true; + cp_lexer_save_tokens (parser->lexer); + /* Avoid committing to outer tentative parse. This is here to parse + "void (*p)(void *);" correctly. */ + tentative_firewall firewall (parser); + while (cp_lexer_next_token_is_not (parser->lexer, CPP_SEMICOLON)) + { + /* Try to parse what follows as a declarator. */ + cp_declarator *d + = cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED, + CP_PARSER_FLAGS_NONE, + /*ctor_dtor_or_conv_p=*/nullptr, + /*parenthesized_p=*/nullptr, + /*member_p=*/false, + /*friend_p=*/false, + /*static_p=*/false); + if (cp_parser_error_occurred (parser) || d == cp_error_declarator) + { + all_ok = false; + break; + } + /* If this was not a function-style cast, go ahead with a + declaration. */ + if (d->kind == cdk_function + /* A declarator followed by an initializer makes this an + init-declarator. */ + || cp_parser_next_token_starts_initializer_p (parser) + || cp_next_tokens_can_be_attribute_p (parser)) + break; + if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA)) + cp_lexer_consume_token (parser->lexer); + } + cp_lexer_rollback_tokens (parser->lexer); + if (!all_ok) + /* Not a declaration. Bail and parse as a statement instead. */ + goto done; + } + tree last_type; bool auto_specifier_p; /* NULL_TREE if both variable and function declaration are allowed, @@ -34936,6 +34986,17 @@ cp_parser_nth_token_starts_template_argument_list_p (cp_parser * parser, return false; } +/* Returns true if the next token can start an initializer; that is, it is + '=', '(', or '{'. */ + +static bool +cp_parser_next_token_starts_initializer_p (cp_parser *parser) +{ + return (cp_lexer_next_token_is (parser->lexer, CPP_EQ) + || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN) + || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)); +} + /* Returns the kind of tag indicated by TOKEN, if it is a class-key, or none_type otherwise. */ diff --git a/gcc/testsuite/g++.dg/parse/ambig15.C b/gcc/testsuite/g++.dg/parse/ambig15.C new file mode 100644 index 00000000000..d086b2b6bac --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/ambig15.C @@ -0,0 +1,83 @@ +// PR c++/29834 +// { dg-do compile { target c++11 } } + +void fun (void *); +using T = void(void*); + +struct A { + int foo (); +}; + +void +f1 () +{ + int(x), a, b, c, d, e, f, g, h, etc; + int(x), a, b, c, d, e, f, g, h, etc, (new int); +} + +int +f2 (int x, int *p, int **pp) +{ + // Statements. + int(x), 0; + (int(x), 0); + (int(x)), 0; + int(*p), 0; + int(**pp), !**pp; + int(x), x + 1; + int(x), (x) + 1; + int(x), x++; + int(x), --x; + int(p[1]), 0; + + // Declarations. + int(a), (b), (c); + a = b = c = 42; + int(g) = 0, (h) = 0; + int(k), l __attribute__((unused)); + int(&r) = x; + void (*p1)(void*) = fun; + void (*p2)(void*); + void (*p3)(void*) __attribute__((unused)) = fun; + void (**p4)(void*) __attribute__((unused)); + void (*p5)(void*) __attribute__((unused)); + void (p6)(void*) __attribute__((unused)); + void (&p7)(void*) __attribute__((unused)) = fun; + void (*p8)(void*)(fun); + void (*p9)(void*){fun}; + int (p10)(int), m; + int(d) __attribute__(()); + int(e) __attribute__(()), f; + int (A::*foo)() = &A::foo; + int (A::*foo2)(); + int (*A::*foo3)(); + int(j[1]); + T(fun2); + + return a + b + c + r; +} + +struct Doh { + Doh(int) {} +}; + + +int +f3 (int x) +{ + Doh(x), ++x; + return Doh(x), x; +} + +void +bad (int x, int y, int z) +{ + int(x) = 1; // { dg-error "shadows a parameter" } + int(y)(1); // { dg-error "shadows a parameter" } + int(z){1}; // { dg-error "shadows a parameter" } + void (*p)(void*), 0; // { dg-error "" } + int(a),; // { dg-error "expected" } + int(x) __attribute__(()), 0; // { dg-error "" } + int(i), i; // { dg-error "redeclaration" } + T(fun), ++x; // { dg-error "invalid cast" } +} diff --git a/gcc/testsuite/g++.dg/parse/ambig16.C b/gcc/testsuite/g++.dg/parse/ambig16.C new file mode 100644 index 00000000000..51bc16dffcf --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/ambig16.C @@ -0,0 +1,18 @@ +// PR c++/54905 +// { dg-do compile } + +struct F +{ +}; + +F f; +struct A +{ + A(F& s); +}; + +void +foo () +{ + A(f), 1; +}