From patchwork Wed Jul 17 07:47:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1961454 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NhwdBmpp; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WP7Mg1NWTz1xqc for ; Wed, 17 Jul 2024 17:47:58 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 85C233839399 for ; Wed, 17 Jul 2024 07:47:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 795913858D39 for ; Wed, 17 Jul 2024 07:47:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 795913858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 795913858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721202450; cv=none; b=Wb4/l2w+kyjEJrVyB3Y3vtSehIxTEVoUodqVDShh7XxHFgiTaN562zWt7mnylZDVIsjlpfrR5Vs2ux22GjuP7OXKoouDE/d38i2D/DYOQIhSxclkI6QtOxvC0GydMja1Pn8Enfhrkn6NzUBnOEJFVH8V/5GcadZHzaSJSnP0vVQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721202450; c=relaxed/simple; bh=q0YMNE3wFm3Bf/DI4TIFvteSbLSLJJQ6Y+Jx6W53Yjc=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=jvrsr7ELCs8At40woMpSItl+irVuuGlTmjLWo6cM7oTiyxFVdbZCAlYWQQdB2I8LcNlRd4GqTpIZ/9dSuESarbZUMzAFD9F1VzxeW6xubqrywLIoOsVwE8HAuoDTFIXbE5ZSHBnLqynG7QRfW30Qb2eGzRBQq8FNvlKsF1qnxPo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721202445; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=yrew9VgZ6D+HTQNjc5KrGRyLOxEdMhRye2ZeVt28+eA=; b=NhwdBmppp4hyQ+7FS1YGmJlyJ4NOUQdjaou7VumV3QURTl8MNWovopPS8bo5mUBohCV3ZY PWzhYyUShRAGqET/r6KOR7RKTtdES9alnbe2KeIdL1idtLNoKu+Alc2RFZnPbvT8btPLtu Kb59HmToI69wISK63+2A4novX+kUnBw= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-619-xwkCaweDPIyuHk-9NSfhMw-1; Wed, 17 Jul 2024 03:47:23 -0400 X-MC-Unique: xwkCaweDPIyuHk-9NSfhMw-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F11851955F66 for ; Wed, 17 Jul 2024 07:47:21 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.224.25]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EB5451955F40; Wed, 17 Jul 2024 07:47:20 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 46H7lIer2337223 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 17 Jul 2024 09:47:18 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 46H7lHbB2337222; Wed, 17 Jul 2024 09:47:17 +0200 Date: Wed, 17 Jul 2024 09:47:17 +0200 From: Jakub Jelinek To: Jason Merrill Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] libcpp, c++: Optimize initializers using #embed in C++ Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi! This patch on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655012.html https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655013.html https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657049.html patches which just introduce non-optimized support for the C23 feature and two extensions to it actually optimizes it and on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657053.html patch which adds optimizations to C & middle-end adds similar optimizations to the C++ FE. The first hunk enables use of CPP_EMBED token even for C++, not just C; the preprocessor guarantees there is always a CPP_NUMBER CPP_COMMA before CPP_EMBED and CPP_COMMA CPP_NUMBER after it which simplifies parsing (unless #embed is more than 2GB, in that case it could be CPP_NUMBER CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED CPP_COMMA CPP_NUMBER etc. with each CPP_EMBED covering at most INT_MAX bytes). Similarly to the C patch, this patch parses it into RAW_DATA_CST tree in the braced initializers (and from there peels into INTEGER_CSTs unless it is an initializer of an std::byte array or integral array with CHAR_BIT element precision), parses CPP_EMBED in cp_parser_expression into just the last INTEGER_CST in it because I think users don't need millions of -Wunused-value warnings because they did useless int a = ( #embed "megabyte.dat" ); and so most of the inner INTEGER_CSTs would be there just for the warning, and in the rest of contexts like template argument list, function argument list, attribute argument list, ...) parse it into a sequence of INTEGER_CSTs (I wrote a range/iterator classes to simplify that). My dumb cat embed-11.c constexpr unsigned char a[] = { #embed "cc1plus" }; const unsigned char *b = a; testcase where cc1plus is 492329008 bytes long when configured --enable-checking=yes,rtl,extra against recent binutils with .base64 gas support results in: time ./xg++ -B ./ -S -O2 embed-11.c real 0m4.350s user 0m2.427s sys 0m0.830s time ./xg++ -B ./ -c -O2 embed-11.c real 0m6.932s user 0m6.034s sys 0m0.888s (compared to running out of memory or very long compilation). On a shorter inclusion, cat embed-12.c constexpr unsigned char a[] = { #embed "xg++" }; const unsigned char *b = a; where xg++ is 15225904 bytes long, this takes using GCC with the #embed patchset except for this patch: time ~/src/gcc/obj36/gcc/xg++ -B ~/src/gcc/obj36/gcc/ -S -O2 embed-12.c real 0m33.190s user 0m32.327s sys 0m0.790s and with this patch: time ./xg++ -B ./ -S -O2 embed-12.c real 0m0.118s user 0m0.090s sys 0m0.028s The patch doesn't change anything on what the first patch in the series introduces even for C++, namely that #embed is expanded (actually or as if) into a sequence of literals like 127,69,76,70,2,1,1,3,0,0,0,0,0,0,0,0,2,0,62,0,1,0,0,0,80,211,64,0,0,0,0,0,64,0,0,0,0,0,0,0,8,253 and so each element has int type. That is how I believe it is in C23, and the different versions of the C++ P1967 paper specified there some casts, P1967R12 in particular "Otherwise, the integral constant expression is the value of std::fgetc’s return is cast to unsigned char." but please see https://github.com/llvm/llvm-project/pull/97274#issuecomment-2230929277 comment and whether we really want the preprocessor to preprocess it for C++ as (or as-if) static_cast(127),static_cast(69),static_cast(76),static_cast(70),static_cast(2),... i.e. 9 tokens per byte rather than 2, or (unsigned char)127,(unsigned char)69,... or ((unsigned char)127),((unsigned char)69),... etc. Without a literal suffix for unsigned char constant literals it is horrible, plus the incompatibility between C and C++. Sure, we could use the magic form more often for C++ to save the size and do the 9 or how many tokens form only for the boundary constants and use #embed "." __gnu__::__base64__("...") for what is in between if there are at least 2 tokens inside of it. E.g. (unsigned char)127 vs. static_cast(127) behaves differently if there is constexpr long long p[] = { ... }; ... #embed __FILE__ [p] Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk if the rest of the series is approved? 2024-07-17 Jakub Jelinek libcpp/ * files.cc (finish_embed): Use CPP_EMBED even for C++. gcc/cp/ChangeLog: * cp-tree.h (class raw_data_iterator): New type. (class raw_data_range): New type. * parser.cc (cp_parser_postfix_open_square_expression): Handle parsing of CPP_EMBED. (cp_parser_parenthesized_expression_list): Likewise. Use cp_lexer_next_token_is. (cp_parser_expression): Handle parsing of CPP_EMBED. (cp_parser_template_argument_list): Likewise. (cp_parser_initializer_list): Likewise. (cp_parser_oacc_clause_tile): Likewise. (cp_parser_omp_tile_sizes): Likewise. * pt.cc (tsubst_expr): Handle RAW_DATA_CST. * constexpr.cc (reduced_constant_expression_p): Likewise. (raw_data_cst_elt): New function. (find_array_ctor_elt): Handle RAW_DATA_CST. (cxx_eval_array_reference): Likewise. * typeck2.cc (digest_init_r): Emit -Wnarrowing and/or -Wconversion diagnostics. (process_init_constructor_array): Handle RAW_DATA_CST. * decl.cc (maybe_deduce_size_from_array_init): Likewise. (is_direct_enum_init): Fail for RAW_DATA_CST. (cp_maybe_split_raw_data): New function. (reshape_init_array_1): Add VECTOR_P argument. Handle RAW_DATA_CST. (reshape_init_array): Adjust reshape_init_array_1 caller. (reshape_init_vector): Likewise. (reshape_init_class): Handle RAW_DATA_CST. (reshape_init_r): Likewise. gcc/testsuite/ * c-c++-common/cpp/embed-22.c: New test. * c-c++-common/cpp/embed-23.c: New test. * g++.dg/cpp/embed-4.C: New test. * g++.dg/cpp/embed-5.C: New test. * g++.dg/cpp/embed-6.C: New test. * g++.dg/cpp/embed-7.C: New test. * g++.dg/cpp/embed-8.C: New test. * g++.dg/cpp/embed-9.C: New test. * g++.dg/cpp/embed-10.C: New test. * g++.dg/cpp/embed-11.C: New test. * g++.dg/cpp/embed-12.C: New test. Jakub --- libcpp/files.cc.jj 2024-07-12 14:13:34.854093279 +0200 +++ libcpp/files.cc 2024-07-12 14:17:58.677783797 +0200 @@ -1241,8 +1241,7 @@ finish_embed (cpp_reader *pfile, _cpp_fi limit = params->limit; size_t embed_tokens = 0; - if (!CPP_OPTION (pfile, cplusplus) - && CPP_OPTION (pfile, lang) != CLK_ASM + if (CPP_OPTION (pfile, lang) != CLK_ASM && limit >= 64) embed_tokens = ((limit - 2) / INT_MAX) + (((limit - 2) % INT_MAX) != 0); --- gcc/cp/cp-tree.h.jj 2024-07-12 14:03:23.863727788 +0200 +++ gcc/cp/cp-tree.h 2024-07-16 09:53:24.260884437 +0200 @@ -1000,6 +1000,54 @@ public: lkp_iterator end() { return lkp_iterator (NULL_TREE); } }; +/* Iterator for a RAW_DATA_CST. */ + +class raw_data_iterator { + tree t; + unsigned int n; + + public: + explicit raw_data_iterator (tree t, unsigned int n) + : t (t), n (n) + { + } + + operator bool () const + { + return n < (unsigned) RAW_DATA_LENGTH (t); + } + + raw_data_iterator &operator++ () + { + ++n; + return *this; + } + + tree operator* () const + { + return build_int_cst (TREE_TYPE (t), + ((const unsigned char *) RAW_DATA_POINTER (t))[n]); + } + + bool operator== (const raw_data_iterator &o) const + { + return t == o.t && n == o.n; + } +}; + +/* Treat a tree as a range of raw_data_iterator, e.g. + for (tree f : raw_data_range (d)) { ... } */ + +class raw_data_range +{ + tree t; +public: + raw_data_range (tree t) : t (t) { } + raw_data_iterator begin () { return raw_data_iterator (t, 0); } + raw_data_iterator end () + { return raw_data_iterator (t, RAW_DATA_LENGTH (t)); } +}; + /* hash traits for declarations. Hashes potential overload sets via DECL_NAME. */ --- gcc/cp/parser.cc.jj 2024-07-12 14:03:23.898727351 +0200 +++ gcc/cp/parser.cc 2024-07-16 10:12:17.573343570 +0200 @@ -8366,6 +8366,19 @@ cp_parser_postfix_open_square_expression { while (true) { + /* Handle #embed in the expression-list. */ + if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + tree raw_data = cp_lexer_peek_token (parser->lexer)->u.value; + cp_lexer_consume_token (parser->lexer); + vec_safe_reserve (expression_list, + RAW_DATA_LENGTH (raw_data)); + for (tree argument : raw_data_range (raw_data)) + expression_list->quick_push (argument); + cp_parser_require (parser, CPP_COMMA, RT_COMMA); + continue; + } + cp_expr expr = cp_parser_parenthesized_expression_list_elt (parser, /*cast_p=*/ @@ -8833,12 +8846,27 @@ cp_parser_parenthesized_expression_list /* At the beginning of attribute lists, check to see if the next token is an identifier. */ if (is_attribute_list == id_attr - && cp_lexer_peek_token (parser->lexer)->type == CPP_NAME) + && cp_lexer_next_token_is (parser->lexer, CPP_NAME)) expr = cp_lexer_consume_token (parser->lexer)->u.value; else if (is_attribute_list == assume_attr) expr = cp_parser_conditional_expression (parser); else if (is_attribute_list == uneval_string_attr) expr = cp_parser_unevaluated_string_literal (parser); + else if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + /* Handle #embed in the argument list. */ + tree raw_data = cp_lexer_peek_token (parser->lexer)->u.value; + location_t loc = cp_lexer_peek_token (parser->lexer)->location; + cp_lexer_consume_token (parser->lexer); + vec_safe_reserve (expression_list, RAW_DATA_LENGTH (raw_data)); + for (tree arg : raw_data_range (raw_data)) + if (wrap_locations_p) + expression_list->quick_push (maybe_wrap_with_location (arg, + loc)); + else + expression_list->quick_push (arg); + goto get_comma; + } else expr = cp_parser_parenthesized_expression_list_elt (parser, cast_p, @@ -10921,8 +10949,24 @@ cp_parser_expression (cp_parser* parser, cp_expr assignment_expression; /* Parse the next assignment-expression. */ - assignment_expression - = cp_parser_assignment_expression (parser, pidk, cast_p, decltype_p); + if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + /* Users aren't interested in milions of -Wunused-value + warnings when using #embed inside of a comma expression, + and one CPP_NUMBER plus CPP_COMMA before it and one + CPP_COMMA plus CPP_NUMBER after it is guaranteed by + the preprocessor. Thus, parse the whole CPP_EMBED just + as a single INTEGER_CST, the last byte in it. */ + tree raw_data = cp_lexer_peek_token (parser->lexer)->u.value; + location_t loc = cp_lexer_peek_token (parser->lexer)->location; + cp_lexer_consume_token (parser->lexer); + assignment_expression + = *raw_data_iterator (raw_data, RAW_DATA_LENGTH (raw_data) - 1); + assignment_expression.set_location (loc); + } + else + assignment_expression + = cp_parser_assignment_expression (parser, pidk, cast_p, decltype_p); /* We don't create a temporary for a call that is the immediate operand of decltype or on the RHS of a comma. But when we see a comma, we @@ -19575,6 +19619,17 @@ cp_parser_template_argument_list (cp_par /* Consume the comma. */ cp_lexer_consume_token (parser->lexer); + /* Handle #embed in the argument list. */ + if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + tree raw_data = cp_lexer_peek_token (parser->lexer)->u.value; + cp_lexer_consume_token (parser->lexer); + args.reserve (RAW_DATA_LENGTH (raw_data), false); + for (tree argument : raw_data_range (raw_data)) + args.quick_push (argument); + continue; + } + /* Parse the template-argument. */ tree argument = cp_parser_template_argument (parser); @@ -26598,10 +26653,17 @@ cp_parser_initializer_list (cp_parser* p first_designator = designator; /* Parse the initializer. */ - initializer = cp_parser_initializer_clause (parser, - (non_constant_p != nullptr - ? &clause_non_constant_p - : nullptr)); + if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + initializer = cp_lexer_peek_token (parser->lexer)->u.value; + clause_non_constant_p = false; + cp_lexer_consume_token (parser->lexer); + } + else + initializer = cp_parser_initializer_clause (parser, + (non_constant_p != nullptr + ? &clause_non_constant_p + : nullptr)); /* If any clause is non-constant, so is the entire initializer. */ if (non_constant_p && clause_non_constant_p) *non_constant_p = true; @@ -39017,6 +39079,15 @@ cp_parser_oacc_clause_tile (cp_parser *p cp_lexer_consume_token (parser->lexer); expr = integer_zero_node; } + else if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + /* Handle #embed in the size-expr-list. */ + tree raw_data = cp_lexer_peek_token (parser->lexer)->u.value; + cp_lexer_consume_token (parser->lexer); + for (tree argument : raw_data_range (raw_data)) + tile = tree_cons (NULL_TREE, argument, tile); + continue; + } else expr = cp_parser_constant_expression (parser); @@ -47632,6 +47703,16 @@ cp_parser_omp_tile_sizes (cp_parser *par if (sizes && !cp_parser_require (parser, CPP_COMMA, RT_COMMA)) return error_mark_node; + if (cp_lexer_next_token_is (parser->lexer, CPP_EMBED)) + { + /* Handle #embed in the size-expr-list. */ + tree raw_data = cp_lexer_peek_token (parser->lexer)->u.value; + cp_lexer_consume_token (parser->lexer); + for (tree argument : raw_data_range (raw_data)) + sizes = tree_cons (NULL_TREE, argument, sizes); + continue; + } + tree expr = cp_parser_constant_expression (parser); if (expr == error_mark_node) { --- gcc/cp/pt.cc.jj 2024-07-12 14:03:23.908727226 +0200 +++ gcc/cp/pt.cc 2024-07-15 18:36:16.075729634 +0200 @@ -21657,6 +21657,14 @@ tsubst_expr (tree t, tree args, tsubst_f RETURN (r); } + case RAW_DATA_CST: + { + tree type = tsubst (TREE_TYPE (t), args, complain, in_decl); + r = copy_node (t); + TREE_TYPE (r) = type; + RETURN (r); + } + case PTRMEM_CST: /* These can sometimes show up in a partial instantiation, but never involve template parms. */ --- gcc/cp/constexpr.cc.jj 2024-07-12 14:03:23.834728149 +0200 +++ gcc/cp/constexpr.cc 2024-07-15 17:33:16.111144086 +0200 @@ -3448,7 +3448,13 @@ reduced_constant_expression_p (tree t) return false; if (TREE_CODE (e.index) == RANGE_EXPR) cursor = TREE_OPERAND (e.index, 1); - cursor = int_const_binop (PLUS_EXPR, cursor, size_one_node); + if (TREE_CODE (e.value) == RAW_DATA_CST) + cursor + = int_const_binop (PLUS_EXPR, cursor, + size_int (RAW_DATA_LENGTH (e.value))); + else + cursor = int_const_binop (PLUS_EXPR, cursor, + size_one_node); } if (find_array_ctor_elt (t, max) == -1) return false; @@ -4057,6 +4063,22 @@ array_index_cmp (tree key, tree index) } } +/* Extract a single INTEGER_CST from RAW_DATA_CST RAW_DATA at + relative index OFF. */ + +static tree +raw_data_cst_elt (tree raw_data, unsigned int off) +{ + return build_int_cst (TREE_TYPE (raw_data), + TYPE_UNSIGNED (TREE_TYPE (raw_data)) + ? (HOST_WIDE_INT) + (((const unsigned char *) + RAW_DATA_POINTER (raw_data))[off]) + : (HOST_WIDE_INT) + (((const signed char *) + RAW_DATA_POINTER (raw_data))[off])); +} + /* Returns the index of the constructor_elt of ARY which matches DINDEX, or -1 if none. If INSERT is true, insert a matching element rather than fail. */ @@ -4081,10 +4103,11 @@ find_array_ctor_elt (tree ary, tree dind if (cindex == NULL_TREE) { /* Verify that if the last index is missing, all indexes - are missing. */ + are missing and there is no RAW_DATA_CST. */ if (flag_checking) for (unsigned int j = 0; j < len - 1; ++j) - gcc_assert ((*elts)[j].index == NULL_TREE); + gcc_assert ((*elts)[j].index == NULL_TREE + && TREE_CODE ((*elts)[j].value) != RAW_DATA_CST); if (i < end) return i; else @@ -4107,6 +4130,11 @@ find_array_ctor_elt (tree ary, tree dind { if (i < end) return i; + tree value = (*elts)[end - 1].value; + if (TREE_CODE (value) == RAW_DATA_CST + && wi::to_widest (dindex) < (wi::to_widest (cindex) + + RAW_DATA_LENGTH (value))) + begin = end - 1; else begin = end; } @@ -4120,12 +4148,59 @@ find_array_ctor_elt (tree ary, tree dind tree idx = elt.index; int cmp = array_index_cmp (dindex, idx); + if (cmp > 0 + && TREE_CODE (elt.value) == RAW_DATA_CST + && wi::to_widest (dindex) < (wi::to_widest (idx) + + RAW_DATA_LENGTH (elt.value))) + cmp = 0; if (cmp < 0) end = middle; else if (cmp > 0) begin = middle + 1; else { + if (insert && TREE_CODE (elt.value) == RAW_DATA_CST) + { + /* We need to split the RAW_DATA_CST elt. */ + constructor_elt e; + gcc_checking_assert (TREE_CODE (idx) != RANGE_EXPR); + unsigned int off = (wi::to_widest (dindex) + - wi::to_widest (idx)).to_uhwi (); + tree value = elt.value; + unsigned int len = RAW_DATA_LENGTH (value); + if (off > 1 && len >= off + 3) + value = copy_node (elt.value); + if (off) + { + if (off > 1) + RAW_DATA_LENGTH (elt.value) = off; + else + elt.value = raw_data_cst_elt (elt.value, 0); + e.index = size_binop (PLUS_EXPR, elt.index, + build_int_cst (TREE_TYPE (elt.index), + off)); + e.value = NULL_TREE; + ++middle; + vec_safe_insert (CONSTRUCTOR_ELTS (ary), middle, e); + } + (*elts)[middle].value = raw_data_cst_elt (value, off); + if (len >= off + 2) + { + e.index = (*elts)[middle].index; + e.index = size_binop (PLUS_EXPR, e.index, + build_one_cst (TREE_TYPE (e.index))); + if (len >= off + 3) + { + RAW_DATA_LENGTH (value) -= off + 1; + RAW_DATA_POINTER (value) += off + 1; + e.value = value; + } + else + e.value = raw_data_cst_elt (value, off + 1); + vec_safe_insert (CONSTRUCTOR_ELTS (ary), middle + 1, e); + } + return middle; + } if (insert && TREE_CODE (idx) == RANGE_EXPR) { /* We need to split the range. */ @@ -4481,7 +4556,17 @@ cxx_eval_array_reference (const constexp { tree r; if (TREE_CODE (ary) == CONSTRUCTOR) - r = (*CONSTRUCTOR_ELTS (ary))[i].value; + { + r = (*CONSTRUCTOR_ELTS (ary))[i].value; + if (TREE_CODE (r) == RAW_DATA_CST) + { + tree ridx = (*CONSTRUCTOR_ELTS (ary))[i].index; + gcc_checking_assert (ridx); + unsigned int off + = (wi::to_widest (index) - wi::to_widest (ridx)).to_uhwi (); + r = raw_data_cst_elt (r, off); + } + } else if (TREE_CODE (ary) == VECTOR_CST) r = VECTOR_CST_ELT (ary, i); else --- gcc/cp/typeck2.cc.jj 2024-07-02 22:06:53.591463682 +0200 +++ gcc/cp/typeck2.cc 2024-07-15 15:37:41.565862537 +0200 @@ -1310,6 +1310,40 @@ digest_init_r (tree type, tree init, int a parenthesized list. */ if (nested && !(flags & LOOKUP_AGGREGATE_PAREN_INIT)) flags |= LOOKUP_NO_NARROWING; + if (TREE_CODE (init) == RAW_DATA_CST && !TYPE_UNSIGNED (type)) + { + tree ret = init; + if ((flags & LOOKUP_NO_NARROWING) || warn_conversion) + for (unsigned int i = 0; + i < (unsigned) RAW_DATA_LENGTH (init); ++i) + if (((const signed char *) + RAW_DATA_POINTER (init))[i] < 0) + { + if ((flags & LOOKUP_NO_NARROWING)) + { + tree elt + = build_int_cst (integer_type_node, + ((const unsigned char *) + RAW_DATA_POINTER (init))[i]); + if (!check_narrowing (type, elt, complain, false)) + { + if (!(complain & tf_warning_or_error)) + ret = error_mark_node; + continue; + } + } + if (warn_conversion) + warning (OPT_Wconversion, + "conversion from %qT to %qT changes value from " + "%qd to %qd", + integer_type_node, type, + ((const unsigned char *) + RAW_DATA_POINTER (init))[i], + ((const signed char *) + RAW_DATA_POINTER (init))[i]); + } + return ret; + } init = convert_for_initialization (0, type, init, flags, ICR_INIT, NULL_TREE, 0, complain); @@ -1558,7 +1592,7 @@ static int process_init_constructor_array (tree type, tree init, int nested, int flags, tsubst_flags_t complain) { - unsigned HOST_WIDE_INT i, len = 0; + unsigned HOST_WIDE_INT i, j, len = 0; int picflags = 0; bool unbounded = false; constructor_elt *ce; @@ -1601,11 +1635,12 @@ process_init_constructor_array (tree typ return PICFLAG_ERRONEOUS; } + j = 0; FOR_EACH_VEC_SAFE_ELT (v, i, ce) { if (!ce->index) - ce->index = size_int (i); - else if (!check_array_designated_initializer (ce, i)) + ce->index = size_int (j); + else if (!check_array_designated_initializer (ce, j)) ce->index = error_mark_node; gcc_assert (ce->value); ce->value @@ -1627,6 +1662,10 @@ process_init_constructor_array (tree typ CONSTRUCTOR_PLACEHOLDER_BOUNDARY (init) = 1; CONSTRUCTOR_PLACEHOLDER_BOUNDARY (ce->value) = 0; } + if (TREE_CODE (ce->value) == RAW_DATA_CST) + j += RAW_DATA_LENGTH (ce->value); + else + ++j; } /* No more initializers. If the array is unbounded, we are done. Otherwise, --- gcc/cp/decl.cc.jj 2024-07-12 14:03:23.870727700 +0200 +++ gcc/cp/decl.cc 2024-07-16 22:44:25.156545691 +0200 @@ -6471,18 +6471,22 @@ maybe_deduce_size_from_array_init (tree { vec *v = CONSTRUCTOR_ELTS (initializer); constructor_elt *ce; - HOST_WIDE_INT i; + HOST_WIDE_INT i, j = 0; FOR_EACH_VEC_SAFE_ELT (v, i, ce) { if (instantiation_dependent_expression_p (ce->index)) return; - if (!check_array_designated_initializer (ce, i)) + if (!check_array_designated_initializer (ce, j)) failure = 1; /* If an un-designated initializer is type-dependent, we can't check brace elision yet. */ if (ce->index == NULL_TREE && type_dependent_expression_p (ce->value)) return; + if (TREE_CODE (ce->value) == RAW_DATA_CST) + j += RAW_DATA_LENGTH (ce->value); + else + ++j; } } @@ -6836,6 +6840,7 @@ is_direct_enum_init (tree type, tree ini && TREE_CODE (init) == CONSTRUCTOR && CONSTRUCTOR_IS_DIRECT_INIT (init) && CONSTRUCTOR_NELTS (init) == 1 + && TREE_CODE (CONSTRUCTOR_ELT (init, 0)->value) != RAW_DATA_CST /* DR 2374: The single element needs to be implicitly convertible to the underlying type of the enum. */ && !type_dependent_expression_p (CONSTRUCTOR_ELT (init, 0)->value) @@ -6847,6 +6852,22 @@ is_direct_enum_init (tree type, tree ini return false; } +/* Helper function for reshape_init*. Split first element of + RAW_DATA_CST and save the rest to d->cur->value. */ + +static tree +cp_maybe_split_raw_data (reshape_iter *d) +{ + if (TREE_CODE (d->cur->value) != RAW_DATA_CST) + return NULL_TREE; + tree ret = *raw_data_iterator (d->cur->value, 0); + ++RAW_DATA_POINTER (d->cur->value); + --RAW_DATA_LENGTH (d->cur->value); + if (RAW_DATA_LENGTH (d->cur->value) == 1) + d->cur->value = *raw_data_iterator (d->cur->value, 0); + return ret; +} + /* Subroutine of reshape_init_array and reshape_init_vector, which does the actual work. ELT_TYPE is the element type of the array. MAX_INDEX is an INTEGER_CST representing the size of the array minus one (the maximum index), @@ -6855,7 +6876,8 @@ is_direct_enum_init (tree type, tree ini static tree reshape_init_array_1 (tree elt_type, tree max_index, reshape_iter *d, - tree first_initializer_p, tsubst_flags_t complain) + tree first_initializer_p, bool vector_p, + tsubst_flags_t complain) { tree new_init; bool sized_array_p = (max_index && TREE_CONSTANT (max_index)); @@ -6888,6 +6910,7 @@ reshape_init_array_1 (tree elt_type, tre max_index_cst = tree_to_uhwi (fold_convert (size_type_node, max_index)); } + constructor_elt *first_cur = d->cur; /* Loop until there are no more initializers. */ for (index = 0; d->cur != d->end && (!sized_array_p || index <= max_index_cst); @@ -6895,16 +6918,68 @@ reshape_init_array_1 (tree elt_type, tre { tree elt_init; constructor_elt *old_cur = d->cur; + const char *old_ptr = NULL; + + if (TREE_CODE (d->cur->value) == RAW_DATA_CST) + old_ptr = RAW_DATA_POINTER (d->cur->value); if (d->cur->index) CONSTRUCTOR_IS_DESIGNATED_INIT (new_init) = true; check_array_designated_initializer (d->cur, index); - elt_init = reshape_init_r (elt_type, d, - /*first_initializer_p=*/NULL_TREE, - complain); + if (TREE_CODE (d->cur->value) == RAW_DATA_CST + && (TREE_CODE (elt_type) == INTEGER_TYPE + || (TREE_CODE (elt_type) == ENUMERAL_TYPE + && TYPE_CONTEXT (TYPE_MAIN_VARIANT (elt_type)) == std_node + && strcmp (TYPE_NAME_STRING (TYPE_MAIN_VARIANT (elt_type)), + "byte") == 0)) + && TYPE_PRECISION (elt_type) == CHAR_BIT + && (!sized_array_p || index < max_index_cst) + && !vector_p) + { + elt_init = d->cur->value; + if (!sized_array_p + || ((unsigned) RAW_DATA_LENGTH (d->cur->value) + <= max_index_cst - index + 1)) + d->cur++; + else + { + unsigned int len = max_index_cst - index + 1; + if ((unsigned) RAW_DATA_LENGTH (d->cur->value) == len + 1) + d->cur->value + = build_int_cst (integer_type_node, + *(const unsigned char *) + RAW_DATA_POINTER (d->cur->value) + len); + else + { + d->cur->value = copy_node (elt_init); + RAW_DATA_LENGTH (d->cur->value) -= len; + RAW_DATA_POINTER (d->cur->value) += len; + } + RAW_DATA_LENGTH (elt_init) = len; + } + TREE_TYPE (elt_init) = elt_type; + } + else + elt_init = reshape_init_r (elt_type, d, + /*first_initializer_p=*/NULL_TREE, + complain); if (elt_init == error_mark_node) return error_mark_node; tree idx = size_int (index); + if (reuse && old_ptr && d->cur == old_cur) + { + /* We need to stop reusing as some RAW_DATA_CST in the original + ctor had to be split. */ + new_init = build_constructor (init_list_type_node, NULL); + if (index) + { + vec_safe_grow (CONSTRUCTOR_ELTS (new_init), index); + memcpy (CONSTRUCTOR_ELT (new_init, 0), first_cur, + (d->cur - first_cur) + * sizeof (*CONSTRUCTOR_ELT (new_init, 0))); + } + reuse = false; + } if (reuse) { old_cur->index = idx; @@ -6917,8 +6992,15 @@ reshape_init_array_1 (tree elt_type, tre TREE_CONSTANT (new_init) = false; /* This can happen with an invalid initializer (c++/54501). */ - if (d->cur == old_cur && !sized_array_p) + if (d->cur == old_cur + && !sized_array_p + && (old_ptr == NULL + || (TREE_CODE (d->cur->value) == RAW_DATA_CST + && RAW_DATA_POINTER (d->cur->value) == old_ptr))) break; + + if (TREE_CODE (elt_init) == RAW_DATA_CST) + index += RAW_DATA_LENGTH (elt_init) - 1; } return new_init; @@ -6939,7 +7021,7 @@ reshape_init_array (tree type, reshape_i max_index = array_type_nelts (type); return reshape_init_array_1 (TREE_TYPE (type), max_index, d, - first_initializer_p, complain); + first_initializer_p, false, complain); } /* Subroutine of reshape_init_r, processes the initializers for vectors. @@ -6971,7 +7053,7 @@ reshape_init_vector (tree type, reshape_ max_index = size_int (TYPE_VECTOR_SUBPARTS (type) - 1); return reshape_init_array_1 (TREE_TYPE (type), max_index, d, - NULL_TREE, complain); + NULL_TREE, true, complain); } /* Subroutine of reshape_init*: We're initializing an element with TYPE from @@ -7044,8 +7126,12 @@ reshape_init_class (tree type, reshape_i { tree field_init; constructor_elt *old_cur = d->cur; + const char *old_ptr = NULL; bool direct_desig = false; + if (TREE_CODE (d->cur->value) == RAW_DATA_CST) + old_ptr = RAW_DATA_POINTER (d->cur->value); + /* Handle C++20 designated initializers. */ if (d->cur->index) { @@ -7158,6 +7244,7 @@ reshape_init_class (tree type, reshape_i is initialized by the designated-initializer-list { D }, where D is the designated- initializer-clause naming a member of the anonymous union member." */ + gcc_checking_assert (TREE_CODE (d->cur->value) != RAW_DATA_CST); field_init = reshape_single_init (TREE_TYPE (field), d->cur->value, complain); d->cur++; @@ -7170,7 +7257,11 @@ reshape_init_class (tree type, reshape_i if (field_init == error_mark_node) return error_mark_node; - if (d->cur == old_cur && d->cur->index) + if (d->cur == old_cur + && d->cur->index + && (old_ptr == NULL + || (TREE_CODE (d->cur->value) == RAW_DATA_CST + && RAW_DATA_POINTER (d->cur->value) == old_ptr))) { /* This can happen with an invalid initializer for a flexible array member (c++/54441). */ @@ -7205,8 +7296,11 @@ reshape_init_class (tree type, reshape_i correspond to all remaining elements of the initializer list (if any). */ if (last_was_pack_expansion) { + tree init = d->cur->value; + if (tree raw_init = cp_maybe_split_raw_data (d)) + init = raw_init; CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (new_init), - last_was_pack_expansion, d->cur->value); + last_was_pack_expansion, init); while (d->cur != d->end) d->cur++; } @@ -7258,7 +7352,10 @@ reshape_init_r (tree type, reshape_iter { /* A complex type can be initialized from one or two initializers, but braces are not elided. */ - d->cur++; + if (tree raw_init = cp_maybe_split_raw_data (d)) + init = raw_init; + else + d->cur++; if (BRACE_ENCLOSED_INITIALIZER_P (stripped_init)) { if (CONSTRUCTOR_NELTS (stripped_init) > 2) @@ -7273,10 +7370,13 @@ reshape_init_r (tree type, reshape_iter { vec *v = 0; CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init); - CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, d->cur->value); + tree raw_init = cp_maybe_split_raw_data (d); + CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, + raw_init ? raw_init : d->cur->value); if (has_designator_problem (d, complain)) return error_mark_node; - d->cur++; + if (!raw_init) + d->cur++; init = build_constructor (init_list_type_node, v); } return init; @@ -7324,6 +7424,8 @@ reshape_init_r (tree type, reshape_iter else maybe_warn_cpp0x (CPP0X_INITIALIZER_LISTS); } + else if (tree raw_init = cp_maybe_split_raw_data (d)) + return raw_init; d->cur++; return init; @@ -7337,6 +7439,7 @@ reshape_init_r (tree type, reshape_iter /* But not if it's a designated init. */ && !d->cur->index && d->end - d->cur == 1 + && TREE_CODE (init) != RAW_DATA_CST && reference_related_p (type, TREE_TYPE (init))) { d->cur++; @@ -7358,9 +7461,16 @@ reshape_init_r (tree type, reshape_iter valid aggregate initialization. */ && !first_initializer_p && (same_type_ignoring_top_level_qualifiers_p (type, TREE_TYPE (init)) - || can_convert_arg (type, TREE_TYPE (init), init, LOOKUP_NORMAL, - complain))) + || can_convert_arg (type, TREE_TYPE (init), + TREE_CODE (init) == RAW_DATA_CST + ? build_int_cst (integer_type_node, + *(const unsigned char *) + RAW_DATA_POINTER (init)) + : init, + LOOKUP_NORMAL, complain))) { + if (tree raw_init = cp_maybe_split_raw_data (d)) + return raw_init; d->cur++; return init; } @@ -7463,7 +7573,7 @@ reshape_init_r (tree type, reshape_iter else if (VECTOR_TYPE_P (type)) new_init = reshape_init_vector (type, d, complain); else - gcc_unreachable(); + gcc_unreachable (); if (braces_elided_p && TREE_CODE (new_init) == CONSTRUCTOR) --- gcc/testsuite/c-c++-common/cpp/embed-22.c.jj 2024-07-15 18:57:18.013860745 +0200 +++ gcc/testsuite/c-c++-common/cpp/embed-22.c 2024-07-15 19:01:49.146451109 +0200 @@ -0,0 +1,28 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -Wno-psabi" } */ +/* { dg-additional-options "-std=c23" { target c } } */ + +typedef unsigned char V __attribute__((vector_size (128))); + +V a; + +void +foo (void) +{ + V b = { + #embed __FILE__ limit (128) gnu::offset (3) + }; + a = b; +} + +const unsigned char c[] = { + #embed __FILE__ limit (128) gnu::offset (3) +}; + +int +main () +{ + foo (); + if (__builtin_memcmp (&c[0], &a, sizeof (a))) + __builtin_abort (); +} --- gcc/testsuite/c-c++-common/cpp/embed-23.c.jj 2024-07-16 12:41:11.514073178 +0200 +++ gcc/testsuite/c-c++-common/cpp/embed-23.c 2024-07-16 13:09:16.730670474 +0200 @@ -0,0 +1,36 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ +/* { dg-additional-options "-std=gnu23" { target c } } */ + +typedef unsigned char V __attribute__((vector_size (16))); + +struct S { _Complex double a; V b; int c; }; +struct T { int a; struct S b; int c; struct S d; int e; unsigned char f[22]; _Complex long double g; }; + +const unsigned char a[] = { + #embed __FILE__ limit (124) +}; +const struct T b[2] = { + #embed __FILE__ limit (124) +}; + +int +main () +{ + for (int i = 0; i < 2; ++i) + if (b[i].a != a[i * 62] + || __real__ b[i].b.a != a[i * 62 + 1] + || __imag__ b[i].b.a + || __builtin_memcmp (&b[i].b.b, &a[i * 62 + 2], 16) + || b[i].b.c != a[i * 62 + 18] + || b[i].c != a[i * 62 + 19] + || __real__ b[i].d.a != a[i * 62 + 20] + || __imag__ b[i].d.a + || __builtin_memcmp (&b[i].d.b, &a[i * 62 + 21], 16) + || b[i].d.c != a[i * 62 + 37] + || b[i].e != a[i * 62 + 38] + || __builtin_memcmp (&b[i].f[0], &a[i * 62 + 39], 22) + || __real__ b[i].g != a[i * 62 + 61] + || __imag__ b[i].g) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-4.C.jj 2024-07-15 17:46:54.113865890 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-4.C 2024-07-15 17:48:15.000000000 +0200 @@ -0,0 +1,66 @@ +// { dg-do run { target c++11 } } +// { dg-options "" } + +constexpr unsigned char a[] = { +#embed __FILE__ +}; + +constexpr unsigned char +foo (int x) +{ + return a[x]; +} +constexpr unsigned char b = a[32]; +constexpr unsigned char c = foo (42); + +#if __cplusplus >= 201402L +constexpr bool +bar () +{ + unsigned char d[] = { + #embed __FILE__ + }; + d[42] = ' '; + d[32] = 'X'; + d[0] = d[1] + 16; + d[sizeof (d) - 1] = d[42] - ' '; + for (int i = 0; i < sizeof (d); ++i) + switch (i) + { + case 0: + if (d[i] != a[1] + 16) + return false; + break; + case 32: + if (d[i] != 'X') + return false; + break; + case 42: + if (d[i] != ' ') + return false; + break; + case sizeof (d) - 1: + if (d[i] != 0) + return false; + break; + default: + if (d[i] != a[i]) + return false; + break; + } + return true; +} + +static_assert (bar (), ""); +#endif + +int +main () +{ + unsigned char e[] = { + #embed __FILE__ + }; + + if (b != e[32] || c != e[42]) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-5.C.jj 2024-07-15 18:06:56.460845067 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-5.C 2024-07-15 18:38:41.170905555 +0200 @@ -0,0 +1,72 @@ +// { dg-do run { target c++14 } } +// { dg-options "" } + +template +constexpr T a[] = { +#embed __FILE__ +}; + +template +constexpr T +foo (int x) +{ + return a[x]; +} +constexpr unsigned char b = a[32]; +constexpr unsigned char c = foo (42); +constexpr int b2 = a[32]; +constexpr int c2 = foo (42); + +template +constexpr bool +bar () +{ + T d[] = { + #embed __FILE__ + }; + d[42] = ' '; + d[32] = 'X'; + d[0] = d[1] + 16; + d[sizeof (d) / sizeof (T) - 1] = d[42] - ' '; + for (int i = 0; i < sizeof (d) / sizeof (T); ++i) + switch (i) + { + case 0: + if (d[i] != a[1] + 16) + return false; + break; + case 32: + if (d[i] != 'X') + return false; + break; + case 42: + if (d[i] != ' ') + return false; + break; + case sizeof (d) / sizeof (T) - 1: + if (d[i] != 0) + return false; + break; + default: + if (d[i] != a[i]) + return false; + break; + } + return true; +} + +static_assert (bar (), ""); +static_assert (bar (), ""); + +int +main () +{ + unsigned char e[] = { + #embed __FILE__ + }; + + if (b != e[32] || c != e[42]) + __builtin_abort (); + if (b2 != b || c2 != c) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-6.C.jj 2024-07-15 18:07:35.927349168 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-6.C 2024-07-15 18:31:54.519017822 +0200 @@ -0,0 +1,72 @@ +// { dg-do run { target c++14 } } +// { dg-options "" } + +template +constexpr unsigned char a[] = { +#embed __FILE__ +}; + +template +constexpr unsigned char +foo (int x) +{ + return a[x]; +} +constexpr unsigned char b = a[32]; +constexpr unsigned char c = foo (42); +constexpr unsigned char b2 = a[32]; +constexpr unsigned char c2 = foo (42); + +template +constexpr bool +bar () +{ + unsigned char d[] = { + #embed __FILE__ + }; + d[42] = ' '; + d[32] = 'X'; + d[0] = d[1] + 16; + d[sizeof (d) - 1] = d[42] - ' '; + for (int i = 0; i < sizeof (d); ++i) + switch (i) + { + case 0: + if (d[i] != a[1] + 16) + return false; + break; + case 32: + if (d[i] != 'X') + return false; + break; + case 42: + if (d[i] != ' ') + return false; + break; + case sizeof (d) - 1: + if (d[i] != 0) + return false; + break; + default: + if (d[i] != a[i]) + return false; + break; + } + return true; +} + +static_assert (bar (), ""); +static_assert (bar (), ""); + +int +main () +{ + unsigned char e[] = { + #embed __FILE__ + }; + + if (b != e[32] || c != e[42]) + __builtin_abort (); + if (b2 != b || c2 != c) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-7.C.jj 2024-07-15 18:30:55.356761596 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-7.C 2024-07-15 18:18:06.385427418 +0200 @@ -0,0 +1,7 @@ +// This is a comment with some UTF-8 non-ASCII characters: áéíóú. +// { dg-do compile { target c++11 } } +// { dg-options "" } */ + +const signed char a[] = { +#embed __FILE__ +}; // { dg-error "narrowing conversion of '\[12]\[0-9]\[0-9]' from 'int' to 'const signed char'" } --- gcc/testsuite/g++.dg/cpp/embed-8.C.jj 2024-07-15 18:30:58.879717302 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-8.C 2024-07-15 18:34:17.199224101 +0200 @@ -0,0 +1,7 @@ +// This is a comment with some UTF-8 non-ASCII characters: áéíóú. +// { dg-do compile { target c++11 } } +// { dg-options "-Wno-narrowing -Wconversion" } + +const signed char a[] = { +#embed __FILE__ +}; // { dg-warning "conversion from 'int' to 'const signed char' changes value from '\[12]\[0-9]\[0-9]' to '-\[0-9]\[0-9]*'" } --- gcc/testsuite/g++.dg/cpp/embed-9.C.jj 2024-07-16 11:44:18.624617163 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-9.C 2024-07-16 11:49:19.171768836 +0200 @@ -0,0 +1,57 @@ +// { dg-do run { target c++11 } } +// { dg-options "--embed-dir=${srcdir}/c-c++-common/cpp/embed-dir" } + +const unsigned char m[] = { + #embed limit (131) +}; + +template +int +foo () +{ + unsigned char a[] = { N... }; + for (int i = 0; i < sizeof (a); ++i) + if (a[i] != m[i]) + return -1; + return sizeof (a); +} + +template +int +bar (T... args) +{ + int a[] = { args... }; + for (int i = 0; i < sizeof (a) / sizeof (a[0]); ++i) + if (a[i] != m[i]) + return -1; + return sizeof (a) / sizeof (a[0]); +} + +int +main () +{ + if (foo < + #embed limit (1) + > () != 1) + __builtin_abort (); + if (foo < + #embed limit (6) + > () != 6) + __builtin_abort (); + if (foo < + #embed limit (131) + > () != 131) + __builtin_abort (); + if (bar ( + #embed limit (1) + ) != 1) + __builtin_abort (); + if (bar ( + #embed limit (6) + ) != 6) + __builtin_abort (); + if (bar ( + #embed limit (131) + ) != 131) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-10.C.jj 2024-07-16 11:50:28.571880216 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-10.C 2024-07-16 11:57:46.462296213 +0200 @@ -0,0 +1,40 @@ +// { dg-do run { target c++23 } } +// { dg-options "--embed-dir=${srcdir}/c-c++-common/cpp/embed-dir" } + +const unsigned char m[] = { + #embed limit (136) +}; + +struct S +{ + S () : a {} {}; + template + int &operator[] (T... args) + { + int b[] = { args... }; + for (int i = 0; i < sizeof (b) / sizeof (b[0]); ++i) + if (b[i] != m[i]) + return a[137]; + return a[sizeof (b) / sizeof (b[0])]; + } + int a[138]; +}; + +S s; + +int +main () +{ + if (&s[ + #embed limit (1) + ] != &s.a[1]) + __builtin_abort (); + if (&s[ + #embed limit (6) + ] != &s.a[6]) + __builtin_abort (); + if (&s[ + #embed limit (135) + ] != &s.a[135]) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-11.C.jj 2024-07-16 12:05:19.170536951 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-11.C 2024-07-16 12:16:01.948346872 +0200 @@ -0,0 +1,41 @@ +// { dg-do run } +// { dg-options "-Wunused-value" } + +#include + +const unsigned char a[] = { + #embed __FILE__ limit (128) +}; + +int +foo (int x, ...) +{ + if (x != 42) + return 2; + va_list ap; + va_start (ap, x); + for (int i = 0; i < 128; ++i) + if (va_arg (ap, int) != a[i]) + { + va_end (ap); + return 1; + } + va_end (ap); + return 0; +} + +int b, c; + +int +main () +{ + if (foo (42, +#embed __FILE__ limit (128) + )) + __builtin_abort (); + b = ( +#embed __FILE__ limit (128) prefix (c = 2 * ) suffix ( + 6) // { dg-warning "right operand of comma operator has no effect" } + ); + if (b != a[127] + 6 || c != 2 * a[0]) + __builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp/embed-12.C.jj 2024-07-16 12:07:19.451006766 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-12.C 2024-07-16 12:29:27.601065723 +0200 @@ -0,0 +1,34 @@ +// { dg-do compile } +// { dg-options "-Wnonnull" } + +#define A(n) int *p##n +#define B(n) A(n##0), A(n##1), A(n##2), A(n##3), A(n##4), A(n##5), A(n##6), A(n##7) +#define C(n) B(n##0), B(n##1), B(n##2), B(n##3), B(n##4), B(n##5), B(n##6), B(n##7) +#define D C(0), C(1), C(2), C(3) + +void foo (D) __attribute__((nonnull ( // { dg-message "in a call to function '\[^\n\r]*' declared 'nonnull'" } +#embed __FILE__ limit (128) +))); +#if __cplusplus >= 201103L +[[gnu::nonnull ( +#embed __FILE__ limit (128) +)]] void bar (D); // { dg-message "in a call to function '\[^\n\r]*' declared 'nonnull'" "" { target c++11 } } +#else +void bar (D) __attribute__((nonnull ( // { dg-message "in a call to function '\[^\n\r]*' declared 'nonnull'" "" { target c++98_only } } +#embed __FILE__ limit (128) +))); +#endif + +#undef A +#if __cplusplus >= 201103L +#define A(n) nullptr +#else +#define A(n) 0 +#endif + +void +baz () +{ + foo (D); // { dg-warning "argument \[0-9]\+ null where non-null expected" } + bar (D); // { dg-warning "argument \[0-9]\+ null where non-null expected" } +}