From patchwork Wed Jul 17 13:55:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1961635 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=fwo/fQlt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WPHX06ZrKz1ySl for ; Wed, 17 Jul 2024 23:55:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1BF67385DDE3 for ; Wed, 17 Jul 2024 13:55:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 22EC83858D34 for ; Wed, 17 Jul 2024 13:55:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 22EC83858D34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 22EC83858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721224526; cv=none; b=X2pouC9jiAFxmGJ7SFH5/BITE+tKwxwp0r2LDeZcBAc/7Jdh/6hwSisEgYC1jemrZPfHJMFCHXFOdLvvkPrxrI7W/AA7CKtAOirVsSChStTgx4BgD3iSczSrZFL/e7q4I2jPgceefbFjSg47+Fn62AnD6nPuPjfc5pcq+vBkq/I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721224526; c=relaxed/simple; bh=TY4TNrkNhkH2Aytve89zzaSpiqdcpLehc3GKe4PhXdU=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=bKu9YgA+kWIeO8ZY+ipo40WsNjMisilsEWOx3d1t8KqY3Onmiw9DvaDIt6TxZbNOpMi/eUWM9MgYkzIOEYQLv/nzrNXMhiWG+AA+4T1ZwJAjIgTAT1NDAttPsJWHZgdpFPa1+hSzC4VrojIqP5FMx+BrVpCuqZHOc96fxt5aWAo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721224524; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=pCf7KdqtMaI9ba5LXuL2D+2oPWxcSXWDMvgUOhj2g40=; b=fwo/fQltsUgWUJbGura06s1cJmBSJtT0f4USySh0F5TohQKjHSZXSncLZl0DpSFwXNQtvn EPR9pGNlAEdXl8cWv2NCBBEda1seVXol1vUzJ9JJJ8xh4wu82b2AIch3Wj/Mfpx1NXfzvr aznfDww2HDcgaWE9UslbnJELnbzWJPc= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-8IkHrfqGMXihtRjimxWB-w-1; Wed, 17 Jul 2024 09:55:23 -0400 X-MC-Unique: 8IkHrfqGMXihtRjimxWB-w-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2974D1955D55; Wed, 17 Jul 2024 13:55:21 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.224.25]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8C2D53000191; Wed, 17 Jul 2024 13:55:19 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 46HDt8T11069744 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 17 Jul 2024 15:55:09 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 46HDt8ob1069743; Wed, 17 Jul 2024 15:55:08 +0200 Date: Wed, 17 Jul 2024 15:55:08 +0200 From: Jakub Jelinek To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] varasm: Shorten assembly of strings with larger zero regions Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi! When not using .base64 directive, we emit for long sequences of zeros .string "foobarbaz" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" The following patch changes that to .string "foobarbaz" .zero 12 It keeps emitting .string "" if there is just one zero or two zeros where the first one is preceded by non-zeros, so we can have .string "foobarbaz" .string "" or .base64 "VG8gYmUgb3Igbm90IHRvIGJlLCB0aGF0IGlzIHRoZSBxdWVzdGlvbg==" .string "" but not 2 .string "" in a row. On a testcase I have with around 310440 0-255 unsigned char character constants mostly derived from cc1plus start but with too long sequences of 0s which broke transformation to STRING_CST adjusted to have at most 126 consecutive 0s, I see: 1504498 bytes long assembly without this patch on i686-linux (without .base64 support in binutils) 1155071 bytes long assembly with this patch on i686-linux (without .base64 support in binutils) 431390 bytes long assembly without this patch on x86_64-linux (with .base64 support in binutils) 427593 bytes long assembly with this patch on x86_64-linux (with .base64 support in binutils) All 4 assemble to identical *.o file when using x86_64-linux .base64 supporting gas, and the former 2 when using older x86_64-linux gas assemble to identical content as well. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-07-17 Jakub Jelinek * varasm.cc (default_elf_asm_output_ascii): Use ASM_OUTPUT_SKIP instead of 2 or more default_elf_asm_output_limited_string (f, "") calls and adjust base64 heuristics correspondingly. Jakub --- gcc/varasm.cc.jj 2024-07-16 13:36:54.259748720 +0200 +++ gcc/varasm.cc 2024-07-16 14:08:19.211753867 +0200 @@ -8538,6 +8538,7 @@ default_elf_asm_output_ascii (FILE *f, c if (s >= last_base64) { unsigned cnt = 0; + unsigned char prev_c = ' '; const char *t; for (t = s; t < limit && (t - s) < (long) ELF_STRING_LIMIT - 1; t++) { @@ -8560,7 +8561,13 @@ default_elf_asm_output_ascii (FILE *f, c break; case 1: if (c == 0) - cnt += 2 + strlen (STRING_ASM_OP) + 1; + { + if (prev_c == 0 + && t + 1 < limit + && (t + 1 - s) < (long) ELF_STRING_LIMIT - 1) + break; + cnt += 2 + strlen (STRING_ASM_OP) + 1; + } else cnt += 4; break; @@ -8568,6 +8575,7 @@ default_elf_asm_output_ascii (FILE *f, c cnt += 2; break; } + prev_c = c; } if (cnt > ((unsigned) (t - s) + 2) / 3 * 4 && (t - s) >= 3) { @@ -8633,8 +8641,18 @@ default_elf_asm_output_ascii (FILE *f, c bytes_in_chunk = 0; } - default_elf_asm_output_limited_string (f, s); - s = p; + if (p == s && p + 1 < limit && p[1] == '\0') + { + for (p = s + 2; p < limit && *p == '\0'; p++) + continue; + ASM_OUTPUT_SKIP (f, (unsigned HOST_WIDE_INT) (p - s)); + s = p - 1; + } + else + { + default_elf_asm_output_limited_string (f, s); + s = p; + } } else {