From patchwork Tue Aug 13 17:48:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xianmiao Qu X-Patchwork-Id: 1972048 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.a=rsa-sha256 header.s=default header.b=HTCkrI+M; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WjzQk3Yzqz1yXl for ; Wed, 14 Aug 2024 03:49:02 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AAD48385B82F for ; Tue, 13 Aug 2024 17:49:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by sourceware.org (Postfix) with ESMTPS id 6F5443858404 for ; Tue, 13 Aug 2024 17:48:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6F5443858404 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.alibaba.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6F5443858404 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723571321; cv=none; b=SLVvE4WmhI2C08/gPGqtZ0Mtpp0eQdpKSLgGzS+XjWy5/NytTfoxp6YLEITL4x1wWVDUSxsVXmbv3e8OuZ+LBafWwksukF9uRlu3fBU721iGz/IPPEhEH08XxG+nS6IXeJ2briBNRXGVcO3KMzlQ7CdtiWZnRY4QQiw4mzFRrXA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723571321; c=relaxed/simple; bh=xy1DZTuflOo5Zp5qMlmQPITZbCSqzUjdSgoNyw0Qo/0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Vr15/w/SY9i8i4FF7Yyv4u+OMbewee1bxFdnEHkEQeYegynpLw8Fe/8aH1EHk61JOi+2f43ECXmFrMwzg+/Q4qH+FbmuraIXyvqLPR2KFnsdc/FGVBZi2yEkO8xR8pgYTtFVojrcndZX0EwnOCWYtagtqnvv8ku+k9ih8aw0VIw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723571316; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=ccAoQ2JqSGjao5qn5h6n99+30SRSNUufRuWEotOuZDM=; b=HTCkrI+MIIV966//c7MjdGl4fuVXUlhR/cAaoqI0U8+RbBki8tZQ0lDZJxNWutBxtLQlr5F8k08983WaVPi9Hq+gi2ZPhq/XVEbUgJppmm+mAiZNKtNVsuJI4bxIPzdKQcAwFfE8dlPSq4OU/8smUz8HXb6+q9CtT+K+Ny8+McY= Received: from localhost(mailfrom:cooper.qu@linux.alibaba.com fp:SMTPD_---0WCpgLUw_1723571312) by smtp.aliyun-inc.com; Wed, 14 Aug 2024 01:48:34 +0800 From: Xianmiao Qu To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Cc: Xianmiao Qu Subject: [PATCH v4] genoutput: Accelerate the place_operands function. Date: Wed, 14 Aug 2024 01:48:23 +0800 Message-Id: <20240813174823.14950-1-cooper.qu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 X-Spam-Status: No, score=-27.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org With the increase in the number of modes and patterns for some backend architectures, the place_operands function becomes a bottleneck int the speed of genoutput, and may even become a bottleneck int the overall speed of building the GCC project. This patch aims to accelerate the place_operands function, the optimizations it includes are: 1. Use a hash table to store operand information, improving the lookup time for the first operand. 2. Move mode comparison to the beginning to avoid the scenarios of most strcmp. I tested the speed improvements for the following backends, Improvement Ratio x86_64 197.9% aarch64 954.5% riscv 2578.6% If the build machine is slow, then this improvement can save a lot of time. I tested the genoutput output for x86_64/aarch64/riscv backends, and there was no difference compared to before the optimization, so this shouldn't introduce any functional issues. gcc/ * genoutput.cc (struct operand_data): Add member 'eq_next' to point to the next member with the same hash value in the hash table. (compare_operands): Move the comparison of the mode to the very beginning to accelerate the comparison of the two operands. (struct operand_data_hasher): New, a class that takes into account the necessary elements for comparing the equality of two operands in its hash value. (operand_data_hasher::hash): New. (operand_data_hasher::equal): New. (operand_datas): New, hash table of konwn pattern operands. (place_operands): Use a hash table instead of traversing the array to find the same operand. (main): Add initialization of the hash table 'operand_datas'. --- gcc/genoutput.cc | 105 ++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 85 insertions(+), 20 deletions(-) diff --git a/gcc/genoutput.cc b/gcc/genoutput.cc index efd81766bb5b..cca1b6622d47 100644 --- a/gcc/genoutput.cc +++ b/gcc/genoutput.cc @@ -91,6 +91,7 @@ along with GCC; see the file COPYING3. If not see #include "errors.h" #include "read-md.h" #include "gensupport.h" +#include "hash-table.h" /* No instruction can have more operands than this. Sorry for this arbitrary limit, but what machine will have an instruction with @@ -112,6 +113,8 @@ static int next_operand_number = 1; struct operand_data { struct operand_data *next; + /* Point to the next member with the same hash value in the hash table. */ + struct operand_data *eq_next; int index; const char *predicate; const char *constraint; @@ -127,7 +130,7 @@ struct operand_data static struct operand_data null_operand = { - 0, 0, "", "", E_VOIDmode, 0, 0, 0, 0, 0 + 0, 0, 0, "", "", E_VOIDmode, 0, 0, 0, 0, 0 }; static struct operand_data *odata = &null_operand; @@ -532,6 +535,13 @@ compare_operands (struct operand_data *d0, struct operand_data *d1) { const char *p0, *p1; + /* On one hand, comparing strings for predicate and constraint + is time-consuming, and on the other hand, the probability of + different modes is relatively high. Therefore, checking the mode + first can speed up the execution of the program. */ + if (d0->mode != d1->mode) + return 0; + p0 = d0->predicate; if (!p0) p0 = ""; @@ -550,9 +560,6 @@ compare_operands (struct operand_data *d0, struct operand_data *d1) if (strcmp (p0, p1) != 0) return 0; - if (d0->mode != d1->mode) - return 0; - if (d0->strict_low != d1->strict_low) return 0; @@ -562,6 +569,47 @@ compare_operands (struct operand_data *d0, struct operand_data *d1) return 1; } +/* This is a class that takes into account the necessary elements for + comparing the equality of two operands in its hash value. */ +struct operand_data_hasher : nofree_ptr_hash +{ + static inline hashval_t hash (const operand_data *); + static inline bool equal (const operand_data *, const operand_data *); +}; + +hashval_t +operand_data_hasher::hash (const operand_data * op_info) +{ + inchash::hash h; + const char *pred, *cons; + + pred = op_info->predicate; + if (!pred) + pred = ""; + h.add (pred, strlen (pred) + 1); + + cons = op_info->constraint; + if (!cons) + cons = ""; + h.add (cons, strlen (cons) + 1); + + h.add_object (op_info->mode); + h.add_object (op_info->strict_low); + h.add_object (op_info->eliminable); + return h.end (); +} + +bool +operand_data_hasher::equal (const operand_data * op_info1, + const operand_data * op_info2) +{ + return compare_operands (const_cast(op_info1), + const_cast(op_info2)); +} + +/* Hashtable of konwn pattern operands. */ +static hash_table *operand_datas; + /* Scan the list of operands we've already committed to output and either find a subsequence that is the same, or allocate a new one at the end. */ @@ -569,6 +617,7 @@ static void place_operands (class data *d) { struct operand_data *od, *od2; + struct operand_data **slot; int i; if (d->n_operands == 0) @@ -577,23 +626,24 @@ place_operands (class data *d) return; } + od = operand_datas->find (&d->operand[0]); /* Brute force substring search. */ - for (od = odata, i = 0; od; od = od->next, i = 0) - if (compare_operands (od, &d->operand[0])) - { - od2 = od->next; - i = 1; - while (1) - { - if (i == d->n_operands) - goto full_match; - if (od2 == NULL) - goto partial_match; - if (! compare_operands (od2, &d->operand[i])) - break; - ++i, od2 = od2->next; - } - } + for (; od; od = od->eq_next) + { + od2 = od->next; + i = 1; + while (1) + { + if (i == d->n_operands) + goto full_match; + if (od2 == NULL) + goto partial_match; + if (! compare_operands (od2, &d->operand[i])) + break; + ++i, od2 = od2->next; + } + } + i = 0; /* Either partial match at the end of the list, or no match. In either case, we tack on what operands are remaining to the end of the list. */ @@ -605,6 +655,20 @@ place_operands (class data *d) *odata_end = od2; odata_end = &od2->next; od2->index = next_operand_number++; + /* Insert the operand_data variable OD2 into the hash table. + If a variable with the same hash value already exists in + the hash table, insert the element at the end of the + linked list connected through the eq_next member. */ + slot = operand_datas->find_slot (od2, INSERT); + if (*slot) + { + struct operand_data *last = (struct operand_data *) *slot; + while (last->eq_next) + last = last->eq_next; + last->eq_next = od2; + } + else + *slot = od2; } *odata_end = NULL; return; @@ -1049,6 +1113,7 @@ main (int argc, const char **argv) progname = "genoutput"; init_insn_for_nothing (); + operand_datas = new hash_table (1024); if (!init_rtx_reader_args (argc, argv)) return (FATAL_EXIT_CODE);