From patchwork Mon May 15 10:35:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 1781308 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=eKcoyYkt; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QKbPV6X64z20dX for ; Mon, 15 May 2023 20:35:58 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C7067385701E for ; Mon, 15 May 2023 10:35:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C7067385701E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684146956; bh=QN/yeq+n7qL0tIaEIAoaIuLTLKg3iadmRuw/F86uCrc=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=eKcoyYktWaiwM1ZqinJokTioylQ8qoCZq9RJyIDwDJBvFjvjjMlpYBgsO7tKyAngA YDZ/2/7P4PG0DoJwh7YVOPjbA05nsW7WbHW0kdZEgs24Ow71h4SvQkN5tPZ0mG8wkj YnnX6vJlWIvTpMCDjvCCzVi/f1ef+wfCnLSiKNEg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id D44643857001 for ; Mon, 15 May 2023 10:35:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D44643857001 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-78-PWx44alAMDqbolgvcktwWQ-1; Mon, 15 May 2023 06:35:33 -0400 X-MC-Unique: PWx44alAMDqbolgvcktwWQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D8504868A04; Mon, 15 May 2023 10:35:32 +0000 (UTC) Received: from abulafia.quesejoda.com (unknown [10.39.193.231]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6E2B31121314; Mon, 15 May 2023 10:35:32 +0000 (UTC) Received: from abulafia.quesejoda.com (localhost [127.0.0.1]) by abulafia.quesejoda.com (8.17.1/8.17.1) with ESMTPS id 34FAZUUF100428 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 15 May 2023 12:35:30 +0200 Received: (from aldyh@localhost) by abulafia.quesejoda.com (8.17.1/8.17.1/Submit) id 34FAZUtC100427; Mon, 15 May 2023 12:35:30 +0200 To: GCC patches Cc: Andrew MacLeod , Richard Biener , Jakub Jelinek , mjambor@suse.cz, Aldy Hernandez Subject: [PATCH] Add auto-resizing capability to irange's [PR109695] Date: Mon, 15 May 2023 12:35:23 +0200 Message-Id: <20230515103523.100412-1-aldyh@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Aldy Hernandez via Gcc-patches From: Aldy Hernandez Reply-To: Aldy Hernandez Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" We can now have int_range for automatically resizable ranges. int_range_max is now int_range<3, true> for a 69X reduction in size from current trunk, and 6.9X reduction from GCC12. This incurs a 5% performance penalty for VRP that is more than covered by our > 13% improvements recently. int_range_max is the temporary range object we use in the ranger for integers. With the conversion to wide_int, this structure bloated up significantly because wide_ints are huge (80 bytes a piece) and are about 10 times as big as a plain tree. Since the temporary object requires 255 sub-ranges, that's 255 * 80 * 2, plus the control word. This means the structure grew from 4112 bytes to 40912 bytes. This patch adds the ability to resize ranges as needed, defaulting to no resizing, while int_range_max now defaults to 3 sub-ranges (instead of 255) and grows to 255 when the range being calculated does not fit. For example: int_range<1> foo; // 1 sub-range with no resizing. int_range<5> foo; // 5 sub-ranges with no resizing. int_range<5, true> foo; // 5 sub-ranges with resizing. I ran some tests and found that 3 sub-ranges cover 99% of cases, so I've set the int_range_max default to that: typedef int_range<3, /*RESIZABLE=*/true> int_range_max; We don't bother growing incrementally, since the default covers most cases and we have a 255 hard-limit. This hard limit could be reduced to 128, since my tests never saw a range needing more than 124, but we could do that as a follow-up if needed. With 3-subranges, int_range_max is now 592 bytes versus 40912 for trunk, and versus 4112 bytes for GCC12! The penalty is 5.04% for VRP and 3.02% for threading, with no noticeable change in overall compilation (0.27%). This is more than covered by our 13.26% improvements for the legacy removal + wide_int conversion. I think this approach is a good alternative, while providing us with flexibility going forward. For example, we could try defaulting to a 8 sub-ranges for a noticeable improvement in VRP. We could also use large sub-ranges for switch analysis to avoid resizing. Another approach I tried was always resizing. With this, we could drop the whole int_range nonsense, and have irange just hold a resizable range. This simplified things, but incurred a 7% penalty on ipa_cp. This was hard to pinpoint, and I'm not entirely convinced this wasn't some artifact of valgrind. However, until we're sure, let's avoid massive changes, especially since IPA changes are coming up. For the curious, a particular hot spot for IPA in this area was: ipcp_vr_lattice::meet_with_1 (const value_range *other_vr) { ... ... value_range save (m_vr); m_vr.union_ (*other_vr); return m_vr != save; } The problem isn't the resizing (since we do that at most once) but the fact that for some functions with lots of callers we end up a huge range that gets copied and compared for every meet operation. Maybe the IPA algorithm could be adjusted somehow??. Anywhooo... for now there is nothing to worry about, since value_range still has 2 subranges and is not resizable. But we should probably think what if anything we want to do here, as I envision IPA using infinite ranges here (well, int_range_max) and handling frange's, etc. I'll hold off a day or two, as I'd appreciate feedback here. gcc/ChangeLog: PR tree-optimization/109695 * value-range.cc (irange::operator=): Resize range. (irange::union_): Same. (irange::intersect): Same. (irange::invert): Same. (int_range_max): Default to 3 sub-ranges and resize as needed. * value-range.h (irange::maybe_resize): New. (~int_range): New. (int_range::int_range): Adjust for resizing. (int_range::operator=): Same. --- gcc/value-range.cc | 14 +++++++ gcc/value-range.h | 98 ++++++++++++++++++++++++++++++++-------------- 2 files changed, 82 insertions(+), 30 deletions(-) diff --git a/gcc/value-range.cc b/gcc/value-range.cc index def9299dc0e..cea4ff59254 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -901,6 +901,9 @@ frange::set_nonnegative (tree type) irange & irange::operator= (const irange &src) { + int needed = src.num_pairs (); + maybe_resize (needed); + unsigned x; unsigned lim = src.m_num_ranges; if (lim > m_max_ranges) @@ -1340,6 +1343,7 @@ irange::union_ (const vrange &v) // Now it simply needs to be copied, and if there are too many // ranges, merge some. We wont do any analysis as to what the // "best" merges are, simply combine the final ranges into one. + maybe_resize (i / 2); if (i > m_max_ranges * 2) { res[m_max_ranges * 2 - 1] = res[i - 1]; @@ -1439,6 +1443,11 @@ irange::intersect (const vrange &v) if (r.irange_contains_p (*this)) return intersect_nonzero_bits (r); + // ?? We could probably come up with something smarter than the + // worst case scenario here. + int needed = num_pairs () + r.num_pairs (); + maybe_resize (needed); + signop sign = TYPE_SIGN (m_type); unsigned bld_pair = 0; unsigned bld_lim = m_max_ranges; @@ -1646,6 +1655,11 @@ irange::invert () m_num_ranges = 1; return; } + + // At this point, we need one extra sub-range to represent the + // inverse. + maybe_resize (m_num_ranges + 1); + // The algorithm is as follows. To calculate INVERT ([a,b][c,d]), we // generate [-MIN, a-1][b+1, c-1][d+1, MAX]. // diff --git a/gcc/value-range.h b/gcc/value-range.h index 22b0250b11b..0da2a42764a 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -167,9 +167,10 @@ public: void set_nonzero_bits (const wide_int &bits); protected: + void maybe_resize (int needed); virtual void set (tree, tree, value_range_kind = VR_RANGE) override; virtual bool contains_p (tree cst) const override; - irange (wide_int *, unsigned); + irange (wide_int *, unsigned nranges, bool resizable); // In-place operators. bool irange_contains_p (const irange &) const; @@ -179,6 +180,8 @@ protected: void verify_range (); + // Hard limit on max ranges allowed. + static const int HARD_MAX_RANGES = 255; private: friend void gt_ggc_mx (irange *); friend void gt_pch_nx (irange *); @@ -192,16 +195,22 @@ private: bool intersect (const wide_int& lb, const wide_int& ub); unsigned char m_num_ranges; - const unsigned char m_max_ranges; + bool m_resizable; + unsigned char m_max_ranges; tree m_type; wide_int m_nonzero_mask; +protected: wide_int *m_base; }; // Here we describe an irange with N pairs of ranges. The storage for // the pairs is embedded in the class as an array. +// +// If RESIZABLE is true, the storage will be resized on the heap when +// the number of ranges needed goes past N up to a max of +// HARD_MAX_RANGES. This new storage is freed upon destruction. -template +template class GTY((user)) int_range : public irange { public: @@ -211,7 +220,7 @@ public: int_range (tree type); int_range (const int_range &); int_range (const irange &); - virtual ~int_range () = default; + virtual ~int_range (); int_range& operator= (const int_range &); protected: int_range (tree, tree, value_range_kind = VR_RANGE); @@ -451,6 +460,38 @@ is_a (vrange &v) return v.m_discriminator == VR_FRANGE; } +// For resizable ranges, resize the range up to HARD_MAX_RANGES if the +// NEEDED pairs is greater than the current capacity of the range. + +inline void +irange::maybe_resize (int needed) +{ + if (!m_resizable || m_max_ranges == HARD_MAX_RANGES) + return; + + if (needed > m_max_ranges) + { + m_max_ranges = HARD_MAX_RANGES; + wide_int *newmem = new wide_int[m_max_ranges * 2]; + memcpy (newmem, m_base, sizeof (wide_int) * num_pairs () * 2); + m_base = newmem; + } +} + +template +inline +int_range::~int_range () +{ + if (RESIZABLE && m_base != m_ranges) + delete m_base; +} + +// This is an "infinite" precision irange for use in temporary +// calculations. It starts with a sensible default covering 99% of +// uses, and goes up to HARD_MAX_RANGES when needed. Any allocated +// storage is freed upon destruction. +typedef int_range<3, /*RESIZABLE=*/true> int_range_max; + class vrange_visitor { public: @@ -461,10 +502,6 @@ public: typedef int_range<2> value_range; -// This is an "infinite" precision irange for use in temporary -// calculations. -typedef int_range<255> int_range_max; - // This is an "infinite" precision range object for use in temporary // calculations for any of the handled types. The object can be // transparently used as a vrange. @@ -757,8 +794,9 @@ gt_pch_nx (int_range *x, gt_pointer_operator op, void *cookie) // Constructors for irange inline -irange::irange (wide_int *base, unsigned nranges) +irange::irange (wide_int *base, unsigned nranges, bool resizable) : vrange (VR_IRANGE), + m_resizable (resizable), m_max_ranges (nranges) { m_base = base; @@ -767,52 +805,52 @@ irange::irange (wide_int *base, unsigned nranges) // Constructors for int_range<>. -template +template inline -int_range::int_range () - : irange (m_ranges, N) +int_range::int_range () + : irange (m_ranges, N, RESIZABLE) { } -template -int_range::int_range (const int_range &other) - : irange (m_ranges, N) +template +int_range::int_range (const int_range &other) + : irange (m_ranges, N, RESIZABLE) { irange::operator= (other); } -template -int_range::int_range (tree min, tree max, value_range_kind kind) - : irange (m_ranges, N) +template +int_range::int_range (tree min, tree max, value_range_kind kind) + : irange (m_ranges, N, RESIZABLE) { irange::set (min, max, kind); } -template -int_range::int_range (tree type) - : irange (m_ranges, N) +template +int_range::int_range (tree type) + : irange (m_ranges, N, RESIZABLE) { set_varying (type); } -template -int_range::int_range (tree type, const wide_int &wmin, const wide_int &wmax, +template +int_range::int_range (tree type, const wide_int &wmin, const wide_int &wmax, value_range_kind kind) - : irange (m_ranges, N) + : irange (m_ranges, N, RESIZABLE) { set (type, wmin, wmax, kind); } -template -int_range::int_range (const irange &other) - : irange (m_ranges, N) +template +int_range::int_range (const irange &other) + : irange (m_ranges, N, RESIZABLE) { irange::operator= (other); } -template -int_range& -int_range::operator= (const int_range &src) +template +int_range& +int_range::operator= (const int_range &src) { irange::operator= (src); return *this;