From patchwork Mon Oct 5 07:08:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 1376621 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4C4WsL0NHsz9sSs for ; Mon, 5 Oct 2020 18:08:22 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 33B083857821; Mon, 5 Oct 2020 07:08:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 7713F3857C6C for ; Mon, 5 Oct 2020 07:08:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7713F3857C6C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tdevries@suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 87459B249; Mon, 5 Oct 2020 07:08:15 +0000 (UTC) Subject: [PATCH][ftracer] Add caching of can_duplicate_bb_p From: Tom de Vries To: Richard Biener References: <515e156e-c917-fcd9-7402-4feb991e894c@suse.de> <8626a710-09e3-d882-1cbf-6c2228aa9656@suse.de> Autocrypt: addr=tdevries@suse.de; keydata= xsBNBF0ltCcBCADDhsUnMMdEXiHFfqJdXeRvgqSEUxLCy/pHek88ALuFnPTICTwkf4g7uSR7 HvOFUoUyu8oP5mNb4VZHy3Xy8KRZGaQuaOHNhZAT1xaVo6kxjswUi3vYgGJhFMiLuIHdApoc u5f7UbV+egYVxmkvVLSqsVD4pUgHeSoAcIlm3blZ1sDKviJCwaHxDQkVmSsGXImaAU+ViJ5l CwkvyiiIifWD2SoOuFexZyZ7RUddLosgsO0npVUYbl6dEMq2a5ijGF6/rBs1m3nAoIgpXk6P TCKlSWVW6OCneTaKM5C387972qREtiArTakRQIpvDJuiR2soGfdeJ6igGA1FZjU+IsM5ABEB AAHNH1RvbSBkZSBWcmllcyA8dGRldnJpZXNAc3VzZS5kZT7CwKsEEwEIAD4WIQSsnSe5hKbL MK1mGmjuhV2rbOJEoAUCXSW0JwIbAwUJA8JnAAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAh CRDuhV2rbOJEoBYhBKydJ7mEpsswrWYaaO6FXats4kSgc48H/Ra2lq5p3dHsrlQLqM7N68Fo eRDf3PMevXyMlrCYDGLVncQwMw3O/AkousktXKQ42DPJh65zoXB22yUt8m0g12xkLax98KFJ 5NyUloa6HflLl+wQL/uZjIdNUQaHQLw3HKwRMVi4l0/Jh/TygYG1Dtm8I4o708JS4y8GQxoQ UL0z1OM9hyM3gI2WVTTyprsBHy2EjMOu/2Xpod95pF8f90zBLajy6qXEnxlcsqreMaqmkzKn 3KTZpWRxNAS/IH3FbGQ+3RpWkNGSJpwfEMVCeyK5a1n7yt1podd1ajY5mA1jcaUmGppqx827 8TqyteNe1B/pbiUt2L/WhnTgW1NC1QDOwE0EXSW0JwEIAM99H34Bu4MKM7HDJVt864MXbx7B 1M93wVlpJ7Uq+XDFD0A0hIal028j+h6jA6bhzWto4RUfDl/9mn1StngNVFovvwtfzbamp6+W pKHZm9X5YvlIwCx131kTxCNDcF+/adRW4n8CU3pZWYmNVqhMUiPLxElA6QhXTtVBh1RkjCZQ Kmbd1szvcOfaD8s+tJABJzNZsmO2hVuFwkDrRN8Jgrh92a+yHQPd9+RybW2l7sJv26nkUH5Z 5s84P6894ebgimcprJdAkjJTgprl1nhgvptU5M9Uv85Pferoh2groQEAtRPlCGrZ2/2qVNe9 XJfSYbiyedvApWcJs5DOByTaKkcAEQEAAcLAkwQYAQgAJhYhBKydJ7mEpsswrWYaaO6FXats 4kSgBQJdJbQnAhsMBQkDwmcAACEJEO6FXats4kSgFiEErJ0nuYSmyzCtZhpo7oVdq2ziRKD3 twf7BAQBZ8TqR812zKAD7biOnWIJ0McV72PFBxmLIHp24UVe0ZogtYMxSWKLg3csh0yLVwc7 H3vldzJ9AoK3Qxp0Q6K/rDOeUy3HMqewQGcqrsRRh0NXDIQk5CgSrZslPe47qIbe3O7ik/MC q31FNIAQJPmKXX25B115MMzkSKlv4udfx7KdyxHrTSkwWZArLQiEZj5KG4cCKhIoMygPTA3U yGaIvI/BGOtHZ7bEBVUCFDFfOWJ26IOCoPnSVUvKPEOH9dv+sNy7jyBsP5QxeTqwxC/1ZtNS DUCSFQjqA6bEGwM22dP8OUY6SC94x1G81A9/xbtm9LQxKm0EiDH8KBMLfQ== Message-ID: <7b2e8361-990f-7aaf-e3d1-7df6247b4db7@suse.de> Date: Mon, 5 Oct 2020 09:08:14 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jakub Jelinek , Tobias Burnus , Alexander Monakov , gcc-patches Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" [ was: Re: [PATCH][omp, ftracer] Don't duplicate blocks in SIMT region ] On 10/5/20 9:05 AM, Tom de Vries wrote: > Ack, updated the patch accordingly, and split it up in two bits, one > that does refactoring, and one that adds the actual caching: > - [ftracer] Factor out can_duplicate_bb_p > - [ftracer] Add caching of can_duplicate_bb_p > > I'll post these in reply to this email. OK? Thanks, - Tom [ftracer] Add caching of can_duplicate_bb_p The fix "[omp, ftracer] Don't duplicate blocks in SIMT region" adds iteration over insns in ignore_bb_p, which makes it more expensive. Counteract this by piggybacking the computation of can_duplicate_bb_p onto count_insns, which is called at the start of ftracer. Bootstrapped and reg-tested on x86_64-linux. gcc/ChangeLog: 2020-10-05 Tom de Vries * tracer.c (count_insns): Rename to ... (analyze_bb): ... this. (cache_can_duplicate_bb_p, cached_can_duplicate_bb_p): New function. (ignore_bb_p): Use cached_can_duplicate_bb_p. (tail_duplicate): Call cache_can_duplicate_bb_p. --- gcc/tracer.c | 47 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/gcc/tracer.c b/gcc/tracer.c index c0e888f6b03..0f69b335b8c 100644 --- a/gcc/tracer.c +++ b/gcc/tracer.c @@ -53,7 +53,7 @@ #include "fibonacci_heap.h" #include "tracer.h" -static int count_insns (basic_block); +static void analyze_bb (basic_block, int *); static bool better_p (const_edge, const_edge); static edge find_best_successor (basic_block); static edge find_best_predecessor (basic_block); @@ -143,6 +143,33 @@ can_duplicate_bb_p (const_basic_block bb) return true; } +static sbitmap can_duplicate_bb; + +/* Cache VAL as value of can_duplicate_bb_p for BB. */ +static inline void +cache_can_duplicate_bb_p (const_basic_block bb, bool val) +{ + if (val) + bitmap_set_bit (can_duplicate_bb, bb->index); +} + +/* Return cached value of can_duplicate_bb_p for BB. */ +static bool +cached_can_duplicate_bb_p (const_basic_block bb) +{ + if (can_duplicate_bb) + { + unsigned int size = SBITMAP_SIZE (can_duplicate_bb); + if ((unsigned int)bb->index < size) + return bitmap_bit_p (can_duplicate_bb, bb->index); + + /* Assume added bb's should not be duplicated. */ + return false; + } + + return can_duplicate_bb_p (bb); +} + /* Return true if we should ignore the basic block for purposes of tracing. */ bool ignore_bb_p (const_basic_block bb) @@ -152,24 +179,27 @@ ignore_bb_p (const_basic_block bb) if (optimize_bb_for_size_p (bb)) return true; - return !can_duplicate_bb_p (bb); + return !cached_can_duplicate_bb_p (bb); } /* Return number of instructions in the block. */ -static int -count_insns (basic_block bb) +static void +analyze_bb (basic_block bb, int *count) { gimple_stmt_iterator gsi; gimple *stmt; int n = 0; + bool can_duplicate = can_duplicate_bb_no_insn_iter_p (bb); for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) { stmt = gsi_stmt (gsi); n += estimate_num_insns (stmt, &eni_size_weights); + can_duplicate = can_duplicate && can_duplicate_insn_p (stmt); } - return n; + *count = n; + cache_can_duplicate_bb_p (bb, can_duplicate); } /* Return true if E1 is more frequent than E2. */ @@ -317,6 +347,8 @@ tail_duplicate (void) resize it. */ bb_seen = sbitmap_alloc (last_basic_block_for_fn (cfun) * 2); bitmap_clear (bb_seen); + can_duplicate_bb = sbitmap_alloc (last_basic_block_for_fn (cfun)); + bitmap_clear (can_duplicate_bb); initialize_original_copy_tables (); if (profile_info && profile_status_for_fn (cfun) == PROFILE_READ) @@ -330,7 +362,8 @@ tail_duplicate (void) FOR_EACH_BB_FN (bb, cfun) { - int n = count_insns (bb); + int n; + analyze_bb (bb, &n); if (!ignore_bb_p (bb)) blocks[bb->index] = heap.insert (-bb->count.to_frequency (cfun), bb); @@ -420,6 +453,8 @@ tail_duplicate (void) free_original_copy_tables (); sbitmap_free (bb_seen); + sbitmap_free (can_duplicate_bb); + can_duplicate_bb = NULL; free (trace); free (counts);