From patchwork Thu Jul 25 18:37:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1964835 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=QGLKkgkl; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVKPf2H89z1yXp for ; Fri, 26 Jul 2024 04:37:42 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8B4593858403 for ; Thu, 25 Jul 2024 18:37:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oa1-x35.google.com (mail-oa1-x35.google.com [IPv6:2001:4860:4864:20::35]) by sourceware.org (Postfix) with ESMTPS id AF4D83858C31 for ; Thu, 25 Jul 2024 18:37:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AF4D83858C31 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AF4D83858C31 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721932642; cv=none; b=beLBbqE3UJ7J06xphxVL34ATPDkubrbrXmsAzIp/V2hGKpm46xZ3xmhCvdwsw/wpNldX5hLrc57u1bFhqLlxq2tp/58AEWJQWu1Dv+pcElj8QxWw7eXh2r970igURHO5mg4Qx6tDjQuZRn8XelMh33ye2FXXcZLJ6NH89ZIo9u0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721932642; c=relaxed/simple; bh=X79935dR9bjBFP4Mp3zE8v8TUFT4sss8pdfCEsrWsmw=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=ijuBZTgClvonOKFaO2c2I768ZVPh8OOKG31Iqeri6+dFcKyxRCAONKUNkzZhdrQSUSWSBVftimPb8OYZUBrqZyrMF3xmDGaFnTZnqbKK45W9k8NoXtxzmdqlbd0WGZ2KYbTXAun6biaqT9IF+1XXooF9wRPW6QOy9uIR9LRFyaw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x35.google.com with SMTP id 586e51a60fabf-264a12e05b9so319772fac.1 for ; Thu, 25 Jul 2024 11:37:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721932637; x=1722537437; darn=gcc.gnu.org; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=yu5ybTA2dP5t0ikwfy7weNkfENcAzoeWqvLr6Kis2NY=; b=QGLKkgkl/6hzQLbIPV2we0De8Ukjp005Sfdt19rw91ATFqHOky1ivu4WBFV1RviboV E1cT2bBW8EAZNofgGEUjs/fUU0lOb08G2d4S67HKzsU3i1YxMY61Jmdm/3BuRMxtZMeH 3vwJUijAo7lRsJb+qCmqHrDWsirgP97Tm8rAS6hWBkr5BLEZlLZ1yHRmbyAMgjPAx9l7 N5I3D9NiqUlyHtbiDPSsduGzNMA9IRDExuys5upuDHtMIdukxIhaZly+Q/wCXMzU43De bCZh1qItHu8MURNxXlFggFZ0Il4WWfVvrSqqPv+29yFw4BHS8XANRXKjYjXvpjq/Ak1s 8Lvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721932637; x=1722537437; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yu5ybTA2dP5t0ikwfy7weNkfENcAzoeWqvLr6Kis2NY=; b=bEb8e1z4Fvu+mXKCzBE/PWrKSysxe3CXvmBrgOJN097Ltz0HVbQR5rvvRRW00ANAMi XyVswQ+himkTsuKGY6VTaUoGbVfJDn3VwVhsV494hZ9qNDoF9Ju9IKhZzV9q9Xt1nXPj cPoBAeuNayJIdwzpY39zUdgwVd7LFVNry6q3F/OmsAtf4bdB9aavRBYCTrjodH3dxB2k XPHRj901KnXGF5Zg4JIALVTg1D52G3avo4uaPmuRGEQmJhsLg8LNyofnYieHJ5L/hLZu a94tmBOg3Ow8f/tZ1kZ2L27Nq7O2Bli0r61gh1veqTxbUtW3liiNf5USv4RoELzp4z8f /ijQ== X-Gm-Message-State: AOJu0YzxMiMoVk4hi5bC3eNX4vufbZE8JyGFwGDoHF22u+20aw/pLMYd WmcIvZ1fjmeRm81zke3Ces+xqjTul9HgBiZfxkLMp7stOVxV/c9XEljCLw== X-Google-Smtp-Source: AGHT+IFTlHoleSkvQFEHUoHJ9tE9l4GSHUCQBryxYqzCZlgjt/Mp/PuO8Evczc0ccKsjn6InFWWudg== X-Received: by 2002:a05:6870:5590:b0:25d:ff4c:6772 with SMTP id 586e51a60fabf-266cc0efc5bmr3569123fac.3.1721932637128; Thu, 25 Jul 2024 11:37:17 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-265771e5d6bsm369687fac.44.2024.07.25.11.37.15 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 25 Jul 2024 11:37:16 -0700 (PDT) Message-ID: <5e90a119-181a-4444-a6bd-9603fff13dda@gmail.com> Date: Thu, 25 Jul 2024 12:37:15 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Content-Language: en-US From: Jeff Law Subject: [committed][PR rtl-optimization/116039] Fix life computation for promoted subregs To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org So this turned out to be a neat little test and while the fuzzer found it on RISC-V, I wouldn't be surprised if the underlying issue is also the root cause of the loongarch issue with ext-dce. The key issue is that if we have something like (set (dest) (any_extend (subreg (source)))) If the subreg object is marked with SUBREG_PROMOTED and the sign/unsigned state matches the any_extend opcode, then combine (and I guess anything using simplify-rtx) may simplify that to (set (dest) (source)) That implies that bits outside the mode of the subreg are actually live and valid. This needs to be accounted for during liveness computation. We have to be careful here though. If we're too conservative about setting additional bits live, then we'll inhibit the desired optimization in the coremark examples. To do a good job we need to know the extension opcode. I'm extremely unhappy with how the use handling works in ext-dce. It mixes different conceptual steps and has horribly complex control flow. It only handles a subset of the unary/binary opcodes, etc etc. It's just damn mess. It's going to need some more noodling around. In the mean time this is a bit hacky in that it depends on non-obvious behavior to know it can get the extension opcode, but I don't want to leave the trunk in a broken state while I figure out the refactoring problem. Bootstrapped and regression tested on x86 and tested on the crosses. Pushing to the trunk. Jeff commit 34fb0feca71f763b2fbe832548749666d34a4a76 Author: Jeff Law Date: Thu Jul 25 12:32:28 2024 -0600 [PR rtl-optimization/116039] Fix life computation for promoted subregs So this turned out to be a neat little test and while the fuzzer found it on RISC-V, I wouldn't be surprised if the underlying issue is also the root cause of the loongarch issue with ext-dce. The key issue is that if we have something like (set (dest) (any_extend (subreg (source)))) If the subreg object is marked with SUBREG_PROMOTED and the sign/unsigned state matches the any_extend opcode, then combine (and I guess anything using simplify-rtx) may simplify that to (set (dest) (source)) That implies that bits outside the mode of the subreg are actually live and valid. This needs to be accounted for during liveness computation. We have to be careful here though. If we're too conservative about setting additional bits live, then we'll inhibit the desired optimization in the coremark examples. To do a good job we need to know the extension opcode. I'm extremely unhappy with how the use handling works in ext-dce. It mixes different conceptual steps and has horribly complex control flow. It only handles a subset of the unary/binary opcodes, etc etc. It's just damn mess. It's going to need some more noodling around. In the mean time this is a bit hacky in that it depends on non-obvious behavior to know it can get the extension opcode, but I don't want to leave the trunk in a broken state while I figure out the refactoring problem. Bootstrapped and regression tested on x86 and tested on the crosses. Pushing to the trunk. PR rtl-optimization/116039 gcc/ * ext-dce.cc (ext_dce_process_uses): Add some comments about concerns with current code. Mark additional bit groups as live when we have an extension of a suitably promoted subreg. gcc/testsuite * gcc.dg/torture/pr116039.c: New test. diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index c94d1fc3414..14f163a01d6 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -667,6 +667,12 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, if (modify && !skipped_dest && (dst_mask & ~src_mask) == 0) ext_dce_try_optimize_insn (insn, x); + /* Stripping the extension here just seems wrong on multiple + levels. It's source side handling, so it seems like it + belongs in the loop below. Stripping here also makes it + harder than necessary to properly handle live bit groups + for (ANY_EXTEND (SUBREG)) where the SUBREG has + SUBREG_PROMOTED state. */ dst_mask &= src_mask; src = XEXP (src, 0); code = GET_CODE (src); @@ -674,8 +680,8 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, /* Optimization is done at this point. We just want to make sure everything that should get marked as live is marked - from here onward. */ - + from here onward. Shouldn't the backpropagate step happen + before optimization? */ dst_mask = carry_backpropagate (dst_mask, code, src); /* We will handle the other operand of a binary operator @@ -688,7 +694,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, /* We're inside a SET and want to process the source operands making things live. Breaking from this loop will cause the iterator to work on sub-rtxs, so it is safe to break - if we see something we don't know how to handle. */ + if we see something we don't know how to handle. + + This code is just hokey as it really just handles trivial + unary and binary cases. Otherwise the loop exits and we + continue iterating on sub-rtxs, but outside the set context. */ unsigned HOST_WIDE_INT save_mask = dst_mask; for (;;) { @@ -704,10 +714,26 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, y = XEXP (y, 0); else if (SUBREG_P (y) && SUBREG_BYTE (y).is_constant ()) { - /* For anything but (subreg (reg)), break the inner loop - and process normally (conservatively). */ - if (!REG_P (SUBREG_REG (y))) + /* We really want to know the outer code here, ie do we + have (ANY_EXTEND (SUBREG ...)) as we need to know if + the extension matches the SUBREG_PROMOTED state. In + that case optimizers can turn the extension into a + simple copy. Which means that bits outside the + SUBREG's mode are actually live. + + We don't want to mark those bits live unnecessarily + as that inhibits extension elimination in important + cases such as those in Coremark. So we need that + outer code. */ + if (!REG_P (SUBREG_REG (y)) + || (SUBREG_PROMOTED_VAR_P (y) + && ((GET_CODE (SET_SRC (x)) == SIGN_EXTEND + && SUBREG_PROMOTED_SIGNED_P (y)) + || (GET_CODE (SET_SRC (x)) == ZERO_EXTEND + && SUBREG_PROMOTED_UNSIGNED_P (y))))) break; + + /* The SUBREG's mode determine the live width. */ bit = subreg_lsb (y).to_constant (); if (dst_mask) { @@ -785,6 +811,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, HOST_WIDE_INT size = GET_MODE_BITSIZE (GET_MODE (x)).to_constant (); HOST_WIDE_INT rn = 4 * REGNO (SUBREG_REG (x)); + /* If this is a promoted subreg, then more of it may be live than + is otherwise obvious. */ + if (SUBREG_PROMOTED_VAR_P (x)) + size = GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (x))).to_constant (); + bitmap_set_bit (livenow, rn); if (size > 8) bitmap_set_bit (livenow, rn + 1); diff --git a/gcc/testsuite/gcc.dg/torture/pr116039.c b/gcc/testsuite/gcc.dg/torture/pr116039.c new file mode 100644 index 00000000000..d67b9326de7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr116039.c @@ -0,0 +1,20 @@ +/* { dg-do run } */ +/* { dg-additional-options "-fsigned-char -fno-strict-aliasing -fwrapv" } */ + +extern void abort (void); + +int c[12]; +char d[12]; +int *f = c; +int *z = (int *)1; +long long y; +int main() { + c[9] = 0xff; + for (int i = 0; i < 12; i += 3) + d[9] = z ? f[i] : 0; + for (long i = 0; i < 12; ++i) + y ^= d[i]; + if (y != -1) + abort (); + return 0; +}