From patchwork Tue Jun 18 05:53:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1948942 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=QE9cteTq; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W3GCy0yBpz20KL for ; Tue, 18 Jun 2024 15:54:22 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5DB7C3882165 for ; Tue, 18 Jun 2024 05:54:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id 4146E388206C for ; Tue, 18 Jun 2024 05:53:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4146E388206C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4146E388206C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::632 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718690040; cv=none; b=WIkq1fvBSaEACeqHPdtFXFrVffQfLIcme9BisejTbsEyou7sYL8ENuL0vGrVdmgkx4Tcf/vZW9EYsBvU/kPvTaon1K9f85VMKd5l7gmQ5vEDylbLnqUE9yNcpYxSabSqXOg6ORtED02CHqjvcL46LAEQdBxTtSj3WbrFQkodA5w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718690040; c=relaxed/simple; bh=xl5cpWBQEOZm6q0lTtXmLEOxXHWGAXWju7k+zAq+nds=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=SKOhiZZ1q77JOuphGqH/onrK1U+PsO0dC/xfODDzTuACJU+rI1uVXHZS1PUu7j29UFZhE9b44QSz+mFBUI2DOSIFgd2BdJnInrf4UUsp7eXZDu9QxArAXx7G5L5kCwR+hRpfxL3pJs8J3atkLwJ7x0+IFmIPn759AIgrKYkgNk0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1f99fe4dc5aso210535ad.0 for ; Mon, 17 Jun 2024 22:53:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718690036; x=1719294836; darn=gcc.gnu.org; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=u5UGk4KiniF5o3RIcBNqsbUS8jOryA/PWbJdQ5YH3e4=; b=QE9cteTqcK/OJPzN81ku/qlb/Osy3kWaxLzR/agKYCT0TCxGgs5qZkpBenvL6knKST HJdWN8KW9h/TjicQujyOMLE/ElGNlZ0ZwKXneEdQrYtvapLC8tfFunD84f+DYGyhvSuB xhYW/1/3ME3lMQG6SLhub5LtSKsAVfqf98eC4vPOLRAaey6I8DZjwr46uxD8DgzALltM R8qKCqRmJotFOCgcFXfJO+1JhGIAXJC6AYRA53F4mLcICAsaWnM74XxNdCfDHmYGcD+S YCLn/zpXJiZXsoLDfOegXwRdGWH3A8hg2PO3W3UlqUMubpOPNk+a5obVNmiCSPkZXUIc tgwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718690036; x=1719294836; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=u5UGk4KiniF5o3RIcBNqsbUS8jOryA/PWbJdQ5YH3e4=; b=BylFT7yRTnCETRSQvXgEZ8wTws36u+M5vf6jgJIj/1QrKwEnIoEHBbTZFClwPj0yYj jy9BgX5pcR68ylbHfj23EG1MHirg0gYd+5Zu3Yf0Ydevf/HDLQamG3n4DVuB3Z3STftV kZ9QVF9s6wDULwtpDOUAquMQzLxuq96+kWMrbYbMIPljmywghpDfI+AmWen139HGpcZ4 n4dZyViNFH1YNlohpu1a7PkM/cdrbNEAjY/mh/tR+ndz7OLK+vlsJr1gmzF8xQx6oRgS Gelpr3sXSGf9Xa9xtN+D9fi0YPDSB1mppKO72zM0f9+Xb7bRiZdm67Jkiune6xjECECS Aing== X-Gm-Message-State: AOJu0YzUfki/fiBLT6853Jskib1FuFYBf27VnZ8bB57694UEYcvgPE6d 1/CLP9FG2/T9ikYyqJCh+bPaw72wh/d4Ytq//3CN+3Js4S9QjrcxZ8p32A== X-Google-Smtp-Source: AGHT+IG8HSDqYcCIB9qqWKQmDuzbVoeWHeeO90zrPg8ZuVt/efHq6lT0PgP2+JUpTUrN6vR9YvnQNw== X-Received: by 2002:a17:902:e5c2:b0:1f6:7f05:8c0e with SMTP id d9443c01a7336-1f8625c1757mr134443025ad.2.1718690035953; Mon, 17 Jun 2024 22:53:55 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f98ba54723sm8171315ad.297.2024.06.17.22.53.55 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 17 Jun 2024 22:53:55 -0700 (PDT) Message-ID: Date: Mon, 17 Jun 2024 23:53:54 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Content-Language: en-US From: Jeff Law Subject: [to-be-committed][RISC-V] Improve bset generation when bit position is limited To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org So more work in the ongoing effort to make better use of the Zbs extension. This time we're trying to exploit knowledge of the shift count/bit position to allow us to use a bset instruction. Consider this expression in SImode (1 << (pos & 0xf) None of the resulting values will have bit 31 set. So if there's an explicit zero or sign extension to DI we can drop that explicit extension and generate a simple bset with x0 as the input value. Or another example (which I think came from spec at some point and IIRC was the primary motivation for this patch): (1 << (7-(pos) % 8)) Before this change they'd generate something like this respectively: li a5,1 andi a0,a0,15 sllw a0,a5,a0 li a5,7 andn a0,a5,a0 li a5,1 sllw a0,a5,a0 After this change they generate: andi a0,a0,15 # 9 [c=4 l=4] *anddi3/1 bset a0,x0,a0 # 17 [c=8 l=4] *bsetdi_2 li a5,7 # 27 [c=4 l=4] *movdi_64bit/1 andn a0,a5,a0 # 28 [c=4 l=4] and_notdi3 bset a0,x0,a0 # 19 [c=8 l=4] *bsetdi_2 We achieve this with simple define_splits which target the bsetdi_2 pattern I recently added. Much better than the original implementation I did a few months back :-) I've got a bclr/binv variant from a few months back as well, but it needs to be updated to the simpler implementation found here. Just ran this through my tester. Will wait for the precommit CI to render its verdict before moving forward. Jeff diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 094bc2acf1c..dc7a7e7fba7 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -609,6 +609,36 @@ (define_insn "*bsetdi_2" "bset\t%0,x0,%1" [(set_attr "type" "bitmanip")]) +;; These two splitters take advantage of the limited range of the +;; shift constant. With the limited range we know the SImode sign +;; bit is never set, thus we can treat this as zero extending and +;; generate the bsetdi_2 pattern. +(define_split + [(set (match_operand:DI 0 "register_operand") + (any_extend:DI + (ashift:SI (const_int 1) + (subreg:QI (and:DI (not:DI (match_operand:DI 1 "register_operand")) + (match_operand 2 "const_int_operand")) 0)))) + (clobber (match_operand:DI 3 "register_operand"))] + "TARGET_64BIT + && TARGET_ZBS + && (TARGET_ZBB || TARGET_ZBKB) + && (INTVAL (operands[2]) & 0x1f) != 0x1f" + [(set (match_dup 0) (and:DI (not:DI (match_dup 1)) (match_dup 2))) + (set (match_dup 0) (zero_extend:DI (ashift:SI (const_int 1) (subreg:QI (match_dup 0) 0))))]) + +(define_split + [(set (match_operand:DI 0 "register_operand") + (any_extend:DI + (ashift:SI (const_int 1) + (subreg:QI (and:DI (match_operand:DI 1 "register_operand") + (match_operand 2 "const_int_operand")) 0))))] + "TARGET_64BIT + && TARGET_ZBS + && (INTVAL (operands[2]) & 0x1f) != 0x1f" + [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) (zero_extend:DI (ashift:SI (const_int 1) (subreg:QI (match_dup 0) 0))))]) + (define_insn "*bset_1_mask" [(set (match_operand:X 0 "register_operand" "=r") (ashift:X (const_int 1) diff --git a/gcc/testsuite/gcc.target/riscv/zbs-ext-2.c b/gcc/testsuite/gcc.target/riscv/zbs-ext-2.c new file mode 100644 index 00000000000..301bc9d89c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbs-ext-2.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb_zbs -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" } } */ + + +typedef unsigned int uint32_t; +uint32_t foo(uint32_t pos) +{ + return (1 << (7-(pos) % 8)); +} + +typedef unsigned int uint32_t; +uint32_t foo2(uint32_t pos) +{ + return (1 << (pos & 0xf)); +} + +/* { dg-final { scan-assembler-not "sll\t" } } */ +/* { dg-final { scan-assembler-times "bset\t" 2 } } */ +/* { dg-final { scan-assembler-times "andi\t" 1 } } */ +/* { dg-final { scan-assembler-times "andn\t" 1 } } */ +/* { dg-final { scan-assembler-times "li\t" 1 } } */ +/* { dg-final { scan-assembler-times "ret" 2 } } */ +