From patchwork Tue Jul 9 18:05:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1958569 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ventanamicro.com header.i=@ventanamicro.com header.a=rsa-sha256 header.s=google header.b=C/R0Nl7n; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WJTSt6BYgz1xr9 for ; Wed, 10 Jul 2024 04:06:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 436243841FFE for ; Tue, 9 Jul 2024 18:06:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oa1-x33.google.com (mail-oa1-x33.google.com [IPv6:2001:4860:4864:20::33]) by sourceware.org (Postfix) with ESMTPS id 71F60384389A for ; Tue, 9 Jul 2024 18:05:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 71F60384389A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 71F60384389A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::33 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720548356; cv=none; b=E4bgWmvAptAXkCJvI6isZ3dFwSMGZPOKHjEa0eZSEOfFhMFTLF6DhEmzwd6cAheLLYGR547WfMNLkD1TTWvNBTD7t/rNR0KIbGOu84nodNMKE/bmSDT1JAZ6Idb2ACSHnSzFihoPmDjbWv63r+lnXCV4v4oJeLfEuwYEZZCc9/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720548356; c=relaxed/simple; bh=2mHLrw7+A0Mma5QdT/BG2z3VKyQr8Ik88QliaNa62Bg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:To:Subject; b=Lw0Ni2DKbtcs576Nl8PpcI528zl8qEb6mm3TqV0whxUo7N1lp/u4fEPoJ0LOLbv5aIb6YbcTJ9vhneqrIlH4BpvJdJHGqw9dSTzbt3/nSsCPq5Zp/6Oy4Apjv7cLc4XC6fIfbU5BD6TaftOCri8Y2QKy6nOzCnbQ4kbgty2ELss= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-25e397c51b2so2709112fac.3 for ; Tue, 09 Jul 2024 11:05:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1720548353; x=1721153153; darn=gcc.gnu.org; h=subject:to:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=fS1lLarh0EoLPExM4PxRYJVuCzHqCwdqHZmqSmfMsEg=; b=C/R0Nl7nejUvjUra5pc6YzkWHTWfaC6jJoe/hXk+owyAuR3o/a5YsRlmtvyKhwbYhc IcXaZndv8/QDUaG4dveBai0u4MD6MqWYNrO5T5seIXHXLMZLqrWbGhaFo2os2s413SXv h0AXgn31ufBH4Fl5fHmf0LDpjytEjnkaOn8w33LVyzivBbj4rysYmNBAnAp76pCIdtvA th6D2ZCZUFWCKAHibB/Zv4BpH1upA92Gja3Nkd2eSrByyt5VtEJzwZ9PMA+FmO5s+Dpw udIfLq7AKWMUlt0cDvsZIFy/eGPlgNnufUQMS560OtWXFdlpgmwzzjjo0exRIDmokLSJ 5Fog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720548353; x=1721153153; h=subject:to:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fS1lLarh0EoLPExM4PxRYJVuCzHqCwdqHZmqSmfMsEg=; b=oQiWRRr7W6uhVAD2CPMX56+IKUa2KO76pjDGBCr18mnr9y3EG721Xc0EGP/QthA/PW TleeYE01CBaTnvnva9hXDWe3SqNsXBz359BaYI1ipxil7oKGqkdSyuCwRa0Js2j1/wuR iyOZeJndoDjBPlnARXvw7wWEXpFxdRg+TezNociA02YZ4tAGGErfE2go6ZUeQwZyUeWg bNUgpMLMpl4uTFMM0lk/p72yFOG+S+QXJEfa/Bnmlwn2Ht0Q6Kn+DdgAMTqp1WzIH9WA Ghcg+QeOR4hdk1B1CetaDGS7pCNUHDtAix1duDOiA6VkaF2qG+fitRuKcMdaCbarsuy4 YfYg== X-Gm-Message-State: AOJu0YyJRTxL8m9cR6Wx/5XAAXnIGjJ35t+HrP/kMmRa/le4XUwNQ7tn RGgZo2upsIjFeWxZ5o/Dofboe9clKXGRQlDQbHL3n6neV+CYdqNXrvGjoeWXTQQ6FB1TEYkh9lk e X-Google-Smtp-Source: AGHT+IEaEjAoIYDitWlK6ktVmwWYW3a8g321n9iiFVIBE4wqXyq3RKDP3ZMcTw12nkw20FL344jIpA== X-Received: by 2002:a05:6870:c6a5:b0:258:42bd:d916 with SMTP id 586e51a60fabf-25eae7bace2mr2868997fac.14.1720548352575; Tue, 09 Jul 2024 11:05:52 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-25eaa083bb1sm755263fac.31.2024.07.09.11.05.51 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 09 Jul 2024 11:05:51 -0700 (PDT) Message-ID: <49edf2b0-ed30-4baf-87d0-a729164357e4@ventanamicro.com> Date: Tue, 9 Jul 2024 12:05:50 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Content-Language: en-US From: Jeff Law To: "gcc-patches@gcc.gnu.org" Subject: [to-be-committed] [RISC-V] Improve bset generation for another case X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org So another minor improvement for bitmanip code generation. Essentially we have a pattern which matches a bset idiom for x = zero_extend (1 << n). That pattern only handles SI->DI extension. For the QI/HI case the 1< bset a5,x0,a5 # 24 [c=8 l=4] *bsetdi_3 > andn a3,a0,a5 # 52 [c=4 l=4] and_notdi3 > beq a4,zero,.L3 # 41 [c=12 l=4] *branchdi > or a3,a0,a5 # 44 [c=4 l=4] *iordi3/0 The bset is what this patch generates instead of a li+sll sequence. In the form above its easier see that the andn can be replaced with a bclr and the or can be replaced with a bset which in turn would allow the bset above to go away completely. This has been tested in my tester for rv32 and rv64. I'll wait for pre-commit testing to complete before moving forward. jeff gcc/ * config/riscv/bitmanip.md (bset_3): New pattern. testsuite/ * gcc.target/riscv/zbs-bset-2.c: New test. diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index f403ba8dbba..b5e052fe4f2 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -611,6 +611,18 @@ (define_insn "*bsetdi_2" "bset\t%0,x0,%1" [(set_attr "type" "bitmanip")]) +;; Similar, but we have a narrowing SUBREG. We're still using x0 as +;; a source, so the result is still zero extended. +(define_insn "*bset_3" + [(set (match_operand:X 0 "register_operand" "=r") + (zero_extend:X + (subreg:SHORT + (ashift:X (const_int 1) + (match_operand:QI 1 "register_operand" "r")) 0)))] + "TARGET_ZBS" + "bset\t%0,x0,%1" + [(set_attr "type" "bitmanip")]) + ;; These two splitters take advantage of the limited range of the ;; shift constant. With the limited range we know the SImode sign ;; bit is never set, thus we can treat this as zero extending and diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bset-2.c b/gcc/testsuite/gcc.target/riscv/zbs-bset-2.c new file mode 100644 index 00000000000..8555e3784ff --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbs-bset-2.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zba_zbb_zbs -mabi=lp64d" { target { riscv64*-*-* } } } */ +/* { dg-options "-O2 -march=rv32gc_zba_zbb_zbs -mabi=ilp32" { target { riscv32*-*-* } } } */ + +typedef struct SHA { + unsigned char block[128]; + unsigned int blockcnt; + +} SHA; + +#define BITSET(s, pos) s[(pos) >> 3] & (char) (0x01 << (7 - (pos) % 8)) +#define SETBIT(s, pos) s[(pos) >> 3] |= (char) (0x01 << (7 - (pos) % 8)) +#define CLRBIT(s, pos) s[(pos) >> 3] &= (char) ~(0x01 << (7 - (pos) % 8)) + +#define ULNG unsigned long + +void shabits(char *bitstr, long bitcnt, SHA *s, ULNG i) +{ + if (BITSET(bitstr, i)) + SETBIT(s->block, s->blockcnt); + else + CLRBIT(s->block, s->blockcnt); +} + +/* { dg-final { scan-assembler-times "bset\t\[a-x\]\[0-9\]+.x0" 2 { target { riscv64*-*-* } } } } */ + +/* rv32 doesn't have the first bset for some reason, probably a missed + generalization of one of the other bitmanip combiner patterns. The + bset we do generate corresponds to the one the related patch generates. */ +/* { dg-final { scan-assembler-times "bset\t\[a-x\]\[0-9\]+.x0" 1 { target { riscv32*-*-* } } } } */ +