From patchwork Sat Jun 29 21:07:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1954278 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=DJl+I6Lr; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WBPzW3VZVz20Xf for ; Sun, 30 Jun 2024 07:08:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B199C38114EA for ; Sat, 29 Jun 2024 21:08:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 0798F38114C4 for ; Sat, 29 Jun 2024 21:07:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0798F38114C4 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0798F38114C4 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719695279; cv=none; b=ErTqudopPcKHwPdFXWSVAldmlJKGS/no4OPijsrDvLoeKFBZ+u7B+gyrCzdqyuGUkGC1j23noEYVC1K6ASfFn1aq9o8tbhyi1+0VHnkzZqZ85TfQ9e0uXKEo849zOp5dP65x+xvsGF7+l9ajo2rhLbH+ku+Kxi4Cug9kKhwWbNQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719695279; c=relaxed/simple; bh=Xcc49zNTm1FQu7a0MOzjJgjbUTYioWmJBeF+moAkBDo=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:To:Subject; b=I9nwlUWL5lFloMWJbLqfBui3hX8Z9Uha3SD17FHAyUQgGt/Dr1TTUPRR/a0fSHm4q3X3Z5XavtJMnWk/Sa/NcL2rbao1QQaN43mjPeZkfRi62IVFCJfCoYxQmrF2jnL7Y8qgkwRwhaod8E9sc7xyov3D8iGO9SWa2GW/FKGAvhg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-5c1d2f7ab69so824754eaf.3 for ; Sat, 29 Jun 2024 14:07:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719695275; x=1720300075; darn=gcc.gnu.org; h=subject:to:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=RPO2+P8/ji/L+oaYiaRWmYWwmmXhug8WWLk1US1oCFU=; b=DJl+I6LrQy+oBzq1tD0Nj8BxWadcLxl0kNQeSSP4IA+RSstrzZh7nocnf7aHapttBi MV51DAkVn8//t/DQGxK8A432mKYSzYoyB9vgARDal2fV0bHYQlc1DQWj2TZaL4VNRKhg i7CI3498CFwxmFxRTOtWSf3c38Dq/Q8C4Vh+scTdUufju77KpKhLGp+sWx99Ym6Hj73z Qx5SNYpK8cdWb50sms3wzNrTCZw/voBeEKgC0zsFdK95jTRugzFNCiUKA6lP3bmFOABO 3fVce4BOT4ZapoJkbFhUHlsQ1JzQ9iuZhVHL0ybur4IO7MS/nDUMipQdOnIsO0uH9Z4m tBCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719695275; x=1720300075; h=subject:to:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RPO2+P8/ji/L+oaYiaRWmYWwmmXhug8WWLk1US1oCFU=; b=X1AAriyPqcfACwU9DVIBuvmktY7+z1gC3i7Ywm2FB+dq8OjIYs7eWFSMW5PghSIAbQ hGr7vUlEyHoQHpJVwd+uBcuUpQ6W5R0CBQOlJpyjyJYyHmWdaGctHFTkHHoUJgMiUNYc AVGtDmEEhqI76Jh4feZl1GnCcbbAPIWeRWwB+TMEc8CwTx5jW/h6xJE3KyY1zxcuxQEf CzUM/PBCV5sGvcWKq1UFISvfghLqzSuhz0ZxSK2uK5QFEhIeVthi0c60ozyoc1vZoxn/ 1wI++LuiiTQYjIOrv4qdzD4AW/N5HroA8UZhKssD6G/29Y0gCP0S8I4A2yZAn6+R8mkx mgfA== X-Gm-Message-State: AOJu0Yz4R3eI4zAOxGvZJGTyaZ9aupTMqwn4Dj0u+LInhNu+2m+TdCdy MDanRHwbPyLrhy6lBsfs/9pBqgSca3grPhrgpmt8lWe6OjeNhvXeTfur7w== X-Google-Smtp-Source: AGHT+IHXLGhET6hBGJt++VRax/zgkOo8HIhdTbuWpV93DZgU2jBbueSKRUic4Iss82vjfar0id7FxQ== X-Received: by 2002:a4a:5404:0:b0:5c4:43a2:d4da with SMTP id 006d021491bc7-5c443a2d553mr110207eaf.6.1719695275246; Sat, 29 Jun 2024 14:07:55 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c4149c2e23sm592103eaf.40.2024.06.29.14.07.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 29 Jun 2024 14:07:54 -0700 (PDT) Message-ID: Date: Sat, 29 Jun 2024 15:07:53 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Content-Language: en-US From: Jeff Law To: "gcc-patches@gcc.gnu.org" Subject: [to-be-committed][v3][RISC-V] Handle bit manipulation of SImode values X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Third time is a charm perhaps? I'm not sure how I keep mucking this patch up, but clearly I do as I've sent the wrong patch twice! --- Last patch in this round of bitmanip work... At least I think I'm going to pause here and switch gears to other projects that need attention 🙂 This patch introduces the ability to generate bitmanip instructions for rv64 when operating on SI objects when we know something about the range of the bit position (due to masking of the position). I've got note that the (7-pos % 8) bit position form was discovered by RAU in 500.perl. I took that and expanded it to the simple (pos & mask) form as well as covering bset, binv and bclr. As far as the implementation is concerned.... This turns the recently added define_splits into define_insn_and_split constructs. This allows combine to "see" enough RTL to realize a sign extension is unnecessary. Otherwise we get undesirable sign extensions for the new testcases. Second it adds new patterns for the logical operations. Two patterns for IOR/XOR and two patterns for AND. I think a key concept to keep in mind is that once we determine a Zbs operation is safe to perform on a SI value, we can rewrite the RTL in 64bit form. If we were ever to try and use range information at expand time for this stuff (and we probably should investigate that), that's the path I'd suggest. This is notably cleaner than my original implementation which actually kept the more complex RTL form through final and emitted 2/3 instructions (mask the bit position, then the bset/bclr/binv). Tested in my tester, but waiting for pre-commit CI to report back before taking further action. Jeff gcc/ * config/riscv/bitmap.md (bset splitters): Turn into define_and_splits. Don't depend on combine splitting the "andn with constant" form. (bset, binv, bclr with masked bit position): New patterns. gcc/testsuite * gcc.target/riscv/binv-for-simode.c: New test. * gcc.target/riscv/bset-for-simode.c: New test. * gcc.target/riscv/bclr-for-simode.c: New test. diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 3eedabffca0..f403ba8dbba 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -615,37 +615,140 @@ (define_insn "*bsetdi_2" ;; shift constant. With the limited range we know the SImode sign ;; bit is never set, thus we can treat this as zero extending and ;; generate the bsetdi_2 pattern. -(define_split - [(set (match_operand:DI 0 "register_operand") +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") (any_extend:DI (ashift:SI (const_int 1) (subreg:QI - (and:DI (not:DI (match_operand:DI 1 "register_operand")) + (and:DI (not:DI (match_operand:DI 1 "register_operand" "r")) (match_operand 2 "const_int_operand")) 0)))) - (clobber (match_operand:DI 3 "register_operand"))] + (clobber (match_scratch:X 3 "=&r"))] "TARGET_64BIT && TARGET_ZBS && (TARGET_ZBB || TARGET_ZBKB) && (INTVAL (operands[2]) & 0x1f) != 0x1f" - [(set (match_dup 0) (and:DI (not:DI (match_dup 1)) (match_dup 2))) - (set (match_dup 0) (zero_extend:DI (ashift:SI - (const_int 1) - (subreg:QI (match_dup 0) 0))))]) + "#" + "&& reload_completed" + [(set (match_dup 3) (match_dup 2)) + (set (match_dup 3) (and:DI (not:DI (match_dup 1)) (match_dup 3))) + (set (match_dup 0) (zero_extend:DI + (ashift:SI (const_int 1) (match_dup 4))))] + { operands[4] = gen_lowpart (QImode, operands[3]); } + [(set_attr "type" "bitmanip")]) -(define_split - [(set (match_operand:DI 0 "register_operand") - (any_extend:DI +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (any_extend:DI (ashift:SI (const_int 1) (subreg:QI - (and:DI (match_operand:DI 1 "register_operand") + (and:DI (match_operand:DI 1 "register_operand" "r") (match_operand 2 "const_int_operand")) 0))))] "TARGET_64BIT && TARGET_ZBS && (INTVAL (operands[2]) & 0x1f) != 0x1f" - [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2))) - (set (match_dup 0) (zero_extend:DI (ashift:SI - (const_int 1) - (subreg:QI (match_dup 0) 0))))]) + "#" + "&& 1" + [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) (zero_extend:DI (ashift:SI + (const_int 1) + (subreg:QI (match_dup 0) 0))))] + { } + [(set_attr "type" "bitmanip")]) + +;; Similarly two patterns for IOR/XOR generating bset/binv to +;; manipulate a bit in a register +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (any_or:DI + (any_extend:DI + (ashift:SI + (const_int 1) + (subreg:QI + (and:DI (not:DI (match_operand:DI 1 "register_operand" "r")) + (match_operand 2 "const_int_operand")) 0))) + (match_operand:DI 3 "register_operand" "r"))) + (clobber (match_scratch:X 4 "=&r"))] + "TARGET_64BIT + && TARGET_ZBS + && (TARGET_ZBB || TARGET_ZBKB) + && (INTVAL (operands[2]) & 0x1f) != 0x1f" + "#" + "&& reload_completed" + [(set (match_dup 4) (match_dup 2)) + (set (match_dup 4) (and:DI (not:DI (match_dup 4)) (match_dup 1))) + (set (match_dup 0) (any_or:DI (ashift:DI (const_int 1) (match_dup 5)) (match_dup 3)))] + { operands[5] = gen_lowpart (QImode, operands[4]); } + [(set_attr "type" "bitmanip")]) + +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (any_or:DI + (any_extend:DI + (ashift:SI + (const_int 1) + (subreg:QI + (and:DI (match_operand:DI 1 "register_operand" "r") + (match_operand 2 "const_int_operand")) 0))) + (match_operand:DI 3 "register_operand" "r"))) + (clobber (match_scratch:X 4 "=&r"))] + "TARGET_64BIT + && TARGET_ZBS + && (INTVAL (operands[2]) & 0x1f) != 0x1f" + "#" + "&& reload_completed" + [(set (match_dup 4) (and:DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) (any_or:DI (ashift:DI (const_int 1) (subreg:QI (match_dup 4) 0)) (match_dup 3)))] + { } + [(set_attr "type" "bitmanip")]) + +;; Similarly two patterns for AND generating bclr to +;; manipulate a bit in a register +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (and:DI + (not:DI + (any_extend:DI + (ashift:SI + (const_int 1) + (subreg:QI + (and:DI (not:DI (match_operand:DI 1 "register_operand" "r")) + (match_operand 2 "const_int_operand")) 0)))) + (match_operand:DI 3 "register_operand" "r"))) + (clobber (match_scratch:X 4 "=&r"))] + "TARGET_64BIT + && TARGET_ZBS + && (TARGET_ZBB || TARGET_ZBKB) + && (INTVAL (operands[2]) & 0x1f) != 0x1f" + "#" + "&& reload_completed" + [(set (match_dup 4) (match_dup 2)) + (set (match_dup 4) (and:DI (not:DI (match_dup 1)) (match_dup 4))) + (set (match_dup 0) (and:DI (rotate:DI (const_int -2) (match_dup 5)) (match_dup 3)))] + { operands[5] = gen_lowpart (QImode, operands[4]); } + [(set_attr "type" "bitmanip")]) + + +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (and:DI + (not:DI + (any_extend:DI + (ashift:SI + (const_int 1) + (subreg:QI + (and:DI (match_operand:DI 1 "register_operand" "r") + (match_operand 2 "const_int_operand")) 0)))) + (match_operand:DI 3 "register_operand" "r"))) + (clobber (match_scratch:X 4 "=&r"))] + "TARGET_64BIT + && TARGET_ZBS + && (INTVAL (operands[2]) & 0x1f) != 0x1f" + "#" + "&& reload_completed" + [(set (match_dup 4) (and:DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) (and:DI (rotate:DI (const_int -2) (match_dup 5)) (match_dup 3)))] + { operands[5] = gen_lowpart (QImode, operands[4]); } + [(set_attr "type" "bitmanip")]) (define_insn "*bset_1_mask" [(set (match_operand:X 0 "register_operand" "=r") diff --git a/gcc/testsuite/gcc.target/riscv/bclr-for-simode-1.c b/gcc/testsuite/gcc.target/riscv/bclr-for-simode-1.c new file mode 100644 index 00000000000..ae9fc33bb34 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/bclr-for-simode-1.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb_zbs -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ + + +typedef unsigned int uint32_t; +uint32_t foo(uint32_t pos, uint32_t x) +{ + return x & ~(1 <<( pos & 0xf)); +} + +typedef unsigned int uint32_t; +uint32_t foo2(uint32_t pos, uint32_t x) +{ + return x & ~(1 <<(7-(pos) % 8)); +} + + + +/* { dg-final { scan-assembler-not "sll\t" } } */ +/* { dg-final { scan-assembler-times "bclr\t" 2 } } */ +/* { dg-final { scan-assembler-times "andi\t" 1 } } */ +/* { dg-final { scan-assembler-times "andn\t" 1 } } */ +/* { dg-final { scan-assembler-times "ret" 2 } } */ + diff --git a/gcc/testsuite/gcc.target/riscv/binv-for-simode-1.c b/gcc/testsuite/gcc.target/riscv/binv-for-simode-1.c new file mode 100644 index 00000000000..f96c2458ef1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/binv-for-simode-1.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb_zbs -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ + + +typedef unsigned int uint32_t; +uint32_t foo(uint32_t pos, uint32_t x) +{ + return x ^ (1 <<( pos & 0xf)); +} + +typedef unsigned int uint32_t; +uint32_t foo2(uint32_t pos, uint32_t x) +{ + return x ^ (1 <<(7-(pos) % 8)); + +} + +/* { dg-final { scan-assembler-not "sll\t" } } */ +/* { dg-final { scan-assembler-times "binv\t" 2 } } */ +/* { dg-final { scan-assembler-times "andi\t" 1 } } */ +/* { dg-final { scan-assembler-times "andn\t" 1 } } */ +/* { dg-final { scan-assembler-times "ret" 2 } } */ + diff --git a/gcc/testsuite/gcc.target/riscv/bset-for-simode-1.c b/gcc/testsuite/gcc.target/riscv/bset-for-simode-1.c new file mode 100644 index 00000000000..24663a1c856 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/bset-for-simode-1.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb_zbs -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ + + +typedef unsigned int uint32_t; +uint32_t foo(uint32_t pos, uint32_t x) +{ + return x | (1 <<( pos & 0xf)); +} + +typedef unsigned int uint32_t; +uint32_t foo2(uint32_t pos, uint32_t x) +{ + return x | (1 <<(7-(pos) % 8)); + +} + +/* { dg-final { scan-assembler-not "sll\t" } } */ +/* { dg-final { scan-assembler-times "bset\t" 2 } } */ +/* { dg-final { scan-assembler-times "andi\t" 1 } } */ +/* { dg-final { scan-assembler-times "andn\t" 1 } } */ +/* { dg-final { scan-assembler-times "ret" 2 } } */ +