From patchwork Wed Aug 7 09:42:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Wang X-Patchwork-Id: 1969945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wf4xG3QWMz1ybS for ; Wed, 7 Aug 2024 19:43:30 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A7AA03858402 for ; Wed, 7 Aug 2024 09:43:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zg8tmja5ljk3lje4ms43mwaa.icoremail.net (zg8tmja5ljk3lje4ms43mwaa.icoremail.net [209.97.181.73]) by sourceware.org (Postfix) with ESMTP id 8F34F385C6C3 for ; Wed, 7 Aug 2024 09:42:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8F34F385C6C3 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8F34F385C6C3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.97.181.73 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723023776; cv=none; b=tFQmxsdIcIViAV9WxiCzwySgXdYTHJ+QfP71rRYB9zeULR+B9TA4uvXL7kiZjnp3NgYR5aiLNTxwZeN7r5LAv7KQVQdDJFiC1hE3Kzbf4EOOznvPbvCMBLsIEMHjohFQ/7DV/Ni9X/QC10BWCaZuR+Y+C3VnuWSeQ8ko4sfJn3A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723023776; c=relaxed/simple; bh=gv1GJOHyEwMfXAycklwAItt3WdicnM7G+LHXt828WOQ=; h=From:To:Subject:Date:Message-Id; b=VlQjxuHERNqNGQuPUuvhFP4IaNwqoYN9UGod3mzTNyxIrgEV9TwgK060xyTDTLhnNukbTvZ0brNpTCZCdWlGhjj4s653Fheuq6GBrc07jCVclcbeO11JjdZjuYAHgqpThZxfiYHVu5KQWwD+Ye3oRKcC5ByCzUlknp3Si5Xh/Vk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost.localdomain (unknown [10.12.130.31]) by app2 (Coremail) with SMTP id TQJkCgCXYoiYQbNmU+8AAA--.12516S4; Wed, 07 Aug 2024 17:42:49 +0800 (CST) From: Feng Wang To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, jeffreyalaw@gmail.com, zhusonghe@eswincomputing.com, Feng Wang Subject: [PATCH] RISC-V: Add auto-vect pattern for vector rotate shift Date: Wed, 7 Aug 2024 09:42:43 +0000 Message-Id: <20240807094243.17972-1-wangfeng@eswincomputing.com> X-Mailer: git-send-email 2.17.1 X-CM-TRANSID: TQJkCgCXYoiYQbNmU+8AAA--.12516S4 X-Coremail-Antispam: 1UD129KBjvJXoW3AFWkAr4xuryDZw4fJw15twb_yoW3WF4rpa 13CryxKFWrGF1xua13KrnrGr4rWr13Cr15u3s2grykAw12yFW8AFWkGFZ3C3y3GF9rGr15 ZayDCw45Cw4SqFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkF14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26F4U JVW0owA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oV Cq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY02Avz4vE-syl42xK 82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGw C20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48J MIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMI IF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E 87Iv6xkF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x0JUdHUDUUUUU= X-CM-SenderInfo: pzdqwwxhqjqvxvzl0uprps33xlqjhudrp/ X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch add the vector rotate shift pattern for auto-vect. With this patch, the scalar rotate shift can be automatically vectorized into vector rotate shift. signed-off-by: Feng Wang gcc/ChangeLog: * config/riscv/autovec-opt.md (v3): Add define_expand for vector rotate shift. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vrolr-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vrolr-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/vrolr-template.h: New test. --- gcc/config/riscv/autovec-opt.md | 16 ++++ .../riscv/rvv/autovec/binop/vrolr-1.c | 9 ++ .../riscv/rvv/autovec/binop/vrolr-run.c | 88 +++++++++++++++++++ .../riscv/rvv/autovec/binop/vrolr-template.h | 29 ++++++ 4 files changed, 142 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-run.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-template.h diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md index d7a3cfd4602..923122510ac 100644 --- a/gcc/config/riscv/autovec-opt.md +++ b/gcc/config/riscv/autovec-opt.md @@ -1607,3 +1607,19 @@ DONE; } [(set_attr "type" "vandn")]) + +;; ------------------------------------------------------------------------- +;; - vrol.vv vror.vv +;; ------------------------------------------------------------------------- +(define_expand "v3" + [(set (match_operand:VI 0 "register_operand") + (bitmanip_rotate:VI + (match_operand:VI 1 "register_operand") + (match_operand:VI 2 "register_operand")))] + "TARGET_ZVBB || TARGET_ZVKB" + { + riscv_vector::emit_vlmax_insn (code_for_pred_v (, mode), + riscv_vector::BINARY_OP, operands); + DONE; + } +) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-1.c new file mode 100644 index 00000000000..55dac27697c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-1.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-add-options "riscv_v" } */ +/* { dg-add-options "riscv_zvbb" } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "vrolr-template.h" + +/* { dg-final { scan-assembler-times {\tvrol\.vv} 4 } } */ +/* { dg-final { scan-assembler-times {\tvror\.vv} 4 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-run.c new file mode 100644 index 00000000000..221795ba871 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-run.c @@ -0,0 +1,88 @@ +/* { dg-do run } */ +/* { dg-require-effective-target "riscv_zvbb_ok" } */ +/* { dg-add-options "riscv_v" } */ +/* { dg-add-options "riscv_zvbb" } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include +#include + +#include +#include +#include + +#define ARRAY_SIZE 512 + +#define CIRCULAR_LEFT_SHIFT_ARRAY(arr, shifts, bit_size, size) \ + for (int i = 0; i < size; i++) { \ + (arr)[i] = (((arr)[i] << (shifts)[i % bit_size]) | ((arr)[i] >> (bit_size - (shifts)[i % bit_size]))); \ + } + +#define CIRCULAR_RIGHT_SHIFT_ARRAY(arr, shifts, bit_size, size) \ + for (int i = 0; i < size; i++) { \ + (arr)[i] = (((arr)[i] >> (shifts)[i % bit_size]) | ((arr)[i] << (bit_size - (shifts)[i % bit_size]))); \ + } + +void __attribute__((optimize("no-tree-vectorize"))) compare_results8( + uint8_t *result_left, uint8_t *result_right, + int bit_size, uint8_t *shift_values) +{ + for (int i = 0; i < ARRAY_SIZE; i++) { + assert(result_left[i] == (i << shift_values[i % bit_size]) | (i >> (bit_size - shift_values[i % bit_size]))); + assert(result_right[i] == (i >> shift_values[i % bit_size]) | (i << (bit_size - shift_values[i % bit_size]))); + } +} + +void __attribute__((optimize("no-tree-vectorize"))) compare_results16( + uint16_t *result_left, uint16_t *result_right, + int bit_size, uint16_t *shift_values) +{ + for (int i = 0; i < ARRAY_SIZE; i++) { + assert(result_left[i] == (i << shift_values[i % bit_size]) | (i >> (bit_size - shift_values[i % bit_size]))); + assert(result_right[i] == (i >> shift_values[i % bit_size]) | (i << (bit_size - shift_values[i % bit_size]))); + } +} + +void __attribute__((optimize("no-tree-vectorize"))) compare_results32( + uint32_t *result_left, uint32_t *result_right, + int bit_size, uint32_t *shift_values) +{ + for (int i = 0; i < ARRAY_SIZE; i++) { + assert(result_left[i] == (i << shift_values[i % bit_size]) | (i >> (bit_size - shift_values[i % bit_size]))); + assert(result_right[i] == (i >> shift_values[i % bit_size]) | (i << (bit_size - shift_values[i % bit_size]))); + } +} + +void __attribute__((optimize("no-tree-vectorize"))) compare_results64( + uint64_t *result_left, uint64_t *result_right, + int bit_size, uint64_t *shift_values) +{ + for (int i = 0; i < ARRAY_SIZE; i++) { + assert(result_left[i] == ((uint64_t)i << shift_values[i % bit_size]) | ((uint64_t)i >> (bit_size - shift_values[i % bit_size]))); + assert(result_right[i] == ((uint64_t)i >> shift_values[i % bit_size]) | ((uint64_t)i << (bit_size - shift_values[i % bit_size]))); + } +} + +#define TEST_SHIFT_OPERATIONS(TYPE, bit_size) \ + TYPE shift_val##bit_size[ARRAY_SIZE];\ + TYPE result_left##bit_size[ARRAY_SIZE];\ + TYPE result_right##bit_size[ARRAY_SIZE];\ + do { \ + for (int i = 0; i < ARRAY_SIZE; i++) { \ + result_left##bit_size[i] = i;\ + result_right##bit_size[i] = i;\ + shift_val##bit_size[i] = i % bit_size; \ + } \ + CIRCULAR_LEFT_SHIFT_ARRAY(result_left##bit_size, shift_val##bit_size, bit_size, ARRAY_SIZE)\ + CIRCULAR_RIGHT_SHIFT_ARRAY(result_right##bit_size, shift_val##bit_size, bit_size, ARRAY_SIZE)\ + compare_results##bit_size(result_left##bit_size, result_right##bit_size, bit_size, shift_val##bit_size); \ + } while(0) + + +int main() { + TEST_SHIFT_OPERATIONS(uint8_t, 8); + TEST_SHIFT_OPERATIONS(uint16_t, 16); + TEST_SHIFT_OPERATIONS(uint32_t, 32); + TEST_SHIFT_OPERATIONS(uint64_t, 64); + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-template.h new file mode 100644 index 00000000000..3db0d8643a8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrolr-template.h @@ -0,0 +1,29 @@ +#include + +#define VROL_VV(SEW, S, T) \ +__attribute__ ((noipa))\ +void autovect_vrol_vv_##S##SEW (T *out, T *op1, T *op2, int n){\ + for(int i=0; i> (SEW - op2[i]));\ + }\ +} + +#define VROR_VV(SEW, S, T) \ +__attribute__ ((noipa))\ +void autovect_vror_vv_##S##SEW (T *out, T *op1, T *op2, int n){\ + for(int i=0; i> op2[i]) | (op1[i] << (SEW - op2[i]));\ + }\ +} + +VROL_VV(8, u, uint8_t) +VROL_VV(16, u, uint16_t) +VROL_VV(32, u, uint32_t) +VROL_VV(64, u, uint64_t) + +VROR_VV(8, u, uint8_t) +VROR_VV(16, u, uint16_t) +VROR_VV(32, u, uint32_t) +VROR_VV(64, u, uint64_t)