From patchwork Wed Jun 26 17:14:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1952732 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=BHUU2dfs; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8Sxy0GzZz20Xf for ; Thu, 27 Jun 2024 03:15:16 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A1D1387103B for ; Wed, 26 Jun 2024 17:15:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id 6BDD23871016 for ; Wed, 26 Jun 2024 17:14:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6BDD23871016 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6BDD23871016 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c36 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719422095; cv=none; b=Z5UvXa4SNK66ku61OzqS6h7f0N5BMWFYOPU0SmUtcTu2ENZ9k3qAlMZEm8UNmFkjb/yvIg93ALNP2mmEZj5cd87o1j6/cttqzu0E0IExZ7OqGDKSfA5ZeDV9403oXOhGHOZXiQ4++oowF/qQcToKuK5kcSgI6sY32xEWDQ/nIv0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719422095; c=relaxed/simple; bh=nLmDMEO70hfvRqH6ifvGRAJLeHmw/zeKuLKlrT9mvw8=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:To:Subject; b=sSmBPUW5MjFRTv1YAYy577bwD5Gy4ofXVHF3ETvxHomWHuNYkJ1Gh0SjNOZY1UY2CH54rqy7Lj5CdAKIdEXZf47J6RD+wZDT3JmlIwnMFkZbCijm+rP5Khw5mqhbboc1tCxYSzTNUHkfCKLIqC3ni813QQ4xsMpo8+7IaHH6RGk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-5b97a9a9b4bso3263986eaf.0 for ; Wed, 26 Jun 2024 10:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719422092; x=1720026892; darn=gcc.gnu.org; h=subject:to:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=5Gd+u46tmFQ8h7jy2qp8MN5sgYw3FBrT4s8EiMdARmY=; b=BHUU2dfs7yhIUvVg/NDxhTb2aFFHyu4IvvPA4VfxfPkdwtETgsLUlyrtp3KXd7L5X5 COd/hIo2gwjUCIVGbWtA9JL3YahZoNKFBTsL9iq6r3BgkoebLoGSxY3OeF9jOaGxZVTZ EquqjX5DrN3XcEjh5180wM69Nd11hNQLSrI3v75mjcw9YXnCMWqz/HE9H7Yj+JOF3sMg oVh0KAy0McyPb1RgJW6qJ5cNOwJRFIG9tx+IJsOPKgHVtA7PPuTstfHysreDhdAKXdNA KvUNRv877oUeFymn04xaly6x/8/EYuDWsZeT6iE9WNSj9oSQ0UIcApNTbBTPVFC4MLyH HVOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719422092; x=1720026892; h=subject:to:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5Gd+u46tmFQ8h7jy2qp8MN5sgYw3FBrT4s8EiMdARmY=; b=Nr3A/dz9WPGzxdjDI9wqhczUUnnf8Ywwo2m0dilxfi0x16PUbwwIH4+Gmj8aVepLe8 GRYqHJMPrrbNlRUEt6/UORUocbw7D6cGcbG3eNptTfXvAjkvf7PX2FCIKFKsFiPq8imB ViYVtPGYRLpL64M4TPFa+tnRIApTgyjoDfJQleQGd7hg3ictVblUo3Vf7QLS5gr2Muuc eU8WoqDApqJ2LOeQCBbRQkZcSNBKdEj71erFwe0LxtCqaeYcrkIJJbZ/bxOJy7uWqeVF H0AEsoxrlqSBNdaF42EZYyVg00k5fLF1puygj7a844BpFzKCRq7LGec0vzq2ckKD/+B5 JLYg== X-Gm-Message-State: AOJu0YxQI3y4VTu+A8cvxPIvD5WL7fwtEpgAy2M7rLPiCDH1gt9gbeI7 uuGfyXeTGsuZrlTSDhIvIDFZxiGzotUjKFKzSQ1Ihb0iGpd3C7xb4/XLng== X-Google-Smtp-Source: AGHT+IGQwyjcw0nGclG42LNaoLISweoh7ojdMm7cOVYkmZ0h1LrcaVG/5flj56YFjJQr52YOilkLIg== X-Received: by 2002:a05:6871:606:b0:250:67c4:d73c with SMTP id 586e51a60fabf-25d016dc0c5mr12222166fac.28.1719422091787; Wed, 26 Jun 2024 10:14:51 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-700d3ebc77fsm153099a34.44.2024.06.26.10.14.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 26 Jun 2024 10:14:51 -0700 (PDT) Message-ID: Date: Wed, 26 Jun 2024 11:14:50 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Content-Language: en-US From: Jeff Law To: "gcc-patches@gcc.gnu.org" Subject: [to-be-committed] [RISC-V][V3] movmem for RISCV with V extension X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org And Sergei's movmem patch. Just trivial testsuite adjustment for an option name change and a whitespace fix from me. I've spun this in my tester for rv32 and rv64. I'll wait for pre-commit CI before taking further action. Just a reminder, this patch is designed to handle the case where we can issue a single vector load/store which avoids all the complexities of determining which direction to copy. --- gcc/ChangeLog * config/riscv/riscv.md (movmem): New expander. gcc/testsuite/ChangeLog PR target/112109 * gcc.target/riscv/rvv/base/movmem-1.c: New test gcc/ChangeLog * config/riscv/riscv.md (movmem): New expander. gcc/testsuite/ChangeLog PR target/112109 * gcc.target/riscv/rvv/base/movmem-1.c: New test --- gcc/config/riscv/riscv.md | 22 +++++++ .../gcc.target/riscv/rvv/base/movmem-1.c | 60 +++++++++++++++++++ 2 files changed, 82 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index ff37125e3f2..c0c960353eb 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2723,6 +2723,28 @@ (define_expand "setmem" FAIL; }) +;; Inlining general memmove is a pessimisation: we can't avoid having to decide +;; which direction to go at runtime, which is costly in instruction count +;; however for situations where the entire move fits in one vector operation +;; we can do all reads before doing any writes so we don't have to worry +;; so generate the inline vector code in such situations +;; nb. prefer scalar path for tiny memmoves. +(define_expand "movmem" + [(parallel [(set (match_operand:BLK 0 "general_operand") + (match_operand:BLK 1 "general_operand")) + (use (match_operand:P 2 "const_int_operand")) + (use (match_operand:SI 3 "const_int_operand"))])] + "TARGET_VECTOR" +{ + if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN / 8) + && (INTVAL (operands[2]) <= TARGET_MIN_VLEN) + && riscv_vector::expand_block_move (operands[0], operands[1], + operands[2])) + DONE; + else + FAIL; +}) + ;; Expand in-line code to clear the instruction cache between operand[0] and ;; operand[1]. (define_expand "clear_cache" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c new file mode 100644 index 00000000000..0ecc3f7e3b7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-add-options riscv_v } */ +/* { dg-additional-options "-O3 --param=riscv-autovec-lmul=dynamic" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define MIN_VECTOR_BYTES (__riscv_v_min_vlen / 8) + +/* Tiny memmoves should not be vectorised. +** f1: +** li\s+a2,\d+ +** tail\s+memmove +*/ +char * +f1 (char *a, char const *b) +{ + return __builtin_memmove (a, b, MIN_VECTOR_BYTES - 1); +} + +/* Vectorise+inline minimum vector register width with LMUL=1 +** f2: +** ( +** vsetivli\s+zero,16,e8,m1,ta,ma +** | +** li\s+[ta][0-7],\d+ +** vsetvli\s+zero,[ta][0-7],e8,m1,ta,ma +** ) +** vle8\.v\s+v\d+,0\(a1\) +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +char * +f2 (char *a, char const *b) +{ + return __builtin_memmove (a, b, MIN_VECTOR_BYTES); +} + +/* Vectorise+inline up to LMUL=8 +** f3: +** li\s+[ta][0-7],\d+ +** vsetvli\s+zero,[ta][0-7],e8,m8,ta,ma +** vle8\.v\s+v\d+,0\(a1\) +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +char * +f3 (char *a, char const *b) +{ + return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8); +} + +/* Don't vectorise if the move is too large for one operation +** f4: +** li\s+a2,\d+ +** tail\s+memmove +*/ +char * +f4 (char *a, char const *b) +{ + return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8 + 1); +}