From patchwork Mon Feb 26 15:54:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1904452 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=IiFNl48r; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Tk4v76Pf9z23cb for ; Tue, 27 Feb 2024 02:54:59 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D55863858412 for ; Mon, 26 Feb 2024 15:54:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by sourceware.org (Postfix) with ESMTPS id 930B13858C55 for ; Mon, 26 Feb 2024 15:54:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 930B13858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 930B13858C55 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::331 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708962874; cv=none; b=vH/EwW7q1xT2hKatPj+HnANHegZrZfXpFp5tAkw1VAml74WI6V6G9qaDmeqr+sARMDPcQxeRsM5yPDSFx9hbN1yQogp+5QiyPpJj8+aoyHhhrcyvNDZUnpAUH6/YXuIk1ZPBeaVtcqnZBaGZhpWjfZLo68kBDP3YfUCrF6rjcXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708962874; c=relaxed/simple; bh=Ktg/dK19wrepD/RULPyyFHHXIarwkN0TRd9BuHtiFwU=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=TK0RFIO5cj5Xyno8chDghAWA4SnmP8b4M4w1D0tHJIeYJsyH0fdc+0htTHhfBI9CauWIqaA1x6Lkyy4biswtvxbXVCfY9xY8TQG9gBjxmv3RWs56kqP3ZTTS+apcN7P7x94qLHuhRvjjjS4Q1h0ldb33vt0+QHS26EhgjCpHVPo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x331.google.com with SMTP id 5b1f17b1804b1-412a4a932f4so9070835e9.0 for ; Mon, 26 Feb 2024 07:54:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708962869; x=1709567669; darn=gcc.gnu.org; h=content-transfer-encoding:subject:from:to:content-language:cc :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=D9gKIZtiCTyMC7uqjJn0KeDiFV/eRiXy6+4WlGB6FYY=; b=IiFNl48r+P12moNkmKQl8waKsHFtzBCxh9W1snwPVpSSdhVkt+LLiXF9gUyppZ7A6N 8TQ2ipJt/VStbCgiqfDtmOsNNFcudOkDzHzhJQqWab1o8f46vG9dzZnHp7lIbDq6Tfrg KrbPpdZ15xXsKHBfnHdC2T0Pc8oq1GuE88gWLWF4ogrlQIsopj5D1rEQOWdouow6LpaV ZFDLHaOt8AU2gAjdvkqluKkNFpsNzDkUFWBadNGfsrHPyqoaZrT9IyQJzO0SUFPC1uJt qGiUAEGbFYxckBecMiX9e7NBuI+IUuxOuWKTS2yHSMVYYHVG1K9ePtEqPEOqVLnlzXbg eyeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708962869; x=1709567669; h=content-transfer-encoding:subject:from:to:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=D9gKIZtiCTyMC7uqjJn0KeDiFV/eRiXy6+4WlGB6FYY=; b=Z2f/hnOz8xynMWEud9jZ8NDRvKLPsrsVU2OtPKkjiPgU3OdvLFnJTu8vbgAVoQIUnv uFq2fwBS5CllUKWtA4sUmCiaJHoakI2DZZ0D6SMkUsClUFntY4Y6mszUCZJDVriHDWtg R9Bk9U4TFyNcVcEY9xIzVp4wPoU1Chdld5VsFX+TfclHQI875IeKJqpuST+p25twli/6 mtecWFDtutWK3iWrqJdlU02cvvyWbzQyKBv2YjKO45YtSep0WgSHlpmcT7K7lOSHMRjX su4ndyeO/hXnaB7Y/hEbgkgsDd1xag8VpB9iwgO6EZ4QeiHiCcKhO129LM8Cqd2Akmuy +MhA== X-Gm-Message-State: AOJu0YzLMXE4N7H75NjU2BdGFe+a5exZ9nxfvu9DOTa5U5KmSX4ArEC6 U2eBjx2ydWzSVKEOijOliQBDz/Qt9lNHnfHYgVESAFWi3g8H39mWbyjAx6CX X-Google-Smtp-Source: AGHT+IHRe1MOxwuzp9lrv9K//r7iXLdFA5yYN4vGdQLiTVq1k45qvZ3l7I0Xw+i3CZozs4T+UIAUfw== X-Received: by 2002:a05:600c:1394:b0:410:c25d:37e9 with SMTP id u20-20020a05600c139400b00410c25d37e9mr5439360wmf.16.1708962868576; Mon, 26 Feb 2024 07:54:28 -0800 (PST) Received: from [192.168.1.23] (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id bx27-20020a5d5b1b000000b0033daa63807fsm8833966wrb.24.2024.02.26.07.54.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Feb 2024 07:54:28 -0800 (PST) Message-ID: Date: Mon, 26 Feb 2024 16:54:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com, jeffreyalaw Content-Language: en-US To: gcc-patches , palmer , Kito Cheng , "juzhe.zhong@rivai.ai" From: Robin Dapp Subject: [PATCH] RISC-V: Add initial cost handling for segment loads/stores. X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This has been sitting on my local tree - I've been wanting to post it for a while but somehow forgot. This patch makes segment loads and stores more expensive. It adds segment_load and segment_store cost fields to the common vector costs and adds handling to adjust_stmt_cost. In the future we could handle this in a more fine-grained manner but let's start somehow. Regtested on rv64. Regards Robin gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_[load/store]_cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_[load/store]_cost to 1. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c: Expect m4 instead of m2. --- gcc/config/riscv/riscv-protos.h | 4 + gcc/config/riscv/riscv-vector-costs.cc | 127 ++++++++++++------ gcc/config/riscv/riscv.cc | 4 + .../vect/costmodel/riscv/rvv/pr113112-4.c | 4 +- 4 files changed, 95 insertions(+), 44 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..2e8ab9990a8 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,10 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store cost. */ + const int segment_load_cost; + const int segment_store_cost; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..d3c12444773 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,24 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if ((kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) + { + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); + } + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1085,80 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing - as opposed to scalar loads and stores. - If the address depends on a variable we need an additional - add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + int group_size; + if ((group_size + = segment_loadstore_group_size (kind, stmt_info)) > 1) { - tree ref; - if (TREE_CODE (dr->ref) != MEM_REF - || !(ref = TREE_OPERAND (dr->ref, 0)) - || TREE_CODE (ref) != SSA_NAME) - break; + /* Segment loads and stores. When the group size is > 1 + the vectorizer will add a vector load/store statement for + each vector in the group. Note that STMT_COST is + overwritten here rather than adjusted. */ + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost + = (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)) + ? costs->vla->segment_load_cost + : costs->vla->segment_store_cost); + else + stmt_cost + = (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)) + ? costs->vls->segment_load_cost + : costs->vls->segment_store_cost); + break; + /* TODO: Indexed and ordered/unordered cost. */ + } + else + { + /* Unit-stride vector loads and stores do not have offset + addressing as opposed to scalar loads and stores. + If the address depends on a variable we need an additional + add/sub for each load/store in the worst case. */ + data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); + class loop *father = stmt_info->stmt->bb->loop_father; + if (!loop && father && !father->inner && father->superloops) + { + tree ref; + if (TREE_CODE (dr->ref) != MEM_REF + || !(ref = TREE_OPERAND (dr->ref, 0)) + || TREE_CODE (ref) != SSA_NAME) + break; - if (SSA_NAME_IS_DEFAULT_DEF (ref)) - break; + if (SSA_NAME_IS_DEFAULT_DEF (ref)) + break; - if (memrefs.contains ({ref, cst0})) - break; + if (memrefs.contains ({ref, cst0})) + break; - memrefs.add ({ref, cst0}); + memrefs.add ({ref, cst0}); - /* In case we have not seen REF before and the base address - is a pointer operation try a bit harder. */ - tree base = DR_BASE_ADDRESS (dr); - if (TREE_CODE (base) == POINTER_PLUS_EXPR - || TREE_CODE (base) == POINTER_DIFF_EXPR) - { - /* Deconstruct BASE's first operand. If it is a binary - operation, i.e. a base and an "offset" store this - pair. Only increase the stmt_cost if we haven't seen - it before. */ - tree argp = TREE_OPERAND (base, 1); - typedef std::pair addr_pair; - addr_pair pair; - if (TREE_CODE_CLASS (TREE_CODE (argp)) == tcc_binary) + /* In case we have not seen REF before and the base + address is a pointer operation try a bit harder. */ + tree base = DR_BASE_ADDRESS (dr); + if (TREE_CODE (base) == POINTER_PLUS_EXPR + || TREE_CODE (base) == POINTER_DIFF_EXPR) { - tree argp0 = tree_strip_nop_conversions - (TREE_OPERAND (argp, 0)); - tree argp1 = TREE_OPERAND (argp, 1); - pair = addr_pair (argp0, argp1); - if (memrefs.contains (pair)) - break; - - memrefs.add (pair); - stmt_cost += builtin_vectorization_cost (scalar_stmt, - NULL_TREE, 0); + /* Deconstruct BASE's first operand. If it is a + binary operation, i.e. a base and an "offset" + store this pair. Only increase the stmt_cost if + we haven't seen it before. */ + tree argp = TREE_OPERAND (base, 1); + typedef std::pair addr_pair; + addr_pair pair; + if (TREE_CODE_CLASS (TREE_CODE (argp)) == tcc_binary) + { + tree argp0 = tree_strip_nop_conversions + (TREE_OPERAND (argp, 0)); + tree argp1 = TREE_OPERAND (argp, 1); + pair = addr_pair (argp0, argp1); + if (memrefs.contains (pair)) + break; + + memrefs.add (pair); + stmt_cost + += builtin_vectorization_cost (scalar_stmt, + NULL_TREE, 0); + } } } } diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5e984ee2a55..6c2f0ec34f4 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -365,6 +365,8 @@ static const common_vector_cost rvv_vls_vector_cost = { 1, /* fp_stmt_cost */ 1, /* gather_load_cost */ 1, /* scatter_store_cost */ + 2, /* segment_load_cost */ + 2, /* segment_store_cost */ 1, /* vec_to_scalar_cost */ 1, /* scalar_to_vec_cost */ 1, /* permute_cost */ @@ -381,6 +383,8 @@ static const scalable_vector_cost rvv_vla_vector_cost = { 1, /* fp_stmt_cost */ 1, /* gather_load_cost */ 1, /* scatter_store_cost */ + 2, /* segment_load_cost */ + 2, /* segment_store_cost */ 1, /* vec_to_scalar_cost */ 1, /* scalar_to_vec_cost */ 1, /* permute_cost */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c index 5c55a66ed77..bdeedaf5224 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c @@ -21,8 +21,8 @@ void move_replacements (rtx *x, rtx *y, int n_replacements) } } -/* { dg-final { scan-assembler {e64,m2} } } */ -/* { dg-final { scan-assembler-not {e64,m4} } } */ +/* { dg-final { scan-assembler-not {e64,m2} } } */ +/* { dg-final { scan-assembler {e64,m4} } } */ /* { dg-final { scan-assembler-not {jr} } } */ /* { dg-final { scan-assembler {ret} } } */ /* { dg-final { scan-assembler-not {sp} } } */