From patchwork Fri May 31 03:25:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 1941916 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=bh0GwAGz; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Vr7n731Pgz20Pc for ; Fri, 31 May 2024 13:26:01 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 481AA385ED69 for ; Fri, 31 May 2024 03:25:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 8D59C3858C60 for ; Fri, 31 May 2024 03:25:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8D59C3858C60 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8D59C3858C60 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717125938; cv=none; b=rpbVszCRHv6LMK9kIql0xMmce0lrZBvEqKCHRfRbhLTOyrLwBq7B7gnkANJ3kIbeEBeZsVzTY0XCMwzxJo6YaBJkYNMcnS2PLOBnYZxOrgxGPJHwUt7xf/m7HObQ4Nw34j21cTUjb9N3KPVOnQSGUbiasIp/Ho4LgPk/PRpE/MU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717125938; c=relaxed/simple; bh=J95gkQJgXrinyLHonWaSlbQ+3e6FKqec1WmhnNu3uTI=; h=DKIM-Signature:Message-ID:Date:To:From:Subject:MIME-Version; b=mkyajqGPhVqXN/XQceulo6T8pY/ctMJhPdd3JmWXrhA4MkYsMxeF+T+0k1RB96UG/eAkGGUWHetuCw2BhriYb9ZysX6SFRTwLo46NKTDQA1YD+O4FTQHq1WQLjy7Pay4UTF2F4irUcuDJDgdJCflsl9NrHkCn4N3dTV8HcP83YM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 44V2QG1c022421; Fri, 31 May 2024 03:25:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc : content-transfer-encoding : content-type : date : from : message-id : mime-version : subject : to; s=pp1; bh=ng1gdaLMiDaIlBlSIsAkGZsNoGG3xQU1Q4O4xRVEpd0=; b=bh0GwAGz5E229v3m56Jot3+M04P6UIV+GSxCLk96x7OOqfCSMVnSNfPLtdLLWiCbd65t UPXN9O/V3CULR7dKWH2DCr6+WrDh8PKpzK+SoODUyVfewEyEykJcybMh2CUJcm+gIoCY 7FghE9Iw0InipgmCCy7szAzxEvyo8Dc9ZnJgcyBwl9jgDUyFySlr3XGGoZVcC39S44sd nwMoushleN1/QJFYEfOjMaMejhMaenqwnvcrEnFAaxVvWZmikAka54qR9V6ljswCwqKR YkkF78U57iB/CCV1HjLEKPGAXbUA77MS8w0xSarMCB3CFmjmd3toYoe7cUn/AxI8NMJY BA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yf5jk849b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 31 May 2024 03:25:35 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 44V3PZ2L012446; Fri, 31 May 2024 03:25:35 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yf5jk8497-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 31 May 2024 03:25:35 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 44V1RIGV029003; Fri, 31 May 2024 03:25:34 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3ydpaywmwj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 31 May 2024 03:25:34 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 44V3PSIw47776172 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 31 May 2024 03:25:30 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6AC702004D; Fri, 31 May 2024 03:25:28 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 945C220043; Fri, 31 May 2024 03:25:26 +0000 (GMT) Received: from [9.200.58.54] (unknown [9.200.58.54]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 31 May 2024 03:25:26 +0000 (GMT) Message-ID: <8b8d0cdc-7725-4cb6-a31f-257392101b7a@linux.ibm.com> Date: Fri, 31 May 2024 11:25:25 +0800 User-Agent: Mozilla Thunderbird Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner From: HAO CHEN GUI Subject: [PATCHv2, rs6000] Optimize vector construction with two vector doubleword loads [PR103568] X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: x-AcNPWsvkr8J9NZrPWMES4bYqsx9rBu X-Proofpoint-GUID: cxbFKuGS3q8ijRldfwOIJD9gf9V14t3M X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.12.28.16 definitions=2024-05-30_21,2024-05-30_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 mlxlogscore=966 lowpriorityscore=0 adultscore=0 priorityscore=1501 bulkscore=0 clxscore=1015 suspectscore=0 impostorscore=0 mlxscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2405310025 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This patch optimizes vector construction with two vector doubleword loads. It generates an optimal insn sequence as "xxlor" has lower latency than "mtvsrdd" on Power10. Compared with previous version, the main change is to use "isa" attribute to guard "lxsd" and "lxsdx". https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653103.html Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. OK for the trunk? Thanks Gui Haochen ChangeLog rs6000: Optimize vector construction with two vector doubleword loads When constructing a vector by two doublewords from memory, originally it does ld 10,0(3) ld 9,0(4) mtvsrdd 34,9,10 An optimal sequence on Power10 should be lxsd 0,0(4) lxvrdx 1,0,3 xxlor 34,1,32 This patch does this optimization by insn combine and split. gcc/ PR target/103568 * config/rs6000/vsx.md (vsx_ld_lowpart_zero_): New insn pattern. (vsx_ld_highpart_zero_): New insn pattern. (vsx_concat_mem_): New insn_and_split pattern. gcc/testsuite/ PR target/103568 * gcc.target/powerpc/pr103568.c: New test. patch.diff diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index f135fa079bd..f9a2a260e89 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -1395,6 +1395,27 @@ (define_insn "vsx_ld_elemrev_v2di" "lxvd2x %x0,%y1" [(set_attr "type" "vecload")]) +(define_insn "vsx_ld_lowpart_zero_" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=v,wa") + (vec_concat:VSX_D + (match_operand: 1 "memory_operand" "wY,Z") + (match_operand: 2 "zero_constant" "j,j")))] + "" + "@ + lxsd %0,%1 + lxsdx %x0,%y1" + [(set_attr "type" "vecload,vecload") + (set_attr "isa" "p9v,p7v")]) + +(define_insn "vsx_ld_highpart_zero_" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") + (vec_concat:VSX_D + (match_operand: 1 "zero_constant" "j") + (match_operand: 2 "memory_operand" "Z")))] + "TARGET_POWER10" + "lxvrdx %x0,%y2" + [(set_attr "type" "vecload")]) + (define_insn "vsx_ld_elemrev_v1ti" [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa") (vec_select:V1TI @@ -3063,6 +3084,26 @@ (define_insn "vsx_concat_" } [(set_attr "type" "vecperm,vecmove")]) +(define_insn_and_split "vsx_concat_mem_" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=v,wa") + (vec_concat:VSX_D + (match_operand: 1 "memory_operand" "wY,Z") + (match_operand: 2 "memory_operand" "Z,Z")))] + "TARGET_POWER10 && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + rtx tmp1 = gen_reg_rtx (mode); + rtx tmp2 = gen_reg_rtx (mode); + emit_insn (gen_vsx_ld_highpart_zero_ (tmp1, CONST0_RTX (mode), + operands[1])); + emit_insn (gen_vsx_ld_lowpart_zero_ (tmp2, operands[2], + CONST0_RTX (mode))); + emit_insn (gen_ior3 (operands[0], tmp1, tmp2)); + DONE; +}) + ;; Combiner patterns to allow creating XXPERMDI's to access either double ;; word element in a vector register. (define_insn "*vsx_concat__1" diff --git a/gcc/testsuite/gcc.target/powerpc/pr103568.c b/gcc/testsuite/gcc.target/powerpc/pr103568.c new file mode 100644 index 00000000000..b2a06fb2162 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103568.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +vector double test (double *a, double *b) +{ + return (vector double) {*a, *b}; +} + +vector long long test1 (long long *a, long long *b) +{ + return (vector long long) {*a, *b}; +} + +/* { dg-final { scan-assembler-times {\mlxsd} 2 } } */ +/* { dg-final { scan-assembler-times {\mlxvrdx\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 2 } } */ +