From patchwork Mon Feb 27 12:32:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1748675 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=JVp/xQpj; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PQKfX54b6z1yWw for ; Mon, 27 Feb 2023 23:33:24 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A3A613858C27 for ; Mon, 27 Feb 2023 12:33:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A3A613858C27 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677501202; bh=AB2S/69CNh3ivZN/JJo+kSXu+kXOcQhx8crrH2XvhJ4=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=JVp/xQpjED+7JEkStiLN0zPDbFqvGaiwNf+zKym2ZtM+tlf3gV+eHoi6ZdmpYPEM8 aYhEMvknvWlLYKP/yQpcpBGGm/9BZbNBnECeDXhMvrRLy0ntTZUQSn28RJUQ8p1iCk 7ZScJBcW81BtVMFmHINJ5/KsRDW0hiFkB3bHhiIY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2043.outbound.protection.outlook.com [40.107.20.43]) by sourceware.org (Postfix) with ESMTPS id EFFA73858D32 for ; Mon, 27 Feb 2023 12:32:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EFFA73858D32 Received: from AM6P193CA0128.EURP193.PROD.OUTLOOK.COM (2603:10a6:209:85::33) by DB9PR08MB6620.eurprd08.prod.outlook.com (2603:10a6:10:256::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Mon, 27 Feb 2023 12:32:52 +0000 Received: from AM7EUR03FT061.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:85:cafe::ee) by AM6P193CA0128.outlook.office365.com (2603:10a6:209:85::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29 via Frontend Transport; Mon, 27 Feb 2023 12:32:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT061.mail.protection.outlook.com (100.127.140.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.16 via Frontend Transport; Mon, 27 Feb 2023 12:32:52 +0000 Received: ("Tessian outbound 43b0faad5a68:v132"); Mon, 27 Feb 2023 12:32:51 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: c4ef34ba838410a0 X-CR-MTA-TID: 64aa7808 Received: from 2617f3c68812.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 87770EFC-54A3-49D5-9C78-936473524EC5.1; Mon, 27 Feb 2023 12:32:45 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 2617f3c68812.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 27 Feb 2023 12:32:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IYCn5fd/vt8D6gcTRzy4eSOKOXqqJrPnlKVGK5B+BWa1WlwTJt0JhhuZfTwOGSVaMHEP/0r+fRy0efvRV0SN3XiYm1SiE5DCxuflQ6mDrW2FS+/dYb/Duo98vCQA6Jp8YrdMrxFx4pM84oiLZg4XNK1buxbVmbmsd99tSY8zP8UnruqbzrE4wG3vWX1us+LtM+o1uOY6bUHNcgfUWmi6kReHqj0uzL6J6Ml1grTHs6m1IcWTTIYVIa05nAffe5+u8AtQJcnyHFmNl1841GF5Se3UPIqXTViHDzrx+44HTvdOvVXzpd6/+jJQGUw6rqfv4zMh64TSUyWIudvMq/AaWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AB2S/69CNh3ivZN/JJo+kSXu+kXOcQhx8crrH2XvhJ4=; b=TdyuXfHK3CkT9x1+n8hghrXhWI7RNfD85DRR4tukZjynv1wxcOuh4YfeUa8PrX6vrADXM4htB4oV+ZoTKtF6SiAJYSViFE4FtOIJ2/J1owDIH8UloY0DN2v8ZshbE4K7p0O41dNN2NKgTxNPMV9IulYO5CqIdhfGQrlmR/chgH599AAgfFMBYWYRqeosfAqWIcwuQeHlH8DHMVCmHXa3jymVjiOLIKm0I3iKon9rGLs3F/uy3hBFNK8s9+ecBAD4CAqZ59e/XYEQyK0QKCaQCe4fhm5qD7mZP58a60h1No3zvmVPFcc7fjULxhNXYAh03fPVhpam+634mIlXB+D/4w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS8PR08MB6006.eurprd08.prod.outlook.com (2603:10a6:20b:29a::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Mon, 27 Feb 2023 12:32:42 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe%6]) with mapi id 15.20.6134.027; Mon, 27 Feb 2023 12:32:42 +0000 Date: Mon, 27 Feb 2023 12:32:39 +0000 To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com Subject: [PATCH 1/4]middle-end: Revert can_special_div_by_const changes [PR108583] Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P265CA0045.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2ac::21) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS8PR08MB6006:EE_|AM7EUR03FT061:EE_|DB9PR08MB6620:EE_ X-MS-Office365-Filtering-Correlation-Id: c5dae077-86ca-4044-6ccb-08db18bebe7a x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: X27PnL+I1KWTeAEU/EnSoWIxs9obKxTyGxZf/0tHriE8OnhKUXRpg3/S6r8PRVz9tLHjK1Cgr7TdAZYyfctpX+IehnzPkYbllX5Bm8DHN1wEDLgTGA+Jg6D4BhzPprlK6YhgqOD6wMV6gMaeyVXN8TZOUGUeSc3jKMqNTsICMgTL0dAq7gTXPKtg9iG31RPDDpSkYVeTGbqAFCorZA4D9G72/MORd+qBvoo8/HpaYfknpoADPUADZgxmzaR0nIHuJd00sAzGMeft8zZtzc7E2TU2rFEfn7TCBUNOMKkQ4E8ZySoQ7uHMENjsnLDXuQLOfbf1hUm6yZLhf+izcAt/ym++cRutI331To6SR77sMQ3YWu1d9jrX7ipnivSeLkGX551j1UjjSiJnAMYoMlLDCGWv1qk5Ucn2VE9oXtjA6opc3Q6uFEwW3BwG8Kc4xW+i+DEQCRPC6YBNc80+VAsPMPVFjIxjXkAIYA5eMClUXj8jp8Kk45xNbX6Av4dG4P7yP1CG2Wq/X6dL0MORJob0CiSL9Pm5lRvORv7VkJKgvjhdWuSJq7ZQRTkzwI2AaCqV0kRwTPJbWhEkhoa6jUFa97o2lFn69VqjOYZwImkxudORhkOAftQWw/owMw+rWmdVMFPkTIkUtevEABBmUXDbbqcqwVC9vxeDfL9JHNqUxvIfw2RsJkQoJJEgdBHmzVygqiw8EDqZFEGAalEdyBRWKk7a3yN7OOMLIXlWlIYD69M= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230025)(4636009)(376002)(366004)(396003)(136003)(346002)(39850400004)(451199018)(316002)(2906002)(36756003)(5660300002)(235185007)(44832011)(30864003)(8936002)(86362001)(6512007)(33964004)(6506007)(44144004)(66946007)(8676002)(186003)(38100700002)(26005)(4743002)(4326008)(6916009)(66556008)(66476007)(41300700001)(2616005)(83380400001)(478600001)(6486002)(6666004)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6006 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT061.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 770e34fd-8e7c-4d0c-6364-08db18beb856 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0SpXn0nDPFTjZKT5FHJ+cM90YmOmSnskO+x4qayWCxmdYd8xIlwkHXwQJS6uiky1P7LOWQuttWB8qPwLa9841zKCLsjSGx/xbFWBAgChU2RUeQrBpKtbiIF3wh+EajQt6653Kk11gAfkOH/RZYrciwWfIBtaiqFHu4lzvXEyoQh2HDbx8V1mfC72TuYqj+UAdXBtrbDVJ8AWLzK/mgam0e34cc/1t/YgdLMX2dSs369d9ekIKEMHYvNwesxdzFVIY9QxigJ3RCruvWxlwjhChizn5ZaTgndZeF+y01BT2+ZJargJXgOReZBICJdoSfkrJMJ3sgCnZm6kAPYxN7KugbYLsJlWLAd77EPceG1AbZ7byR98ezbnLZ+bhQBe/I9zvE+szKtBQ1tYSawMLtD8yKPWyKULe6cy75KZf6pAfXLXlGSg04Nzjxpp1ZeFkCPlVxqfUf2XBeO1DiXOX+3Pf+vYyPRQF3vBXBeo+05/RW3Ah9NPcxyXEYL63VEnoR7/5LZHljUMBMCJ6v9F38U6GklZMfZTMNtTQf8TecceEZkqjfAEXO27uU+eQHZhYnRUC5OPvHPXfC3lChXdlMoU/42f908gZw6RAw5PLsXXGHWme8pCi9NGMDkh7gmlrV8L0jiCnLj2FcF3hAmhd1YjJLqtDvnhFmiCjGnrGSI7zoNbJiTw1CZxEpMhybfTrN2plOmyhS9AoRpftotxctKo1sTqGaQWLgzwmLqvoBS45OA= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230025)(4636009)(136003)(346002)(376002)(39860400002)(396003)(451199018)(36840700001)(46966006)(81166007)(82740400003)(316002)(36860700001)(2906002)(36756003)(30864003)(5660300002)(235185007)(44832011)(8936002)(86362001)(40480700001)(6512007)(44144004)(33964004)(6506007)(70586007)(70206006)(26005)(4743002)(186003)(4326008)(6916009)(8676002)(356005)(41300700001)(336012)(2616005)(82310400005)(47076005)(478600001)(83380400001)(6486002)(107886003)(6666004)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Feb 2023 12:32:52.0823 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c5dae077-86ca-4044-6ccb-08db18bebe7a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT061.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6620 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, RCVD_IN_VALIDITY_RPBL, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This reverts the changes for the CAN_SPECIAL_DIV_BY_CONST hook. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/108583 * doc/tm.texi (TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST): Remove. * doc/tm.texi.in: Likewise. * explow.cc (round_push, align_dynamic_address): Revert previous patch. * expmed.cc (expand_divmod): Likewise. * expmed.h (expand_divmod): Likewise. * expr.cc (force_operand, expand_expr_divmod): Likewise. * optabs.cc (expand_doubleword_mod, expand_doubleword_divmod): Likewise. * target.def (can_special_div_by_const): Remove. * target.h: Remove tree-core.h include * targhooks.cc (default_can_special_div_by_const): Remove. * targhooks.h (default_can_special_div_by_const): Remove. * tree-vect-generic.cc (expand_vector_operation): Remove hook. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Remove hook. * tree-vect-stmts.cc (vectorizable_operation): Remove hook. --- inline copy of patch -- diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c6c891972d1e58cd163b259ba96a599d62326865..50a8872a6695b18b9bed0d393bacf733833633db 100644 --- diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c6c891972d1e58cd163b259ba96a599d62326865..50a8872a6695b18b9bed0d393bacf733833633db 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6137,20 +6137,6 @@ instruction pattern. There is no need for the hook to handle these two implementation approaches itself. @end deftypefn -@deftypefn {Target Hook} bool TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST (enum @var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx *@var{output}, rtx @var{in0}, rtx @var{in1}) -This hook is used to test whether the target has a special method of -division of vectors of type @var{vectype} using the value @var{constant}, -and producing a vector of type @var{vectype}. The division -will then not be decomposed by the vectorizer and kept as a div. - -When the hook is being used to test whether the target supports a special -divide, @var{in0}, @var{in1}, and @var{output} are all null. When the hook -is being used to emit a division, @var{in0} and @var{in1} are the source -vectors of type @var{vecttype} and @var{output} is the destination vector of -type @var{vectype}. - -Return true if the operation is possible, emitting instructions for it -if rtxes are provided and updating @var{output}. @end deftypefn @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in}) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 613b2534149415f442163d599503efaf423b673b..3e07978a02f4e6077adae6cadc93ea4273295f1f 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4173,7 +4173,6 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_VEC_PERM_CONST -@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION diff --git a/gcc/explow.cc b/gcc/explow.cc index 83439b32abe1b9aa4b7983eb629804f97486acbd..be9195b33323ee5597fc212f0befa016eea4573c 100644 --- a/gcc/explow.cc +++ b/gcc/explow.cc @@ -1037,7 +1037,7 @@ round_push (rtx size) TRUNC_DIV_EXPR. */ size = expand_binop (Pmode, add_optab, size, alignm1_rtx, NULL_RTX, 1, OPTAB_LIB_WIDEN); - size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, size, align_rtx, + size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size, align_rtx, NULL_RTX, 1); size = expand_mult (Pmode, size, align_rtx, NULL_RTX, 1); @@ -1203,7 +1203,7 @@ align_dynamic_address (rtx target, unsigned required_align) gen_int_mode (required_align / BITS_PER_UNIT - 1, Pmode), NULL_RTX, 1, OPTAB_LIB_WIDEN); - target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, target, + target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target, gen_int_mode (required_align / BITS_PER_UNIT, Pmode), NULL_RTX, 1); diff --git a/gcc/expmed.h b/gcc/expmed.h index 0419e2dac85850889ce0bee59515e31a80c582de..4dfe635c22ee49f2dba4c53640941628068f3901 100644 --- a/gcc/expmed.h +++ b/gcc/expmed.h @@ -710,9 +710,8 @@ extern rtx expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx, extern rtx maybe_expand_shift (enum tree_code, machine_mode, rtx, int, rtx, int); #ifdef GCC_OPTABS_H -extern rtx expand_divmod (int, enum tree_code, machine_mode, tree, tree, - rtx, rtx, rtx, int, - enum optab_methods = OPTAB_LIB_WIDEN); +extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx, + rtx, int, enum optab_methods = OPTAB_LIB_WIDEN); #endif #endif diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 917360199ca56157cf3c3693b65e93cd9d8ed244..1553ea8e31eb6433025ab18a3a59c169d3b7692f 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -4222,8 +4222,8 @@ expand_sdiv_pow2 (scalar_int_mode mode, rtx op0, HOST_WIDE_INT d) rtx expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, - tree treeop0, tree treeop1, rtx op0, rtx op1, rtx target, - int unsignedp, enum optab_methods methods) + rtx op0, rtx op1, rtx target, int unsignedp, + enum optab_methods methods) { machine_mode compute_mode; rtx tquotient; @@ -4375,17 +4375,6 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, last_div_const = ! rem_flag && op1_is_constant ? INTVAL (op1) : 0; - /* Check if the target has specific expansions for the division. */ - tree cst; - if (treeop0 - && treeop1 - && (cst = uniform_integer_cst_p (treeop1)) - && targetm.vectorize.can_special_div_by_const (code, TREE_TYPE (treeop0), - wi::to_wide (cst), - &target, op0, op1)) - return target; - - /* Now convert to the best mode to use. */ if (compute_mode != mode) { @@ -4629,8 +4618,8 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, || (optab_handler (sdivmod_optab, int_mode) != CODE_FOR_nothing))) quotient = expand_divmod (0, TRUNC_DIV_EXPR, - int_mode, treeop0, treeop1, - op0, gen_int_mode (abs_d, + int_mode, op0, + gen_int_mode (abs_d, int_mode), NULL_RTX, 0); else @@ -4819,8 +4808,8 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, size - 1, NULL_RTX, 0); t3 = force_operand (gen_rtx_MINUS (int_mode, t1, nsign), NULL_RTX); - t4 = expand_divmod (0, TRUNC_DIV_EXPR, int_mode, treeop0, - treeop1, t3, op1, NULL_RTX, 0); + t4 = expand_divmod (0, TRUNC_DIV_EXPR, int_mode, t3, op1, + NULL_RTX, 0); if (t4) { rtx t5; diff --git a/gcc/expr.cc b/gcc/expr.cc index 15be1c8db999103bb9e5fa33daa44ae06de5ace8..78d35297e755216339078d5b2280c6e277f26d72 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -8207,17 +8207,16 @@ force_operand (rtx value, rtx target) return expand_divmod (0, FLOAT_MODE_P (GET_MODE (value)) ? RDIV_EXPR : TRUNC_DIV_EXPR, - GET_MODE (value), NULL, NULL, op1, op2, - target, 0); + GET_MODE (value), op1, op2, target, 0); case MOD: - return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), NULL, NULL, - op1, op2, target, 0); + return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), op1, op2, + target, 0); case UDIV: - return expand_divmod (0, TRUNC_DIV_EXPR, GET_MODE (value), NULL, NULL, - op1, op2, target, 1); + return expand_divmod (0, TRUNC_DIV_EXPR, GET_MODE (value), op1, op2, + target, 1); case UMOD: - return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), NULL, NULL, - op1, op2, target, 1); + return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), op1, op2, + target, 1); case ASHIFTRT: return expand_simple_binop (GET_MODE (value), code, op1, op2, target, 0, OPTAB_LIB_WIDEN); @@ -9170,13 +9169,11 @@ expand_expr_divmod (tree_code code, machine_mode mode, tree treeop0, bool speed_p = optimize_insn_for_speed_p (); do_pending_stack_adjust (); start_sequence (); - rtx uns_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1, - op0, op1, target, 1); + rtx uns_ret = expand_divmod (mod_p, code, mode, op0, op1, target, 1); rtx_insn *uns_insns = get_insns (); end_sequence (); start_sequence (); - rtx sgn_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1, - op0, op1, target, 0); + rtx sgn_ret = expand_divmod (mod_p, code, mode, op0, op1, target, 0); rtx_insn *sgn_insns = get_insns (); end_sequence (); unsigned uns_cost = seq_cost (uns_insns, speed_p); @@ -9198,8 +9195,7 @@ expand_expr_divmod (tree_code code, machine_mode mode, tree treeop0, emit_insn (sgn_insns); return sgn_ret; } - return expand_divmod (mod_p, code, mode, treeop0, treeop1, - op0, op1, target, unsignedp); + return expand_divmod (mod_p, code, mode, op0, op1, target, unsignedp); } rtx diff --git a/gcc/optabs.cc b/gcc/optabs.cc index cf22bfec3f5513f56d22c866231edbf322ff6945..474ccbd7915b4f144cebe0369a6e77082c1e617b 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1106,9 +1106,8 @@ expand_doubleword_mod (machine_mode mode, rtx op0, rtx op1, bool unsignedp) return NULL_RTX; } } - rtx remainder = expand_divmod (1, TRUNC_MOD_EXPR, word_mode, NULL, NULL, - sum, gen_int_mode (INTVAL (op1), - word_mode), + rtx remainder = expand_divmod (1, TRUNC_MOD_EXPR, word_mode, sum, + gen_int_mode (INTVAL (op1), word_mode), NULL_RTX, 1, OPTAB_DIRECT); if (remainder == NULL_RTX) return NULL_RTX; @@ -1211,8 +1210,8 @@ expand_doubleword_divmod (machine_mode mode, rtx op0, rtx op1, rtx *rem, if (op11 != const1_rtx) { - rtx rem2 = expand_divmod (1, TRUNC_MOD_EXPR, mode, NULL, NULL, quot1, - op11, NULL_RTX, unsignedp, OPTAB_DIRECT); + rtx rem2 = expand_divmod (1, TRUNC_MOD_EXPR, mode, quot1, op11, + NULL_RTX, unsignedp, OPTAB_DIRECT); if (rem2 == NULL_RTX) return NULL_RTX; @@ -1226,8 +1225,8 @@ expand_doubleword_divmod (machine_mode mode, rtx op0, rtx op1, rtx *rem, if (rem2 == NULL_RTX) return NULL_RTX; - rtx quot2 = expand_divmod (0, TRUNC_DIV_EXPR, mode, NULL, NULL, quot1, - op11, NULL_RTX, unsignedp, OPTAB_DIRECT); + rtx quot2 = expand_divmod (0, TRUNC_DIV_EXPR, mode, quot1, op11, + NULL_RTX, unsignedp, OPTAB_DIRECT); if (quot2 == NULL_RTX) return NULL_RTX; diff --git a/gcc/target.def b/gcc/target.def index db8af0cbe81624513f114fc9bbd8be61d855f409..e0a5c7adbd962f5d08ed08d1d81afa2c2baa64a5 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1905,25 +1905,6 @@ implementation approaches itself.", const vec_perm_indices &sel), NULL) -DEFHOOK -(can_special_div_by_const, - "This hook is used to test whether the target has a special method of\n\ -division of vectors of type @var{vectype} using the value @var{constant},\n\ -and producing a vector of type @var{vectype}. The division\n\ -will then not be decomposed by the vectorizer and kept as a div.\n\ -\n\ -When the hook is being used to test whether the target supports a special\n\ -divide, @var{in0}, @var{in1}, and @var{output} are all null. When the hook\n\ -is being used to emit a division, @var{in0} and @var{in1} are the source\n\ -vectors of type @var{vecttype} and @var{output} is the destination vector of\n\ -type @var{vectype}.\n\ -\n\ -Return true if the operation is possible, emitting instructions for it\n\ -if rtxes are provided and updating @var{output}.", - bool, (enum tree_code, tree vectype, wide_int constant, rtx *output, - rtx in0, rtx in1), - default_can_special_div_by_const) - /* Return true if the target supports misaligned store/load of a specific factor denoted in the third parameter. The last parameter is true if the access is defined in a packed struct. */ diff --git a/gcc/target.h b/gcc/target.h index 03fd03a52075b4836159035ec14078c0aebdd7e9..93691882757232c514fca82b99f913158c2d47b1 100644 --- a/gcc/target.h +++ b/gcc/target.h @@ -51,7 +51,6 @@ #include "insn-codes.h" #include "tm.h" #include "hard-reg-set.h" -#include "tree-core.h" #if CHECKING_P diff --git a/gcc/targhooks.h b/gcc/targhooks.h index a1df260f5483dc84f18d8f12c5202484a32d5bb7..a6a4809ca91baa5d7fad2244549317a31390f0c2 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -209,8 +209,6 @@ extern void default_addr_space_diagnose_usage (addr_space_t, location_t); extern rtx default_addr_space_convert (rtx, tree, tree); extern unsigned int default_case_values_threshold (void); extern bool default_have_conditional_execution (void); -extern bool default_can_special_div_by_const (enum tree_code, tree, wide_int, - rtx *, rtx, rtx); extern bool default_libc_has_function (enum function_class, tree); extern bool default_libc_has_fast_function (int fcode); diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index fe0116521feaf32187e7bc113bf93b1805852c79..211525720a620d6f533e2da91e03877337a931e7 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1840,14 +1840,6 @@ default_have_conditional_execution (void) return HAVE_conditional_execution; } -/* Default that no division by constant operations are special. */ -bool -default_can_special_div_by_const (enum tree_code, tree, wide_int, rtx *, rtx, - rtx) -{ - return false; -} - /* By default we assume that c99 functions are present at the runtime, but sincos is not. */ bool diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 166a248f4b9512d4c6fc8d760b458b7a467f7790..519a824ec727d4d4f28c14077dc3e970bed75ef6 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1237,17 +1237,6 @@ expand_vector_operation (gimple_stmt_iterator *gsi, tree type, tree compute_type tree rhs2 = gimple_assign_rhs2 (assign); tree ret; - /* Check if the target was going to handle it through the special - division callback hook. */ - tree cst = uniform_integer_cst_p (rhs2); - if (cst && - targetm.vectorize.can_special_div_by_const (code, type, - wi::to_wide (cst), - NULL, - NULL_RTX, NULL_RTX)) - return NULL_TREE; - - if (!optimize || !VECTOR_INTEGER_TYPE_P (type) || TREE_CODE (rhs2) != VECTOR_CST diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 6934aebc69f231af24668f0a1c3d140e97f55487..1766ce277d6b88d8aa3be77e7c8abb504a10a735 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -3913,14 +3913,6 @@ vect_recog_divmod_pattern (vec_info *vinfo, return pattern_stmt; } - else if ((cst = uniform_integer_cst_p (oprnd1)) - && targetm.vectorize.can_special_div_by_const (rhs_code, vectype, - wi::to_wide (cst), - NULL, NULL_RTX, - NULL_RTX)) - { - return NULL; - } if (prec > HOST_BITS_PER_WIDE_INT || integer_zerop (oprnd1)) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index eb4ca1f184e374d177eb43d5eb93acf6e6a8fde9..3a0fb5ad898ad42c3867f0b9564fc4e066e50081 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -6263,15 +6263,6 @@ vectorizable_operation (vec_info *vinfo, } target_support_p = (optab_handler (optab, vec_mode) != CODE_FOR_nothing); - tree cst; - if (!target_support_p - && op1 - && (cst = uniform_integer_cst_p (op1))) - target_support_p - = targetm.vectorize.can_special_div_by_const (code, vectype, - wi::to_wide (cst), - NULL, NULL_RTX, - NULL_RTX); } bool using_emulated_vectors_p = vect_emulated_vector_p (vectype); --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6137,20 +6137,6 @@ instruction pattern. There is no need for the hook to handle these two implementation approaches itself. @end deftypefn -@deftypefn {Target Hook} bool TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST (enum @var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx *@var{output}, rtx @var{in0}, rtx @var{in1}) -This hook is used to test whether the target has a special method of -division of vectors of type @var{vectype} using the value @var{constant}, -and producing a vector of type @var{vectype}. The division -will then not be decomposed by the vectorizer and kept as a div. - -When the hook is being used to test whether the target supports a special -divide, @var{in0}, @var{in1}, and @var{output} are all null. When the hook -is being used to emit a division, @var{in0} and @var{in1} are the source -vectors of type @var{vecttype} and @var{output} is the destination vector of -type @var{vectype}. - -Return true if the operation is possible, emitting instructions for it -if rtxes are provided and updating @var{output}. @end deftypefn @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in}) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 613b2534149415f442163d599503efaf423b673b..3e07978a02f4e6077adae6cadc93ea4273295f1f 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4173,7 +4173,6 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_VEC_PERM_CONST -@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION diff --git a/gcc/explow.cc b/gcc/explow.cc index 83439b32abe1b9aa4b7983eb629804f97486acbd..be9195b33323ee5597fc212f0befa016eea4573c 100644 --- a/gcc/explow.cc +++ b/gcc/explow.cc @@ -1037,7 +1037,7 @@ round_push (rtx size) TRUNC_DIV_EXPR. */ size = expand_binop (Pmode, add_optab, size, alignm1_rtx, NULL_RTX, 1, OPTAB_LIB_WIDEN); - size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, size, align_rtx, + size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size, align_rtx, NULL_RTX, 1); size = expand_mult (Pmode, size, align_rtx, NULL_RTX, 1); @@ -1203,7 +1203,7 @@ align_dynamic_address (rtx target, unsigned required_align) gen_int_mode (required_align / BITS_PER_UNIT - 1, Pmode), NULL_RTX, 1, OPTAB_LIB_WIDEN); - target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, target, + target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target, gen_int_mode (required_align / BITS_PER_UNIT, Pmode), NULL_RTX, 1); diff --git a/gcc/expmed.h b/gcc/expmed.h index 0419e2dac85850889ce0bee59515e31a80c582de..4dfe635c22ee49f2dba4c53640941628068f3901 100644 --- a/gcc/expmed.h +++ b/gcc/expmed.h @@ -710,9 +710,8 @@ extern rtx expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx, extern rtx maybe_expand_shift (enum tree_code, machine_mode, rtx, int, rtx, int); #ifdef GCC_OPTABS_H -extern rtx expand_divmod (int, enum tree_code, machine_mode, tree, tree, - rtx, rtx, rtx, int, - enum optab_methods = OPTAB_LIB_WIDEN); +extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx, + rtx, int, enum optab_methods = OPTAB_LIB_WIDEN); #endif #endif diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 917360199ca56157cf3c3693b65e93cd9d8ed244..1553ea8e31eb6433025ab18a3a59c169d3b7692f 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -4222,8 +4222,8 @@ expand_sdiv_pow2 (scalar_int_mode mode, rtx op0, HOST_WIDE_INT d) rtx expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, - tree treeop0, tree treeop1, rtx op0, rtx op1, rtx target, - int unsignedp, enum optab_methods methods) + rtx op0, rtx op1, rtx target, int unsignedp, + enum optab_methods methods) { machine_mode compute_mode; rtx tquotient; @@ -4375,17 +4375,6 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, last_div_const = ! rem_flag && op1_is_constant ? INTVAL (op1) : 0; - /* Check if the target has specific expansions for the division. */ - tree cst; - if (treeop0 - && treeop1 - && (cst = uniform_integer_cst_p (treeop1)) - && targetm.vectorize.can_special_div_by_const (code, TREE_TYPE (treeop0), - wi::to_wide (cst), - &target, op0, op1)) - return target; - - /* Now convert to the best mode to use. */ if (compute_mode != mode) { @@ -4629,8 +4618,8 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, || (optab_handler (sdivmod_optab, int_mode) != CODE_FOR_nothing))) quotient = expand_divmod (0, TRUNC_DIV_EXPR, - int_mode, treeop0, treeop1, - op0, gen_int_mode (abs_d, + int_mode, op0, + gen_int_mode (abs_d, int_mode), NULL_RTX, 0); else @@ -4819,8 +4808,8 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, size - 1, NULL_RTX, 0); t3 = force_operand (gen_rtx_MINUS (int_mode, t1, nsign), NULL_RTX); - t4 = expand_divmod (0, TRUNC_DIV_EXPR, int_mode, treeop0, - treeop1, t3, op1, NULL_RTX, 0); + t4 = expand_divmod (0, TRUNC_DIV_EXPR, int_mode, t3, op1, + NULL_RTX, 0); if (t4) { rtx t5; diff --git a/gcc/expr.cc b/gcc/expr.cc index 15be1c8db999103bb9e5fa33daa44ae06de5ace8..78d35297e755216339078d5b2280c6e277f26d72 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -8207,17 +8207,16 @@ force_operand (rtx value, rtx target) return expand_divmod (0, FLOAT_MODE_P (GET_MODE (value)) ? RDIV_EXPR : TRUNC_DIV_EXPR, - GET_MODE (value), NULL, NULL, op1, op2, - target, 0); + GET_MODE (value), op1, op2, target, 0); case MOD: - return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), NULL, NULL, - op1, op2, target, 0); + return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), op1, op2, + target, 0); case UDIV: - return expand_divmod (0, TRUNC_DIV_EXPR, GET_MODE (value), NULL, NULL, - op1, op2, target, 1); + return expand_divmod (0, TRUNC_DIV_EXPR, GET_MODE (value), op1, op2, + target, 1); case UMOD: - return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), NULL, NULL, - op1, op2, target, 1); + return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), op1, op2, + target, 1); case ASHIFTRT: return expand_simple_binop (GET_MODE (value), code, op1, op2, target, 0, OPTAB_LIB_WIDEN); @@ -9170,13 +9169,11 @@ expand_expr_divmod (tree_code code, machine_mode mode, tree treeop0, bool speed_p = optimize_insn_for_speed_p (); do_pending_stack_adjust (); start_sequence (); - rtx uns_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1, - op0, op1, target, 1); + rtx uns_ret = expand_divmod (mod_p, code, mode, op0, op1, target, 1); rtx_insn *uns_insns = get_insns (); end_sequence (); start_sequence (); - rtx sgn_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1, - op0, op1, target, 0); + rtx sgn_ret = expand_divmod (mod_p, code, mode, op0, op1, target, 0); rtx_insn *sgn_insns = get_insns (); end_sequence (); unsigned uns_cost = seq_cost (uns_insns, speed_p); @@ -9198,8 +9195,7 @@ expand_expr_divmod (tree_code code, machine_mode mode, tree treeop0, emit_insn (sgn_insns); return sgn_ret; } - return expand_divmod (mod_p, code, mode, treeop0, treeop1, - op0, op1, target, unsignedp); + return expand_divmod (mod_p, code, mode, op0, op1, target, unsignedp); } rtx diff --git a/gcc/optabs.cc b/gcc/optabs.cc index cf22bfec3f5513f56d22c866231edbf322ff6945..474ccbd7915b4f144cebe0369a6e77082c1e617b 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1106,9 +1106,8 @@ expand_doubleword_mod (machine_mode mode, rtx op0, rtx op1, bool unsignedp) return NULL_RTX; } } - rtx remainder = expand_divmod (1, TRUNC_MOD_EXPR, word_mode, NULL, NULL, - sum, gen_int_mode (INTVAL (op1), - word_mode), + rtx remainder = expand_divmod (1, TRUNC_MOD_EXPR, word_mode, sum, + gen_int_mode (INTVAL (op1), word_mode), NULL_RTX, 1, OPTAB_DIRECT); if (remainder == NULL_RTX) return NULL_RTX; @@ -1211,8 +1210,8 @@ expand_doubleword_divmod (machine_mode mode, rtx op0, rtx op1, rtx *rem, if (op11 != const1_rtx) { - rtx rem2 = expand_divmod (1, TRUNC_MOD_EXPR, mode, NULL, NULL, quot1, - op11, NULL_RTX, unsignedp, OPTAB_DIRECT); + rtx rem2 = expand_divmod (1, TRUNC_MOD_EXPR, mode, quot1, op11, + NULL_RTX, unsignedp, OPTAB_DIRECT); if (rem2 == NULL_RTX) return NULL_RTX; @@ -1226,8 +1225,8 @@ expand_doubleword_divmod (machine_mode mode, rtx op0, rtx op1, rtx *rem, if (rem2 == NULL_RTX) return NULL_RTX; - rtx quot2 = expand_divmod (0, TRUNC_DIV_EXPR, mode, NULL, NULL, quot1, - op11, NULL_RTX, unsignedp, OPTAB_DIRECT); + rtx quot2 = expand_divmod (0, TRUNC_DIV_EXPR, mode, quot1, op11, + NULL_RTX, unsignedp, OPTAB_DIRECT); if (quot2 == NULL_RTX) return NULL_RTX; diff --git a/gcc/target.def b/gcc/target.def index db8af0cbe81624513f114fc9bbd8be61d855f409..e0a5c7adbd962f5d08ed08d1d81afa2c2baa64a5 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1905,25 +1905,6 @@ implementation approaches itself.", const vec_perm_indices &sel), NULL) -DEFHOOK -(can_special_div_by_const, - "This hook is used to test whether the target has a special method of\n\ -division of vectors of type @var{vectype} using the value @var{constant},\n\ -and producing a vector of type @var{vectype}. The division\n\ -will then not be decomposed by the vectorizer and kept as a div.\n\ -\n\ -When the hook is being used to test whether the target supports a special\n\ -divide, @var{in0}, @var{in1}, and @var{output} are all null. When the hook\n\ -is being used to emit a division, @var{in0} and @var{in1} are the source\n\ -vectors of type @var{vecttype} and @var{output} is the destination vector of\n\ -type @var{vectype}.\n\ -\n\ -Return true if the operation is possible, emitting instructions for it\n\ -if rtxes are provided and updating @var{output}.", - bool, (enum tree_code, tree vectype, wide_int constant, rtx *output, - rtx in0, rtx in1), - default_can_special_div_by_const) - /* Return true if the target supports misaligned store/load of a specific factor denoted in the third parameter. The last parameter is true if the access is defined in a packed struct. */ diff --git a/gcc/target.h b/gcc/target.h index 03fd03a52075b4836159035ec14078c0aebdd7e9..93691882757232c514fca82b99f913158c2d47b1 100644 --- a/gcc/target.h +++ b/gcc/target.h @@ -51,7 +51,6 @@ #include "insn-codes.h" #include "tm.h" #include "hard-reg-set.h" -#include "tree-core.h" #if CHECKING_P diff --git a/gcc/targhooks.h b/gcc/targhooks.h index a1df260f5483dc84f18d8f12c5202484a32d5bb7..a6a4809ca91baa5d7fad2244549317a31390f0c2 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -209,8 +209,6 @@ extern void default_addr_space_diagnose_usage (addr_space_t, location_t); extern rtx default_addr_space_convert (rtx, tree, tree); extern unsigned int default_case_values_threshold (void); extern bool default_have_conditional_execution (void); -extern bool default_can_special_div_by_const (enum tree_code, tree, wide_int, - rtx *, rtx, rtx); extern bool default_libc_has_function (enum function_class, tree); extern bool default_libc_has_fast_function (int fcode); diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index fe0116521feaf32187e7bc113bf93b1805852c79..211525720a620d6f533e2da91e03877337a931e7 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1840,14 +1840,6 @@ default_have_conditional_execution (void) return HAVE_conditional_execution; } -/* Default that no division by constant operations are special. */ -bool -default_can_special_div_by_const (enum tree_code, tree, wide_int, rtx *, rtx, - rtx) -{ - return false; -} - /* By default we assume that c99 functions are present at the runtime, but sincos is not. */ bool diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 166a248f4b9512d4c6fc8d760b458b7a467f7790..519a824ec727d4d4f28c14077dc3e970bed75ef6 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1237,17 +1237,6 @@ expand_vector_operation (gimple_stmt_iterator *gsi, tree type, tree compute_type tree rhs2 = gimple_assign_rhs2 (assign); tree ret; - /* Check if the target was going to handle it through the special - division callback hook. */ - tree cst = uniform_integer_cst_p (rhs2); - if (cst && - targetm.vectorize.can_special_div_by_const (code, type, - wi::to_wide (cst), - NULL, - NULL_RTX, NULL_RTX)) - return NULL_TREE; - - if (!optimize || !VECTOR_INTEGER_TYPE_P (type) || TREE_CODE (rhs2) != VECTOR_CST diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 6934aebc69f231af24668f0a1c3d140e97f55487..1766ce277d6b88d8aa3be77e7c8abb504a10a735 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -3913,14 +3913,6 @@ vect_recog_divmod_pattern (vec_info *vinfo, return pattern_stmt; } - else if ((cst = uniform_integer_cst_p (oprnd1)) - && targetm.vectorize.can_special_div_by_const (rhs_code, vectype, - wi::to_wide (cst), - NULL, NULL_RTX, - NULL_RTX)) - { - return NULL; - } if (prec > HOST_BITS_PER_WIDE_INT || integer_zerop (oprnd1)) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index eb4ca1f184e374d177eb43d5eb93acf6e6a8fde9..3a0fb5ad898ad42c3867f0b9564fc4e066e50081 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -6263,15 +6263,6 @@ vectorizable_operation (vec_info *vinfo, } target_support_p = (optab_handler (optab, vec_mode) != CODE_FOR_nothing); - tree cst; - if (!target_support_p - && op1 - && (cst = uniform_integer_cst_p (op1))) - target_support_p - = targetm.vectorize.can_special_div_by_const (code, vectype, - wi::to_wide (cst), - NULL, NULL_RTX, - NULL_RTX); } bool using_emulated_vectors_p = vect_emulated_vector_p (vectype); From patchwork Mon Feb 27 12:33:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1748676 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=FseZfaJt; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PQKgK65Vbz1yWw for ; Mon, 27 Feb 2023 23:34:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E3DDC3857835 for ; Mon, 27 Feb 2023 12:34:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E3DDC3857835 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677501241; bh=goaQRdebHoCkEtN0fpbQSZhBpfQ2tAnyPkBNN3cew1M=; h=Date:To:Cc:Subject:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=FseZfaJth8ea9RA+z0wcd4B5/+ZVnCPUCFQ6bdbg1Wtl4CCTnrW0Yj1NYLREc0mgK vL9uCxnz8uGqKIqVnv6hh3cNUtmghYDEzUKX3k9KJezHlrx6n9O/LcQhW9cwtmlCYR qRpabnW/vaXvjZ34eDJKUu+tiVJZRfYyXZ/rLtyo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2058.outbound.protection.outlook.com [40.107.7.58]) by sourceware.org (Postfix) with ESMTPS id 4276E3858C2F for ; Mon, 27 Feb 2023 12:33:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4276E3858C2F Received: from AM5PR0402CA0023.eurprd04.prod.outlook.com (2603:10a6:203:90::33) by AS8PR08MB6120.eurprd08.prod.outlook.com (2603:10a6:20b:299::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.27; Mon, 27 Feb 2023 12:33:36 +0000 Received: from AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com (2603:10a6:203:90:cafe::e9) by AM5PR0402CA0023.outlook.office365.com (2603:10a6:203:90::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.25 via Frontend Transport; Mon, 27 Feb 2023 12:33:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT027.mail.protection.outlook.com (100.127.140.124) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.12 via Frontend Transport; Mon, 27 Feb 2023 12:33:36 +0000 Received: ("Tessian outbound 0d7b2ab0f13d:v132"); Mon, 27 Feb 2023 12:33:36 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: fc8b4759ad4514ad X-CR-MTA-TID: 64aa7808 Received: from ecc813ec87dd.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 094424E8-9DA0-460D-B50A-050066160A06.1; Mon, 27 Feb 2023 12:33:28 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ecc813ec87dd.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 27 Feb 2023 12:33:28 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=agLi/SqgfUmLIOQotgz/qUsE0wSMgLb1taj/7pNzy1WX7uAt/km+TRjMAPa6rRot7A4FfUPKwqI0s7itpn0OutyGp/No8H/CtACfp6h81bGdu4RvanKD2j15OXQPANIJ84bXVHd2v+R4wK/l4QmSwAuKeOkiFIEzHZ1yiC6MZGGZkT2zsBIppy4UEbRZqGKvmW90NXjmSlxsR/JQ2nWWxGTFQvva1GIXnOIFYiK9N5YtdFiMf8M4gfOk0rFi5qavz8d/TjbdJRd9hw+9ovOk9RjCD8p2XS3Ds5L2tQdL4/bqG22Gm3rmhZuuJxxsF4Nm7dPweX5R2ptpIbfE1ZPkfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=goaQRdebHoCkEtN0fpbQSZhBpfQ2tAnyPkBNN3cew1M=; b=JelS1p2XoiqgvXlGhq5z8qM0GauI8JAyhswh68L5TQ43QZ09g9vE0YjjPpM3+Mk3EER/ENQcoufvPbRUEyDrQPOYHTGcg0jTOShl8jBNNXkHFldZT650Mxk9fENrb3orvpja1SG4R9JYQufZYMZk2j0hlioCnmM/nmVzpp8mkQr2j17N2vPEuZcAC+/nDPPa/rPY1Ewpyn1ExZdmyvaLNmyNKD2adrcKuYP6lv5PT/0tIywi9h4GZE9t75NcbUC3/xq9wrWAJpv6J/forY3Z7YN3ULe3jfrPyGlYshAw7U+WTReZ5TxiMBj6nKCTp3TRCNa3TN/r8VVtiBgcyxDkGA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS8PR08MB6006.eurprd08.prod.outlook.com (2603:10a6:20b:29a::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Mon, 27 Feb 2023 12:33:26 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe%6]) with mapi id 15.20.6134.027; Mon, 27 Feb 2023 12:33:25 +0000 Date: Mon, 27 Feb 2023 12:33:18 +0000 To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, amacleod@redhat.com, aldyh@redhat.com Subject: [PATCH 2/4][ranger]: Add range-ops for widen addition and widen multiplication [PR108583] Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA1PR05CA0011.namprd05.prod.outlook.com (2603:10b6:806:2d2::13) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS8PR08MB6006:EE_|AM7EUR03FT027:EE_|AS8PR08MB6120:EE_ X-MS-Office365-Filtering-Correlation-Id: e2bb83a6-2df8-46e3-6fd1-08db18bed8d2 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: XXiAmgWNwMyeKJAPxjzjsyadZboG3jTbftjNmAW9gRo66cFFWvSZbj7TzdIl5cMiB/M6GAaG07ydepax+PP2DDgYhrbT725qGpLB6Ov1bo41BW3BQxyR/dwPi2MFrdRSLuGEA+RnqqJJ1Zi0wHMGXViPMxm9Th7ByUTIjxPXneRsXTpGM5ZZEmGocggBvZFVlVGIjukknFe08z5S23PO2fX4GhN7q0ngVs0IrijcnFwmoKbA+/dwcx6anuhMzvtFjcAHAHDyNDb3ht3OEuE3k3QZMmx5Qt2VKSXsCfsMnm8lmSc08Vm2/koYWdUfK7g3WfH7v8936x9zaXiBIhPus0IIxoCr/S+HG8125AmYStmzN1OEbcGkPMf4oQMN0cZ/HbX9qznd9mj7Ev7o9XZ2988wwGzEFJXHJrbydl7pCNOsqk+rQ1YQ/ODAwsG/k75aDnnl5sxg4AFWuWsrKucy/iR6dpwIlAJkhD//dcii/reaZ6bjz3qKhdhuugIOq2I8XJykv/yhXavGS207jj8TGsqM4wKh2a+3M6Sy8UYPkhEYcMGLwjNYR6dnWQNtfgnGpD2qdxBTZIcTQsfy3Z9+BZuoinwr8ZnRkytdkZnUT8WQvRKrsV9c+zrlyxhFz8ZqSD+oy+ZnNMspGz2U1EGz/a7+uBW6IrRFYN5c6RQfzFf+JSG5Eo6lKUMWdfgruWseIQstQ+JL7oyK9yO0LHtfIw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230025)(4636009)(376002)(366004)(396003)(136003)(346002)(39850400004)(451199018)(316002)(2906002)(36756003)(5660300002)(235185007)(44832011)(8936002)(86362001)(6512007)(33964004)(6506007)(44144004)(66946007)(8676002)(186003)(38100700002)(26005)(4743002)(4326008)(6916009)(66556008)(66476007)(41300700001)(2616005)(478600001)(6486002)(6666004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6006 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 94e928ba-6d27-4656-1d64-08db18bed279 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 40gCLa60PYUnifYsCUQHLsU3egTcXXR6U5Og9gbY9y8Kz2d9xpDgfrJnXruHZ8oEK9oE1FJL25nvLr1HmlsgMgaieUAt2vnY03wrqdTiekQYvsu2pN2iqnalnMvCtJkNBS65+RdAd4bp9zXa58fa7TW7HvdzX2hhj1sMZTnKvagHhxKiUVpl0lhjPRIDubWCw9+xdLXC1eL/BJ2X8Jc8qYW+kGklstHACYz58lIrx+B7bqgj8WH+ZDjxddy+O+iTfNbJgUBCL1sGUH3FqqRf7U0I11ZG8RqQgGFMvH4/DrMqJcv0lIxY1a1jy6J9QyBeWHsNQxHJLhrxCAFHgkw3NjwqbLsdpCvo11pGQiEYZWcpKjngfUw/+NsdpA13gaO0ZcRjToMFnWaPYxNZ7qhlnou/SfqtG4o6KkbtY8EsXpbr57kkwqC/rGay/Le2JpXCJz80irJTnMTDkqxQEPoAnIpdtEE4N1Tx5/fhu0yPilfS+tkoIprPNlyi6DU1UvmiZNZ8yk0ZBYTSh+es8GEa5wRvQa+xl4+SlTwgdO0MN/NXR48qgazYB1s9xA5VKuG3HKauF5d9HxPaoUsbyeKDpzvNDy62CkpbiPUaySzxA3bUtxhjRZ4XEbvrAQvSRwAZNmZ6YdlgG/IHa6JCgOMOt9p5fSQteoyJAdeSJ6U9QBCNfOJBy0nQx6nbr0z8DB5MLBkN/YIW89XuPXHOOdxDDg5x8bkoVkigQx4tR2RMb64= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230025)(4636009)(39860400002)(136003)(396003)(346002)(376002)(451199018)(40470700004)(46966006)(36840700001)(8676002)(86362001)(6506007)(6666004)(107886003)(36756003)(40460700003)(36860700001)(82740400003)(8936002)(235185007)(81166007)(47076005)(82310400005)(356005)(5660300002)(40480700001)(478600001)(4743002)(336012)(2616005)(186003)(26005)(44144004)(33964004)(6486002)(6916009)(70206006)(4326008)(2906002)(44832011)(70586007)(316002)(41300700001)(6512007)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Feb 2023 12:33:36.2798 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e2bb83a6-2df8-46e3-6fd1-08db18bed8d2 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6120 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, RCVD_IN_VALIDITY_RPBL, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds range-ops for widening addition and widening multiplication. I couldn't figure out how to write a test for this. It looks like there are self tests but not a way to write standalone ones? I did create testcases in the patch 3/4 which tests the end result. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/108583 * gimple-range-op.h (gimple_range_op_handler): Add maybe_non_standard. * gimple-range-op.cc (gimple_range_op_handler::gimple_range_op_handler): Use it. (gimple_range_op_handler::maybe_non_standard): New. * range-op.cc (class operator_widen_plus_signed, operator_widen_plus_signed::wi_fold, class operator_widen_plus_unsigned, operator_widen_plus_unsigned::wi_fold, class operator_widen_mult_signed, operator_widen_mult_signed::wi_fold, class operator_widen_mult_unsigned, operator_widen_mult_unsigned::wi_fold, ptr_op_widen_mult_signed, ptr_op_widen_mult_unsigned, ptr_op_widen_plus_signed, ptr_op_widen_plus_unsigned): New. * range-op.h (ptr_op_widen_mult_signed, ptr_op_widen_mult_unsigned, ptr_op_widen_plus_signed, ptr_op_widen_plus_unsigned): New Co-Authored-By: Andrew MacLeod --- inline copy of patch -- diff --git a/gcc/gimple-range-op.h b/gcc/gimple-range-op.h index 743b858126e333ea9590c0f175aacb476260c048..1bf63c5ce6f5db924a1f5907ab4539e376281bd0 100644 --- diff --git a/gcc/gimple-range-op.h b/gcc/gimple-range-op.h index 743b858126e333ea9590c0f175aacb476260c048..1bf63c5ce6f5db924a1f5907ab4539e376281bd0 100644 --- a/gcc/gimple-range-op.h +++ b/gcc/gimple-range-op.h @@ -41,6 +41,7 @@ public: relation_trio = TRIO_VARYING); private: void maybe_builtin_call (); + void maybe_non_standard (); gimple *m_stmt; tree m_op1, m_op2; }; diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index d9dfdc56939bb62ade72726b15c3d5e87e4ddcd1..ad13c873c6303db5b68b74db1562c0db6763101f 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -179,6 +179,8 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s) // statements. if (is_a (m_stmt)) maybe_builtin_call (); + else + maybe_non_standard (); } // Calculate what we can determine of the range of this unary @@ -764,6 +766,44 @@ public: } } op_cfn_parity; +// Set up a gimple_range_op_handler for any nonstandard function which can be +// supported via range-ops. + +void +gimple_range_op_handler::maybe_non_standard () +{ + range_operator *signed_op = ptr_op_widen_mult_signed; + range_operator *unsigned_op = ptr_op_widen_mult_unsigned; + if (gimple_code (m_stmt) == GIMPLE_ASSIGN) + switch (gimple_assign_rhs_code (m_stmt)) + { + case WIDEN_PLUS_EXPR: + { + signed_op = ptr_op_widen_plus_signed; + unsigned_op = ptr_op_widen_plus_unsigned; + } + gcc_fallthrough (); + case WIDEN_MULT_EXPR: + { + m_valid = true; + m_op1 = gimple_assign_rhs1 (m_stmt); + m_op2 = gimple_assign_rhs2 (m_stmt); + bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + if (signed2 && !signed1) + std::swap (m_op1, m_op2); + + if (signed1 || signed2) + m_int = signed_op; + else + m_int = unsigned_op; + break; + } + default: + break; + } +} + // Set up a gimple_range_op_handler for any built in function which can be // supported via range-ops. diff --git a/gcc/range-op.h b/gcc/range-op.h index f00b747f08a1fa8404c63bfe5a931b4048008b03..b1eeac70df81f2bdf228af7adff5399e7ac5e5d6 100644 --- a/gcc/range-op.h +++ b/gcc/range-op.h @@ -311,4 +311,8 @@ private: // This holds the range op table for floating point operations. extern floating_op_table *floating_tree_table; +extern range_operator *ptr_op_widen_mult_signed; +extern range_operator *ptr_op_widen_mult_unsigned; +extern range_operator *ptr_op_widen_plus_signed; +extern range_operator *ptr_op_widen_plus_unsigned; #endif // GCC_RANGE_OP_H diff --git a/gcc/range-op.cc b/gcc/range-op.cc index 5c67bce6d3aab81ad3186b902e09d6a96878d9bb..718ccb6f074e1a2a9ef1b7a5d4e879898d4a7fc3 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -1556,6 +1556,73 @@ operator_plus::op2_range (irange &r, tree type, return op1_range (r, type, lhs, op1, rel.swap_op1_op2 ()); } +class operator_widen_plus_signed : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const; +} op_widen_plus_signed; +range_operator *ptr_op_widen_plus_signed = &op_widen_plus_signed; + +void +operator_widen_plus_signed::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + wi::overflow_type ov_lb, ov_ub; + signop s = TYPE_SIGN (type); + + wide_int lh_wlb + = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, SIGNED); + wide_int lh_wub + = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, SIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + wide_int new_lb = wi::add (lh_wlb, rh_wlb, s, &ov_lb); + wide_int new_ub = wi::add (lh_wub, rh_wub, s, &ov_ub); + + r = int_range<2> (type, new_lb, new_ub); +} + +class operator_widen_plus_unsigned : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const; +} op_widen_plus_unsigned; +range_operator *ptr_op_widen_plus_unsigned = &op_widen_plus_unsigned; + +void +operator_widen_plus_unsigned::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + wi::overflow_type ov_lb, ov_ub; + signop s = TYPE_SIGN (type); + + wide_int lh_wlb + = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, UNSIGNED); + wide_int lh_wub + = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, UNSIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + wide_int new_lb = wi::add (lh_wlb, rh_wlb, s, &ov_lb); + wide_int new_ub = wi::add (lh_wub, rh_wub, s, &ov_ub); + + r = int_range<2> (type, new_lb, new_ub); +} class operator_minus : public range_operator { @@ -2031,6 +2098,70 @@ operator_mult::wi_fold (irange &r, tree type, } } +class operator_widen_mult_signed : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) + const; +} op_widen_mult_signed; +range_operator *ptr_op_widen_mult_signed = &op_widen_mult_signed; + +void +operator_widen_mult_signed::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + signop s = TYPE_SIGN (type); + + wide_int lh_wlb = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, SIGNED); + wide_int lh_wub = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, SIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + /* We don't expect a widening multiplication to be able to overflow but range + calculations for multiplications are complicated. After widening the + operands lets call the base class. */ + return op_mult.wi_fold (r, type, lh_wlb, lh_wub, rh_wlb, rh_wub); +} + + +class operator_widen_mult_unsigned : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) + const; +} op_widen_mult_unsigned; +range_operator *ptr_op_widen_mult_unsigned = &op_widen_mult_unsigned; + +void +operator_widen_mult_unsigned::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + signop s = TYPE_SIGN (type); + + wide_int lh_wlb = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, UNSIGNED); + wide_int lh_wub = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, UNSIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + /* We don't expect a widening multiplication to be able to overflow but range + calculations for multiplications are complicated. After widening the + operands lets call the base class. */ + return op_mult.wi_fold (r, type, lh_wlb, lh_wub, rh_wlb, rh_wub); +} class operator_div : public cross_product_operator { --- a/gcc/gimple-range-op.h +++ b/gcc/gimple-range-op.h @@ -41,6 +41,7 @@ public: relation_trio = TRIO_VARYING); private: void maybe_builtin_call (); + void maybe_non_standard (); gimple *m_stmt; tree m_op1, m_op2; }; diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index d9dfdc56939bb62ade72726b15c3d5e87e4ddcd1..ad13c873c6303db5b68b74db1562c0db6763101f 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -179,6 +179,8 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s) // statements. if (is_a (m_stmt)) maybe_builtin_call (); + else + maybe_non_standard (); } // Calculate what we can determine of the range of this unary @@ -764,6 +766,44 @@ public: } } op_cfn_parity; +// Set up a gimple_range_op_handler for any nonstandard function which can be +// supported via range-ops. + +void +gimple_range_op_handler::maybe_non_standard () +{ + range_operator *signed_op = ptr_op_widen_mult_signed; + range_operator *unsigned_op = ptr_op_widen_mult_unsigned; + if (gimple_code (m_stmt) == GIMPLE_ASSIGN) + switch (gimple_assign_rhs_code (m_stmt)) + { + case WIDEN_PLUS_EXPR: + { + signed_op = ptr_op_widen_plus_signed; + unsigned_op = ptr_op_widen_plus_unsigned; + } + gcc_fallthrough (); + case WIDEN_MULT_EXPR: + { + m_valid = true; + m_op1 = gimple_assign_rhs1 (m_stmt); + m_op2 = gimple_assign_rhs2 (m_stmt); + bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + if (signed2 && !signed1) + std::swap (m_op1, m_op2); + + if (signed1 || signed2) + m_int = signed_op; + else + m_int = unsigned_op; + break; + } + default: + break; + } +} + // Set up a gimple_range_op_handler for any built in function which can be // supported via range-ops. diff --git a/gcc/range-op.h b/gcc/range-op.h index f00b747f08a1fa8404c63bfe5a931b4048008b03..b1eeac70df81f2bdf228af7adff5399e7ac5e5d6 100644 --- a/gcc/range-op.h +++ b/gcc/range-op.h @@ -311,4 +311,8 @@ private: // This holds the range op table for floating point operations. extern floating_op_table *floating_tree_table; +extern range_operator *ptr_op_widen_mult_signed; +extern range_operator *ptr_op_widen_mult_unsigned; +extern range_operator *ptr_op_widen_plus_signed; +extern range_operator *ptr_op_widen_plus_unsigned; #endif // GCC_RANGE_OP_H diff --git a/gcc/range-op.cc b/gcc/range-op.cc index 5c67bce6d3aab81ad3186b902e09d6a96878d9bb..718ccb6f074e1a2a9ef1b7a5d4e879898d4a7fc3 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -1556,6 +1556,73 @@ operator_plus::op2_range (irange &r, tree type, return op1_range (r, type, lhs, op1, rel.swap_op1_op2 ()); } +class operator_widen_plus_signed : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const; +} op_widen_plus_signed; +range_operator *ptr_op_widen_plus_signed = &op_widen_plus_signed; + +void +operator_widen_plus_signed::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + wi::overflow_type ov_lb, ov_ub; + signop s = TYPE_SIGN (type); + + wide_int lh_wlb + = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, SIGNED); + wide_int lh_wub + = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, SIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + wide_int new_lb = wi::add (lh_wlb, rh_wlb, s, &ov_lb); + wide_int new_ub = wi::add (lh_wub, rh_wub, s, &ov_ub); + + r = int_range<2> (type, new_lb, new_ub); +} + +class operator_widen_plus_unsigned : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const; +} op_widen_plus_unsigned; +range_operator *ptr_op_widen_plus_unsigned = &op_widen_plus_unsigned; + +void +operator_widen_plus_unsigned::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + wi::overflow_type ov_lb, ov_ub; + signop s = TYPE_SIGN (type); + + wide_int lh_wlb + = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, UNSIGNED); + wide_int lh_wub + = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, UNSIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + wide_int new_lb = wi::add (lh_wlb, rh_wlb, s, &ov_lb); + wide_int new_ub = wi::add (lh_wub, rh_wub, s, &ov_ub); + + r = int_range<2> (type, new_lb, new_ub); +} class operator_minus : public range_operator { @@ -2031,6 +2098,70 @@ operator_mult::wi_fold (irange &r, tree type, } } +class operator_widen_mult_signed : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) + const; +} op_widen_mult_signed; +range_operator *ptr_op_widen_mult_signed = &op_widen_mult_signed; + +void +operator_widen_mult_signed::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + signop s = TYPE_SIGN (type); + + wide_int lh_wlb = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, SIGNED); + wide_int lh_wub = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, SIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + /* We don't expect a widening multiplication to be able to overflow but range + calculations for multiplications are complicated. After widening the + operands lets call the base class. */ + return op_mult.wi_fold (r, type, lh_wlb, lh_wub, rh_wlb, rh_wub); +} + + +class operator_widen_mult_unsigned : public range_operator +{ +public: + virtual void wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) + const; +} op_widen_mult_unsigned; +range_operator *ptr_op_widen_mult_unsigned = &op_widen_mult_unsigned; + +void +operator_widen_mult_unsigned::wi_fold (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + const wide_int &rh_lb, + const wide_int &rh_ub) const +{ + signop s = TYPE_SIGN (type); + + wide_int lh_wlb = wide_int::from (lh_lb, wi::get_precision (lh_lb) * 2, UNSIGNED); + wide_int lh_wub = wide_int::from (lh_ub, wi::get_precision (lh_ub) * 2, UNSIGNED); + wide_int rh_wlb = wide_int::from (rh_lb, wi::get_precision (rh_lb) * 2, s); + wide_int rh_wub = wide_int::from (rh_ub, wi::get_precision (rh_ub) * 2, s); + + /* We don't expect a widening multiplication to be able to overflow but range + calculations for multiplications are complicated. After widening the + operands lets call the base class. */ + return op_mult.wi_fold (r, type, lh_wlb, lh_wub, rh_wlb, rh_wub); +} class operator_div : public cross_product_operator { From patchwork Mon Feb 27 12:33:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1748681 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=ViN3075I; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PQKh045y4z1yWw for ; Mon, 27 Feb 2023 23:34:40 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 400FA385B535 for ; Mon, 27 Feb 2023 12:34:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 400FA385B535 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677501278; bh=iD5HgKqwcfTa3e2vhFXGDssh4TEOzpsL+jGijXxEjvQ=; h=Date:To:Cc:Subject:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=ViN3075IHT1e/yjTQ+8IFZIXv1SJExJ5I1s4Ntqmm9ZfudfJfJHgl/v8s16WHSFgT y801DiNyWrZGrYfowETxt06of6JNLFOH6bOUArs3r+QN3kMwMpzZz9y3xRU3X/XthW FGaMXXGPsp6EXWSK0wls4wSn2Y7WA9yJwjPB5hTE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2085.outbound.protection.outlook.com [40.107.105.85]) by sourceware.org (Postfix) with ESMTPS id 722A13858C78 for ; Mon, 27 Feb 2023 12:34:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 722A13858C78 Received: from AM7PR04CA0014.eurprd04.prod.outlook.com (2603:10a6:20b:110::24) by AM0PR08MB5363.eurprd08.prod.outlook.com (2603:10a6:208:188::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Mon, 27 Feb 2023 12:34:06 +0000 Received: from AM7EUR03FT012.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:110:cafe::b6) by AM7PR04CA0014.outlook.office365.com (2603:10a6:20b:110::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29 via Frontend Transport; Mon, 27 Feb 2023 12:34:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT012.mail.protection.outlook.com (100.127.141.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.12 via Frontend Transport; Mon, 27 Feb 2023 12:34:06 +0000 Received: ("Tessian outbound 3ad958cd7492:v132"); Mon, 27 Feb 2023 12:34:06 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 51c28ceb651e6041 X-CR-MTA-TID: 64aa7808 Received: from f49dae471dd4.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9C76B3EB-4332-4EE7-9D4C-2F561AAE807A.1; Mon, 27 Feb 2023 12:34:00 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f49dae471dd4.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 27 Feb 2023 12:34:00 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k3Mqbhz/rR+vlK0RBZk6N56R6RNovc0wJd4yu5FSrOZvRrlgZgZ4ORR0JcQJm9nMU7taLPjU78arHekvz0c1JevDS0JKr9c89dmACOeww+yEoiTcQtDmILRNptkg3uLdRaSHnPBhTRwwA02NbqoNxLnInmbU6wn4N7RnIf/+CJv+tFLrLlW2ctYRFjnys8ERN8CFE20c76YYOjw093EXk//VqnitcDAIRMIBfdj59NXC5SIKN11YMZAFpXHGfBH1WgURrJP8IUodqtg9E3Ihvyd07q9soNInrqmZcHpUgeSHDHTx8zDR3VKyRTTX7mDf57b4qWaFqHKhUhRjkX/R5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iD5HgKqwcfTa3e2vhFXGDssh4TEOzpsL+jGijXxEjvQ=; b=RBREMy0MXB62CsEWMEwMk8Zw4ZWE0TrM5h/4uZr+7j4Dz6+RBp9o2sk11BwKkmzf5PxXgTZI4CpRDyphQTPrSp7ois+2P0tLyFuCLgoZkN3PllHXPfXbwJEbolAq0iv4LDL7ifhYEAfHh6Jan7N37VwPpsDwHYIVfzeiwdApOVgoHVnszZfraI1hQg2P5MP7CBf9sChKhYBluJyHbnyuh+brZDWU4rllmqcUgsGf2EkJ9wZHq/oanCvIAMhmuptcaeAVyTqEewVGa8j+KoTRTbY5Fo0Ijnl3pqIi0FkrVBAFWpBgcUQdLn1dp7gWmkqlCNPYxJ+DlYGSf2zrVhkiKQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PAVPR08MB9844.eurprd08.prod.outlook.com (2603:10a6:102:2f8::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.21; Mon, 27 Feb 2023 12:33:57 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe%6]) with mapi id 15.20.6134.027; Mon, 27 Feb 2023 12:33:57 +0000 Date: Mon, 27 Feb 2023 12:33:55 +0000 To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com, richard.sandiford@arm.com Subject: [PATCH 3/4]middle-end: Implement preferred_div_as_shifts_over_mult [PR108583] Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0308.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:197::7) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|PAVPR08MB9844:EE_|AM7EUR03FT012:EE_|AM0PR08MB5363:EE_ X-MS-Office365-Filtering-Correlation-Id: 3d0ed44d-1eff-4fd5-d63c-08db18beead4 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 4TpFYg1LHZaVa90I1tw/U8acYcdYOU95saAubY1lPSN8TrqAtrCrPc8lFqpe+mF46aNd8xz4SAbzNPxlc1Qk/vetBIdPCRjF3nRyUMRk72EzvdD9St4HG0x/rrVpkJqQcQxI2SHguT/UubqkHq9Up5T6QUyFuqsac6rJIy8jPUQOQZ18wRov87nNZBqLuWN7bMYnUiAu67pZnie9tPfI8Xgb6bvgB+TfHPbcHDZIRb6+YJk64M/l6jiUpEEKsCMnfSMuHFQf87pyrzpGMJvCbi4ZFR824MHNhgLnIDo+SXa8fcgjqRAjvxqzYyqpRyAjFjut2S5uDOFsOSquv32JCmmB2/Ne4hV74VapSsmKz/YI60saDG8y8kf4x628jDgXyPoUDzG0VQR/yJG7MbazWJEPybYypf038jVVt59vVH/ylViwzAkTieZlbgoqf1opwQKLG8vbO3uYErQ/4ihq6ezkm+rXjY8artdBPM5ArcW75pUrwMXNLkDp+Bte9ydExA4VXnyLOr5hHtHkJQHDgLPovH5NW6T/bjIv1xQfGO5vnyIHDC5uZwLIS9lmqg2jXVcDqy3wMUF5PhSJazTIF++nlUWsvkTNRW6aE2TsGMPpH43MpLdbY9TscYiQ/mQ47EW5xnKERIOhLeA5FpRCdDbTD9Z6fY6Y/Fw1puWTm6Z9cEIlycZNLsomBe3EfreqN7QLqoc9kIxwh1UThEFTncSX+BDlLYRT56uod2ff5T8= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230025)(4636009)(376002)(366004)(396003)(136003)(346002)(39860400002)(451199018)(478600001)(44832011)(30864003)(38100700002)(2906002)(84970400001)(86362001)(44144004)(26005)(36756003)(33964004)(186003)(6486002)(4743002)(83380400001)(66556008)(66476007)(41300700001)(4326008)(8936002)(66946007)(6512007)(6916009)(2616005)(6506007)(316002)(235185007)(8676002)(5660300002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB9844 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 2691e018-e689-4187-d76a-08db18bee566 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8UjXov3FGuBXnasNGN7g1nFV0RwGH9VmtBYp+Sxcg+P0JfsdSdir80ZZrsEc8JOcXlvIM0y5vmJS3JYFhO//m8d2le59CimEgzLngaQeUN+S3B11Cv4cls8Ybo0Kdd0Kes7RhW7bH3d19D3cEEo1j5sd3PEyt/gcAygyjADcmoJOezuIOL1Kx+VKsbZGoTeIgXiYG069a4OnNEYK37kY1810iLgpMNh77ex9t4u8p9XF8e4D5PCKOl2cWcfXT4wxl8akjo8xOURydNXQZrDp/YUBCRQ6PGuX6GMXHPzANW2ZvzAaYAtq0kXQCvThMNCxv+IPy2xsw3h/TljVr9qLCiuKkossHai9d86Ax0vTFSCFZmOvaO93YF2D1jF/ULlfvw5TBAJEMLIzKzrTiOpMjKvKQELjq0jLvovsyUPU+RzuLgc5LfWexunSxyasA4zFtfj1/LO5a39XPBdrKEIKpuOM9+o8up3Vh3HVn4Z/K6NrXuLX4xK6fORLPGtPB5jF8xjN/mMj3S6IivcuTE210ex8/1K0xdrgnpNV3GIdm54w5lb9+wkq8Vvncfc6ady+dbRoA5mjLf+fwww+9RxnAO28j6ZuZ8TGdDB3LO1SW47wOOsfkqA1tJzD29PaW2u7u4gZUJ0iGMiTxvGykx50YwUX7/YblITjvC0du2JzkxQBfU70OdakLOsT5SdIUQmBAWqL/JEmr51G2PLrWZERMF66NVkb3envG08DpfOx6XU= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230025)(4636009)(396003)(376002)(346002)(39850400004)(136003)(451199018)(36840700001)(46966006)(235185007)(44832011)(30864003)(5660300002)(36756003)(8936002)(40480700001)(86362001)(47076005)(478600001)(82310400005)(2616005)(83380400001)(6486002)(336012)(4743002)(26005)(186003)(6512007)(44144004)(33964004)(6506007)(70206006)(70586007)(41300700001)(356005)(8676002)(4326008)(6916009)(36860700001)(82740400003)(81166007)(316002)(84970400001)(2906002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Feb 2023 12:34:06.4945 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3d0ed44d-1eff-4fd5-d63c-08db18beead4 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5363 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, As Richard S wanted, this now implements a hook preferred_div_as_shifts_over_mult that indicates whether a target prefers that the vectorizer decomposes division as shifts rather than multiplication when possible. In order to be able to use this we need to check whether the current precision has enough bits to do the operation without any of the additions overflowing. We use range information to determine this and only do the operation if we're sure am overflow won't occur. This now uses ranger to do this range check. This seems to work better than vect_get_range_info which uses range_query, but I have not switched the interface of vect_get_range_info over in this PR fix. As Andy said before initializing a ranger instance is cheap but not free, and if the intention is to call it often during a pass it should be instantiated at pass startup and passed along to the places that need it. This is a big refactoring and doesn't seem right to do in this PR. But we should in GCC 14. Currently we only instantiate it after a long series of much cheaper checks. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/108583 * target.def (preferred_div_as_shifts_over_mult): New. * doc/tm.texi.in: Document it. * doc/tm.texi: Regenerate. * targhooks.cc (default_preferred_div_as_shifts_over_mult): New. * targhooks.h (default_preferred_div_as_shifts_over_mult): New. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Use it. gcc/testsuite/ChangeLog: PR target/108583 * gcc.dg/vect/vect-div-bitmask-4.c: New test. * gcc.dg/vect/vect-div-bitmask-5.c: New test. --- inline copy of patch -- diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 50a8872a6695b18b9bed0d393bacf733833633db..c85196015e2e53047fcc65d32ef2d3203d2a6bab 100644 --- diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 50a8872a6695b18b9bed0d393bacf733833633db..c85196015e2e53047fcc65d32ef2d3203d2a6bab 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6137,6 +6137,9 @@ instruction pattern. There is no need for the hook to handle these two implementation approaches itself. @end deftypefn +@deftypefn {Target Hook} bool TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT (void) +When decomposing a division operation, if possible prefer to decompose the +operation as shifts rather than multiplication by magic constants. @end deftypefn @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in}) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 3e07978a02f4e6077adae6cadc93ea4273295f1f..0051017a7fd67691a343470f36ad4fc32c8e7e15 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4173,6 +4173,7 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_VEC_PERM_CONST +@hook TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION diff --git a/gcc/target.def b/gcc/target.def index e0a5c7adbd962f5d08ed08d1d81afa2c2baa64a5..8cc18b1f3c5de24c21faf891b9d4d0b6fd5b59d7 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1868,6 +1868,15 @@ correct for most targets.", poly_uint64, (const_tree type), default_preferred_vector_alignment) +/* Returns whether the target has a preference for decomposing divisions using + shifts rather than multiplies. */ +DEFHOOK +(preferred_div_as_shifts_over_mult, + "When decomposing a division operation, if possible prefer to decompose the\n\ +operation as shifts rather than multiplication by magic constants.", + bool, (void), + default_preferred_div_as_shifts_over_mult) + /* Return true if vector alignment is reachable (by peeling N iterations) for the given scalar type. */ DEFHOOK diff --git a/gcc/targhooks.h b/gcc/targhooks.h index a6a4809ca91baa5d7fad2244549317a31390f0c2..dda011c59fbd5973ee648dfea195619cc41c71bc 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -53,6 +53,8 @@ extern scalar_int_mode default_unwind_word_mode (void); extern unsigned HOST_WIDE_INT default_shift_truncation_mask (machine_mode); extern unsigned int default_min_divisions_for_recip_mul (machine_mode); +extern bool +default_preferred_div_as_shifts_over_mult (void); extern int default_mode_rep_extended (scalar_int_mode, scalar_int_mode); extern tree default_stack_protect_guard (void); diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index 211525720a620d6f533e2da91e03877337a931e7..6396f344eef09dd61f358938846a1c02a70b31d8 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1483,6 +1483,15 @@ default_preferred_vector_alignment (const_tree type) return TYPE_ALIGN (type); } +/* The default implementation of + TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT. */ + +bool +default_preferred_div_as_shifts_over_mult (void) +{ + return false; +} + /* By default assume vectors of element TYPE require a multiple of the natural alignment of TYPE. TYPE is naturally aligned if IS_PACKED is false. */ bool diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c new file mode 100644 index 0000000000000000000000000000000000000000..c81f8946922250234bf759e0a0a04ea8c1f73e3c --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c @@ -0,0 +1,25 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include "tree-vect.h" + +typedef unsigned __attribute__((__vector_size__ (16))) V; + +static __attribute__((__noinline__)) __attribute__((__noclone__)) V +foo (V v, unsigned short i) +{ + v /= i; + return v; +} + +int +main (void) +{ + V v = foo ((V) { 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff }, 0xffff); + for (unsigned i = 0; i < sizeof (v) / sizeof (v[0]); i++) + if (v[i] != 0x00010001) + __builtin_abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: detected" "vect" { target aarch64*-*-* } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-5.c b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-5.c new file mode 100644 index 0000000000000000000000000000000000000000..b4eb1a4dacba481e6306b49914d2a29b933de625 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-5.c @@ -0,0 +1,58 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include +#include "tree-vect.h" + +#define N 50 +#define TYPE uint8_t + +#ifndef DEBUG +#define DEBUG 0 +#endif + +#define BASE ((TYPE) -1 < 0 ? -126 : 4) + + +__attribute__((noipa, noinline, optimize("O1"))) +void fun1(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] = (pixel[i] + level) / 0xff; +} + +__attribute__((noipa, noinline, optimize("O3"))) +void fun2(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] = (pixel[i] + level) / 0xff; +} + +int main () +{ + TYPE a[N]; + TYPE b[N]; + + for (int i = 0; i < N; ++i) + { + a[i] = BASE + i * 13; + b[i] = BASE + i * 13; + if (DEBUG) + printf ("%d: 0x%x\n", i, a[i]); + } + + fun1 (a, N / 2, N); + fun2 (b, N / 2, N); + + for (int i = 0; i < N; ++i) + { + if (DEBUG) + printf ("%d = 0x%x == 0x%x\n", i, a[i], b[i]); + + if (a[i] != b[i]) + __builtin_abort (); + } + return 0; +} + +/* { dg-final { scan-tree-dump "divmod pattern recognized" "vect" { target aarch64*-*-* } } } */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1766ce277d6b88d8aa3be77e7c8abb504a10a735..31f2a6753b4faccb77351c8c5afed9775888b60f 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -3913,6 +3913,84 @@ vect_recog_divmod_pattern (vec_info *vinfo, return pattern_stmt; } + else if ((cst = uniform_integer_cst_p (oprnd1)) + && TYPE_UNSIGNED (itype) + && rhs_code == TRUNC_DIV_EXPR + && vectype + && targetm.vectorize.preferred_div_as_shifts_over_mult ()) + { + /* div optimizations using narrowings + we can do the division e.g. shorts by 255 faster by calculating it as + (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in + double the precision of x. + + If we imagine a short as being composed of two blocks of bytes then + adding 257 or 0b0000_0001_0000_0001 to the number is equivalent to + adding 1 to each sub component: + + short value of 16-bits + ┌──────────────┬────────────────┐ + │ │ │ + └──────────────┴────────────────┘ + 8-bit part1 ▲ 8-bit part2 ▲ + │ │ + │ │ + +1 +1 + + after the first addition, we have to shift right by 8, and narrow the + results back to a byte. Remember that the addition must be done in + double the precision of the input. However if we know that the addition + `x + 257` does not overflow then we can do the operation in the current + precision. In which case we don't need the pack and unpacks. */ + auto wcst = wi::to_wide (cst); + int pow = wi::exact_log2 (wcst + 1); + if (pow == (int) (element_precision (vectype) / 2)) + { + gimple *stmt = SSA_NAME_DEF_STMT (oprnd0); + + gimple_ranger ranger; + int_range_max r; + + /* Check that no overflow will occur. If we don't have range + information we can't perform the optimization. */ + + if (ranger.range_of_expr (r, oprnd0, stmt)) + { + wide_int max = r.upper_bound (); + wide_int one = wi::to_wide (build_one_cst (itype)); + wide_int adder = wi::add (one, wi::lshift (one, pow)); + wi::overflow_type ovf; + wi::add (max, adder, UNSIGNED, &ovf); + if (ovf == wi::OVF_NONE) + { + *type_out = vectype; + tree tadder = wide_int_to_tree (itype, adder); + tree rshift = wide_int_to_tree (itype, pow); + + tree new_lhs1 = vect_recog_temp_ssa_var (itype, NULL); + gassign *patt1 + = gimple_build_assign (new_lhs1, PLUS_EXPR, oprnd0, tadder); + append_pattern_def_seq (vinfo, stmt_vinfo, patt1, vectype); + + tree new_lhs2 = vect_recog_temp_ssa_var (itype, NULL); + patt1 = gimple_build_assign (new_lhs2, RSHIFT_EXPR, new_lhs1, + rshift); + append_pattern_def_seq (vinfo, stmt_vinfo, patt1, vectype); + + tree new_lhs3 = vect_recog_temp_ssa_var (itype, NULL); + patt1 = gimple_build_assign (new_lhs3, PLUS_EXPR, new_lhs2, + oprnd0); + append_pattern_def_seq (vinfo, stmt_vinfo, patt1, vectype); + + tree new_lhs4 = vect_recog_temp_ssa_var (itype, NULL); + pattern_stmt = gimple_build_assign (new_lhs4, RSHIFT_EXPR, + new_lhs3, rshift); + + return pattern_stmt; + } + } + } + } if (prec > HOST_BITS_PER_WIDE_INT || integer_zerop (oprnd1)) --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6137,6 +6137,9 @@ instruction pattern. There is no need for the hook to handle these two implementation approaches itself. @end deftypefn +@deftypefn {Target Hook} bool TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT (void) +When decomposing a division operation, if possible prefer to decompose the +operation as shifts rather than multiplication by magic constants. @end deftypefn @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in}) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 3e07978a02f4e6077adae6cadc93ea4273295f1f..0051017a7fd67691a343470f36ad4fc32c8e7e15 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4173,6 +4173,7 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_VEC_PERM_CONST +@hook TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION diff --git a/gcc/target.def b/gcc/target.def index e0a5c7adbd962f5d08ed08d1d81afa2c2baa64a5..8cc18b1f3c5de24c21faf891b9d4d0b6fd5b59d7 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1868,6 +1868,15 @@ correct for most targets.", poly_uint64, (const_tree type), default_preferred_vector_alignment) +/* Returns whether the target has a preference for decomposing divisions using + shifts rather than multiplies. */ +DEFHOOK +(preferred_div_as_shifts_over_mult, + "When decomposing a division operation, if possible prefer to decompose the\n\ +operation as shifts rather than multiplication by magic constants.", + bool, (void), + default_preferred_div_as_shifts_over_mult) + /* Return true if vector alignment is reachable (by peeling N iterations) for the given scalar type. */ DEFHOOK diff --git a/gcc/targhooks.h b/gcc/targhooks.h index a6a4809ca91baa5d7fad2244549317a31390f0c2..dda011c59fbd5973ee648dfea195619cc41c71bc 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -53,6 +53,8 @@ extern scalar_int_mode default_unwind_word_mode (void); extern unsigned HOST_WIDE_INT default_shift_truncation_mask (machine_mode); extern unsigned int default_min_divisions_for_recip_mul (machine_mode); +extern bool +default_preferred_div_as_shifts_over_mult (void); extern int default_mode_rep_extended (scalar_int_mode, scalar_int_mode); extern tree default_stack_protect_guard (void); diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index 211525720a620d6f533e2da91e03877337a931e7..6396f344eef09dd61f358938846a1c02a70b31d8 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1483,6 +1483,15 @@ default_preferred_vector_alignment (const_tree type) return TYPE_ALIGN (type); } +/* The default implementation of + TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT. */ + +bool +default_preferred_div_as_shifts_over_mult (void) +{ + return false; +} + /* By default assume vectors of element TYPE require a multiple of the natural alignment of TYPE. TYPE is naturally aligned if IS_PACKED is false. */ bool diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c new file mode 100644 index 0000000000000000000000000000000000000000..c81f8946922250234bf759e0a0a04ea8c1f73e3c --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c @@ -0,0 +1,25 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include "tree-vect.h" + +typedef unsigned __attribute__((__vector_size__ (16))) V; + +static __attribute__((__noinline__)) __attribute__((__noclone__)) V +foo (V v, unsigned short i) +{ + v /= i; + return v; +} + +int +main (void) +{ + V v = foo ((V) { 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff }, 0xffff); + for (unsigned i = 0; i < sizeof (v) / sizeof (v[0]); i++) + if (v[i] != 0x00010001) + __builtin_abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: detected" "vect" { target aarch64*-*-* } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-5.c b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-5.c new file mode 100644 index 0000000000000000000000000000000000000000..b4eb1a4dacba481e6306b49914d2a29b933de625 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-5.c @@ -0,0 +1,58 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include +#include "tree-vect.h" + +#define N 50 +#define TYPE uint8_t + +#ifndef DEBUG +#define DEBUG 0 +#endif + +#define BASE ((TYPE) -1 < 0 ? -126 : 4) + + +__attribute__((noipa, noinline, optimize("O1"))) +void fun1(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] = (pixel[i] + level) / 0xff; +} + +__attribute__((noipa, noinline, optimize("O3"))) +void fun2(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] = (pixel[i] + level) / 0xff; +} + +int main () +{ + TYPE a[N]; + TYPE b[N]; + + for (int i = 0; i < N; ++i) + { + a[i] = BASE + i * 13; + b[i] = BASE + i * 13; + if (DEBUG) + printf ("%d: 0x%x\n", i, a[i]); + } + + fun1 (a, N / 2, N); + fun2 (b, N / 2, N); + + for (int i = 0; i < N; ++i) + { + if (DEBUG) + printf ("%d = 0x%x == 0x%x\n", i, a[i], b[i]); + + if (a[i] != b[i]) + __builtin_abort (); + } + return 0; +} + +/* { dg-final { scan-tree-dump "divmod pattern recognized" "vect" { target aarch64*-*-* } } } */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1766ce277d6b88d8aa3be77e7c8abb504a10a735..31f2a6753b4faccb77351c8c5afed9775888b60f 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -3913,6 +3913,84 @@ vect_recog_divmod_pattern (vec_info *vinfo, return pattern_stmt; } + else if ((cst = uniform_integer_cst_p (oprnd1)) + && TYPE_UNSIGNED (itype) + && rhs_code == TRUNC_DIV_EXPR + && vectype + && targetm.vectorize.preferred_div_as_shifts_over_mult ()) + { + /* div optimizations using narrowings + we can do the division e.g. shorts by 255 faster by calculating it as + (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in + double the precision of x. + + If we imagine a short as being composed of two blocks of bytes then + adding 257 or 0b0000_0001_0000_0001 to the number is equivalent to + adding 1 to each sub component: + + short value of 16-bits + ┌──────────────┬────────────────┐ + │ │ │ + └──────────────┴────────────────┘ + 8-bit part1 ▲ 8-bit part2 ▲ + │ │ + │ │ + +1 +1 + + after the first addition, we have to shift right by 8, and narrow the + results back to a byte. Remember that the addition must be done in + double the precision of the input. However if we know that the addition + `x + 257` does not overflow then we can do the operation in the current + precision. In which case we don't need the pack and unpacks. */ + auto wcst = wi::to_wide (cst); + int pow = wi::exact_log2 (wcst + 1); + if (pow == (int) (element_precision (vectype) / 2)) + { + gimple *stmt = SSA_NAME_DEF_STMT (oprnd0); + + gimple_ranger ranger; + int_range_max r; + + /* Check that no overflow will occur. If we don't have range + information we can't perform the optimization. */ + + if (ranger.range_of_expr (r, oprnd0, stmt)) + { + wide_int max = r.upper_bound (); + wide_int one = wi::to_wide (build_one_cst (itype)); + wide_int adder = wi::add (one, wi::lshift (one, pow)); + wi::overflow_type ovf; + wi::add (max, adder, UNSIGNED, &ovf); + if (ovf == wi::OVF_NONE) + { + *type_out = vectype; + tree tadder = wide_int_to_tree (itype, adder); + tree rshift = wide_int_to_tree (itype, pow); + + tree new_lhs1 = vect_recog_temp_ssa_var (itype, NULL); + gassign *patt1 + = gimple_build_assign (new_lhs1, PLUS_EXPR, oprnd0, tadder); + append_pattern_def_seq (vinfo, stmt_vinfo, patt1, vectype); + + tree new_lhs2 = vect_recog_temp_ssa_var (itype, NULL); + patt1 = gimple_build_assign (new_lhs2, RSHIFT_EXPR, new_lhs1, + rshift); + append_pattern_def_seq (vinfo, stmt_vinfo, patt1, vectype); + + tree new_lhs3 = vect_recog_temp_ssa_var (itype, NULL); + patt1 = gimple_build_assign (new_lhs3, PLUS_EXPR, new_lhs2, + oprnd0); + append_pattern_def_seq (vinfo, stmt_vinfo, patt1, vectype); + + tree new_lhs4 = vect_recog_temp_ssa_var (itype, NULL); + pattern_stmt = gimple_build_assign (new_lhs4, RSHIFT_EXPR, + new_lhs3, rshift); + + return pattern_stmt; + } + } + } + } if (prec > HOST_BITS_PER_WIDE_INT || integer_zerop (oprnd1)) From patchwork Mon Feb 27 12:34:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1748686 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=MED4DIy9; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PQKhW0QkXz1yX0 for ; Mon, 27 Feb 2023 23:35:07 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0B33F385C301 for ; Mon, 27 Feb 2023 12:35:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0B33F385C301 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677501305; bh=beMEe6W4hHi/TM+8J88TP+FUa/eJioJM8cj2OGVVG/g=; h=Date:To:Cc:Subject:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=MED4DIy9oEX00Dof150dmqH68KiZkBets8EkYy97n9+/7Wf+q/czuLEwZefdv2zGY fOij3vqjlm3PmfGezPtiLV9uTFkqxCZGeGwe3tyEix+5EWbAi5WNcSI0LAXM21Mhje 4B0ZBo73vGSEvJ4ectySdqllgN4rZZW5GogQfO4A= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2059.outbound.protection.outlook.com [40.107.6.59]) by sourceware.org (Postfix) with ESMTPS id A53D0385B53E for ; Mon, 27 Feb 2023 12:34:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A53D0385B53E Received: from AM0PR02CA0016.eurprd02.prod.outlook.com (2603:10a6:208:3e::29) by DU0PR08MB9751.eurprd08.prod.outlook.com (2603:10a6:10:445::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.27; Mon, 27 Feb 2023 12:34:38 +0000 Received: from AM7EUR03FT029.eop-EUR03.prod.protection.outlook.com (2603:10a6:208:3e:cafe::1f) by AM0PR02CA0016.outlook.office365.com (2603:10a6:208:3e::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29 via Frontend Transport; Mon, 27 Feb 2023 12:34:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT029.mail.protection.outlook.com (100.127.140.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.12 via Frontend Transport; Mon, 27 Feb 2023 12:34:38 +0000 Received: ("Tessian outbound baf1b7a96f25:v132"); Mon, 27 Feb 2023 12:34:37 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6dc4d26afe4b18c2 X-CR-MTA-TID: 64aa7808 Received: from 87cf6227fd12.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 368CB167-D43C-4E11-9DFF-88B9EF756485.1; Mon, 27 Feb 2023 12:34:27 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 87cf6227fd12.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 27 Feb 2023 12:34:27 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MJKdtgwSig3Wn1TOINXN8aGPeqqw0AYTaNhSTtjwhsDO4bbOi0LJGjBRgjgbhTzb4qoUH2hcGLQkgtA5FnoTSZJy3CvcvmvEpJq0JofgqUMGSRU7+w/PfDboLYCGqD/5rcuJXC2AcaKUBEC2qLGFFknKnVI1c2XtPEzKGCO9j77TSdjRrTRRSByODrVo2HUuPNVmyuqLw7R6X4lpRt2I1r/D0gI+M1nljAf+LzCyGv5/FckCA2P1Rczde8/32sIS4FOGshGShzeBTMi9swSm69MSOaRUUM+FoTwvEFZxrZAnljphpRUzkyuYn04LR9RxBj4GKql22CxQPmqOUNxjag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=beMEe6W4hHi/TM+8J88TP+FUa/eJioJM8cj2OGVVG/g=; b=EDty85Pcb0II9Xg5xaSI3Tcm827Zq63RilyI5E/0ZqHeA12yOQtKQqDB7pUCXI3bNWZ/EyHuExeiTTVfcvzZmeSXnzTmwIPOr9rBch7aB8Yd0OMhkBzmJETWgwi8LcKgjaHsjcywEGF1Q9KyH+Dgh6t4X064imlZZOkal/mZyHp41FWTCSM9rr374cxHikqdNT91T7DsZAaxwqmPUGAyF/iziessQwPXKlgReioh2s2120e9WxMM7hLOmGd2efYyf4hkususBU0aF2ck9f/c50pl5QjqeL4nuC2EgGq4+KQ8pPqtJUqvZ69oEA//j7TDJjnZ511ROnWdUaXMl+B2xw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV1PR08MB7900.eurprd08.prod.outlook.com (2603:10a6:150:5f::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Mon, 27 Feb 2023 12:34:22 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::210c:d369:23f7:84fe%6]) with mapi id 15.20.6134.027; Mon, 27 Feb 2023 12:34:22 +0000 Date: Mon, 27 Feb 2023 12:34:20 +0000 To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Subject: [PATCH 4/4]AArch64 Update div-bitmask to implement new optab instead of target hook [PR108583] Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0402.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:189::11) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV1PR08MB7900:EE_|AM7EUR03FT029:EE_|DU0PR08MB9751:EE_ X-MS-Office365-Filtering-Correlation-Id: 0cb5ccc1-e1a3-489a-26e5-08db18befdac x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: fTZrvgbmpFCnfiay9PBS02MthQ4W2YddzejQYGUAM6Kaer9HMUKyST5H8AldHCWvqnXnZwXlK/bqLkF2HBXFyvi/dwFa8YJps67je87Ka7F0f8zMok5XluZD4ri1AW7DUSlL4kGZpkBLtOoWwN2aOJTUDUzTMbQc650DaxTnA0ZbEU/w39+fejkBfrZOkdZeP8uLl8UKHoYnZDwxAQ/y1JTptKehyD+1NQDpj3DLMKFfWbvWWB1QNLrL94dJM9bmo7yRbLMJNdOOsPbRnG6OqMlzKkFy4MEa5jniyin7ZzJBnIQbjuLJsouMllu5dh9oD6dPzlYl1l+s77tA333lLqpt123QrDbPbS1E0V9dz5J3B0X2u/J3Qhq7ZjBx45sObDEgyFvruZ/5uzAIpH5azPntZtJqfc1soBk10GTykdiz4CxYYI1YWiLZIBP6fxEeNYixqqJW1JzD5mzYqa3e4d32RGFpGEHDAQoUHM6rwG0B4+kH7Yco4wcyzn2D87tT75m6xzHRAMpFtj9TIEweyAxvuZCWJ9suBdMzvLJTRXtAFtb8lPqeGSGXQH/31KREiR82z6bGpAANwA6HLAA+DXzfQw7Ng13RAKSUsG0UqUZTtcqmd3alDwijQxN74/Vo4Fa60Z6AaqZLET97h2mbQ3AUGw8b01xOGp/tKH8n7Q9yorMCZvml0QUrLW8W51O/kF+7DxIGpyEc6GvdtM5QJHXoPSUBVWfCNMNXzlEOLFM= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230025)(4636009)(39860400002)(376002)(366004)(396003)(136003)(346002)(451199018)(316002)(2906002)(44832011)(36756003)(5660300002)(235185007)(30864003)(8936002)(86362001)(6512007)(33964004)(6506007)(44144004)(66946007)(186003)(26005)(4743002)(8676002)(6916009)(66476007)(4326008)(66556008)(38100700002)(41300700001)(2616005)(83380400001)(478600001)(6486002)(2700100001)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7900 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT029.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: c851d2d4-5768-4567-21d5-08db18bef42e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9IM5V8GjpivOCsadJaG2WQAIzN2ZXaG+k/GyeB8XklE3sIk2pHBNDe3GjGqexmTHGpFs9eseU9zcOuJI7NcZUBKCY02T+Om86zmQCzIAuYUSqo20GgYFP0fz79uc+pyC9FSZanp7tB596XrTriMZbtLzv0FiqnMqBxtJbjR4u5PggMLRG/3nPwQY8Srq80aY+ioXAvaZYBS8SqTaVBLH09l/9AS402n2e+j6AXYh/YZzAGca2Qc0pHAf95HhyPeBpyWmgIO05gdU6CxBYbBUDTIXqeD7dAHrslBuTwmsEzgN0M73Vb7dSsJ/fz/WgkvUmskn4hC2y/6N+YBPkwN0Ug1PECV9Hlz/xoo4pFHYYI71TJu87tsnSNluSi7ChhoKYZ82c6Wcp0AeqneHCzQHM29M5Wb9CAQrEldAMzWUWEq1kQzZhxRiqQ7APMZN8ZtAFbcsFMWYli8UFpew7ZJM2gBJGwxHFyDUgibfFZN65yvMW1FXf209qukylyzU7LNtyop5LnwHGpyExmqNFEdV0JJpeAE+Srwe9Eg/+wrHYhoEkI0Fj5EPnZsaFGCj7pJ1oIa+TWFXKhHTOYBtzJgQnsiY5gRFuP64U6omujEbypX6CZ8zHowMkNHhBekw4AgKdhwZH2zKiy7v7tfMBl8EW9d1MKKNaU80UO+13WqY+d6DvIgPAKg2f1qjMVhAag1JPWvUoG8INqTmabO4QDgm8zVQ/Lyp74ssMhuI29sJnw0OVD9SffIvH+od6MLsLby2fTOYcwDD1Nb7Hm/o0yHKsw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230025)(4636009)(346002)(39850400004)(376002)(396003)(136003)(451199018)(40470700004)(46966006)(36840700001)(6486002)(2616005)(82310400005)(36756003)(86362001)(36860700001)(40460700003)(186003)(4743002)(2906002)(70586007)(30864003)(5660300002)(235185007)(6916009)(4326008)(8676002)(8936002)(70206006)(41300700001)(6512007)(47076005)(478600001)(44144004)(26005)(33964004)(44832011)(336012)(40480700001)(6506007)(82740400003)(316002)(356005)(83380400001)(81166007)(2700100001)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Feb 2023 12:34:38.1046 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0cb5ccc1-e1a3-489a-26e5-08db18befdac X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT029.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9751 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, RCVD_IN_VALIDITY_RPBL, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This replaces the custom division hook with just an implementation through add_highpart. For NEON we implement the add highpart (Addition + extraction of the upper highpart of the register in the same precision) as ADD + LSR. This representation allows us to easily optimize the sequence using existing sequences. This gets us a pretty decent sequence using SRA: umull v1.8h, v0.8b, v3.8b umull2 v0.8h, v0.16b, v3.16b add v5.8h, v1.8h, v2.8h add v4.8h, v0.8h, v2.8h usra v1.8h, v5.8h, 8 usra v0.8h, v4.8h, 8 uzp2 v1.16b, v1.16b, v0.16b To get the most optimal sequence however we match (a + ((b + c) >> n)) where n is half the precision of the mode of the operation into addhn + uaddw which is a general good optimization on its own and gets us back to: .L4: ldr q0, [x3] umull v1.8h, v0.8b, v5.8b umull2 v0.8h, v0.16b, v5.16b addhn v3.8b, v1.8h, v4.8h addhn v2.8b, v0.8h, v4.8h uaddw v1.8h, v1.8h, v3.8b uaddw v0.8h, v0.8h, v2.8b uzp2 v1.16b, v1.16b, v0.16b str q1, [x3], 16 cmp x3, x4 bne .L4 For SVE2 we optimize the initial sequence to the same ADD + LSR which gets us: .L3: ld1b z0.h, p0/z, [x0, x3] mul z0.h, p1/m, z0.h, z2.h add z1.h, z0.h, z3.h usra z0.h, z1.h, #8 lsr z0.h, z0.h, #8 st1b z0.h, p0, [x0, x3] inch x3 whilelo p0.h, w3, w2 b.any .L3 .L1: ret and to get the most optimal sequence I match (a + b) >> n (same constraint on n) to addhnb which gets us to: .L3: ld1b z0.h, p0/z, [x0, x3] mul z0.h, p1/m, z0.h, z2.h addhnb z1.b, z0.h, z3.h addhnb z0.b, z0.h, z1.h st1b z0.h, p0, [x0, x3] inch x3 whilelo p0.h, w3, w2 b.any .L3 There are multiple RTL representations possible for these optimizations, I did not represent them using a zero_extend because we seem very inconsistent in this in the backend. Since they are unspecs we won't match them from vector ops anyway. I figured maintainers would prefer this, but my maintainer ouija board is still out for repairs :) There are no new test as new correctness tests were added to the mid-end and the existing codegen tests for this already exist. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/108583 * config/aarch64/aarch64-simd.md (@aarch64_bitmask_udiv3): Remove. (*bitmask_shift_plus): New. * config/aarch64/aarch64-sve2.md (*bitmask_shift_plus): New. (@aarch64_bitmask_udiv3): Remove. * config/aarch64/aarch64.cc (aarch64_vectorize_can_special_div_by_constant, TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST): Removed. (TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT, aarch64_vectorize_preferred_div_as_shifts_over_mult): New. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 7f212bf37cd2c120dceb7efa733c9fa76226f029..e1ecb88634f93d380ef534093ea6599dc7278108 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 7f212bf37cd2c120dceb7efa733c9fa76226f029..e1ecb88634f93d380ef534093ea6599dc7278108 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4867,60 +4867,27 @@ (define_expand "aarch64_hn2" } ) -;; div optimizations using narrowings -;; we can do the division e.g. shorts by 255 faster by calculating it as -;; (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in -;; double the precision of x. -;; -;; If we imagine a short as being composed of two blocks of bytes then -;; adding 257 or 0b0000_0001_0000_0001 to the number is equivalent to -;; adding 1 to each sub component: -;; -;; short value of 16-bits -;; ┌──────────────┬────────────────┐ -;; │ │ │ -;; └──────────────┴────────────────┘ -;; 8-bit part1 ▲ 8-bit part2 ▲ -;; │ │ -;; │ │ -;; +1 +1 -;; -;; after the first addition, we have to shift right by 8, and narrow the -;; results back to a byte. Remember that the addition must be done in -;; double the precision of the input. Since 8 is half the size of a short -;; we can use a narrowing halfing instruction in AArch64, addhn which also -;; does the addition in a wider precision and narrows back to a byte. The -;; shift itself is implicit in the operation as it writes back only the top -;; half of the result. i.e. bits 2*esize-1:esize. -;; -;; Since we have narrowed the result of the first part back to a byte, for -;; the second addition we can use a widening addition, uaddw. -;; -;; For the final shift, since it's unsigned arithmetic we emit an ushr by 8. -;; -;; The shift is later optimized by combine to a uzp2 with movi #0. -(define_expand "@aarch64_bitmask_udiv3" - [(match_operand:VQN 0 "register_operand") - (match_operand:VQN 1 "register_operand") - (match_operand:VQN 2 "immediate_operand")] +;; Optimize ((a + b) >> n) + c where n is half the bitsize of the vector +(define_insn_and_split "*bitmask_shift_plus" + [(set (match_operand:VQN 0 "register_operand" "=&w") + (plus:VQN + (lshiftrt:VQN + (plus:VQN (match_operand:VQN 1 "register_operand" "w") + (match_operand:VQN 2 "register_operand" "w")) + (match_operand:VQN 3 "aarch64_simd_shift_imm_vec_exact_top" "Dr")) + (match_operand:VQN 4 "register_operand" "w")))] "TARGET_SIMD" + "#" + "&& true" + [(const_int 0)] { - unsigned HOST_WIDE_INT size - = (1ULL << GET_MODE_UNIT_BITSIZE (mode)) - 1; - rtx elt = unwrap_const_vec_duplicate (operands[2]); - if (!CONST_INT_P (elt) || UINTVAL (elt) != size) - FAIL; - - rtx addend = gen_reg_rtx (mode); - rtx val = aarch64_simd_gen_const_vector_dup (mode, 1); - emit_move_insn (addend, lowpart_subreg (mode, val, mode)); - rtx tmp1 = gen_reg_rtx (mode); - rtx tmp2 = gen_reg_rtx (mode); - emit_insn (gen_aarch64_addhn (tmp1, operands[1], addend)); - unsigned bitsize = GET_MODE_UNIT_BITSIZE (mode); - rtx shift_vector = aarch64_simd_gen_const_vector_dup (mode, bitsize); - emit_insn (gen_aarch64_uaddw (tmp2, operands[1], tmp1)); - emit_insn (gen_aarch64_simd_lshr (operands[0], tmp2, shift_vector)); + rtx tmp; + if (can_create_pseudo_p ()) + tmp = gen_reg_rtx (mode); + else + tmp = gen_rtx_REG (mode, REGNO (operands[0])); + emit_insn (gen_aarch64_addhn (tmp, operands[1], operands[2])); + emit_insn (gen_aarch64_uaddw (operands[0], operands[4], tmp)); DONE; }) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 40c0728a7e6f00c395c360ce7625bc2e4a018809..bed44d7d6873877386222d56144cc115e3953a61 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2317,41 +2317,24 @@ (define_insn "@aarch64_sve_" ;; ---- [INT] Misc optab implementations ;; ------------------------------------------------------------------------- ;; Includes: -;; - aarch64_bitmask_udiv +;; - bitmask_shift_plus ;; ------------------------------------------------------------------------- -;; div optimizations using narrowings -;; we can do the division e.g. shorts by 255 faster by calculating it as -;; (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in -;; double the precision of x. -;; -;; See aarch64-simd.md for bigger explanation. -(define_expand "@aarch64_bitmask_udiv3" - [(match_operand:SVE_FULL_HSDI 0 "register_operand") - (match_operand:SVE_FULL_HSDI 1 "register_operand") - (match_operand:SVE_FULL_HSDI 2 "immediate_operand")] +;; Optimize ((a + b) >> n) where n is half the bitsize of the vector +(define_insn "*bitmask_shift_plus" + [(set (match_operand:SVE_FULL_HSDI 0 "register_operand" "=w") + (unspec:SVE_FULL_HSDI + [(match_operand: 1) + (lshiftrt:SVE_FULL_HSDI + (plus:SVE_FULL_HSDI + (match_operand:SVE_FULL_HSDI 2 "register_operand" "w") + (match_operand:SVE_FULL_HSDI 3 "register_operand" "w")) + (match_operand:SVE_FULL_HSDI 4 + "aarch64_simd_shift_imm_vec_exact_top" "Dr"))] + UNSPEC_PRED_X))] "TARGET_SVE2" -{ - unsigned HOST_WIDE_INT size - = (1ULL << GET_MODE_UNIT_BITSIZE (mode)) - 1; - rtx elt = unwrap_const_vec_duplicate (operands[2]); - if (!CONST_INT_P (elt) || UINTVAL (elt) != size) - FAIL; - - rtx addend = gen_reg_rtx (mode); - rtx tmp1 = gen_reg_rtx (mode); - rtx tmp2 = gen_reg_rtx (mode); - rtx val = aarch64_simd_gen_const_vector_dup (mode, 1); - emit_move_insn (addend, lowpart_subreg (mode, val, mode)); - emit_insn (gen_aarch64_sve (UNSPEC_ADDHNB, mode, tmp1, operands[1], - addend)); - emit_insn (gen_aarch64_sve (UNSPEC_ADDHNB, mode, tmp2, operands[1], - lowpart_subreg (mode, tmp1, - mode))); - emit_move_insn (operands[0], - lowpart_subreg (mode, tmp2, mode)); - DONE; -}) + "addhnb\t%0., %2., %3." +) ;; ========================================================================= ;; == Permutation diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index e6f47cbbb0d04a6f33b9a741ebb614cabd0204b9..2728fb347c0df1756b237f4d6268908eef6bdd2a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3849,6 +3849,13 @@ aarch64_vectorize_related_mode (machine_mode vector_mode, return default_vectorize_related_mode (vector_mode, element_mode, nunits); } +/* Implement TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT. */ + +static bool aarch64_vectorize_preferred_div_as_shifts_over_mult (void) +{ + return true; +} + /* Implement TARGET_PREFERRED_ELSE_VALUE. For binary operations, prefer to use the first arithmetic operand as the else value if the else value doesn't matter, since that exactly matches the SVE @@ -24363,46 +24370,6 @@ aarch64_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, return ret; } - -/* Implement TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST. */ - -bool -aarch64_vectorize_can_special_div_by_constant (enum tree_code code, - tree vectype, wide_int cst, - rtx *output, rtx in0, rtx in1) -{ - if (code != TRUNC_DIV_EXPR - || !TYPE_UNSIGNED (vectype)) - return false; - - machine_mode mode = TYPE_MODE (vectype); - unsigned int flags = aarch64_classify_vector_mode (mode); - if ((flags & VEC_ANY_SVE) && !TARGET_SVE2) - return false; - - int pow = wi::exact_log2 (cst + 1); - auto insn_code = maybe_code_for_aarch64_bitmask_udiv3 (TYPE_MODE (vectype)); - /* SVE actually has a div operator, we may have gotten here through - that route. */ - if (pow != (int) (element_precision (vectype) / 2) - || insn_code == CODE_FOR_nothing) - return false; - - /* We can use the optimized pattern. */ - if (in0 == NULL_RTX && in1 == NULL_RTX) - return true; - - gcc_assert (output); - - expand_operand ops[3]; - create_output_operand (&ops[0], *output, mode); - create_input_operand (&ops[1], in0, mode); - create_fixed_operand (&ops[2], in1); - expand_insn (insn_code, 3, ops); - *output = ops[0].value; - return true; -} - /* Generate a byte permute mask for a register of mode MODE, which has NUNITS units. */ @@ -27904,13 +27871,13 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_MAX_ANCHOR_OFFSET #define TARGET_MAX_ANCHOR_OFFSET 4095 +#undef TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT +#define TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT \ + aarch64_vectorize_preferred_div_as_shifts_over_mult + #undef TARGET_VECTOR_ALIGNMENT #define TARGET_VECTOR_ALIGNMENT aarch64_simd_vector_alignment -#undef TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST -#define TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST \ - aarch64_vectorize_can_special_div_by_constant - #undef TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT #define TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT \ aarch64_vectorize_preferred_vector_alignment --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4867,60 +4867,27 @@ (define_expand "aarch64_hn2" } ) -;; div optimizations using narrowings -;; we can do the division e.g. shorts by 255 faster by calculating it as -;; (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in -;; double the precision of x. -;; -;; If we imagine a short as being composed of two blocks of bytes then -;; adding 257 or 0b0000_0001_0000_0001 to the number is equivalent to -;; adding 1 to each sub component: -;; -;; short value of 16-bits -;; ┌──────────────┬────────────────┐ -;; │ │ │ -;; └──────────────┴────────────────┘ -;; 8-bit part1 ▲ 8-bit part2 ▲ -;; │ │ -;; │ │ -;; +1 +1 -;; -;; after the first addition, we have to shift right by 8, and narrow the -;; results back to a byte. Remember that the addition must be done in -;; double the precision of the input. Since 8 is half the size of a short -;; we can use a narrowing halfing instruction in AArch64, addhn which also -;; does the addition in a wider precision and narrows back to a byte. The -;; shift itself is implicit in the operation as it writes back only the top -;; half of the result. i.e. bits 2*esize-1:esize. -;; -;; Since we have narrowed the result of the first part back to a byte, for -;; the second addition we can use a widening addition, uaddw. -;; -;; For the final shift, since it's unsigned arithmetic we emit an ushr by 8. -;; -;; The shift is later optimized by combine to a uzp2 with movi #0. -(define_expand "@aarch64_bitmask_udiv3" - [(match_operand:VQN 0 "register_operand") - (match_operand:VQN 1 "register_operand") - (match_operand:VQN 2 "immediate_operand")] +;; Optimize ((a + b) >> n) + c where n is half the bitsize of the vector +(define_insn_and_split "*bitmask_shift_plus" + [(set (match_operand:VQN 0 "register_operand" "=&w") + (plus:VQN + (lshiftrt:VQN + (plus:VQN (match_operand:VQN 1 "register_operand" "w") + (match_operand:VQN 2 "register_operand" "w")) + (match_operand:VQN 3 "aarch64_simd_shift_imm_vec_exact_top" "Dr")) + (match_operand:VQN 4 "register_operand" "w")))] "TARGET_SIMD" + "#" + "&& true" + [(const_int 0)] { - unsigned HOST_WIDE_INT size - = (1ULL << GET_MODE_UNIT_BITSIZE (mode)) - 1; - rtx elt = unwrap_const_vec_duplicate (operands[2]); - if (!CONST_INT_P (elt) || UINTVAL (elt) != size) - FAIL; - - rtx addend = gen_reg_rtx (mode); - rtx val = aarch64_simd_gen_const_vector_dup (mode, 1); - emit_move_insn (addend, lowpart_subreg (mode, val, mode)); - rtx tmp1 = gen_reg_rtx (mode); - rtx tmp2 = gen_reg_rtx (mode); - emit_insn (gen_aarch64_addhn (tmp1, operands[1], addend)); - unsigned bitsize = GET_MODE_UNIT_BITSIZE (mode); - rtx shift_vector = aarch64_simd_gen_const_vector_dup (mode, bitsize); - emit_insn (gen_aarch64_uaddw (tmp2, operands[1], tmp1)); - emit_insn (gen_aarch64_simd_lshr (operands[0], tmp2, shift_vector)); + rtx tmp; + if (can_create_pseudo_p ()) + tmp = gen_reg_rtx (mode); + else + tmp = gen_rtx_REG (mode, REGNO (operands[0])); + emit_insn (gen_aarch64_addhn (tmp, operands[1], operands[2])); + emit_insn (gen_aarch64_uaddw (operands[0], operands[4], tmp)); DONE; }) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 40c0728a7e6f00c395c360ce7625bc2e4a018809..bed44d7d6873877386222d56144cc115e3953a61 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2317,41 +2317,24 @@ (define_insn "@aarch64_sve_" ;; ---- [INT] Misc optab implementations ;; ------------------------------------------------------------------------- ;; Includes: -;; - aarch64_bitmask_udiv +;; - bitmask_shift_plus ;; ------------------------------------------------------------------------- -;; div optimizations using narrowings -;; we can do the division e.g. shorts by 255 faster by calculating it as -;; (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in -;; double the precision of x. -;; -;; See aarch64-simd.md for bigger explanation. -(define_expand "@aarch64_bitmask_udiv3" - [(match_operand:SVE_FULL_HSDI 0 "register_operand") - (match_operand:SVE_FULL_HSDI 1 "register_operand") - (match_operand:SVE_FULL_HSDI 2 "immediate_operand")] +;; Optimize ((a + b) >> n) where n is half the bitsize of the vector +(define_insn "*bitmask_shift_plus" + [(set (match_operand:SVE_FULL_HSDI 0 "register_operand" "=w") + (unspec:SVE_FULL_HSDI + [(match_operand: 1) + (lshiftrt:SVE_FULL_HSDI + (plus:SVE_FULL_HSDI + (match_operand:SVE_FULL_HSDI 2 "register_operand" "w") + (match_operand:SVE_FULL_HSDI 3 "register_operand" "w")) + (match_operand:SVE_FULL_HSDI 4 + "aarch64_simd_shift_imm_vec_exact_top" "Dr"))] + UNSPEC_PRED_X))] "TARGET_SVE2" -{ - unsigned HOST_WIDE_INT size - = (1ULL << GET_MODE_UNIT_BITSIZE (mode)) - 1; - rtx elt = unwrap_const_vec_duplicate (operands[2]); - if (!CONST_INT_P (elt) || UINTVAL (elt) != size) - FAIL; - - rtx addend = gen_reg_rtx (mode); - rtx tmp1 = gen_reg_rtx (mode); - rtx tmp2 = gen_reg_rtx (mode); - rtx val = aarch64_simd_gen_const_vector_dup (mode, 1); - emit_move_insn (addend, lowpart_subreg (mode, val, mode)); - emit_insn (gen_aarch64_sve (UNSPEC_ADDHNB, mode, tmp1, operands[1], - addend)); - emit_insn (gen_aarch64_sve (UNSPEC_ADDHNB, mode, tmp2, operands[1], - lowpart_subreg (mode, tmp1, - mode))); - emit_move_insn (operands[0], - lowpart_subreg (mode, tmp2, mode)); - DONE; -}) + "addhnb\t%0., %2., %3." +) ;; ========================================================================= ;; == Permutation diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index e6f47cbbb0d04a6f33b9a741ebb614cabd0204b9..2728fb347c0df1756b237f4d6268908eef6bdd2a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3849,6 +3849,13 @@ aarch64_vectorize_related_mode (machine_mode vector_mode, return default_vectorize_related_mode (vector_mode, element_mode, nunits); } +/* Implement TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT. */ + +static bool aarch64_vectorize_preferred_div_as_shifts_over_mult (void) +{ + return true; +} + /* Implement TARGET_PREFERRED_ELSE_VALUE. For binary operations, prefer to use the first arithmetic operand as the else value if the else value doesn't matter, since that exactly matches the SVE @@ -24363,46 +24370,6 @@ aarch64_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, return ret; } - -/* Implement TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST. */ - -bool -aarch64_vectorize_can_special_div_by_constant (enum tree_code code, - tree vectype, wide_int cst, - rtx *output, rtx in0, rtx in1) -{ - if (code != TRUNC_DIV_EXPR - || !TYPE_UNSIGNED (vectype)) - return false; - - machine_mode mode = TYPE_MODE (vectype); - unsigned int flags = aarch64_classify_vector_mode (mode); - if ((flags & VEC_ANY_SVE) && !TARGET_SVE2) - return false; - - int pow = wi::exact_log2 (cst + 1); - auto insn_code = maybe_code_for_aarch64_bitmask_udiv3 (TYPE_MODE (vectype)); - /* SVE actually has a div operator, we may have gotten here through - that route. */ - if (pow != (int) (element_precision (vectype) / 2) - || insn_code == CODE_FOR_nothing) - return false; - - /* We can use the optimized pattern. */ - if (in0 == NULL_RTX && in1 == NULL_RTX) - return true; - - gcc_assert (output); - - expand_operand ops[3]; - create_output_operand (&ops[0], *output, mode); - create_input_operand (&ops[1], in0, mode); - create_fixed_operand (&ops[2], in1); - expand_insn (insn_code, 3, ops); - *output = ops[0].value; - return true; -} - /* Generate a byte permute mask for a register of mode MODE, which has NUNITS units. */ @@ -27904,13 +27871,13 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_MAX_ANCHOR_OFFSET #define TARGET_MAX_ANCHOR_OFFSET 4095 +#undef TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT +#define TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT \ + aarch64_vectorize_preferred_div_as_shifts_over_mult + #undef TARGET_VECTOR_ALIGNMENT #define TARGET_VECTOR_ALIGNMENT aarch64_simd_vector_alignment -#undef TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST -#define TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST \ - aarch64_vectorize_can_special_div_by_constant - #undef TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT #define TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT \ aarch64_vectorize_preferred_vector_alignment