From patchwork Sun Dec 10 17:44:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1874214 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=TJyO4Ts+; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SpC1N1r63z1ySY for ; Mon, 11 Dec 2023 04:44:21 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 39D6B385842B for ; Sun, 10 Dec 2023 17:44:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by sourceware.org (Postfix) with ESMTPS id D466E3858C74 for ; Sun, 10 Dec 2023 17:44:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D466E3858C74 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D466E3858C74 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::12b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702230248; cv=none; b=HacnqL8tbjfE2I8FjLYI7f0fzeRtLUXJThrCdzf+mfFsTWYyV/6KYNgm8hJrbuNLDSDv7iliEUQ8Q3ycANltkOYiXwqQxCHzsc7Wi+WVhSTbcBafwRPV53jX3X1pgv8PriGHQX9XLRigU8WH1BCby+4UqUyS9js4HRUPxVRFc5s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702230248; c=relaxed/simple; bh=4KtQOH1beS/XDpbz+AUNuHZO90diRhzTnONFDOftCr0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=fs+cXpR4TjpoPU6c7GFUDxVyxp8J4BT0tem5+gXtrV0FFd0RBwELoU94Ov9HwOAJG+SDTvAc3yqK9sFSWZ/OebX8D5xISH/mHml3XTp02K8EC3BNgm4IYVi4zl3S00dvYDldFRxmOtgBYXTHtkG5plIff8DLh4vQT5Bk7LW044c= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-il1-x12b.google.com with SMTP id e9e14a558f8ab-35d64ad4188so16971895ab.3 for ; Sun, 10 Dec 2023 09:44:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702230245; x=1702835045; darn=gcc.gnu.org; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=TKTvefgEFZcMNi0Es3XSRw6W3l/PuJjIFo7ovabM0wM=; b=TJyO4Ts+e/qlNa7uEvoqIrVgVc0IkzbO39WKJqn34yzmMvGgVlcfseiFpwqlxi6wgz 5xcgq89TkmT4ytmfcdArYziyeR/yaFBdyYR4UfQN5FinxoepZjArtKEHbbQP5leiofcM eRyADFZiajOP7pYBAUkUmPmMaKE1zuAb7R+IGSnhqxKKTbNtYK5sE/HJ0ZWZyu/jYY5C 2iHAFY1RUj5sN3HK/guCwAnnweUucpHOMN2MZbwc/EOpPvClZUzgT0KAi+AA1hIhnl90 EofL8W6sLCsHogjGI1s4g0k0DrePLAz0fObrCMIjiR2kubrRgGw3Jej1R8FNrOjVm2qM MXjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702230245; x=1702835045; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TKTvefgEFZcMNi0Es3XSRw6W3l/PuJjIFo7ovabM0wM=; b=Dd8MD+Z9No1TugN9YRSfY7HWybLIwdVtD5hw4uJsHFPListWhkSYA8GXag7kZmvcV0 C30a9mEKdapE3ZNDJaG3sAhrwmuH6GtfPNkew+R5m6ZkYnkyAWBRVVRsUM65CtYrcNQX KRLsXJdnU7CjslrsNVKnevtdcqN4agRpJzncldK3V9hZG9afBAMwHHLGP1l3bG3VjMgV 2CL/aYuJaIghvJzNCeWOwo+NZl9VBXHxTORRhpPcWnxVrmM6U0dRwtVhT1QbeLJUTrX8 XIr7pAJ4II230x0wEre/C4tTxSoH1Uv6s7SBGbxg5h/AvL0aZN2PggRvyEKXdMbXoKkI GLkw== X-Gm-Message-State: AOJu0Ywzz56n8sbSTdqIANq9CGhw8o4YS4Go3sok8Rq4Bl9lR55E2hUT l9mJWZkMUGYuyJ4SNJQxQmDY/JJDgLI= X-Google-Smtp-Source: AGHT+IHzUTlkmzHQQXakbWv0RS5OOj04D4YGTBrS1ueag4vvUcODHfTqDOLSNcjWR648VOLyUs+t6w== X-Received: by 2002:a05:6e02:1a4d:b0:35d:a4a9:7bb7 with SMTP id u13-20020a056e021a4d00b0035da4a97bb7mr5647058ilv.63.1702230245085; Sun, 10 Dec 2023 09:44:05 -0800 (PST) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id s15-20020a170902b18f00b001cf6783fd41sm5009936plr.17.2023.12.10.09.44.03 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 10 Dec 2023 09:44:04 -0800 (PST) Message-ID: <03662208-651f-458c-93a9-7aebdbc02586@gmail.com> Date: Sun, 10 Dec 2023 10:44:00 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US From: Jeff Law Subject: [committed] Support uaddv and usubv on the H8 To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds uaddv/usubv support on the H8 port to speed up those pesky builtin-overflow tests. It's a variant of something I'd been running for a while -- the major change between the old approach I'd been using and this patch is this version does not expose the CC register until after reload to be consistent with the rest of the H8 port. The general approach is to first clear the GPR that's going to hold the overflow status, perform the arithmetic operation (add/sub), then use addx to move the overflow indicator (in the C bit) into the GPR holding the overflow status. That's a significant improvement over the mess of logicals that's generated by the generic code. Handling signed overflow is possible and something I'll probably port to this scheme at some point. It's a bit more complex because we can't trivially move the bit from CCR into the right position in a GPR and other quirks of the H8. This has been regression tested on the H8 without problems. Pushing to the trunk. Jeff commit 7fb9454c748632d148a07c275ea1f77b290b0c2d Author: Jeff Law Date: Sun Dec 10 10:41:05 2023 -0700 [committed] Support uaddv and usubv on the H8 This patch adds uaddv/usubv support on the H8 port to speed up those pesky builtin-overflow tests. It's a variant of something I'd been running for a while -- the major change between the old approach I'd been using and this patch is this version does not expose the CC register until after reload to be consistent with the rest of the H8 port. The general approach is to first clear the GPR that's going to hold the overflow status, perform the arithmetic operation (add/sub), then use addx to move the overflow indicator (in the C bit) into the GPR holding the overflow status. That's a significant improvement over the mess of logicals that's generated by the generic code. Handling signed overflow is possible and something I'll probably port to this scheme at some point. It's a bit more complex because we can't trivially move the bit from CCR into the right position in a GPR and other quirks of the H8. This has been regression tested on the H8 without problems. Pushing to the trunk. gcc/ * config/h8300/addsub.md (uaddv4, usubv4): New expanders. (uaddv): New define_insn_and_split plus post-reload pattern. diff --git a/gcc/config/h8300/addsub.md b/gcc/config/h8300/addsub.md index b1eb0d20188..32eba9df67a 100644 --- a/gcc/config/h8300/addsub.md +++ b/gcc/config/h8300/addsub.md @@ -239,3 +239,80 @@ (define_insn "*negsf2_clobber_flags" "reload_completed" "xor.w\\t#32768,%e0" [(set_attr "length" "4")]) + +(define_expand "uaddv4" + [(set (match_operand:QHSI 0 "register_operand" "") + (plus:QHSI (match_operand:QHSI 1 "register_operand" "") + (match_operand:QHSI 2 "register_operand" ""))) + (set (pc) + (if_then_else (ltu (match_dup 0) (match_dup 1)) + (label_ref (match_operand 3 "")) + (pc)))] + "") + +(define_insn_and_split "*uaddv" + [(set (match_operand:QHSI2 3 "register_operand" "=&r") + (ltu:QHSI2 (plus:QHSI (match_operand:QHSI 1 "register_operand" "%0") + (match_operand:QHSI 2 "register_operand" "r")) + (match_dup 1))) + (set (match_operand:QHSI 0 "register_operand" "=r") + (plus:QHSI (match_dup 1) (match_dup 2)))] + "" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 3) (ltu:QHSI2 (plus:QHSI (match_dup 1) (match_dup 2)) + (match_dup 1))) + (set (match_dup 0) (plus:QHSI (match_dup 1) (match_dup 2))) + (clobber (reg:CC CC_REG))])]) + +(define_insn "*uaddv" + [(set (match_operand:QHSI2 3 "register_operand" "=&r") + (ltu:QHSI2 (plus:QHSI (match_operand:QHSI 1 "register_operand" "%0") + (match_operand:QHSI 2 "register_operand" "r")) + (match_dup 1))) + (set (match_operand:QHSI 0 "register_operand" "=r") + (plus (match_dup 1) (match_dup 2))) + (clobber (reg:CC CC_REG))] + "" +{ + if (E_mode == E_QImode) + { + if (E_mode == E_QImode) + return "sub.b\t%X3,%X3\;add.b\t%X2,%X0\;addx\t%X3,%X3"; + else if (E_mode == E_HImode) + return "sub.b\t%X3,%X3\;add.w\t%T2,%T0\;addx\t%X3,%X3"; + else if (E_mode == E_SImode) + return "sub.b\t%X3,%X3\;add.l\t%S2,%S0\;addx\t%X3,%X3"; + } + else if (E_mode == E_HImode) + { + if (E_mode == E_QImode) + return "sub.w\t%T3,%T3\;add.b\t%X2,%X0\;addx\t%X3,%X3"; + else if (E_mode == E_HImode) + return "sub.w\t%T3,%T3\;add.w\t%T2,%T0\;addx\t%X3,%X3"; + else if (E_mode == E_SImode) + return "sub.w\t%T3,%T3\;add.l\t%S2,%S0\;addx\t%X3,%X3"; + } + else if (E_mode == E_SImode) + { + if (E_mode == E_QImode) + return "sub.l\t%S3,%S3\;add.b\t%X2,%X0\;addx\t%X3,%X3"; + else if (E_mode == E_HImode) + return "sub.l\t%S3,%S3\;add.w\t%T2,%T0\;addx\t%X3,%X3"; + else if (E_mode == E_SImode) + return "sub.l\t%S3,%S3\;add.l\t%S2,%S0\;addx\t%X3,%X3"; + } + else + gcc_unreachable (); +} + [(set_attr "length" "6")]) + +(define_expand "usubv4" + [(set (match_operand:QHSI 0 "register_operand" "") + (minus:QHSI (match_operand:QHSI 1 "register_operand" "") + (match_operand:QHSI 2 "register_operand" ""))) + (set (pc) + (if_then_else (ltu (match_dup 1) (match_dup 2)) + (label_ref (match_operand 3 "")) + (pc)))] + "")