From patchwork Fri Apr 21 08:20:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 753191 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3w8TGD5vy4z9s65 for ; Fri, 21 Apr 2017 18:21:08 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="GpJQxUK3"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=klkGNCQ7PSp+CzVxRXYtIspbTWLlvjlwd9GnIrrg7c7 ADTwUwbWFEG5fO96zR9Xtuw43hgOvW4JQNg3xcXb9ggdyhdkFq5tOUQVKFqWFyMB +0B/aMpF8djWgKMW5u1SGeLFXc/GuRsrSiUjxmMMNkwDdtFZgNRMW1qZav3PYSq8 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=ONDy1B9VIMNXDhIqWm8TwKFKjmc=; b=GpJQxUK3Cx/fwDRqW +G8gx522h6bMcMkvMSVKE4IpZYxXBcR1EWDcdPeOwvHw7/flealM/peUI4FldV7k 5cLvbIghCM/WAGtj1MqpBtzkjK+tjHPhKd6EXOhbf8pFt9dM6C1B82YwubH6NNmP A6pDh1/QNeA5ZbBrJiks74jZe8= Received: (qmail 79569 invoked by alias); 21 Apr 2017 08:20:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 79554 invoked by uid 89); 21 Apr 2017 08:20:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=sk:kyrylo, sk:kyrylo., Thing X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Apr 2017 08:20:54 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C0BA780D; Fri, 21 Apr 2017 01:20:54 -0700 (PDT) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0B7283F41F; Fri, 21 Apr 2017 01:20:53 -0700 (PDT) Message-ID: <58F9C0E4.7070108@foss.arm.com> Date: Fri, 21 Apr 2017 09:20:52 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Peephole for SUBS Hi all, A pattern that sometimes occurs in the wild is to subtract two operands and separately compare them. This can be implemented as a single SUBS instruction and we actually have a define_insn for this: sub3_compare1. However, I'm not sure if that's enough by itself to match these constructs. Adding a peephole that will actually bring the subtraction and comparison SETs together into a PARALLEL helps a lot in matching these (note that there is no dependency between the subtract and the comparison). This patch adds such a peephole. It's really simple and straightforward. The only thing to look out for is the case when the output of the subtract is a register that is also one of the operands: SUB W0, W0, W1 CMP W0, W1 should not be transformed into: SUBS W0, W0, W1. The testcase in the patch provides a motivating example where we now generate a single SUBS instead of a SUB followed by a CMP. This transformation triggers a few times in SPEC2006. Not enough to actually move the needle, but it's the Right Thing to Do (tm). I've seen it catch cases that compute an absolute difference, for example: int foo (int a, int b) { if (a < b) return b - a; else return a - b; } will now generate: foo: sub w2, w1, w0 subs w3, w0, w1 csel w0, w3, w2, ge ret instead of: foo: sub w2, w1, w0 sub w3, w0, w1 cmp w0, w1 csel w0, w3, w2, ge ret Bootstrapped and tested on aarch64-none-linux-gnu. Ok for GCC 8? Thanks, Kyrill 2017-04-21 Kyrylo Tkachov * config/aarch64/aarch64.c (define_peephole2 above *sub__): New peephole. 2017-04-21 Kyrylo Tkachov * gcc.target/aarch64/subs_compare_1.c: New test. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f046466f8af731db6752d69690ebfd071cd55d3e..2a0341e1a957ebd28bc9e29465803501be23cd72 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2344,6 +2344,24 @@ (define_insn "sub3_compare1" [(set_attr "type" "alus_sreg")] ) +(define_peephole2 + [(set (match_operand:GPI 0 "register_operand") + (minus:GPI (match_operand:GPI 1 "aarch64_reg_or_zero") + (match_operand:GPI 2 "aarch64_reg_or_zero"))) + (set (reg:CC CC_REGNUM) + (compare:CC + (match_dup 1) + (match_dup 2)))] + "!reg_overlap_mentioned_p (operands[0], operands[1]) + && !reg_overlap_mentioned_p (operands[0], operands[2])" + [(const_int 0)] + { + emit_insn (gen_sub3_compare1 (operands[0], operands[1], + operands[2])); + DONE; + } +) + (define_insn "*sub__" [(set (match_operand:GPI 0 "register_operand" "=r") (minus:GPI (match_operand:GPI 3 "register_operand" "r") diff --git a/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c b/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c new file mode 100644 index 0000000000000000000000000000000000000000..95c8f696fee7e992c27625108850c02319426de5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int +foo (int a, int b) +{ + int x = a - b; + if (a <= b) + return x; + else + return 0; +} + +/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */ +/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" } } */