From patchwork Mon Apr 27 10:01:13 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 464926 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 98F60140318 for ; Mon, 27 Apr 2015 20:01:26 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=bkXutf26; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=igV+OnCUEufadyHaXofomvbThjz3oUufDjQt2/HjwCu HyaM77482qy7++xspA+H3O8cO1ztCu/D07tRdpQBBPuBRfimKE2jGmj5JXZkCmq3 P0YA+kidQhE/7BQeubUaOYSJFK/Oq0Erz8o2lfxAzcj5DCIT3h5kx8vceWuNiw8c = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=dg3vPvUPm/HGH5ZnFu7FyzhAznE=; b=bkXutf268PnxLdcUp aSOCt3c/U484O+KF8SMLDOrPz84p+Mj3M08hC9eKi5YP5EJJ4Hd3BFdkXzYI9ZMk Kcki6Q4484p7TWmG9RA2UcMkGBEyXS7DTHYb8jp4ekhsTmJVfup8nj75siZTr9Zt WnVyOVARb/77T1IiIaPvu60SRU= Received: (qmail 123836 invoked by alias); 27 Apr 2015 10:01:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 123816 invoked by uid 89); 27 Apr 2015 10:01:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 27 Apr 2015 10:01:17 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-8.uk.mimecast.lan; Mon, 27 Apr 2015 11:01:13 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 27 Apr 2015 11:01:13 +0100 Message-ID: <553E08E9.7060107@arm.com> Date: Mon, 27 Apr 2015 11:01:13 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Add alternative 'extr' pattern, calculate rtx cost properly X-MC-Unique: wQzvdZXyRxio96alpNyfPw-1 X-IsSubscribed: yes Hi all, We currently have a pattern that will recognise a particular combination of shifts and bitwise-or as an extr instruction. However, the order of the shifts inside the IOR doesn't have canonicalisation rules (see rev16 pattern for similar stuff). This means that for code like: unsigned long foo (unsigned long a, unsigned long b) { return (a << 16) | (b >> 48); } we will recognise the extr, but for the equivalent: unsigned long foo (unsigned long a, unsigned long b) { return (b >> 48) | (a << 16); } we won't, and we'll emit three instructions. This patch adds the pattern for the alternative order of shifts and allows us to generate for the above the code: foo: extr x0, x0, x1, 48 ret The zero-extended version is added as well and the rtx costs function is updated to handle all of these cases. I've seen this pattern trigger in the gcc code itself in expmed.c where it eliminated a sequence of orrs and shifts into an extr instruction! Bootstrapped and tested on aarch64-linux. Ok for trunk? Thanks, Kyrill 2015-04-27 Kyrylo Tkachov * config/aarch64/aarch64.md (*extr5_insn_alt): New pattern. (*extrsi5_insn_uxtw_alt): Likewise. * config/aarch64/aarch64.c (aarch64_extr_rtx_p): New function. (aarch64_rtx_costs, IOR case): Use above to properly cost extr operations. commit d45e92b3b8c5837328b7b10682565cacfb566e5b Author: Kyrylo Tkachov Date: Mon Mar 2 17:26:38 2015 +0000 [AArch64] Add alternative 'extr' pattern, calculate rtx cost properly diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 860a1dd..ef5a1e4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5438,6 +5438,51 @@ aarch64_frint_unspec_p (unsigned int u) } } +/* Return true iff X is an rtx that will match an extr instruction + i.e. as described in the *extr5_insn family of patterns. + OP0 and OP1 will be set to the operands of the shifts involved + on success and will be NULL_RTX otherwise. */ + +static bool +aarch64_extr_rtx_p (rtx x, rtx *res_op0, rtx *res_op1) +{ + rtx op0, op1; + machine_mode mode = GET_MODE (x); + + *res_op0 = NULL_RTX; + *res_op1 = NULL_RTX; + + if (GET_CODE (x) != IOR) + return false; + + op0 = XEXP (x, 0); + op1 = XEXP (x, 1); + + if ((GET_CODE (op0) == ASHIFT && GET_CODE (op1) == LSHIFTRT) + || (GET_CODE (op1) == ASHIFT && GET_CODE (op0) == LSHIFTRT)) + { + /* Canonicalise locally to ashift in op0, lshiftrt in op1. */ + if (GET_CODE (op1) == ASHIFT) + std::swap (op0, op1); + + if (!CONST_INT_P (XEXP (op0, 1)) || !CONST_INT_P (XEXP (op1, 1))) + return false; + + unsigned HOST_WIDE_INT shft_amnt_0 = UINTVAL (XEXP (op0, 1)); + unsigned HOST_WIDE_INT shft_amnt_1 = UINTVAL (XEXP (op1, 1)); + + if (shft_amnt_0 < GET_MODE_BITSIZE (mode) + && shft_amnt_0 + shft_amnt_1 == GET_MODE_BITSIZE (mode)) + { + *res_op0 = XEXP (op0, 0); + *res_op1 = XEXP (op1, 0); + return true; + } + } + + return false; +} + /* Calculate the cost of calculating (if_then_else (OP0) (OP1) (OP2)), storing it in *COST. Result is true if the total cost of the operation has now been calculated. */ @@ -5968,6 +6013,16 @@ cost_plus: return true; } + + if (aarch64_extr_rtx_p (x, &op0, &op1)) + { + *cost += rtx_cost (op0, IOR, 0, speed) + + rtx_cost (op1, IOR, 1, speed); + if (speed) + *cost += extra_cost->alu.shift; + + return true; + } /* Fall through. */ case XOR: case AND: diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1a7f888..17a8755 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3597,6 +3597,21 @@ (define_insn "*extr5_insn" [(set_attr "type" "shift_imm")] ) +;; There are no canonicalisation rules for ashift and lshiftrt inside an ior +;; so we have to match both orderings. +(define_insn "*extr5_insn_alt" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ior:GPI (lshiftrt:GPI (match_operand:GPI 2 "register_operand" "r") + (match_operand 4 "const_int_operand" "n")) + (ashift:GPI (match_operand:GPI 1 "register_operand" "r") + (match_operand 3 "const_int_operand" "n"))))] + "UINTVAL (operands[3]) < GET_MODE_BITSIZE (mode) + && (UINTVAL (operands[3]) + UINTVAL (operands[4]) + == GET_MODE_BITSIZE (mode))" + "extr\\t%0, %1, %2, %4" + [(set_attr "type" "shift_imm")] +) + ;; zero_extend version of the above (define_insn "*extrsi5_insn_uxtw" [(set (match_operand:DI 0 "register_operand" "=r") @@ -3611,6 +3626,19 @@ (define_insn "*extrsi5_insn_uxtw" [(set_attr "type" "shift_imm")] ) +(define_insn "*extrsi5_insn_uxtw_alt" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (ior:SI (lshiftrt:SI (match_operand:SI 2 "register_operand" "r") + (match_operand 4 "const_int_operand" "n")) + (ashift:SI (match_operand:SI 1 "register_operand" "r") + (match_operand 3 "const_int_operand" "n")))))] + "UINTVAL (operands[3]) < 32 && + (UINTVAL (operands[3]) + UINTVAL (operands[4]) == 32)" + "extr\\t%w0, %w1, %w2, %4" + [(set_attr "type" "shift_imm")] +) + (define_insn "*ror3_insn" [(set (match_operand:GPI 0 "register_operand" "=r") (rotate:GPI (match_operand:GPI 1 "register_operand" "r")