From patchwork Fri Jul 24 10:55:42 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 499681 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 33F04140D4F for ; Fri, 24 Jul 2015 20:56:16 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=yPL6YIye; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=O7hcO+/Cf6oBpvaRFLy2zHGy+2ceJStzB6zEUVnTHz/ LB72z3wYrbVVnEqUGygR9kJCzIMbsaq+2xIF0Q1rLAxDE8qU8MOLYVR+xzLCnLKs UgK3N+If3hAB7IHtIJncsA9DLcWbH4FjCow4EKbxodIMJKV7WBIFwQFH9mNZg9BU = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=g2kZJPrXaIAiB+PMQQVwfO3g6/E=; b=yPL6YIyeJay8ykhOY A5Exxc2GIUthGAoS1jc7hwwugH3JpXClNDOfpXcY+E8Dock/M2JastnjR9HaQA5i 8+l1lBAJ2/82ENyd46ChB7OHdJjPZSZNnjAMSj2m6wEbfSwZZGpLDHVy8Uz4RNvx WRHHuZX9dNt3GMvr6r7iV2ZIts= Received: (qmail 100540 invoked by alias); 24 Jul 2015 10:55:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100468 invoked by uid 89); 24 Jul 2015 10:55:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 24 Jul 2015 10:55:48 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-12-IG-KRh2jQJyhdvel6mvS0w-1; Fri, 24 Jul 2015 11:55:42 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 24 Jul 2015 11:55:42 +0100 Message-ID: <55B219AE.6010102@arm.com> Date: Fri, 24 Jul 2015 11:55:42 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Ramana Radhakrishnan , Richard Earnshaw , Marcus Shawcroft , James Greenhalgh Subject: [PATCH][ARM][3/3] Expand mod by power of 2 X-MC-Unique: IG-KRh2jQJyhdvel6mvS0w-1 X-IsSubscribed: yes Hi all, This third patch implements the same algorithm as patch 1/3 but for arm. That is, for X % N where N is a power of 2 we do: rsbs r1, r0, #0 and r0, r0, #(N - 1) and r1, r1, #(N - 1) rsbpl r0, r1, #0 For the special case where N is 2 we do the shorter: cmp r0, #0 and r0, r0, #1 rsblt r0, r0, #0 Note that for the final conditional negate we expand to an IF_THEN_ELSE of a NEG rather than a cond_exec rtx because the lra dataflow analysis doesn't always deal with cond_execs correctly. The splitters fixed in patch 2/3 then break it into a cond_exec after reload, so it all works out. Bootstrapped and tested on arm, with both ARM and Thumb2 states. Tests are added and shared with aarch64. Ok for trunk? Thanks, Kyrill 2015-07-24 Kyrylo Tkachov * config/arm/arm.md (*subsi3_compare0): Rename to... (subsi3_compare0): ... This. (*arm_andsi3_insn): Rename to... (arm_andsi3_insn): ... This. (modsi3): New define_expand. * config/arm/arm.c (arm_new_rtx_costs, MOD case): Handle case operand is power of 2. 2015-07-24 Kyrylo Tkachov * gcc.target/aarch64/mod_2.x: New file. * gcc.target/aarch64/mod_256.x: Likewise. * gcc.target/arm/mod_2.c: New test. * gcc.target/arm/mod_256.c: Likewise. * gcc.target/aarch64/mod_2.c: Likewise. * gcc.target/aarch64/mod_256.c: Likewise. commit d562629e36ba013b8f77956a74139330d191bc30 Author: Kyrylo Tkachov Date: Fri Jul 17 16:30:01 2015 +0100 [ARM][3/3] Expand mod by power of 2 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e1bc727..6ade07c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -9556,6 +9556,22 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, case MOD: case UMOD: + /* MOD by a power of 2 can be expanded as: + rsbs r1, r0, #0 + and r0, r0, #(n - 1) + and r1, r1, #(n - 1) + rsbpl r0, r1, #0. */ + if (code == MOD + && CONST_INT_P (XEXP (x, 1)) + && exact_log2 (INTVAL (XEXP (x, 1))) > 0 + && mode == SImode) + { + *cost += COSTS_N_INSNS (3) + + 2 * extra_cost->alu.logical + + extra_cost->alu.arith; + return true; + } + *cost = LIBCALL_COST (2); return false; /* All arguments must be in registers. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index f341109..8301648 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1229,7 +1229,7 @@ (define_peephole2 "" ) -(define_insn "*subsi3_compare0" +(define_insn "subsi3_compare0" [(set (reg:CC_NOOV CC_REGNUM) (compare:CC_NOOV (minus:SI (match_operand:SI 1 "arm_rhs_operand" "r,r,I") @@ -2158,7 +2158,7 @@ (define_expand "andsi3" ) ; ??? Check split length for Thumb-2 -(define_insn_and_split "*arm_andsi3_insn" +(define_insn_and_split "arm_andsi3_insn" [(set (match_operand:SI 0 "s_register_operand" "=r,l,r,r,r") (and:SI (match_operand:SI 1 "s_register_operand" "%r,0,r,r,r") (match_operand:SI 2 "reg_or_int_operand" "I,l,K,r,?n")))] @@ -11105,6 +11105,78 @@ (define_expand "thumb_legacy_rev" "" ) +;; ARM-specific expansion of signed mod by power of 2 +;; using conditional negate. +;; For r0 % n where n is a power of 2 produce: +;; rsbs r1, r0, #0 +;; and r0, r0, #(n - 1) +;; and r1, r1, #(n - 1) +;; rsbpl r0, r1, #0 + +(define_expand "modsi3" + [(match_operand:SI 0 "register_operand" "") + (match_operand:SI 1 "register_operand" "") + (match_operand:SI 2 "const_int_operand" "")] + "TARGET_32BIT" + { + HOST_WIDE_INT val = INTVAL (operands[2]); + + if (val <= 0 + || exact_log2 (INTVAL (operands[2])) <= 0 + || !const_ok_for_arm (INTVAL (operands[2]) - 1)) + FAIL; + + rtx mask = GEN_INT (val - 1); + + /* In the special case of x0 % 2 we can do the even shorter: + cmp r0, #0 + and r0, r0, #1 + rsblt r0, r0, #0. */ + + if (val == 2) + { + rtx cc_reg = gen_rtx_REG (CCmode, CC_REGNUM); + rtx cond = gen_rtx_LT (SImode, cc_reg, const0_rtx); + + emit_insn (gen_rtx_SET (cc_reg, + gen_rtx_COMPARE (CCmode, operands[1], const0_rtx))); + + rtx masked = gen_reg_rtx (SImode); + emit_insn (gen_arm_andsi3_insn (masked, operands[1], mask)); + emit_move_insn (operands[0], + gen_rtx_IF_THEN_ELSE (SImode, cond, + gen_rtx_NEG (SImode, + masked), + masked)); + DONE; + } + + rtx neg_op = gen_reg_rtx (SImode); + rtx_insn *insn = emit_insn (gen_subsi3_compare0 (neg_op, const0_rtx, + operands[1])); + + /* Extract the condition register and mode. */ + rtx cmp = XVECEXP (PATTERN (insn), 0, 0); + rtx cc_reg = SET_DEST (cmp); + rtx cond = gen_rtx_GE (SImode, cc_reg, const0_rtx); + + emit_insn (gen_arm_andsi3_insn (operands[0], operands[1], mask)); + + rtx masked_neg = gen_reg_rtx (SImode); + emit_insn (gen_arm_andsi3_insn (masked_neg, neg_op, mask)); + + /* We want a conditional negate here, but emitting COND_EXEC rtxes + during expand does not always work. Do an IF_THEN_ELSE instead. */ + emit_move_insn (operands[0], + gen_rtx_IF_THEN_ELSE (SImode, cond, + gen_rtx_NEG (SImode, masked_neg), + operands[0])); + + + DONE; + } +) + (define_expand "bswapsi2" [(set (match_operand:SI 0 "s_register_operand" "=r") (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))] diff --git a/gcc/testsuite/gcc.target/aarch64/mod_2.c b/gcc/testsuite/gcc.target/aarch64/mod_2.c new file mode 100644 index 0000000..2645c18 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_2.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "mod_2.x" + +/* { dg-final { scan-assembler "csneg\t\[wx\]\[0-9\]*" } } */ +/* { dg-final { scan-assembler-times "and\t\[wx\]\[0-9\]*" 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/mod_2.x b/gcc/testsuite/gcc.target/aarch64/mod_2.x new file mode 100644 index 0000000..2b079a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_2.x @@ -0,0 +1,5 @@ +int +f (int x) +{ + return x % 2; +} diff --git a/gcc/testsuite/gcc.target/aarch64/mod_256.c b/gcc/testsuite/gcc.target/aarch64/mod_256.c new file mode 100644 index 0000000..567332c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_256.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "mod_256.x" + +/* { dg-final { scan-assembler "csneg\t\[wx\]\[0-9\]*" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/mod_256.x b/gcc/testsuite/gcc.target/aarch64/mod_256.x new file mode 100644 index 0000000..c1de42c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_256.x @@ -0,0 +1,5 @@ +int +f (int x) +{ + return x % 256; +} diff --git a/gcc/testsuite/gcc.target/arm/mod_2.c b/gcc/testsuite/gcc.target/arm/mod_2.c new file mode 100644 index 0000000..93017a1 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mod_2.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "../aarch64/mod_2.x" + +/* { dg-final { scan-assembler "rsblt\tr\[0-9\]*" } } */ +/* { dg-final { scan-assembler-times "and\tr\[0-9\].*1" 1 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mod_256.c b/gcc/testsuite/gcc.target/arm/mod_256.c new file mode 100644 index 0000000..92ab05a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mod_256.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "../aarch64/mod_256.x" + +/* { dg-final { scan-assembler "rsbpl\tr\[0-9\]*" } } */ +/* { dg-final { scan-assembler "and\tr\[0-9\].*255" } } */