From patchwork Fri Aug 20 23:29:21 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 62331 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 4BF67B70DE for ; Sat, 21 Aug 2010 09:29:43 +1000 (EST) Received: (qmail 22874 invoked by alias); 20 Aug 2010 23:29:32 -0000 Received: (qmail 22719 invoked by uid 22791); 20 Aug 2010 23:29:30 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, T_LOTS_OF_MONEY, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 20 Aug 2010 23:29:26 +0000 Received: (qmail 31759 invoked from network); 20 Aug 2010 23:29:24 -0000 Received: from unknown (HELO ?84.152.240.111?) (bernds@127.0.0.2) by mail.codesourcery.com with ESMTPA; 20 Aug 2010 23:29:24 -0000 Message-ID: <4C6F0FD1.7040207@codesourcery.com> Date: Sat, 21 Aug 2010 01:29:21 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.7) Gecko/20100724 Thunderbird/3.1.1 MIME-Version: 1.0 To: GCC Patches Subject: combine: Undo canonicalizations when splitting (minus x (mult)) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org A week or two ago I posted a patch which added some new canonicalizations in the combiner, to change (plus (mult X -Y) Z) to (minus Z (mult X Y)) This helped on ARM, and to a lesser degree on x86. I mentioned at the time that I was aware of one counterexample, in the 20040709-2 testcase. Here, we have two successive pairs of multiply and add insns, which can be combined to a single multiply and add - however, that becomes (minus (const_int) (mult X (const_int)) which is not recognizable on i686 even when split. Since then I've also found a couple additional examples on x86_64 with vectorization enabled. The patch below corrects the problem by tweaking find_split_point. If we have (minus Z (mult X Y)) at the top level, and Y is not a constant power of 2, assume it is easier for the target to split the (plus (mult X -Y) Z) form. - imull $1103515245, %ecx, %r8d - addl $12345, %r8d - imull $1103515245, %r8d, %r8d - addl $12345, %r8d + imull $-1029531031, %ecx, %r8d + subl $740551042, %r8d Bootstrapped and tested on x86_64-linux. Looking at code generation on ARMv7 and x86_64, a few sequences were improved, and I did not discover additional regressions. Ok? Bernd Index: testsuite/gcc.target/i386/combine-mul.c =================================================================== --- testsuite/gcc.target/i386/combine-mul.c (revision 0) +++ testsuite/gcc.target/i386/combine-mul.c (revision 0) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "12345" } } */ + +static inline unsigned int myrnd (void) +{ + static unsigned int s = 1388815473; + s *= 1103515245; + s += 12345; +} + +struct __attribute__ ((packed)) A { + unsigned short i:1, l:1, j:3, k:11; +}; + +struct A sA; +void testA (void) +{ + char *p = (char *) &sA; + *p++ = myrnd (); + *p++ = myrnd (); + sA.k = -1; +} Index: combine.c =================================================================== --- combine.c (revision 163389) +++ combine.c (working copy) @@ -4771,6 +4771,23 @@ find_split_point (rtx *loc, rtx insn, bo case PLUS: case MINUS: + /* Canonicalization can produce (minus A (mult B C)), where C is a + constant. It may be better to try splitting (plus (mult B -C) A) + instead if this isn't a multiply by a power of two. */ + if (set_src && code == MINUS && GET_CODE (XEXP (x, 1)) == MULT + && GET_CODE (XEXP (XEXP (x, 1), 1)) == CONST_INT + && exact_log2 (INTVAL (XEXP (XEXP (x, 1), 1))) < 0) + { + enum machine_mode mode = GET_MODE (x); + unsigned HOST_WIDE_INT this_int = INTVAL (XEXP (XEXP (x, 1), 1)); + HOST_WIDE_INT other_int = trunc_int_for_mode (-this_int, mode); + SUBST (*loc, gen_rtx_PLUS (mode, gen_rtx_MULT (mode, + XEXP (XEXP (x, 1), 0), + GEN_INT (other_int)), + XEXP (x, 0))); + return find_split_point (loc, insn, set_src); + } + /* Split at a multiply-accumulate instruction. However if this is the SET_SRC, we likely do not have such an instruction and it's worthless to try this split. */