Simplify X / X, 0 / X and X % X

Message ID	alpine.DEB.2.02.1611042045070.23376@laptop-mg.saclay.inria.fr
State	New
Headers	show Return-Path: <gcc-patches-return-440496-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=gA7H0AyHhqpxdGyll4NosJPMCMp1fRlr0HCzE+WL3kxzJkl4Gm0Q/ HOdlF262UsttKJu0wbzU+9DQsLetLwSR9Yb9hFXnuwHJ5Q9RBpJeRzmgxmTARQXi B6f7aWAM8d/yNkhRiScmoGcJlXo/feFC2NLbn3w3WLHHw1oHRMKXVg= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Date: Fri, 4 Nov 2016 21:07:07 +0100 (CET) From: Marc Glisse <marc.glisse@inria.fr> To: gcc-patches@gcc.gnu.org Subject: Simplify X / X, 0 / X and X % X Message-ID: <alpine.DEB.2.02.1611042045070.23376@laptop-mg.saclay.inria.fr> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323329-903730152-1478290034=:23376"

Message ID

alpine.DEB.2.02.1611042045070.23376@laptop-mg.saclay.inria.fr

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:subject:message-id:mime-version:content-type; q=dns; s=
	default; b=gA7H0AyHhqpxdGyll4NosJPMCMp1fRlr0HCzE+WL3kxzJkl4Gm0Q/
	HOdlF262UsttKJu0wbzU+9DQsLetLwSR9Yb9hFXnuwHJ5Q9RBpJeRzmgxmTARQXi
	B6f7aWAM8d/yNkhRiScmoGcJlXo/feFC2NLbn3w3WLHHw1oHRMKXVg=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
Date: Fri, 4 Nov 2016 21:07:07 +0100 (CET)
From: Marc Glisse <marc.glisse@inria.fr>
To: gcc-patches@gcc.gnu.org
Subject: Simplify X / X, 0 / X and X % X
Message-ID: <alpine.DEB.2.02.1611042045070.23376@laptop-mg.saclay.inria.fr>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="8323329-903730152-1478290034=:23376"

Commit Message

Marc Glisse Nov. 4, 2016, 8:07 p.m. UTC

Hello,

since we were discussing this recently...

The condition is copied from the existing 0 % X case, visible in the 
context of the diff.

As far as I understand, the main case where we do not want to optimize is 
during constexpr evaluation in the C++ front-end (it wants to detect the 
undefined behavior), and with late folding I think this means we only need 
to care about an explicit 0/0, not about X/X where X would become 0 after 
the simplification.

And later, if we do have something like X/0, we could handle it the same 
way as we currently handle *(char*)0, insert a trap after that instruction 
and clear the following code, which likely gives better code than 
replacing 0/0 with 1.

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-11-07  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* match.pd (0 / X, X / X, X % X): New simplifications.

gcc/testsuite/
 	* gcc.dg/tree-ssa/divide-5.c: New file.

Comments

Jeff Law Nov. 5, 2016, 2:30 a.m. UTC | #1

On 11/04/2016 02:07 PM, Marc Glisse wrote:
> Hello,
>
> since we were discussing this recently...
>
> The condition is copied from the existing 0 % X case, visible in the
> context of the diff.
>
> As far as I understand, the main case where we do not want to optimize
> is during constexpr evaluation in the C++ front-end (it wants to detect
> the undefined behavior), and with late folding I think this means we
> only need to care about an explicit 0/0, not about X/X where X would
> become 0 after the simplification.
>
> And later, if we do have something like X/0, we could handle it the same
> way as we currently handle *(char*)0, insert a trap after that
> instruction and clear the following code, which likely gives better code
> than replacing 0/0 with 1.
Yup.  I'd prefer to insert a trap if we ultimately expose a division by 
zero -- including cases where that division occurs as a result of a PHI 
arg being zero and the PHI result being used as a denominator in a 
division expression.

It ought to be extremely easy to detect & transform that case (and 
probably warn for it too).


I'm leaving the actual review to Richi.
jeff

Richard Biener Nov. 7, 2016, 10:01 a.m. UTC | #2

On Fri, Nov 4, 2016 at 9:07 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> since we were discussing this recently...
>
> The condition is copied from the existing 0 % X case, visible in the context
> of the diff.
>
> As far as I understand, the main case where we do not want to optimize is
> during constexpr evaluation in the C++ front-end (it wants to detect the
> undefined behavior), and with late folding I think this means we only need
> to care about an explicit 0/0, not about X/X where X would become 0 after
> the simplification.
>
> And later, if we do have something like X/0, we could handle it the same way
> as we currently handle *(char*)0, insert a trap after that instruction and
> clear the following code, which likely gives better code than replacing 0/0
> with 1.
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Ok.

Thanks,
Richard.

> 2016-11-07  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
>         * match.pd (0 / X, X / X, X % X): New simplifications.
>
> gcc/testsuite/
>         * gcc.dg/tree-ssa/divide-5.c: New file.
>
> --
> Marc Glisse

Richard Biener Nov. 7, 2016, 10:02 a.m. UTC | #3

On Sat, Nov 5, 2016 at 3:30 AM, Jeff Law <law@redhat.com> wrote:
> On 11/04/2016 02:07 PM, Marc Glisse wrote:
>>
>> Hello,
>>
>> since we were discussing this recently...
>>
>> The condition is copied from the existing 0 % X case, visible in the
>> context of the diff.
>>
>> As far as I understand, the main case where we do not want to optimize
>> is during constexpr evaluation in the C++ front-end (it wants to detect
>> the undefined behavior), and with late folding I think this means we
>> only need to care about an explicit 0/0, not about X/X where X would
>> become 0 after the simplification.
>>
>> And later, if we do have something like X/0, we could handle it the same
>> way as we currently handle *(char*)0, insert a trap after that
>> instruction and clear the following code, which likely gives better code
>> than replacing 0/0 with 1.
>
> Yup.  I'd prefer to insert a trap if we ultimately expose a division by zero
> -- including cases where that division occurs as a result of a PHI arg being
> zero and the PHI result being used as a denominator in a division
> expression.
>
> It ought to be extremely easy to detect & transform that case (and probably
> warn for it too).

We have gimple-ssa-isolate-paths.c for that, right?

Richard.

>
>
> I'm leaving the actual review to Richi.
> jeff
>

Jeff Law Nov. 7, 2016, 3:19 p.m. UTC | #4

On 11/07/2016 03:02 AM, Richard Biener wrote:
> On Sat, Nov 5, 2016 at 3:30 AM, Jeff Law <law@redhat.com> wrote:
>> On 11/04/2016 02:07 PM, Marc Glisse wrote:
>>>
>>> Hello,
>>>
>>> since we were discussing this recently...
>>>
>>> The condition is copied from the existing 0 % X case, visible in the
>>> context of the diff.
>>>
>>> As far as I understand, the main case where we do not want to optimize
>>> is during constexpr evaluation in the C++ front-end (it wants to detect
>>> the undefined behavior), and with late folding I think this means we
>>> only need to care about an explicit 0/0, not about X/X where X would
>>> become 0 after the simplification.
>>>
>>> And later, if we do have something like X/0, we could handle it the same
>>> way as we currently handle *(char*)0, insert a trap after that
>>> instruction and clear the following code, which likely gives better code
>>> than replacing 0/0 with 1.
>>
>> Yup.  I'd prefer to insert a trap if we ultimately expose a division by zero
>> -- including cases where that division occurs as a result of a PHI arg being
>> zero and the PHI result being used as a denominator in a division
>> expression.
>>
>> It ought to be extremely easy to detect & transform that case (and probably
>> warn for it too).
>
> We have gimple-ssa-isolate-paths.c for that, right?
Right.   I was thinking about instrumenting for it today to see if it's 
worth any effort.  It shouldn't take more than a few minutes once I 
refamiliarize myself with isolate-paths.





jeff

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 241856)
+++ gcc/match.pd	(working copy)
@@ -133,33 +133,47 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (non_lvalue @0)))
 
 /* Transform x * -1.0 into -x.  */
 (simplify
  (mult @0 real_minus_onep)
   (if (!HONOR_SNANS (type)
        && (!HONOR_SIGNED_ZEROS (type)
            || !COMPLEX_FLOAT_TYPE_P (type)))
    (negate @0)))
 
-/* Make sure to preserve divisions by zero.  This is the reason why
-   we don't simplify x / x to 1 or 0 / x to 0.  */
+/* X * 1, X / 1 -> X.  */
 (for op (mult trunc_div ceil_div floor_div round_div exact_div)
   (simplify
     (op @0 integer_onep)
     (non_lvalue @0)))
 
+/* Preserve explicit divisions by 0: the C++ front-end wants to detect
+   undefined behavior in constexpr evaluation, and assuming that the division
+   traps enables better optimizations than these anyway.  */
 (for div (trunc_div ceil_div floor_div round_div exact_div)
+ /* 0 / X is always zero.  */
+ (simplify
+  (div integer_zerop@0 @1)
+  /* But not for 0 / 0 so that we can get the proper warnings and errors.  */
+  (if (!integer_zerop (@1))
+   @0))
   /* X / -1 is -X.  */
  (simplify
    (div @0 integer_minus_onep@1)
    (if (!TYPE_UNSIGNED (type))
     (negate @0)))
+ /* X / X is one.  */
+ (simplify
+  (div @0 @0)
+  /* But not for 0 / 0 so that we can get the proper warnings and errors.  */
+  (if (!integer_zerop (@0))
+   { build_one_cst (type); }))
  /* X / abs (X) is X < 0 ? -1 : 1.  */ 
  (simplify
    (div:C @0 (abs @0))
    (if (INTEGRAL_TYPE_P (type)
 	&& TYPE_OVERFLOW_UNDEFINED (type))
     (cond (lt @0 { build_zero_cst (type); })
           { build_minus_one_cst (type); } { build_one_cst (type); })))
  /* X / -X is -1.  */
  (simplify
    (div:C @0 (negate @0))
@@ -269,38 +283,42 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	&& !real_zerop (@1))
     (with
      { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @1); }
      (if (tem)
       (mult @0 { tem; } )))
     (if (cst != COMPLEX_CST)
      (with { tree inverse = exact_inverse (type, @1); }
       (if (inverse)
        (mult @0 { inverse; } ))))))))
 
-/* Same applies to modulo operations, but fold is inconsistent here
-   and simplifies 0 % x to 0, only preserving literal 0 % 0.  */
 (for mod (ceil_mod floor_mod round_mod trunc_mod)
  /* 0 % X is always zero.  */
  (simplify
   (mod integer_zerop@0 @1)
   /* But not for 0 % 0 so that we can get the proper warnings and errors.  */
   (if (!integer_zerop (@1))
    @0))
  /* X % 1 is always zero.  */
  (simplify
   (mod @0 integer_onep)
   { build_zero_cst (type); })
  /* X % -1 is zero.  */
  (simplify
   (mod @0 integer_minus_onep@1)
   (if (!TYPE_UNSIGNED (type))
    { build_zero_cst (type); }))
+ /* X % X is zero.  */
+ (simplify
+  (mod @0 @0)
+  /* But not for 0 % 0 so that we can get the proper warnings and errors.  */
+  (if (!integer_zerop (@0))
+   { build_zero_cst (type); }))
  /* (X % Y) % Y is just X % Y.  */
  (simplify
   (mod (mod@2 @0 @1) @1)
   @2)
  /* From extract_muldiv_1: (X * C1) % C2 is zero if C1 is a multiple of C2.  */
  (simplify
   (mod (mult @0 INTEGER_CST@1) INTEGER_CST@2)
   (if (ANY_INTEGRAL_TYPE_P (type)
        && TYPE_OVERFLOW_UNDEFINED (type)
        && wi::multiple_of_p (@1, @2, TYPE_SIGN (type)))
Index: gcc/testsuite/gcc.dg/tree-ssa/divide-5.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/divide-5.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/divide-5.c	(working copy)
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(int x){
+  int y = x;
+  int z = 0;
+  return x / y - x % y + z / y;
+}
+
+/* { dg-final { scan-tree-dump "return 1;" "optimized"} } */

Simplify X / X, 0 / X and X % X

Commit Message

Comments

Patch