Message ID | 4C582BD8.3080306@codesourcery.com |
---|---|
State | New |
Headers | show |
On 08/03/10 08:46, Bernd Schmidt wrote: > On 08/03/2010 09:24 AM, Paolo Bonzini wrote: >> On 08/02/2010 10:37 PM, Bernd Schmidt wrote: >>> + if (GET_CODE (op0) == NEG&& CONST_INT_P (trueop1)) >>> + return simplify_gen_binary (MULT, mode, XEXP (op0, 0), >>> + simplify_gen_unary (NEG, mode, op1, mode)); >> Why not go one step further and try it on all operands: >> >> if (GET_CODE (op0) == NEG) >> { >> rtx temp = simplify_unary (NEG, mode, op1, mode); >> if (temp) >> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); >> } >> if (GET_CODE (op1) == NEG) >> { >> rtx temp = simplify_unary (NEG, mode, op0, mode); >> if (temp) >> return simplify_gen_binary (MULT, mode, temp, XEXP (op1, 0)); >> } > Done (slight typo in the above, needs simplify_unary_operation), and > also implemented the opposite transformation in combine: > (minus x (mult y -12345)) > becomes > (plus (mult y 12345) x) > > I've now also looked at code generation on i686, where it also seems to > help occasionally: > - imull $-12, 4(%ecx), %edx > - movl $4, %eax > - subl %edx, %eax > + imull $12, 4(%ecx), %eax > + addl $4, %eax > ========= > - sall $5, %eax > - negl %eax > - imull $-2, %eax, %eax > + sall $6, %eax > > There's a single counterexample I found, in 20040709-2.c: > - imull $-1029531031, %ecx, %ebp > - subl $740551042, %ebp > + imull $1103515245, %ecx, %ebp > + addl $12345, %ebp > + imull $1103515245, %ebp, %ebp > + addl $12345, %ebp > > where an intermediate (minus (const) (mult x const)) is not recognized > as a valid pattern in combine, which then prevents later > transformations. I think it's one of these cases where combine could > benefit from looking at 4 insns. > > Bootstrapped and regression tested on i686-linux. In the ARM tests, > with the previous patch I saw an intermittent segfault on one testcase, > which wasn't reproducible when running the compiler manually, and has > gone away with the new version (tests still running). I think it's > unrelated. OK. jeff
On 08/03/2010 05:00 PM, Jeff Law wrote: > On 08/03/10 08:46, Bernd Schmidt wrote: >> On 08/03/2010 09:24 AM, Paolo Bonzini wrote: >>> Why not go one step further and try it on all operands: >>> >>> if (GET_CODE (op0) == NEG) >>> { >>> rtx temp = simplify_unary (NEG, mode, op1, mode); >>> if (temp) >>> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); >>> } > OK. Actually, do I have to limit that to integer modes? Bernd
On Tue, 2010-08-03 at 17:10 +0200, Bernd Schmidt wrote: > On 08/03/2010 05:00 PM, Jeff Law wrote: > > On 08/03/10 08:46, Bernd Schmidt wrote: > >> On 08/03/2010 09:24 AM, Paolo Bonzini wrote: > >>> Why not go one step further and try it on all operands: > >>> > >>> if (GET_CODE (op0) == NEG) > >>> { > >>> rtx temp = simplify_unary (NEG, mode, op1, mode); > >>> if (temp) > >>> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); > >>> } > > > OK. > > Actually, do I have to limit that to integer modes? > Probably. But it should be fast with fast-math (or re-associate or whatever it's called) enabled. R.
On 08/03/10 09:10, Bernd Schmidt wrote: > On 08/03/2010 05:00 PM, Jeff Law wrote: >> On 08/03/10 08:46, Bernd Schmidt wrote: >>> On 08/03/2010 09:24 AM, Paolo Bonzini wrote: >>>> Why not go one step further and try it on all operands: >>>> >>>> if (GET_CODE (op0) == NEG) >>>> { >>>> rtx temp = simplify_unary (NEG, mode, op1, mode); >>>> if (temp) >>>> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); >>>> } >> OK. > Actually, do I have to limit that to integer modes? Good point. Probably since you're effectively reassociating which can be bad for FP. jeff
On 08/03/2010 05:16 PM, Jeff Law wrote: > On 08/03/10 09:10, Bernd Schmidt wrote: >> On 08/03/2010 05:00 PM, Jeff Law wrote: >>> On 08/03/10 08:46, Bernd Schmidt wrote: >>>> On 08/03/2010 09:24 AM, Paolo Bonzini wrote: >>>>> Why not go one step further and try it on all operands: >>>>> >>>>> if (GET_CODE (op0) == NEG) >>>>> { >>>>> rtx temp = simplify_unary (NEG, mode, op1, mode); >>>>> if (temp) >>>>> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); >>>>> } >>> OK. >> Actually, do I have to limit that to integer modes? > > Good point. Probably since you're effectively reassociating which can be > bad for FP. "-a * b" to "a * -b" is safe even for non-fast-math, try grepping -A.*B in fold-const.c. This of course assumes that simplify_unary_operation itself doesn't do any invalid transformation, but that's a different problem. Paolo
On 08/03/2010 05:28 PM, Paolo Bonzini wrote: > "-a * b" to "a * -b" is safe even for non-fast-math, try grepping -A.*B > in fold-const.c. This of course assumes that simplify_unary_operation > itself doesn't do any invalid transformation, but that's a different > problem. That seems convincing, so I'll check it in as-is. Thanks. Bernd
On Tue, Aug 3, 2010 at 5:28 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > On 08/03/2010 05:16 PM, Jeff Law wrote: >> >> On 08/03/10 09:10, Bernd Schmidt wrote: >>> >>> On 08/03/2010 05:00 PM, Jeff Law wrote: >>>> >>>> On 08/03/10 08:46, Bernd Schmidt wrote: >>>>> >>>>> On 08/03/2010 09:24 AM, Paolo Bonzini wrote: >>>>>> >>>>>> Why not go one step further and try it on all operands: >>>>>> >>>>>> if (GET_CODE (op0) == NEG) >>>>>> { >>>>>> rtx temp = simplify_unary (NEG, mode, op1, mode); >>>>>> if (temp) >>>>>> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); >>>>>> } >>>> >>>> OK. >>> >>> Actually, do I have to limit that to integer modes? >> >> Good point. Probably since you're effectively reassociating which can be >> bad for FP. > > "-a * b" to "a * -b" is safe even for non-fast-math, try grepping -A.*B in > fold-const.c. This of course assumes that simplify_unary_operation itself > doesn't do any invalid transformation, but that's a different problem. It's not safe on RTL, please do not add FP reassociation there. (config/i386/i386.c:ix86_expand_{round,trunc,...} would start to break). Richard.
On 08/03/2010 05:44 PM, Richard Guenther wrote: > It's not safe on RTL, please do not add FP reassociation there. > (config/i386/i386.c:ix86_expand_{round,trunc,...} would start > to break). Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is needed), but I guess I don't understand how things would break - I see no multiply operations in these i386 functions. Can you elaborate? Bernd
On Tue, Aug 3, 2010 at 6:01 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: > On 08/03/2010 05:44 PM, Richard Guenther wrote: >> It's not safe on RTL, please do not add FP reassociation there. >> (config/i386/i386.c:ix86_expand_{round,trunc,...} would start >> to break). > > Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is > needed), but I guess I don't understand how things would break - I see > no multiply operations in these i386 functions. Can you elaborate? It was just a general comment to re-associations of FP on RTL, but more important is that we not start doing constant folding, re-associating should be fine as long as they are valid without any fancy math flags. Richard. > > Bernd >
On Tue, Aug 3, 2010 at 6:06 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > On Tue, Aug 3, 2010 at 6:01 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: >> On 08/03/2010 05:44 PM, Richard Guenther wrote: >>> It's not safe on RTL, please do not add FP reassociation there. >>> (config/i386/i386.c:ix86_expand_{round,trunc,...} would start >>> to break). >> >> Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is >> needed), but I guess I don't understand how things would break - I see >> no multiply operations in these i386 functions. Can you elaborate? > > It was just a general comment to re-associations of FP on RTL, > but more important is that we not start doing constant folding, > re-associating should be fine as long as they are valid without > any fancy math flags. Btw, doing more elaborate re-association on RTL would need carrying PAREN_EXPR support down to RTL, which is an explicit re-association barrier (dropped during expansion because we do not re-associate FP on RTL - sofar). Richard.
On 08/03/2010 06:06 PM, Richard Guenther wrote: > On Tue, Aug 3, 2010 at 6:01 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: >> On 08/03/2010 05:44 PM, Richard Guenther wrote: >>> It's not safe on RTL, please do not add FP reassociation there. >>> (config/i386/i386.c:ix86_expand_{round,trunc,...} would start >>> to break). >> >> Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is >> needed), but I guess I don't understand how things would break - I see >> no multiply operations in these i386 functions. Can you elaborate? > > It was just a general comment to re-associations of FP on RTL, > but more important is that we not start doing constant folding, > re-associating should be fine as long as they are valid without > any fancy math flags. Constant folding should be dealt with in simplify_unary_operation (NEG, ...) as Paolo said. So, what is your opinion about this specific transformation? Should it be enabled with flag_associative_math only? Bernd
On Tue, Aug 3, 2010 at 6:11 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: > On 08/03/2010 06:06 PM, Richard Guenther wrote: >> On Tue, Aug 3, 2010 at 6:01 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: >>> On 08/03/2010 05:44 PM, Richard Guenther wrote: >>>> It's not safe on RTL, please do not add FP reassociation there. >>>> (config/i386/i386.c:ix86_expand_{round,trunc,...} would start >>>> to break). >>> >>> Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is >>> needed), but I guess I don't understand how things would break - I see >>> no multiply operations in these i386 functions. Can you elaborate? >> >> It was just a general comment to re-associations of FP on RTL, >> but more important is that we not start doing constant folding, >> re-associating should be fine as long as they are valid without >> any fancy math flags. > > Constant folding should be dealt with in simplify_unary_operation (NEG, > ...) as Paolo said. > > So, what is your opinion about this specific transformation? Should it > be enabled with flag_associative_math only? The specific transformation is always valid as it doesn't change the outcome of the operation. It can be done unconditionally. Richard. > > Bernd >
On 08/03/2010 06:08 PM, Richard Guenther wrote: > On Tue, Aug 3, 2010 at 6:06 PM, Richard Guenther > <richard.guenther@gmail.com> wrote: >> On Tue, Aug 3, 2010 at 6:01 PM, Bernd Schmidt<bernds@codesourcery.com> wrote: >>> On 08/03/2010 05:44 PM, Richard Guenther wrote: >>>> It's not safe on RTL, please do not add FP reassociation there. >>>> (config/i386/i386.c:ix86_expand_{round,trunc,...} would start >>>> to break). >>> >>> Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is >>> needed), but I guess I don't understand how things would break - I see >>> no multiply operations in these i386 functions. Can you elaborate? >> >> It was just a general comment to re-associations of FP on RTL, >> but more important is that we not start doing constant folding, >> re-associating should be fine as long as they are valid without >> any fancy math flags. > > Btw, doing more elaborate re-association on RTL would need > carrying PAREN_EXPR support down to RTL, which is an > explicit re-association barrier (dropped during expansion because > we do not re-associate FP on RTL - sofar). Looks like a bad idea... If someone ever introduces some world-shaking transform on RTL that requires reassociation he/she'll have to live with requiring -ffast-math or whatnot. Paolo
On Tue, Aug 3, 2010 at 6:35 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > On 08/03/2010 06:08 PM, Richard Guenther wrote: >> >> On Tue, Aug 3, 2010 at 6:06 PM, Richard Guenther >> <richard.guenther@gmail.com> wrote: >>> >>> On Tue, Aug 3, 2010 at 6:01 PM, Bernd Schmidt<bernds@codesourcery.com> >>> wrote: >>>> >>>> On 08/03/2010 05:44 PM, Richard Guenther wrote: >>>>> >>>>> It's not safe on RTL, please do not add FP reassociation there. >>>>> (config/i386/i386.c:ix86_expand_{round,trunc,...} would start >>>>> to break). >>>> >>>> Not a problem (I guess a !FLOAT_MODE_P || flag_associative_math test is >>>> needed), but I guess I don't understand how things would break - I see >>>> no multiply operations in these i386 functions. Can you elaborate? >>> >>> It was just a general comment to re-associations of FP on RTL, >>> but more important is that we not start doing constant folding, >>> re-associating should be fine as long as they are valid without >>> any fancy math flags. >> >> Btw, doing more elaborate re-association on RTL would need >> carrying PAREN_EXPR support down to RTL, which is an >> explicit re-association barrier (dropped during expansion because >> we do not re-associate FP on RTL - sofar). > > Looks like a bad idea... If someone ever introduces some world-shaking > transform on RTL that requires reassociation he/she'll have to live with > requiring -ffast-math or whatnot. No, the point is that even with -ffast-math reassociating over PAREN_EXPR is invalid. Richard. > Paolo >
On Tue, Aug 3, 2010 at 7:46 AM, Bernd Schmidt <bernds@codesourcery.com> wrote: > On 08/03/2010 09:24 AM, Paolo Bonzini wrote: >> On 08/02/2010 10:37 PM, Bernd Schmidt wrote: >>> + if (GET_CODE (op0) == NEG && CONST_INT_P (trueop1)) >>> + return simplify_gen_binary (MULT, mode, XEXP (op0, 0), >>> + simplify_gen_unary (NEG, mode, op1, mode)); >> >> Why not go one step further and try it on all operands: >> >> if (GET_CODE (op0) == NEG) >> { >> rtx temp = simplify_unary (NEG, mode, op1, mode); >> if (temp) >> return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); >> } >> if (GET_CODE (op1) == NEG) >> { >> rtx temp = simplify_unary (NEG, mode, op0, mode); >> if (temp) >> return simplify_gen_binary (MULT, mode, temp, XEXP (op1, 0)); >> } > > Done (slight typo in the above, needs simplify_unary_operation), and > also implemented the opposite transformation in combine: > (minus x (mult y -12345)) > becomes > (plus (mult y 12345) x) > > I've now also looked at code generation on i686, where it also seems to > help occasionally: > - imull $-12, 4(%ecx), %edx > - movl $4, %eax > - subl %edx, %eax > + imull $12, 4(%ecx), %eax > + addl $4, %eax > ========= > - sall $5, %eax > - negl %eax > - imull $-2, %eax, %eax > + sall $6, %eax > > There's a single counterexample I found, in 20040709-2.c: > - imull $-1029531031, %ecx, %ebp > - subl $740551042, %ebp > + imull $1103515245, %ecx, %ebp > + addl $12345, %ebp > + imull $1103515245, %ebp, %ebp > + addl $12345, %ebp > > where an intermediate (minus (const) (mult x const)) is not recognized > as a valid pattern in combine, which then prevents later > transformations. I think it's one of these cases where combine could > benefit from looking at 4 insns. > > Bootstrapped and regression tested on i686-linux. In the ARM tests, > with the previous patch I saw an intermittent segfault on one testcase, > which wasn't reproducible when running the compiler manually, and has > gone away with the new version (tests still running). I think it's > unrelated. > > This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45182
Index: config/arm/constraints.md =================================================================== --- config/arm/constraints.md.orig +++ config/arm/constraints.md @@ -121,7 +121,7 @@ "In Thumb-1 state a constant that is a multiple of 4 in the range 0-1020." (and (match_code "const_int") (match_test "TARGET_32BIT ? ((ival >= 0 && ival <= 32) - || ((ival & (ival - 1)) == 0)) + || (((ival & (ival - 1)) & 0xFFFFFFFF) == 0)) : ival >= 0 && ival <= 1020 && (ival & 3) == 0"))) (define_constraint "N" Index: config/arm/predicates.md =================================================================== --- config/arm/predicates.md.orig +++ config/arm/predicates.md @@ -289,7 +289,7 @@ (define_predicate "power_of_two_operand" (match_code "const_int") { - HOST_WIDE_INT value = INTVAL (op); + unsigned HOST_WIDE_INT value = INTVAL (op) & 0xffffffff; return value != 0 && (value & (value - 1)) == 0; }) Index: simplify-rtx.c =================================================================== --- simplify-rtx.c.orig +++ simplify-rtx.c @@ -2109,6 +2109,19 @@ simplify_binary_operation_1 (enum rtx_co if (trueop1 == constm1_rtx) return simplify_gen_unary (NEG, mode, op0, mode); + if (GET_CODE (op0) == NEG) + { + rtx temp = simplify_unary_operation (NEG, mode, op1, mode); + if (temp) + return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); + } + if (GET_CODE (op1) == NEG) + { + rtx temp = simplify_unary_operation (NEG, mode, op0, mode); + if (temp) + return simplify_gen_binary (MULT, mode, temp, XEXP (op1, 0)); + } + /* Maybe simplify x * 0 to 0. The reduction is not valid if x is NaN, since x * 0 is then also NaN. Nor is it valid when the mode has signed zeros, since multiplying a negative Index: combine.c =================================================================== --- combine.c.orig +++ combine.c @@ -7110,13 +7110,79 @@ make_compound_operation (rtx x, enum rtx && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT && INTVAL (XEXP (x, 1)) >= 0) { + HOST_WIDE_INT count = INTVAL (XEXP (x, 1)); + HOST_WIDE_INT multval = (HOST_WIDE_INT) 1 << count; + new_rtx = make_compound_operation (XEXP (x, 0), next_code); - new_rtx = gen_rtx_MULT (mode, new_rtx, - GEN_INT ((HOST_WIDE_INT) 1 - << INTVAL (XEXP (x, 1)))); + if (GET_CODE (new_rtx) == NEG) + { + new_rtx = XEXP (new_rtx, 0); + multval = -multval; + } + multval = trunc_int_for_mode (multval, mode); + new_rtx = gen_rtx_MULT (mode, new_rtx, GEN_INT (multval)); } break; + case PLUS: + lhs = XEXP (x, 0); + rhs = XEXP (x, 1); + lhs = make_compound_operation (lhs, MEM); + rhs = make_compound_operation (rhs, MEM); + if (GET_CODE (lhs) == MULT && GET_CODE (XEXP (lhs, 0)) == NEG + && SCALAR_INT_MODE_P (mode)) + { + tem = simplify_gen_binary (MULT, mode, XEXP (XEXP (lhs, 0), 0), + XEXP (lhs, 1)); + new_rtx = simplify_gen_binary (MINUS, mode, rhs, tem); + } + else if (GET_CODE (lhs) == MULT + && (CONST_INT_P (XEXP (lhs, 1)) && INTVAL (XEXP (lhs, 1)) < 0)) + { + tem = simplify_gen_binary (MULT, mode, XEXP (lhs, 0), + simplify_gen_unary (NEG, mode, + XEXP (lhs, 1), + mode)); + new_rtx = simplify_gen_binary (MINUS, mode, rhs, tem); + } + else + { + SUBST (XEXP (x, 0), lhs); + SUBST (XEXP (x, 1), rhs); + goto maybe_swap; + } + x = gen_lowpart (mode, new_rtx); + goto maybe_swap; + + case MINUS: + lhs = XEXP (x, 0); + rhs = XEXP (x, 1); + lhs = make_compound_operation (lhs, MEM); + rhs = make_compound_operation (rhs, MEM); + if (GET_CODE (rhs) == MULT && GET_CODE (XEXP (rhs, 0)) == NEG + && SCALAR_INT_MODE_P (mode)) + { + tem = simplify_gen_binary (MULT, mode, XEXP (XEXP (rhs, 0), 0), + XEXP (rhs, 1)); + new_rtx = simplify_gen_binary (PLUS, mode, tem, lhs); + } + else if (GET_CODE (rhs) == MULT + && (CONST_INT_P (XEXP (rhs, 1)) && INTVAL (XEXP (rhs, 1)) < 0)) + { + tem = simplify_gen_binary (MULT, mode, XEXP (rhs, 0), + simplify_gen_unary (NEG, mode, + XEXP (rhs, 1), + mode)); + new_rtx = simplify_gen_binary (PLUS, mode, tem, lhs); + } + else + { + SUBST (XEXP (x, 0), lhs); + SUBST (XEXP (x, 1), rhs); + return x; + } + return gen_lowpart (mode, new_rtx); + case AND: /* If the second operand is not a constant, we can't do anything with it. */ @@ -7345,6 +7411,7 @@ make_compound_operation (rtx x, enum rtx SUBST (XVECEXP (x, i, j), new_rtx); } + maybe_swap: /* If this is a commutative operation, the changes to the operands may have made it noncanonical. */ if (COMMUTATIVE_ARITH_P (x)