Message ID | 4DDBE035.8050901@codesourcery.com |
---|---|
State | New |
Headers | show |
On Tue, 24 May 2011, Andrew Stubbs wrote: > I've created this new, simpler patch that converts > > (extend (mult a b)) > > into > > (mult (extend a) (extend b)) > > regardless of what 'a' and 'b' might be. (These are then simplified and > superfluous extends removed, of course.) Are there some missing conditions here? The two aren't equivalent in general - (extend:SI (mult:HI a b)) multiplies the HImode values in HImode (with modulo arithmetic on overflow) before extending the possibly wrapped result to SImode. You'd need a and b themselves to be extended from narrower modes in such a way that if you interpret the extended values in the signedness of the outer extension, the result of the multiplication is exactly representable in the mode of the multiplication. (For example, if both values are extended from QImode, and all extensions have the same signedness, that would be OK. There are cases that are OK where not all extensions have the same signedness, e.g. (sign_extend:DI (mult:SI a b)) where a and b are zero-extended from HImode or QImode, at least one from QImode, though there the outer extension is equivalent to a zero-extension.)
On 24/05/11 20:35, Joseph S. Myers wrote: > On Tue, 24 May 2011, Andrew Stubbs wrote: > >> I've created this new, simpler patch that converts >> >> (extend (mult a b)) >> >> into >> >> (mult (extend a) (extend b)) >> >> regardless of what 'a' and 'b' might be. (These are then simplified and >> superfluous extends removed, of course.) > > Are there some missing conditions here? The two aren't equivalent in > general - (extend:SI (mult:HI a b)) multiplies the HImode values in HImode > (with modulo arithmetic on overflow) before extending the possibly wrapped > result to SImode. You'd need a and b themselves to be extended from > narrower modes in such a way that if you interpret the extended values in > the signedness of the outer extension, the result of the multiplication is > exactly representable in the mode of the multiplication. (For example, if > both values are extended from QImode, and all extensions have the same > signedness, that would be OK. There are cases that are OK where not all > extensions have the same signedness, e.g. (sign_extend:DI (mult:SI a b)) > where a and b are zero-extended from HImode or QImode, at least one from > QImode, though there the outer extension is equivalent to a > zero-extension.) So, you're saying that promoting a regular multiply to a widening multiply isn't a valid transformation anyway? I suppose that does make sense. I knew something was too easy. OK, I'll go try again. :) Andrew
On Wed, 25 May 2011, Andrew Stubbs wrote: > So, you're saying that promoting a regular multiply to a widening multiply > isn't a valid transformation anyway? I suppose that does make sense. I knew In general, yes. RTL always has modulo semantics (except for division and remainder by -1); all optimizations based on undefinedness of overflow (in the absence of -fwrapv) happen at tree/GIMPLE level, where signed and unsigned types are still distinct. (So you could promote a regular multiply of signed types at GIMPLE level in the absence of -fwrapv/-ftrapv, but not at RTL level and not for unsigned types at GIMPLE level.)
2011-05-24 Bernd Schmidt <bernds@codesourcery.com> Andrew Stubbs <ams@codesourcery.com> gcc/ * simplify-rtx.c (simplify_unary_operation_1): Create a new canonical form for widening multiplies. * doc/md.texi (Canonicalization of Instructions): Document widening multiply canonicalization. gcc/testsuite/ * gcc.target/arm/mla-2.c: New test. --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5840,6 +5840,11 @@ Equality comparisons of a group of bits (usually a single bit) with zero will be written using @code{zero_extract} rather than the equivalent @code{and} or @code{sign_extract} operations. +@cindex @code{mult}, canonicalization of +@item +@code{(sign_extend:@var{m1} (mult:@var{m2} @var{x} @var{y}))} is converted +to @code{(mult:@var{m1} (sign_extend:@var{m1} @var{x}) (sign_extend:@var{m1} @var{y}))}, and likewise for @code{zero_extract}. + @end itemize Further canonicalization rules are defined in the function --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1000,6 +1000,21 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op) && GET_CODE (XEXP (XEXP (op, 0), 1)) == LABEL_REF) return XEXP (op, 0); + /* Convert (sign_extend (mult ..)) to a canonical widening + mulitplication (mult (sign_extend ..) (sign_extend ..)). */ + if (GET_CODE (op) == MULT && GET_MODE (op) < mode) + { + rtx lhs = XEXP (op, 0); + rtx rhs = XEXP (op, 1); + enum machine_mode lhs_mode = GET_MODE (lhs); + enum machine_mode rhs_mode = GET_MODE (rhs); + return simplify_gen_binary (MULT, mode, + simplify_gen_unary (SIGN_EXTEND, mode, + lhs, lhs_mode), + simplify_gen_unary (SIGN_EXTEND, mode, + rhs, rhs_mode)); + } + /* Check for a sign extension of a subreg of a promoted variable, where the promotion is sign-extended, and the target mode is the same as the variable's promotion. */ @@ -1071,6 +1086,21 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op) && GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (XEXP (op, 0)))) return rtl_hooks.gen_lowpart_no_emit (mode, op); + /* Convert (zero_extend (mult ..)) to a canonical widening + mulitplication (mult (zero_extend ..) (zero_extend ..)). */ + if (GET_CODE (op) == MULT && GET_MODE (op) < mode) + { + rtx lhs = XEXP (op, 0); + rtx rhs = XEXP (op, 1); + enum machine_mode lhs_mode = GET_MODE (lhs); + enum machine_mode rhs_mode = GET_MODE (rhs); + return simplify_gen_binary (MULT, mode, + simplify_gen_unary (ZERO_EXTEND, mode, + lhs, lhs_mode), + simplify_gen_unary (ZERO_EXTEND, mode, + rhs, rhs_mode)); + } + /* (zero_extend:M (zero_extend:N <X>)) is (zero_extend:M <X>). */ if (GET_CODE (op) == ZERO_EXTEND) return simplify_gen_unary (ZERO_EXTEND, mode, XEXP (op, 0), --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mla-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long foolong (long long x, short *a, short *b) +{ + return x + *a * *b; +} + +/* { dg-final { scan-assembler "smlalbb" } } */