Message ID | 20160519111819.GA24787@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On 05/19/2016 05:18 AM, Dominik Vogt wrote: > On Mon, May 16, 2016 at 01:09:36PM -0600, Jeff Law wrote: >> > On 05/11/2016 02:52 AM, Dominik Vogt wrote: >>> > >On Wed, May 11, 2016 at 10:40:11AM +0200, Bernd Schmidt wrote: >>> > >That's what I mentioned somewhere during the discussion. The s390 >>> > >backend just uses COSTS_N_INSNS(1) for AND as well as ZERO_EXTEND, >>> > >so this won't ever trigger. I just left the rtx_cost call in the >>> > >patch for further discussion as Jeff said he liked the approach. >>> > >We don't need it to achieve the behaviour we want for s390. >> > I liked it, just based on the general theory that we should be >> > comparing costs of a transform to the original much more often than >> > we currently do. >> > >> > If Bernd prefers it gone and you don't need it to achieve your >> > goals, then I won't object to the costing stuff going away. > All right, third version attached, without the rtx_vost call; > bootstrapped and regression tested on s390, s390x, x86_64. > > On Wed, Apr 27, 2016 at 09:20:05AM +0100, Dominik Vogt wrote: >> > The attached patch is a result of discussing an S/390 issue with >> > "and with complement" in some cases. >> > >> > https://gcc.gnu.org/ml/gcc/2016-03/msg00163.html >> > https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01586.html >> > >> > Combine would merge a ZERO_EXTEND and a SET taking the known zero >> > bits into account, resulting in an AND. Later on, >> > make_compound_operation() fails to replace that with a ZERO_EXTEND >> > which we get for free on S/390 but leaves the AND, eventually >> > resulting in two consecutive AND instructions. >> > >> > The current code in make_compound_operation() that detects >> > opportunities for ZERO_EXTEND does not work here because it does >> > not take the known zero bits into account: >> > >> > /* If the constant is one less than a power of two, this might be >> > representable by an extraction even if no shift is present. >> > If it doesn't end up being a ZERO_EXTEND, we will ignore it unless >> > we are in a COMPARE. */ >> > else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0) >> > new_rtx = make_extraction (mode, >> > make_compound_operation (XEXP (x, 0), >> > next_code), >> > 0, NULL_RTX, i, 1, 0, in_code == COMPARE); >> > >> > An attempt to use the zero bits in the above conditions resulted >> > in many situations that generated worse code, so the patch tries >> > to fix this in a more conservative way. While the effect is >> > completely positive on S/390, this will very likely have >> > unforeseeable consequences on other targets. >> > >> > Bootstrapped and regression tested on s390 and s390x only at the >> > moment. > Ciao > > Dominik ^_^ ^_^ > > -- Dominik Vogt IBM Germany > > > 0001-v3-ChangeLog > > > gcc/ChangeLog > > * combine.c (make_compound_operation): Take known zero bits into > account when checking for possible zero_extend. > gcc/testsuite/ChangeLog > > * gcc.dg/zero_bits_compound-1.c: New test. > * gcc.dg/zero_bits_compound-2.c: New test. I'm a little worried about the tests. They check for lp64, but the tests actually need a stronger set of conditions to pass. I'm thinking that the tests ought to be opt-in as I don't think we have a set of effective-target tests we can use. So OK with the tests restricted to the targets where you've verified they work. Jeff
From dde191ad76255c0826a546b03f441af748edbd77 Mon Sep 17 00:00:00 2001 From: Dominik Vogt <vogt@linux.vnet.ibm.com> Date: Tue, 12 Apr 2016 09:53:46 +0100 Subject: [PATCH] Take known zero bits into account when checking extraction. Allows AND Insns with a const_int operand to be expressed as ZERO_EXTEND if the operand ist a power of 2 - 1 even with the known zero bits masked out. --- gcc/combine.c | 28 ++++++++++++++++++ gcc/testsuite/gcc.dg/zero_bits_compound-1.c | 44 +++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/zero_bits_compound-2.c | 39 +++++++++++++++++++++++++ 3 files changed, 111 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/zero_bits_compound-1.c create mode 100644 gcc/testsuite/gcc.dg/zero_bits_compound-2.c diff --git a/gcc/combine.c b/gcc/combine.c index 3554f51..97d59d7 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7988,6 +7988,34 @@ make_compound_operation (rtx x, enum rtx_code in_code) next_code), i, NULL_RTX, 1, 1, 0, 1); + /* If the one operand is a paradoxical subreg of a register or memory and + the constant (limited to the smaller mode) has only zero bits where + the sub expression has known zero bits, this can be expressed as + a zero_extend. */ + else if (GET_CODE (XEXP (x, 0)) == SUBREG) + { + rtx sub; + + sub = XEXP (XEXP (x, 0), 0); + machine_mode sub_mode = GET_MODE (sub); + if ((REG_P (sub) || MEM_P (sub)) + && GET_MODE_PRECISION (sub_mode) < mode_width) + { + unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode); + unsigned HOST_WIDE_INT mask; + + /* original AND constant with all the known zero bits set */ + mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode)); + if ((mask & mode_mask) == mode_mask) + { + new_rtx = make_compound_operation (sub, next_code); + new_rtx = make_extraction (mode, new_rtx, 0, 0, + GET_MODE_PRECISION (sub_mode), + 1, 0, in_code == COMPARE); + } + } + } + break; case LSHIFTRT: diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c new file mode 100644 index 0000000..731907f --- /dev/null +++ b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c @@ -0,0 +1,44 @@ +/* Test whether an AND mask or'ed with the know zero bits that equals a mode + mask is a candidate for zero extendion. */ + +/* Note: This test requires that char, int and long have different sizes and the + target has a way to do 32 -> 64 bit zero extension other than AND. Targets + that fail the test because they do not satisfy these preconditions can skip + it. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O3 -dP" } */ + +unsigned long foo (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x0ff0ff00; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +unsigned long bar (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x0ffffff0; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +/* Check that no pattern containing an AND expression was used. */ +/* { dg-final { scan-assembler-not "\\(and:" } } */ diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-2.c b/gcc/testsuite/gcc.dg/zero_bits_compound-2.c new file mode 100644 index 0000000..a509966 --- /dev/null +++ b/gcc/testsuite/gcc.dg/zero_bits_compound-2.c @@ -0,0 +1,39 @@ +/* Test whether an AND mask or'ed with the know zero bits that equals a mode + mask is a candidate for zero extendion. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O3 -dP" } */ + +unsigned long foo (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x0fe0fe00; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +unsigned long bar (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x07f007f0; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +/* Check that an AND expression was used. */ +/* { dg-final { scan-assembler-times "\\(and:" 2 } } */ -- 2.3.0