Message ID | 20110330160546.GP18914@tyan-ft48-01.lab.bos.redhat.com |
---|---|
State | New |
Headers | show |
On Wed, Mar 30, 2011 at 6:05 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > MEM_REFs which can represent type punning on lhs don't force > non-gimple types to be addressable. This causes various problems > in the expander, which wasn't prepared to handle that. > > This patch tries to fix what I've found and adds a bunch of > testcases. The original report was with just -O2 on some large testcase > from Eigen, most of the testcases have -fno-tree-sra just because > I've given up on delta when it stopped reducing at 32KB. > > The first problem (the one from Eigen) is _mm_store_pd into > a std::complex<double> var, which is a single field and thus > has DCmode TYPE_MODE. As starting with 4.6 that var is not > TREE_ADDRESSABLE, its DECL_RTL is a CONCAT, and for assignments > to concat expand_assignment was expecting either that > from has COMPLEX_TYPE (and matching mode to the store), or > that it is a real or imaginary subpart store, thus when > trying to store a V2DF mode value it expected it to be > real part store (bitpos == 0) and tried to set a DFmode pseudo > from V2DFmode rtx. > Further testing revealed that it is possible to hit many other > cases with CONCAT destination, it can be a store of just a few bits, > or can overlap parts of both real and imaginary, or be partially > out of bounds. > The patch handles the case from Eigen - bitpos == 0 bitsize == > GET_MODE_BITSIZE (GET_MODE (to_rtx)) non-COMPLEX_TYPE by setting > each half separately, if it is a store which is not touching > one of the parts by just adjusting bitpos for the imaginary > case and storing just to one of the parts (this is > the bitpos + bitsize < half_bitsize resp. bitpos >= half_bitsize > case) and finally adds a generic slow one for the very unusual > cases with partial overlap of both. > > After testing it with the testcases I wrote, I found a bunch of > other ICEs though, and reproduced them even without CONCAT > on the LHS (the testcases below which don't contain any _Complex > keyword). > > Bootstrapped/regtested on x86_64-linux and i686-linux, > regtested with a cross compiler on these new testcases also > for powerpc{,64}-linux and s390{,x}-linux. > > Ok for trunk and after a while for 4.6? Looks good to me. 2nd eyes welcome though. Thanks, Richard. > 2011-03-30 Jakub Jelinek <jakub@redhat.com> > > PR middle-end/48335 > * expr.c (expand_assignment): Handle all possibilities > if TO_RTX is CONCAT. > * expmed.c (store_bit_field_1): Avoid trying to create > invalid SUBREGs. > (store_split_bit_field): If SUBREG_REG (op0) or > op0 itself has smaller mode than word, return it > for offset 0 and const0_rtx for out of bounds stores. > If word is const0_rtx, skip it. > > * gcc.c-torture/compile/pr48335-1.c: New test. > * gcc.dg/pr48335-1.c: New test. > * gcc.dg/pr48335-2.c: New test. > * gcc.dg/pr48335-3.c: New test. > * gcc.dg/pr48335-4.c: New test. > * gcc.dg/pr48335-5.c: New test. > * gcc.dg/pr48335-6.c: New test. > * gcc.dg/pr48335-7.c: New test. > * gcc.dg/pr48335-8.c: New test. > * gcc.target/i386/pr48335-1.c: New test. > > --- gcc/expr.c.jj 2011-03-23 17:15:55.000000000 +0100 > +++ gcc/expr.c 2011-03-30 11:38:15.000000000 +0200 > @@ -4278,16 +4278,47 @@ expand_assignment (tree to, tree from, b > /* Handle expand_expr of a complex value returning a CONCAT. */ > else if (GET_CODE (to_rtx) == CONCAT) > { > - if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))) > + unsigned short mode_bitsize = GET_MODE_BITSIZE (GET_MODE (to_rtx)); > + if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from))) > + && bitpos == 0 > + && bitsize == mode_bitsize) > + result = store_expr (from, to_rtx, false, nontemporal); > + else if (bitsize == mode_bitsize / 2 > + && (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1))) > + result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > + nontemporal); > + else if (bitpos + bitsize <= mode_bitsize / 2) > + result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, > + mode1, from, TREE_TYPE (tem), > + get_alias_set (to), nontemporal); > + else if (bitpos >= mode_bitsize / 2) > + result = store_field (XEXP (to_rtx, 1), bitsize, > + bitpos - mode_bitsize / 2, mode1, from, > + TREE_TYPE (tem), get_alias_set (to), > + nontemporal); > + else if (bitpos == 0 && bitsize == mode_bitsize) > { > - gcc_assert (bitpos == 0); > - result = store_expr (from, to_rtx, false, nontemporal); > + rtx from_rtx = expand_normal (from); > + from_rtx = simplify_gen_subreg (GET_MODE (to_rtx), from_rtx, > + TYPE_MODE (TREE_TYPE (from)), 0); > + emit_move_insn (XEXP (to_rtx, 0), > + read_complex_part (from_rtx, false)); > + emit_move_insn (XEXP (to_rtx, 1), > + read_complex_part (from_rtx, true)); > + result = NULL; > } > else > { > - gcc_assert (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1)); > - result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > - nontemporal); > + rtx temp = assign_stack_temp (GET_MODE (to_rtx), > + GET_MODE_SIZE (GET_MODE (to_rtx)), > + 0); > + write_complex_part (temp, XEXP (to_rtx, 0), false); > + write_complex_part (temp, XEXP (to_rtx, 1), true); > + result = store_field (temp, bitsize, bitpos, mode1, from, > + TREE_TYPE (tem), get_alias_set (to), > + nontemporal); > + emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); > + emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); > } > } > else > --- gcc/expmed.c.jj 2011-03-23 17:15:55.000000000 +0100 > +++ gcc/expmed.c 2011-03-30 13:37:54.000000000 +0200 > @@ -422,7 +422,10 @@ store_bit_field_1 (rtx str_rtx, unsigned > && byte_offset % GET_MODE_SIZE (fieldmode) == 0) > : (! SLOW_UNALIGNED_ACCESS (fieldmode, MEM_ALIGN (op0)) > || (offset * BITS_PER_UNIT % bitsize == 0 > - && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0)))) > + && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0))) > + && (MEM_P (op0) > + || GET_MODE (op0) == fieldmode > + || validate_subreg (fieldmode, GET_MODE (op0), op0, byte_offset))) > { > if (MEM_P (op0)) > op0 = adjust_address (op0, fieldmode, offset); > @@ -479,6 +482,7 @@ store_bit_field_1 (rtx str_rtx, unsigned > struct expand_operand ops[2]; > enum insn_code icode = optab_handler (movstrict_optab, fieldmode); > rtx arg0 = op0; > + unsigned HOST_WIDE_INT subreg_off; > > if (GET_CODE (arg0) == SUBREG) > { > @@ -491,15 +495,18 @@ store_bit_field_1 (rtx str_rtx, unsigned > arg0 = SUBREG_REG (arg0); > } > > - arg0 = gen_rtx_SUBREG (fieldmode, arg0, > - (bitnum % BITS_PER_WORD) / BITS_PER_UNIT > - + (offset * UNITS_PER_WORD)); > - > - create_fixed_operand (&ops[0], arg0); > - /* Shrink the source operand to FIELDMODE. */ > - create_convert_operand_to (&ops[1], value, fieldmode, false); > - if (maybe_expand_insn (icode, 2, ops)) > - return true; > + subreg_off = (bitnum % BITS_PER_WORD) / BITS_PER_UNIT > + + (offset * UNITS_PER_WORD); > + if (validate_subreg (fieldmode, GET_MODE (arg0), arg0, subreg_off)) > + { > + arg0 = gen_rtx_SUBREG (fieldmode, arg0, subreg_off); > + > + create_fixed_operand (&ops[0], arg0); > + /* Shrink the source operand to FIELDMODE. */ > + create_convert_operand_to (&ops[1], value, fieldmode, false); > + if (maybe_expand_insn (icode, 2, ops)) > + return true; > + } > } > > /* Handle fields bigger than a word. */ > @@ -1045,22 +1052,32 @@ store_split_bit_field (rtx op0, unsigned > if (GET_CODE (op0) == SUBREG) > { > int word_offset = (SUBREG_BYTE (op0) / UNITS_PER_WORD) + offset; > - word = operand_subword_force (SUBREG_REG (op0), word_offset, > - GET_MODE (SUBREG_REG (op0))); > + enum machine_mode sub_mode = GET_MODE (SUBREG_REG (op0)); > + if (sub_mode != BLKmode && GET_MODE_SIZE (sub_mode) < UNITS_PER_WORD) > + word = word_offset ? const0_rtx : op0; > + else > + word = operand_subword_force (SUBREG_REG (op0), word_offset, > + GET_MODE (SUBREG_REG (op0))); > offset = 0; > } > else if (REG_P (op0)) > { > - word = operand_subword_force (op0, offset, GET_MODE (op0)); > + enum machine_mode op0_mode = GET_MODE (op0); > + if (op0_mode != BLKmode && GET_MODE_SIZE (op0_mode) < UNITS_PER_WORD) > + word = offset ? const0_rtx : op0; > + else > + word = operand_subword_force (op0, offset, GET_MODE (op0)); > offset = 0; > } > else > word = op0; > > /* OFFSET is in UNITs, and UNIT is in bits. > - store_fixed_bit_field wants offset in bytes. */ > - store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, > - thispos, part); > + store_fixed_bit_field wants offset in bytes. If WORD is const0_rtx, > + it is jut an out of bounds access. Ignore it. */ > + if (word != const0_rtx) > + store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, > + thispos, part); > bitsdone += thissize; > } > } > --- gcc/testsuite/gcc.c-torture/compile/pr48335-1.c.jj 2011-03-30 11:32:24.000000000 +0200 > +++ gcc/testsuite/gcc.c-torture/compile/pr48335-1.c 2011-03-30 11:31:47.000000000 +0200 > @@ -0,0 +1,41 @@ > +/* PR middle-end/48335 */ > + > +struct S { float d; }; > + > +void bar (struct S); > + > +void > +f0 (int x) > +{ > + struct S s = {.d = 0.0f }; > + ((char *) &s.d)[0] = x; > + s.d *= 7.0; > + bar (s); > +} > + > +void > +f1 (int x) > +{ > + struct S s = {.d = 0.0f }; > + ((char *) &s.d)[1] = x; > + s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (int x) > +{ > + struct S s = {.d = 0.0f }; > + ((char *) &s.d)[2] = x; > + s.d *= 7.0; > + bar (s); > +} > + > +void > +f3 (int x) > +{ > + struct S s = {.d = 0.0f }; > + ((char *) &s.d)[3] = x; > + s.d *= 7.0; > + bar (s); > +} > --- gcc/testsuite/gcc.dg/pr48335-1.c.jj 2011-03-30 10:57:29.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-1.c 2011-03-29 18:28:03.000000000 +0200 > @@ -0,0 +1,48 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef long long T __attribute__((may_alias)); > + > +struct S > +{ > + _Complex float d __attribute__((aligned (8))); > +}; > + > +void bar (struct S); > + > +void > +f1 (T x) > +{ > + struct S s; > + *(T *) &s.d = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (int x) > +{ > + struct S s = { .d = 0.0f }; > + *(char *) &s.d = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f3 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((char *) &s.d)[2] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f4 (int x, int y) > +{ > + struct S s = { .d = 0.0f }; > + ((char *) &s.d)[y] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > --- gcc/testsuite/gcc.dg/pr48335-2.c.jj 2011-03-30 10:57:29.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-2.c 2011-03-29 18:27:53.000000000 +0200 > @@ -0,0 +1,58 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef long long T __attribute__((may_alias, aligned (1))); > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + _Complex float d __attribute__((aligned (8))); > +}; > + > +void bar (struct S); > + > +void > +f1 (T x) > +{ > + struct S s; > + *(T *) ((char *) &s.d + 1) = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[0] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f3 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[1] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f4 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[2] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f5 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[3] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > --- gcc/testsuite/gcc.dg/pr48335-3.c.jj 2011-03-30 11:43:38.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-3.c 2011-03-30 11:42:54.000000000 +0200 > @@ -0,0 +1,48 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + double d; > +}; > + > +void bar (struct S); > + > +void > +f1 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[0] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[1] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f3 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[2] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f4 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[3] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > --- gcc/testsuite/gcc.dg/pr48335-4.c.jj 2011-03-30 12:59:11.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-4.c 2011-03-30 12:55:51.000000000 +0200 > @@ -0,0 +1,39 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + double d; > +}; > + > +void bar (struct S); > + > +void > +f1 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[-1] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[-2] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f3 (int x) > +{ > + struct S s = { .d = 0.0 }; > + ((U *)((char *) &s.d + 1))[5] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > --- gcc/testsuite/gcc.dg/pr48335-5.c.jj 2011-03-30 13:09:30.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-5.c 2011-03-30 13:01:10.000000000 +0200 > @@ -0,0 +1,38 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef long long T __attribute__((may_alias)); > + > +struct S > +{ > + _Complex float d __attribute__((aligned (8))); > +}; > + > +int > +f1 (struct S x) > +{ > + struct S s = x; > + return *(T *) &s.d; > +} > + > +int > +f2 (struct S x) > +{ > + struct S s = x; > + return *(char *) &s.d; > +} > + > +int > +f3 (struct S x) > +{ > + struct S s = x; > + return ((char *) &s.d)[2]; > +} > + > +int > +f4 (struct S x, int y) > +{ > + struct S s = x; > + return ((char *) &s.d)[y]; > +} > --- gcc/testsuite/gcc.dg/pr48335-6.c.jj 2011-03-30 13:09:30.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-6.c 2011-03-30 13:04:58.000000000 +0200 > @@ -0,0 +1,46 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef long long T __attribute__((may_alias, aligned (1))); > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + _Complex float d __attribute__((aligned (8))); > +}; > + > +T > +f1 (struct S x) > +{ > + struct S s = x; > + return *(T *) ((char *) &s.d + 1); > +} > + > +int > +f2 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[0]; > +} > + > +int > +f3 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[1]; > +} > + > +int > +f4 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[2]; > +} > + > +int > +f5 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[3]; > +} > --- gcc/testsuite/gcc.dg/pr48335-7.c.jj 2011-03-30 13:09:30.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-7.c 2011-03-30 13:07:17.000000000 +0200 > @@ -0,0 +1,38 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + double d; > +}; > + > +int > +f1 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[0]; > +} > + > +int > +f2 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[1]; > +} > + > +int > +f3 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[2]; > +} > + > +int > +f4 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[3]; > +} > --- gcc/testsuite/gcc.dg/pr48335-8.c.jj 2011-03-30 13:09:30.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-8.c 2011-03-30 13:08:17.000000000 +0200 > @@ -0,0 +1,31 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + double d; > +}; > + > +int > +f1 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[-1]; > +} > + > +int > +f2 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[-2]; > +} > + > +int > +f3 (struct S x) > +{ > + struct S s = x; > + return ((U *)((char *) &s.d + 1))[5]; > +} > --- gcc/testsuite/gcc.target/i386/pr48335-1.c.jj 2011-03-30 10:57:16.000000000 +0200 > +++ gcc/testsuite/gcc.target/i386/pr48335-1.c 2011-03-29 18:28:12.000000000 +0200 > @@ -0,0 +1,32 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra -msse2" } */ > + > +#include <emmintrin.h> > + > +typedef __float128 T __attribute__((may_alias)); > + > +struct S > +{ > + _Complex double d __attribute__((aligned (16))); > +}; > + > +void bar (struct S); > + > +void > +f1 (T x) > +{ > + struct S s; > + *(T *) &s.d = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (__m128d x) > +{ > + struct S s; > + _mm_store_pd ((double *) &s.d, x); > + __real__ s.d *= 7.0; > + bar (s); > +} > > Jakub >
> --- gcc/expr.c.jj 2011-03-23 17:15:55.000000000 +0100 > +++ gcc/expr.c 2011-03-30 11:38:15.000000000 +0200 > @@ -4278,16 +4278,47 @@ expand_assignment (tree to, tree from, b > /* Handle expand_expr of a complex value returning a CONCAT. */ > else if (GET_CODE (to_rtx) == CONCAT) > { > - if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))) > + unsigned short mode_bitsize = GET_MODE_BITSIZE (GET_MODE (to_rtx)); > + if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from))) > + && bitpos == 0 > + && bitsize == mode_bitsize) > + result = store_expr (from, to_rtx, false, nontemporal); > + else if (bitsize == mode_bitsize / 2 > + && (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1))) > + result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > + nontemporal); Why GET_MODE_BITSIZE (mode1) and not mode_bitsize / 2 here? > { > - gcc_assert (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1)); > - result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > - nontemporal); > + rtx temp = assign_stack_temp (GET_MODE (to_rtx), > + GET_MODE_SIZE (GET_MODE (to_rtx)), > + 0); > + write_complex_part (temp, XEXP (to_rtx, 0), false); > + write_complex_part (temp, XEXP (to_rtx, 1), true); > + result = store_field (temp, bitsize, bitpos, mode1, from, > + TREE_TYPE (tem), get_alias_set (to), > + nontemporal); > + emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); > + emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); > } Can't you add result = NULL at the end of the block? > : (! SLOW_UNALIGNED_ACCESS (fieldmode, MEM_ALIGN (op0)) > : > || (offset * BITS_PER_UNIT % bitsize == 0 > > - && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0)))) > + && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0))) > + && (MEM_P (op0) > + || GET_MODE (op0) == fieldmode > + || validate_subreg (fieldmode, GET_MODE (op0), op0, byte_offset))) This partially duplicates the existing test. Can't the new code be retrofitted into the existing test? > @@ -1045,22 +1052,32 @@ store_split_bit_field (rtx op0, unsigned > if (GET_CODE (op0) == SUBREG) > { > int word_offset = (SUBREG_BYTE (op0) / UNITS_PER_WORD) + offset; > - word = operand_subword_force (SUBREG_REG (op0), word_offset, > - GET_MODE (SUBREG_REG (op0))); > + enum machine_mode sub_mode = GET_MODE (SUBREG_REG (op0)); > + if (sub_mode != BLKmode && GET_MODE_SIZE (sub_mode) < UNITS_PER_WORD) > + word = word_offset ? const0_rtx : op0; > + else > + word = operand_subword_force (SUBREG_REG (op0), word_offset, > + GET_MODE (SUBREG_REG (op0))); > offset = 0; > } > else if (REG_P (op0)) > { > - word = operand_subword_force (op0, offset, GET_MODE (op0)); > + enum machine_mode op0_mode = GET_MODE (op0); > + if (op0_mode != BLKmode && GET_MODE_SIZE (op0_mode) < UNITS_PER_WORD) > + word = offset ? const0_rtx : op0; > + else > + word = operand_subword_force (op0, offset, GET_MODE (op0)); > offset = 0; > } > else > word = op0; > > /* OFFSET is in UNITs, and UNIT is in bits. > - store_fixed_bit_field wants offset in bytes. */ > - store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, > thissize, - thispos, part); > + store_fixed_bit_field wants offset in bytes. If WORD is const0_rtx, > + it is jut an out of bounds access. Ignore it. */ > + if (word != const0_rtx) > + store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, > + thispos, part); > bitsdone += thissize; "it is just an out-of-bounds access."
On Fri, Apr 01, 2011 at 12:52:04PM +0200, Eric Botcazou wrote: > > --- gcc/expr.c.jj 2011-03-23 17:15:55.000000000 +0100 > > +++ gcc/expr.c 2011-03-30 11:38:15.000000000 +0200 > > @@ -4278,16 +4278,47 @@ expand_assignment (tree to, tree from, b > > /* Handle expand_expr of a complex value returning a CONCAT. */ > > else if (GET_CODE (to_rtx) == CONCAT) > > { > > - if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))) > > + unsigned short mode_bitsize = GET_MODE_BITSIZE (GET_MODE (to_rtx)); > > + if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from))) > > + && bitpos == 0 > > + && bitsize == mode_bitsize) > > + result = store_expr (from, to_rtx, false, nontemporal); > > + else if (bitsize == mode_bitsize / 2 > > + && (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1))) > > + result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > > + nontemporal); > > Why GET_MODE_BITSIZE (mode1) and not mode_bitsize / 2 here? It should be mode_bitsize / 2 yeah, GET_MODE_BITSIZE (mode1) just came from the original code. Will change. > > { > > - gcc_assert (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1)); > > - result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > > - nontemporal); > > + rtx temp = assign_stack_temp (GET_MODE (to_rtx), > > + GET_MODE_SIZE (GET_MODE (to_rtx)), > > + 0); > > + write_complex_part (temp, XEXP (to_rtx, 0), false); > > + write_complex_part (temp, XEXP (to_rtx, 1), true); > > + result = store_field (temp, bitsize, bitpos, mode1, from, > > + TREE_TYPE (tem), get_alias_set (to), > > + nontemporal); > > + emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); > > + emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); > > } > > Can't you add result = NULL at the end of the block? result is set there from store_field, and result is only used in if (result) preserve_temp_slots (result); afterwards. store_field might want to preserve_temp_slots perhaps, so clearing result afterwards might be wrong. > > : (! SLOW_UNALIGNED_ACCESS (fieldmode, MEM_ALIGN (op0)) > > : > > || (offset * BITS_PER_UNIT % bitsize == 0 > > > > - && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0)))) > > + && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0))) > > + && (MEM_P (op0) > > + || GET_MODE (op0) == fieldmode > > + || validate_subreg (fieldmode, GET_MODE (op0), op0, byte_offset))) > > This partially duplicates the existing test. Can't the new code be retrofitted > into the existing test? You mean just with the MEM_P test? So something like: @@ -418,8 +418,11 @@ store_bit_field_1 (rtx str_rtx, unsigned && bitsize == GET_MODE_BITSIZE (fieldmode) && (!MEM_P (op0) ? ((GET_MODE_SIZE (fieldmode) >= UNITS_PER_WORD - || GET_MODE_SIZE (GET_MODE (op0)) == GET_MODE_SIZE (fieldmode)) - && byte_offset % GET_MODE_SIZE (fieldmode) == 0) + || GET_MODE_SIZE (GET_MODE (op0)) == GET_MODE_SIZE (fieldmode)) + && byte_offset % GET_MODE_SIZE (fieldmode) == 0 + && (GET_MODE (op0) == fieldmode + || validate_subreg (fieldmode, GET_MODE (op0), op0, + byte_offset))) : (! SLOW_UNALIGNED_ACCESS (fieldmode, MEM_ALIGN (op0)) || (offset * BITS_PER_UNIT % bitsize == 0 && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0)))) instead? > > /* OFFSET is in UNITs, and UNIT is in bits. > > - store_fixed_bit_field wants offset in bytes. */ > > - store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, > > thissize, - thispos, part); > > + store_fixed_bit_field wants offset in bytes. If WORD is const0_rtx, > > + it is jut an out of bounds access. Ignore it. */ > > + if (word != const0_rtx) > > + store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, > > + thispos, part); > > bitsdone += thissize; > > "it is just an out-of-bounds access." Ok, will change, thanks. Jakub
On Mar 30, 2011, at 9:05 AM, Jakub Jelinek <jakub@redhat.com> wrote: > MEM_REFs which can represent type punning on lhs don't force > non-gimple types to be addressable. This causes various problems > in the expander, which wasn't prepared to handle that. > > This patch tries to fix what I've found and adds a bunch of > testcases. The original report was with just -O2 on some large testcase > from Eigen, most of the testcases have -fno-tree-sra just because > I've given up on delta when it stopped reducing at 32KB. > > The first problem (the one from Eigen) is _mm_store_pd into > a std::complex<double> var, which is a single field and thus > has DCmode TYPE_MODE. As starting with 4.6 that var is not > TREE_ADDRESSABLE, its DECL_RTL is a CONCAT, and for assignments > to concat expand_assignment was expecting either that > from has COMPLEX_TYPE (and matching mode to the store), or > that it is a real or imaginary subpart store, thus when > trying to store a V2DF mode value it expected it to be > real part store (bitpos == 0) and tried to set a DFmode pseudo > from V2DFmode rtx. > Further testing revealed that it is possible to hit many other > cases with CONCAT destination, it can be a store of just a few bits, > or can overlap parts of both real and imaginary, or be partially > out of bounds. > The patch handles the case from Eigen - bitpos == 0 bitsize == > GET_MODE_BITSIZE (GET_MODE (to_rtx)) non-COMPLEX_TYPE by setting > each half separately, if it is a store which is not touching > one of the parts by just adjusting bitpos for the imaginary > case and storing just to one of the parts (this is > the bitpos + bitsize < half_bitsize resp. bitpos >= half_bitsize > case) and finally adds a generic slow one for the very unusual > cases with partial overlap of both. > > After testing it with the testcases I wrote, I found a bunch of > other ICEs though, and reproduced them even without CONCAT > on the LHS (the testcases below which don't contain any _Complex > keyword). > --- gcc/testsuite/gcc.dg/pr48335-2.c.jj 2011-03-30 10:57:29.000000000 +0200 > +++ gcc/testsuite/gcc.dg/pr48335-2.c 2011-03-29 18:27:53.000000000 +0200 > @@ -0,0 +1,58 @@ > +/* PR middle-end/48335 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-sra" } */ > + > +typedef long long T __attribute__((may_alias, aligned (1))); > +typedef short U __attribute__((may_alias, aligned (1))); > + > +struct S > +{ > + _Complex float d __attribute__((aligned (8))); > +}; > + > +void bar (struct S); > + > +void > +f1 (T x) > +{ > + struct S s; > + *(T *) ((char *) &s.d + 1) = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f2 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[0] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f3 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[1] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f4 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[2] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} > + > +void > +f5 (int x) > +{ > + struct S s = { .d = 0.0f }; > + ((U *)((char *) &s.d + 1))[3] = x; > + __real__ s.d *= 7.0; > + bar (s); > +} So, this test case writes outside the bounds of s. I think it is a nice test case for a missed diagnostic of some form, but for code that is supposed to compile, kinda hate it. Thoughts?
--- gcc/expr.c.jj 2011-03-23 17:15:55.000000000 +0100 +++ gcc/expr.c 2011-03-30 11:38:15.000000000 +0200 @@ -4278,16 +4278,47 @@ expand_assignment (tree to, tree from, b /* Handle expand_expr of a complex value returning a CONCAT. */ else if (GET_CODE (to_rtx) == CONCAT) { - if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))) + unsigned short mode_bitsize = GET_MODE_BITSIZE (GET_MODE (to_rtx)); + if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from))) + && bitpos == 0 + && bitsize == mode_bitsize) + result = store_expr (from, to_rtx, false, nontemporal); + else if (bitsize == mode_bitsize / 2 + && (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1))) + result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, + nontemporal); + else if (bitpos + bitsize <= mode_bitsize / 2) + result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, + mode1, from, TREE_TYPE (tem), + get_alias_set (to), nontemporal); + else if (bitpos >= mode_bitsize / 2) + result = store_field (XEXP (to_rtx, 1), bitsize, + bitpos - mode_bitsize / 2, mode1, from, + TREE_TYPE (tem), get_alias_set (to), + nontemporal); + else if (bitpos == 0 && bitsize == mode_bitsize) { - gcc_assert (bitpos == 0); - result = store_expr (from, to_rtx, false, nontemporal); + rtx from_rtx = expand_normal (from); + from_rtx = simplify_gen_subreg (GET_MODE (to_rtx), from_rtx, + TYPE_MODE (TREE_TYPE (from)), 0); + emit_move_insn (XEXP (to_rtx, 0), + read_complex_part (from_rtx, false)); + emit_move_insn (XEXP (to_rtx, 1), + read_complex_part (from_rtx, true)); + result = NULL; } else { - gcc_assert (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1)); - result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, - nontemporal); + rtx temp = assign_stack_temp (GET_MODE (to_rtx), + GET_MODE_SIZE (GET_MODE (to_rtx)), + 0); + write_complex_part (temp, XEXP (to_rtx, 0), false); + write_complex_part (temp, XEXP (to_rtx, 1), true); + result = store_field (temp, bitsize, bitpos, mode1, from, + TREE_TYPE (tem), get_alias_set (to), + nontemporal); + emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); + emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); } } else --- gcc/expmed.c.jj 2011-03-23 17:15:55.000000000 +0100 +++ gcc/expmed.c 2011-03-30 13:37:54.000000000 +0200 @@ -422,7 +422,10 @@ store_bit_field_1 (rtx str_rtx, unsigned && byte_offset % GET_MODE_SIZE (fieldmode) == 0) : (! SLOW_UNALIGNED_ACCESS (fieldmode, MEM_ALIGN (op0)) || (offset * BITS_PER_UNIT % bitsize == 0 - && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0)))) + && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0))) + && (MEM_P (op0) + || GET_MODE (op0) == fieldmode + || validate_subreg (fieldmode, GET_MODE (op0), op0, byte_offset))) { if (MEM_P (op0)) op0 = adjust_address (op0, fieldmode, offset); @@ -479,6 +482,7 @@ store_bit_field_1 (rtx str_rtx, unsigned struct expand_operand ops[2]; enum insn_code icode = optab_handler (movstrict_optab, fieldmode); rtx arg0 = op0; + unsigned HOST_WIDE_INT subreg_off; if (GET_CODE (arg0) == SUBREG) { @@ -491,15 +495,18 @@ store_bit_field_1 (rtx str_rtx, unsigned arg0 = SUBREG_REG (arg0); } - arg0 = gen_rtx_SUBREG (fieldmode, arg0, - (bitnum % BITS_PER_WORD) / BITS_PER_UNIT - + (offset * UNITS_PER_WORD)); - - create_fixed_operand (&ops[0], arg0); - /* Shrink the source operand to FIELDMODE. */ - create_convert_operand_to (&ops[1], value, fieldmode, false); - if (maybe_expand_insn (icode, 2, ops)) - return true; + subreg_off = (bitnum % BITS_PER_WORD) / BITS_PER_UNIT + + (offset * UNITS_PER_WORD); + if (validate_subreg (fieldmode, GET_MODE (arg0), arg0, subreg_off)) + { + arg0 = gen_rtx_SUBREG (fieldmode, arg0, subreg_off); + + create_fixed_operand (&ops[0], arg0); + /* Shrink the source operand to FIELDMODE. */ + create_convert_operand_to (&ops[1], value, fieldmode, false); + if (maybe_expand_insn (icode, 2, ops)) + return true; + } } /* Handle fields bigger than a word. */ @@ -1045,22 +1052,32 @@ store_split_bit_field (rtx op0, unsigned if (GET_CODE (op0) == SUBREG) { int word_offset = (SUBREG_BYTE (op0) / UNITS_PER_WORD) + offset; - word = operand_subword_force (SUBREG_REG (op0), word_offset, - GET_MODE (SUBREG_REG (op0))); + enum machine_mode sub_mode = GET_MODE (SUBREG_REG (op0)); + if (sub_mode != BLKmode && GET_MODE_SIZE (sub_mode) < UNITS_PER_WORD) + word = word_offset ? const0_rtx : op0; + else + word = operand_subword_force (SUBREG_REG (op0), word_offset, + GET_MODE (SUBREG_REG (op0))); offset = 0; } else if (REG_P (op0)) { - word = operand_subword_force (op0, offset, GET_MODE (op0)); + enum machine_mode op0_mode = GET_MODE (op0); + if (op0_mode != BLKmode && GET_MODE_SIZE (op0_mode) < UNITS_PER_WORD) + word = offset ? const0_rtx : op0; + else + word = operand_subword_force (op0, offset, GET_MODE (op0)); offset = 0; } else word = op0; /* OFFSET is in UNITs, and UNIT is in bits. - store_fixed_bit_field wants offset in bytes. */ - store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, - thispos, part); + store_fixed_bit_field wants offset in bytes. If WORD is const0_rtx, + it is jut an out of bounds access. Ignore it. */ + if (word != const0_rtx) + store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, + thispos, part); bitsdone += thissize; } } --- gcc/testsuite/gcc.c-torture/compile/pr48335-1.c.jj 2011-03-30 11:32:24.000000000 +0200 +++ gcc/testsuite/gcc.c-torture/compile/pr48335-1.c 2011-03-30 11:31:47.000000000 +0200 @@ -0,0 +1,41 @@ +/* PR middle-end/48335 */ + +struct S { float d; }; + +void bar (struct S); + +void +f0 (int x) +{ + struct S s = {.d = 0.0f }; + ((char *) &s.d)[0] = x; + s.d *= 7.0; + bar (s); +} + +void +f1 (int x) +{ + struct S s = {.d = 0.0f }; + ((char *) &s.d)[1] = x; + s.d *= 7.0; + bar (s); +} + +void +f2 (int x) +{ + struct S s = {.d = 0.0f }; + ((char *) &s.d)[2] = x; + s.d *= 7.0; + bar (s); +} + +void +f3 (int x) +{ + struct S s = {.d = 0.0f }; + ((char *) &s.d)[3] = x; + s.d *= 7.0; + bar (s); +} --- gcc/testsuite/gcc.dg/pr48335-1.c.jj 2011-03-30 10:57:29.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-1.c 2011-03-29 18:28:03.000000000 +0200 @@ -0,0 +1,48 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef long long T __attribute__((may_alias)); + +struct S +{ + _Complex float d __attribute__((aligned (8))); +}; + +void bar (struct S); + +void +f1 (T x) +{ + struct S s; + *(T *) &s.d = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f2 (int x) +{ + struct S s = { .d = 0.0f }; + *(char *) &s.d = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f3 (int x) +{ + struct S s = { .d = 0.0f }; + ((char *) &s.d)[2] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f4 (int x, int y) +{ + struct S s = { .d = 0.0f }; + ((char *) &s.d)[y] = x; + __real__ s.d *= 7.0; + bar (s); +} --- gcc/testsuite/gcc.dg/pr48335-2.c.jj 2011-03-30 10:57:29.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-2.c 2011-03-29 18:27:53.000000000 +0200 @@ -0,0 +1,58 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef long long T __attribute__((may_alias, aligned (1))); +typedef short U __attribute__((may_alias, aligned (1))); + +struct S +{ + _Complex float d __attribute__((aligned (8))); +}; + +void bar (struct S); + +void +f1 (T x) +{ + struct S s; + *(T *) ((char *) &s.d + 1) = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f2 (int x) +{ + struct S s = { .d = 0.0f }; + ((U *)((char *) &s.d + 1))[0] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f3 (int x) +{ + struct S s = { .d = 0.0f }; + ((U *)((char *) &s.d + 1))[1] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f4 (int x) +{ + struct S s = { .d = 0.0f }; + ((U *)((char *) &s.d + 1))[2] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f5 (int x) +{ + struct S s = { .d = 0.0f }; + ((U *)((char *) &s.d + 1))[3] = x; + __real__ s.d *= 7.0; + bar (s); +} --- gcc/testsuite/gcc.dg/pr48335-3.c.jj 2011-03-30 11:43:38.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-3.c 2011-03-30 11:42:54.000000000 +0200 @@ -0,0 +1,48 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef short U __attribute__((may_alias, aligned (1))); + +struct S +{ + double d; +}; + +void bar (struct S); + +void +f1 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[0] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f2 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[1] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f3 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[2] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f4 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[3] = x; + __real__ s.d *= 7.0; + bar (s); +} --- gcc/testsuite/gcc.dg/pr48335-4.c.jj 2011-03-30 12:59:11.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-4.c 2011-03-30 12:55:51.000000000 +0200 @@ -0,0 +1,39 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef short U __attribute__((may_alias, aligned (1))); + +struct S +{ + double d; +}; + +void bar (struct S); + +void +f1 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[-1] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f2 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[-2] = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f3 (int x) +{ + struct S s = { .d = 0.0 }; + ((U *)((char *) &s.d + 1))[5] = x; + __real__ s.d *= 7.0; + bar (s); +} --- gcc/testsuite/gcc.dg/pr48335-5.c.jj 2011-03-30 13:09:30.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-5.c 2011-03-30 13:01:10.000000000 +0200 @@ -0,0 +1,38 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef long long T __attribute__((may_alias)); + +struct S +{ + _Complex float d __attribute__((aligned (8))); +}; + +int +f1 (struct S x) +{ + struct S s = x; + return *(T *) &s.d; +} + +int +f2 (struct S x) +{ + struct S s = x; + return *(char *) &s.d; +} + +int +f3 (struct S x) +{ + struct S s = x; + return ((char *) &s.d)[2]; +} + +int +f4 (struct S x, int y) +{ + struct S s = x; + return ((char *) &s.d)[y]; +} --- gcc/testsuite/gcc.dg/pr48335-6.c.jj 2011-03-30 13:09:30.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-6.c 2011-03-30 13:04:58.000000000 +0200 @@ -0,0 +1,46 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef long long T __attribute__((may_alias, aligned (1))); +typedef short U __attribute__((may_alias, aligned (1))); + +struct S +{ + _Complex float d __attribute__((aligned (8))); +}; + +T +f1 (struct S x) +{ + struct S s = x; + return *(T *) ((char *) &s.d + 1); +} + +int +f2 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[0]; +} + +int +f3 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[1]; +} + +int +f4 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[2]; +} + +int +f5 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[3]; +} --- gcc/testsuite/gcc.dg/pr48335-7.c.jj 2011-03-30 13:09:30.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-7.c 2011-03-30 13:07:17.000000000 +0200 @@ -0,0 +1,38 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef short U __attribute__((may_alias, aligned (1))); + +struct S +{ + double d; +}; + +int +f1 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[0]; +} + +int +f2 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[1]; +} + +int +f3 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[2]; +} + +int +f4 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[3]; +} --- gcc/testsuite/gcc.dg/pr48335-8.c.jj 2011-03-30 13:09:30.000000000 +0200 +++ gcc/testsuite/gcc.dg/pr48335-8.c 2011-03-30 13:08:17.000000000 +0200 @@ -0,0 +1,31 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra" } */ + +typedef short U __attribute__((may_alias, aligned (1))); + +struct S +{ + double d; +}; + +int +f1 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[-1]; +} + +int +f2 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[-2]; +} + +int +f3 (struct S x) +{ + struct S s = x; + return ((U *)((char *) &s.d + 1))[5]; +} --- gcc/testsuite/gcc.target/i386/pr48335-1.c.jj 2011-03-30 10:57:16.000000000 +0200 +++ gcc/testsuite/gcc.target/i386/pr48335-1.c 2011-03-29 18:28:12.000000000 +0200 @@ -0,0 +1,32 @@ +/* PR middle-end/48335 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-sra -msse2" } */ + +#include <emmintrin.h> + +typedef __float128 T __attribute__((may_alias)); + +struct S +{ + _Complex double d __attribute__((aligned (16))); +}; + +void bar (struct S); + +void +f1 (T x) +{ + struct S s; + *(T *) &s.d = x; + __real__ s.d *= 7.0; + bar (s); +} + +void +f2 (__m128d x) +{ + struct S s; + _mm_store_pd ((double *) &s.d, x); + __real__ s.d *= 7.0; + bar (s); +}