Message ID | 5842907.YBOs7q9Hb0@hardin.shanghai.arm.com |
---|---|
State | New |
Headers | show |
On Tuesday, January 05, 2016 01:53:37 PM you wrote: > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC > and on an arm-none-eabi GCC cross-compiler without any regression. I'm > waiting for a slot on gcc110 to do a big endian bootstrap but at least the > testcase works on mips-linux. I'll send an update once bootstrap is > complete. Bootstrap went fine on gcc110 with the following language enabled: c,c++,objc,obj-c++,java,fortran,ada,go,lto. Best regards, Thomas
On Tue, 5 Jan 2016, Thomas Preud'homme wrote: > Hi, > > bswap optimization pass generate wrong code on big endian targets when the > result of a bit operation it analyzed is a partial load of the range of memory > accessed by the original expression (when one or more bytes at lowest address > were lost in the computation). This is due to the way cmpxchg and cmpnop are > adjusted in find_bswap_or_nop before being compared to the result of the > symbolic expression. Part of the adjustment is endian independent: it's to > ignore the bytes that were not accessed by the original gimple expression. > However, when the result has less byte than that original expression, some > more byte need to be ignored and this is endian dependent. > > The current code only support loss of bytes at the highest addresses because > there is no code to adjust the address of the load. However, for little and > big endian targets the bytes at highest address translate into different byte > significance in the result. This patch first separate cmpxchg and cmpnop > adjustement into 2 steps and then deal with endianness correctly for the > second step. > > ChangeLog entries are as follow: > > > *** gcc/ChangeLog *** > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > PR tree-optimization/67781 > * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg > and cmpnop in two steps: first the ones not accessed in original > gimple expression in a endian independent way and then the ones not > accessed in the final result in an endian-specific way. > > > *** gcc/testsuite/ChangeLog *** > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > PR tree-optimization/67781 > * gcc.c-torture/execute/pr67781.c: New file. > > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > new file mode 100644 > index 0000000..bf50aa2 > --- /dev/null > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > @@ -0,0 +1,34 @@ > +#ifdef __UINT32_TYPE__ > +typedef __UINT32_TYPE__ uint32_t; > +#else > +typedef unsigned uint32_t; > +#endif > + > +#ifdef __UINT8_TYPE__ > +typedef __UINT8_TYPE__ uint8_t; > +#else > +typedef unsigned char uint8_t; > +#endif > + > +struct > +{ > + uint32_t a; > + uint8_t b; > +} s = { 0x123456, 0x78 }; > + > +int pr67781() > +{ > + uint32_t c = (s.a << 8) | s.b; > + return c; > +} > + > +int > +main () > +{ > + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) > + return 0; > + > + if (pr67781 () != 0x12345678) > + __builtin_abort (); > + return 0; > +} > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c > index b00f046..e5a185f 100644 > --- a/gcc/tree-ssa-math-opts.c > +++ b/gcc/tree-ssa-math-opts.c > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number > *n, int limit) > static gimple * > find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap) > { > + unsigned rsize; > + uint64_t tmpn, mask; > /* The number which the find_bswap_or_nop_1 result should match in order > to have a full byte swap. The number is shifted to the right > according to the size of the symbolic number before using it. */ > @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number > *n, bool *bswap) > > /* Find real size of result (highest non-zero byte). */ > if (n->base_addr) > - { > - int rsize; > - uint64_t tmpn; > - > - for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++); > - n->range = rsize; > - } > + for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++); > + else > + rsize = n->range; > > - /* Zero out the extra bits of N and CMP*. */ > + /* Zero out the bits corresponding to untouched bytes in original gimple > + expression. */ > if (n->range < (int) sizeof (int64_t)) > { > - uint64_t mask; > - > mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1; > cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * BITS_PER_MARKER; > cmpnop &= mask; > } > > + /* Zero out the bits corresponding to unused bytes in the result of the > + gimple expression. */ > + if (rsize < n->range) > + { > + if (BYTES_BIG_ENDIAN) > + { > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > + cmpxchg &= mask; > + cmpnop >>= (n->range - rsize) * BITS_PER_MARKER; > + } > + else > + { > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > + cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER; > + cmpnop &= mask; > + } > + n->range = rsize; > + } > + > /* A complete byte swap should make the symbolic number to start with > the largest digit in the highest order byte. Unchanged symbolic > number indicates a read with same endianness as target architecture. */ > > > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC and > on an arm-none-eabi GCC cross-compiler without any regression. I'm waiting for > a slot on gcc110 to do a big endian bootstrap but at least the testcase works > on mips-linux. I'll send an update once bootstrap is complete. > > Is this ok for trunk and 5 branch in a week time if no regression is reported? Yes. Thanks, Richar. > Best regards, > > Thomas > >
On Friday, January 08, 2016 10:05:25 AM Richard Biener wrote: > On Tue, 5 Jan 2016, Thomas Preud'homme wrote: > > Hi, > > > > bswap optimization pass generate wrong code on big endian targets when the > > result of a bit operation it analyzed is a partial load of the range of > > memory accessed by the original expression (when one or more bytes at > > lowest address were lost in the computation). This is due to the way > > cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being > > compared to the result of the symbolic expression. Part of the adjustment > > is endian independent: it's to ignore the bytes that were not accessed by > > the original gimple expression. However, when the result has less byte > > than that original expression, some more byte need to be ignored and this > > is endian dependent. > > > > The current code only support loss of bytes at the highest addresses > > because there is no code to adjust the address of the load. However, for > > little and big endian targets the bytes at highest address translate into > > different byte significance in the result. This patch first separate > > cmpxchg and cmpnop adjustement into 2 steps and then deal with endianness > > correctly for the second step. > > > > ChangeLog entries are as follow: > > > > > > *** gcc/ChangeLog *** > > > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > > > PR tree-optimization/67781 > > * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in > > cmpxchg > > and cmpnop in two steps: first the ones not accessed in original > > gimple expression in a endian independent way and then the ones > > not > > accessed in the final result in an endian-specific way. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > > > PR tree-optimization/67781 > > * gcc.c-torture/execute/pr67781.c: New file. > > > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > new file mode 100644 > > index 0000000..bf50aa2 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > @@ -0,0 +1,34 @@ > > +#ifdef __UINT32_TYPE__ > > +typedef __UINT32_TYPE__ uint32_t; > > +#else > > +typedef unsigned uint32_t; > > +#endif > > + > > +#ifdef __UINT8_TYPE__ > > +typedef __UINT8_TYPE__ uint8_t; > > +#else > > +typedef unsigned char uint8_t; > > +#endif > > + > > +struct > > +{ > > + uint32_t a; > > + uint8_t b; > > +} s = { 0x123456, 0x78 }; > > + > > +int pr67781() > > +{ > > + uint32_t c = (s.a << 8) | s.b; > > + return c; > > +} > > + > > +int > > +main () > > +{ > > + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) > > + return 0; > > + > > + if (pr67781 () != 0x12345678) > > + __builtin_abort (); > > + return 0; > > +} > > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c > > index b00f046..e5a185f 100644 > > --- a/gcc/tree-ssa-math-opts.c > > +++ b/gcc/tree-ssa-math-opts.c > > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct > > symbolic_number *n, int limit) > > > > static gimple * > > find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap) > > { > > > > + unsigned rsize; > > + uint64_t tmpn, mask; > > > > /* The number which the find_bswap_or_nop_1 result should match in order > > > > to have a full byte swap. The number is shifted to the right > > according to the size of the symbolic number before using it. */ > > > > @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct > > symbolic_number *n, bool *bswap) > > > > /* Find real size of result (highest non-zero byte). */ > > if (n->base_addr) > > > > - { > > - int rsize; > > - uint64_t tmpn; > > - > > - for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, > > rsize++); - n->range = rsize; > > - } > > + for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, > > rsize++); > > + else > > + rsize = n->range; > > > > - /* Zero out the extra bits of N and CMP*. */ > > + /* Zero out the bits corresponding to untouched bytes in original > > gimple > > + expression. */ > > > > if (n->range < (int) sizeof (int64_t)) > > > > { > > > > - uint64_t mask; > > - > > > > mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1; > > cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * BITS_PER_MARKER; > > cmpnop &= mask; > > > > } > > > > + /* Zero out the bits corresponding to unused bytes in the result of the > > + gimple expression. */ > > + if (rsize < n->range) > > + { > > + if (BYTES_BIG_ENDIAN) > > + { > > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > > + cmpxchg &= mask; > > + cmpnop >>= (n->range - rsize) * BITS_PER_MARKER; > > + } > > + else > > + { > > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > > + cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER; > > + cmpnop &= mask; > > + } > > + n->range = rsize; > > + } > > + > > > > /* A complete byte swap should make the symbolic number to start with > > > > the largest digit in the highest order byte. Unchanged symbolic > > number indicates a read with same endianness as target architecture. > > */ > > > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC > > and on an arm-none-eabi GCC cross-compiler without any regression. I'm > > waiting for a slot on gcc110 to do a big endian bootstrap but at least > > the testcase works on mips-linux. I'll send an update once bootstrap is > > complete. > > > > Is this ok for trunk and 5 branch in a week time if no regression is > > reported? > Yes. What about a backport to GCC 5? The patch applies cleanly and shows no regression when running the testsuite with an arm-none-eabi GCC cross-compiler and a natively bootstrapped x86_64-linux-gnu GCC compiler. It also pass the test added in the patch with a nativly bootstrapped powerpc64-unknown-linux-gnu GCC compiler (on gcc110 in compile farm). Best regards, Thomas
On Thu, 21 Jan 2016, Thomas Preud'homme wrote: > On Friday, January 08, 2016 10:05:25 AM Richard Biener wrote: > > On Tue, 5 Jan 2016, Thomas Preud'homme wrote: > > > Hi, > > > > > > bswap optimization pass generate wrong code on big endian targets when the > > > result of a bit operation it analyzed is a partial load of the range of > > > memory accessed by the original expression (when one or more bytes at > > > lowest address were lost in the computation). This is due to the way > > > cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being > > > compared to the result of the symbolic expression. Part of the adjustment > > > is endian independent: it's to ignore the bytes that were not accessed by > > > the original gimple expression. However, when the result has less byte > > > than that original expression, some more byte need to be ignored and this > > > is endian dependent. > > > > > > The current code only support loss of bytes at the highest addresses > > > because there is no code to adjust the address of the load. However, for > > > little and big endian targets the bytes at highest address translate into > > > different byte significance in the result. This patch first separate > > > cmpxchg and cmpnop adjustement into 2 steps and then deal with endianness > > > correctly for the second step. > > > > > > ChangeLog entries are as follow: > > > > > > > > > *** gcc/ChangeLog *** > > > > > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > > > > > PR tree-optimization/67781 > > > * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in > > > cmpxchg > > > and cmpnop in two steps: first the ones not accessed in original > > > gimple expression in a endian independent way and then the ones > > > not > > > accessed in the final result in an endian-specific way. > > > > > > *** gcc/testsuite/ChangeLog *** > > > > > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > > > > > PR tree-optimization/67781 > > > * gcc.c-torture/execute/pr67781.c: New file. > > > > > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > > new file mode 100644 > > > index 0000000..bf50aa2 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > > @@ -0,0 +1,34 @@ > > > +#ifdef __UINT32_TYPE__ > > > +typedef __UINT32_TYPE__ uint32_t; > > > +#else > > > +typedef unsigned uint32_t; > > > +#endif > > > + > > > +#ifdef __UINT8_TYPE__ > > > +typedef __UINT8_TYPE__ uint8_t; > > > +#else > > > +typedef unsigned char uint8_t; > > > +#endif > > > + > > > +struct > > > +{ > > > + uint32_t a; > > > + uint8_t b; > > > +} s = { 0x123456, 0x78 }; > > > + > > > +int pr67781() > > > +{ > > > + uint32_t c = (s.a << 8) | s.b; > > > + return c; > > > +} > > > + > > > +int > > > +main () > > > +{ > > > + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) > > > + return 0; > > > + > > > + if (pr67781 () != 0x12345678) > > > + __builtin_abort (); > > > + return 0; > > > +} > > > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c > > > index b00f046..e5a185f 100644 > > > --- a/gcc/tree-ssa-math-opts.c > > > +++ b/gcc/tree-ssa-math-opts.c > > > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct > > > symbolic_number *n, int limit) > > > > > > static gimple * > > > find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap) > > > { > > > > > > + unsigned rsize; > > > + uint64_t tmpn, mask; > > > > > > /* The number which the find_bswap_or_nop_1 result should match in order > > > > > > to have a full byte swap. The number is shifted to the right > > > according to the size of the symbolic number before using it. */ > > > > > > @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct > > > symbolic_number *n, bool *bswap) > > > > > > /* Find real size of result (highest non-zero byte). */ > > > if (n->base_addr) > > > > > > - { > > > - int rsize; > > > - uint64_t tmpn; > > > - > > > - for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, > > > rsize++); - n->range = rsize; > > > - } > > > + for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, > > > rsize++); > > > + else > > > + rsize = n->range; > > > > > > - /* Zero out the extra bits of N and CMP*. */ > > > + /* Zero out the bits corresponding to untouched bytes in original > > > gimple > > > + expression. */ > > > > > > if (n->range < (int) sizeof (int64_t)) > > > > > > { > > > > > > - uint64_t mask; > > > - > > > > > > mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1; > > > cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * BITS_PER_MARKER; > > > cmpnop &= mask; > > > > > > } > > > > > > + /* Zero out the bits corresponding to unused bytes in the result of the > > > + gimple expression. */ > > > + if (rsize < n->range) > > > + { > > > + if (BYTES_BIG_ENDIAN) > > > + { > > > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > > > + cmpxchg &= mask; > > > + cmpnop >>= (n->range - rsize) * BITS_PER_MARKER; > > > + } > > > + else > > > + { > > > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > > > + cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER; > > > + cmpnop &= mask; > > > + } > > > + n->range = rsize; > > > + } > > > + > > > > > > /* A complete byte swap should make the symbolic number to start with > > > > > > the largest digit in the highest order byte. Unchanged symbolic > > > number indicates a read with same endianness as target architecture. > > > */ > > > > > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC > > > and on an arm-none-eabi GCC cross-compiler without any regression. I'm > > > waiting for a slot on gcc110 to do a big endian bootstrap but at least > > > the testcase works on mips-linux. I'll send an update once bootstrap is > > > complete. > > > > > > Is this ok for trunk and 5 branch in a week time if no regression is > > > reported? > > Yes. > > What about a backport to GCC 5? The patch applies cleanly and shows no > regression when running the testsuite with an arm-none-eabi GCC cross-compiler > and a natively bootstrapped x86_64-linux-gnu GCC compiler. > > It also pass the test added in the patch with a nativly bootstrapped > powerpc64-unknown-linux-gnu GCC compiler (on gcc110 in compile farm). The "Yes" above approved a backport. Richard.
On Thursday, January 21, 2016 09:21:52 AM Richard Biener wrote: > On Thu, 21 Jan 2016, Thomas Preud'homme wrote: > > On Friday, January 08, 2016 10:05:25 AM Richard Biener wrote: > > > On Tue, 5 Jan 2016, Thomas Preud'homme wrote: > > > > Hi, > > > > > > > > bswap optimization pass generate wrong code on big endian targets when > > > > the > > > > result of a bit operation it analyzed is a partial load of the range > > > > of > > > > memory accessed by the original expression (when one or more bytes at > > > > lowest address were lost in the computation). This is due to the way > > > > cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being > > > > compared to the result of the symbolic expression. Part of the > > > > adjustment > > > > is endian independent: it's to ignore the bytes that were not accessed > > > > by > > > > the original gimple expression. However, when the result has less byte > > > > than that original expression, some more byte need to be ignored and > > > > this > > > > is endian dependent. > > > > > > > > The current code only support loss of bytes at the highest addresses > > > > because there is no code to adjust the address of the load. However, > > > > for > > > > little and big endian targets the bytes at highest address translate > > > > into > > > > different byte significance in the result. This patch first separate > > > > cmpxchg and cmpnop adjustement into 2 steps and then deal with > > > > endianness > > > > correctly for the second step. > > > > > > > > ChangeLog entries are as follow: > > > > > > > > > > > > *** gcc/ChangeLog *** > > > > > > > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > > > > > > > PR tree-optimization/67781 > > > > * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in > > > > cmpxchg > > > > and cmpnop in two steps: first the ones not accessed in > > > > original > > > > gimple expression in a endian independent way and then the > > > > ones > > > > not > > > > accessed in the final result in an endian-specific way. > > > > > > > > *** gcc/testsuite/ChangeLog *** > > > > > > > > 2015-12-16 Thomas Preud'homme <thomas.preudhomme@arm.com> > > > > > > > > PR tree-optimization/67781 > > > > * gcc.c-torture/execute/pr67781.c: New file. > > > > > > > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > > > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > > > new file mode 100644 > > > > index 0000000..bf50aa2 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c > > > > @@ -0,0 +1,34 @@ > > > > +#ifdef __UINT32_TYPE__ > > > > +typedef __UINT32_TYPE__ uint32_t; > > > > +#else > > > > +typedef unsigned uint32_t; > > > > +#endif > > > > + > > > > +#ifdef __UINT8_TYPE__ > > > > +typedef __UINT8_TYPE__ uint8_t; > > > > +#else > > > > +typedef unsigned char uint8_t; > > > > +#endif > > > > + > > > > +struct > > > > +{ > > > > + uint32_t a; > > > > + uint8_t b; > > > > +} s = { 0x123456, 0x78 }; > > > > + > > > > +int pr67781() > > > > +{ > > > > + uint32_t c = (s.a << 8) | s.b; > > > > + return c; > > > > +} > > > > + > > > > +int > > > > +main () > > > > +{ > > > > + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) > > > > + return 0; > > > > + > > > > + if (pr67781 () != 0x12345678) > > > > + __builtin_abort (); > > > > + return 0; > > > > +} > > > > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c > > > > index b00f046..e5a185f 100644 > > > > --- a/gcc/tree-ssa-math-opts.c > > > > +++ b/gcc/tree-ssa-math-opts.c > > > > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct > > > > symbolic_number *n, int limit) > > > > > > > > static gimple * > > > > find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool > > > > *bswap) > > > > { > > > > > > > > + unsigned rsize; > > > > + uint64_t tmpn, mask; > > > > > > > > /* The number which the find_bswap_or_nop_1 result should match in > > > > order > > > > > > > > to have a full byte swap. The number is shifted to the right > > > > according to the size of the symbolic number before using it. */ > > > > > > > > @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct > > > > symbolic_number *n, bool *bswap) > > > > > > > > /* Find real size of result (highest non-zero byte). */ > > > > if (n->base_addr) > > > > > > > > - { > > > > - int rsize; > > > > - uint64_t tmpn; > > > > - > > > > - for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, > > > > rsize++); - n->range = rsize; > > > > - } > > > > + for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, > > > > rsize++); > > > > + else > > > > + rsize = n->range; > > > > > > > > - /* Zero out the extra bits of N and CMP*. */ > > > > + /* Zero out the bits corresponding to untouched bytes in original > > > > gimple > > > > + expression. */ > > > > > > > > if (n->range < (int) sizeof (int64_t)) > > > > > > > > { > > > > > > > > - uint64_t mask; > > > > - > > > > > > > > mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1; > > > > cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * > > > > BITS_PER_MARKER; > > > > cmpnop &= mask; > > > > > > > > } > > > > > > > > + /* Zero out the bits corresponding to unused bytes in the result of > > > > the > > > > + gimple expression. */ > > > > + if (rsize < n->range) > > > > + { > > > > + if (BYTES_BIG_ENDIAN) > > > > + { > > > > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > > > > + cmpxchg &= mask; > > > > + cmpnop >>= (n->range - rsize) * BITS_PER_MARKER; > > > > + } > > > > + else > > > > + { > > > > + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; > > > > + cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER; > > > > + cmpnop &= mask; > > > > + } > > > > + n->range = rsize; > > > > + } > > > > + > > > > > > > > /* A complete byte swap should make the symbolic number to start > > > > with > > > > > > > > the largest digit in the highest order byte. Unchanged symbolic > > > > number indicates a read with same endianness as target > > > > architecture. > > > > > > > > */ > > > > > > > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu > > > > GCC > > > > and on an arm-none-eabi GCC cross-compiler without any regression. I'm > > > > waiting for a slot on gcc110 to do a big endian bootstrap but at least > > > > the testcase works on mips-linux. I'll send an update once bootstrap > > > > is > > > > complete. > > > > > > > > Is this ok for trunk and 5 branch in a week time if no regression is > > > > reported? > > > > > > Yes. > > > > What about a backport to GCC 5? The patch applies cleanly and shows no > > regression when running the testsuite with an arm-none-eabi GCC > > cross-compiler and a natively bootstrapped x86_64-linux-gnu GCC compiler. > > > > It also pass the test added in the patch with a nativly bootstrapped > > powerpc64-unknown-linux-gnu GCC compiler (on gcc110 in compile farm). > > The "Yes" above approved a backport. Doh indeed. My apologize. Best regards, Thomas
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c b/gcc/testsuite/gcc.c-torture/execute/pr67781.c new file mode 100644 index 0000000..bf50aa2 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c @@ -0,0 +1,34 @@ +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef unsigned uint32_t; +#endif + +#ifdef __UINT8_TYPE__ +typedef __UINT8_TYPE__ uint8_t; +#else +typedef unsigned char uint8_t; +#endif + +struct +{ + uint32_t a; + uint8_t b; +} s = { 0x123456, 0x78 }; + +int pr67781() +{ + uint32_t c = (s.a << 8) | s.b; + return c; +} + +int +main () +{ + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) + return 0; + + if (pr67781 () != 0x12345678) + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index b00f046..e5a185f 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number *n, int limit) static gimple * find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap) { + unsigned rsize; + uint64_t tmpn, mask; /* The number which the find_bswap_or_nop_1 result should match in order to have a full byte swap. The number is shifted to the right