Message ID | 5827ae6b-a0ec-ed71-b18d-63d54e1d1449@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [rs6000] Use unaligned vector types for some pointer casts | expand |
Hi! On Fri, Oct 19, 2018 at 04:27:27PM -0500, Bill Schmidt wrote: > The x86 intrinsic compatibility headers contain a couple of instances of > undefined behavior where a cast to an aligned type is used when that > alignment is not guaranteed by the expression to be cast from. This > patch fixes that problem by replacing the aligned types with unaligned > versions of the same type. How did you find these? What I'm after is, did you find all instances? > --- gcc/config/rs6000/xmmintrin.h (revision 265318) > +++ gcc/config/rs6000/xmmintrin.h (working copy) > @@ -85,6 +85,9 @@ > vector types, and their scalar components. */ > typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); > > +/* Unaligned version of the same type. */ > +typedef float __m128_u __attribute__ ((__vector_size__ (16), __may_alias__)); This is identical to __m128. Do you want aligned(1) as well? Okay for trunk if you want that; if not, this needs explanation (a code comment or similar). Thanks! Segher
On 10/20/18 10:53 AM, Segher Boessenkool wrote: > Hi! > > On Fri, Oct 19, 2018 at 04:27:27PM -0500, Bill Schmidt wrote: >> The x86 intrinsic compatibility headers contain a couple of instances of >> undefined behavior where a cast to an aligned type is used when that >> alignment is not guaranteed by the expression to be cast from. This >> patch fixes that problem by replacing the aligned types with unaligned >> versions of the same type. > How did you find these? What I'm after is, did you find all instances? Jinsong found these by using the test cases for these header files when incorporating the headers into Clang. I will ask him whether he scanned for additional similar cases. > >> --- gcc/config/rs6000/xmmintrin.h (revision 265318) >> +++ gcc/config/rs6000/xmmintrin.h (working copy) >> @@ -85,6 +85,9 @@ >> vector types, and their scalar components. */ >> typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); >> >> +/* Unaligned version of the same type. */ >> +typedef float __m128_u __attribute__ ((__vector_size__ (16), __may_alias__)); > This is identical to __m128. Do you want aligned(1) as well? Argh, yes. I failed to copy this correctly from Jinsong. I will re-test before applying. Thanks! Bill > > Okay for trunk if you want that; if not, this needs explanation (a code > comment or similar). > > Thanks! > > > Segher >
Just to follow up... On 10/21/18 6:13 PM, Bill Schmidt wrote: > On 10/20/18 10:53 AM, Segher Boessenkool wrote: >> Hi! >> >> On Fri, Oct 19, 2018 at 04:27:27PM -0500, Bill Schmidt wrote: >>> The x86 intrinsic compatibility headers contain a couple of instances of >>> undefined behavior where a cast to an aligned type is used when that >>> alignment is not guaranteed by the expression to be cast from. This >>> patch fixes that problem by replacing the aligned types with unaligned >>> versions of the same type. >> How did you find these? What I'm after is, did you find all instances? > Jinsong found these by using the test cases for these header files when > incorporating the headers into Clang. I will ask him whether he scanned > for additional similar cases. Only one of the issues was found by test case. The rest were found by scanning. Thanks, Bill > >>> --- gcc/config/rs6000/xmmintrin.h (revision 265318) >>> +++ gcc/config/rs6000/xmmintrin.h (working copy) >>> @@ -85,6 +85,9 @@ >>> vector types, and their scalar components. */ >>> typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); >>> >>> +/* Unaligned version of the same type. */ >>> +typedef float __m128_u __attribute__ ((__vector_size__ (16), __may_alias__)); >> This is identical to __m128. Do you want aligned(1) as well? > Argh, yes. I failed to copy this correctly from Jinsong. I will re-test before > applying. > > Thanks! > Bill > >> Okay for trunk if you want that; if not, this needs explanation (a code >> comment or similar). >> >> Thanks! >> >> >> Segher >>
Index: gcc/config/rs6000/emmintrin.h =================================================================== --- gcc/config/rs6000/emmintrin.h (revision 265318) +++ gcc/config/rs6000/emmintrin.h (working copy) @@ -85,7 +85,7 @@ typedef double __m128d __attribute__ ((__vector_si typedef long long __m128i_u __attribute__ ((__vector_size__ (16), __may_alias__, __aligned__ (1))); typedef double __m128d_u __attribute__ ((__vector_size__ (16), __may_alias__, __aligned__ (1))); -/* Define two value permute mask */ +/* Define two value permute mask. */ #define _MM_SHUFFLE2(x,y) (((x) << 1) | (y)) /* Create a vector with element 0 as F and the rest zero. */ @@ -201,7 +201,7 @@ _mm_store_pd (double *__P, __m128d __A) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storeu_pd (double *__P, __m128d __A) { - *(__m128d *)__P = __A; + *(__m128d_u *)__P = __A; } /* Stores the lower DPFP value. */ @@ -2175,7 +2175,7 @@ _mm_maskmoveu_si128 (__m128i __A, __m128i __B, cha { __v2du hibit = { 0x7f7f7f7f7f7f7f7fUL, 0x7f7f7f7f7f7f7f7fUL}; __v16qu mask, tmp; - __m128i *p = (__m128i*)__C; + __m128i_u *p = (__m128i_u*)__C; tmp = (__v16qu)_mm_loadu_si128(p); mask = (__v16qu)vec_cmpgt ((__v16qu)__B, (__v16qu)hibit); Index: gcc/config/rs6000/xmmintrin.h =================================================================== --- gcc/config/rs6000/xmmintrin.h (revision 265318) +++ gcc/config/rs6000/xmmintrin.h (working copy) @@ -85,6 +85,9 @@ vector types, and their scalar components. */ typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); +/* Unaligned version of the same type. */ +typedef float __m128_u __attribute__ ((__vector_size__ (16), __may_alias__)); + /* Internal data types for implementing the intrinsics. */ typedef float __v4sf __attribute__ ((__vector_size__ (16))); @@ -172,7 +175,7 @@ _mm_store_ps (float *__P, __m128 __A) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storeu_ps (float *__P, __m128 __A) { - *(__m128 *)__P = __A; + *(__m128_u *)__P = __A; } /* Store four SPFP values in reverse order. The address must be aligned. */