From patchwork Fri Oct 22 11:21:25 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Artem Shinkarov X-Patchwork-Id: 68838 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 486B2B70A8 for ; Fri, 22 Oct 2010 22:22:06 +1100 (EST) Received: (qmail 22421 invoked by alias); 22 Oct 2010 11:22:01 -0000 Received: (qmail 22404 invoked by uid 22791); 22 Oct 2010 11:21:57 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, TW_DQ, TW_VD, TW_VZ, TW_WT, TW_ZW X-Spam-Check-By: sourceware.org Received: from mail-iw0-f175.google.com (HELO mail-iw0-f175.google.com) (209.85.214.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 22 Oct 2010 11:21:49 +0000 Received: by iwn6 with SMTP id 6so151264iwn.20 for ; Fri, 22 Oct 2010 04:21:47 -0700 (PDT) Received: by 10.231.183.77 with SMTP id cf13mr299569ibb.113.1287746507078; Fri, 22 Oct 2010 04:21:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.182.207 with HTTP; Fri, 22 Oct 2010 04:21:25 -0700 (PDT) In-Reply-To: References: <4CBF3BAD.6080305@redhat.com> <4CBF5B6E.3050109@redhat.com> <4CBF60F9.8070702@redhat.com> <4CBF67E3.2090102@redhat.com> From: Artem Shinkarov Date: Fri, 22 Oct 2010 12:21:25 +0100 Message-ID: Subject: Re: Vector subscription patch To: Richard Guenther Cc: Richard Henderson , gcc-patches@gcc.gnu.org, "Joseph S. Myers" X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Ok, so here is the same patch with fixed error messages and changelog. All the optimizing issues we should bring in the separate patch on the level of middle-end. ChangeLog: 2010-10-20 Artjoms Sinkarovs Andrew Pinski gcc/ * c-typeck.c (build_array_ref): Handle subscripting of vectors. gcc/c-family/ * c-common.h (c_common_mark_addressable_vec): Declare. * c-common.c (c_common_mark_addressable_vec): New function. gcc/testsuite/ * gcc.c-torture/execute/vector-subscript-1.c: Likewise. * gcc.c-torture/execute/vector-subscript-2.c: Likewise. * gcc.c-torture/execute/vector-subscript-3.c: New testcase. * gcc.dg/vector-subscript-1.c: Likewise. * gcc.dg/vector-subscript-2.c: Likewise. * gcc.dg/vector-subscript-3.c: New testcase. * gcc.dg/array-8.c: Adjust. gcc/doc/ * extend.texi: New paragraph bootstrapped and tested on x86_64_unknown-linux On Thu, Oct 21, 2010 at 2:36 PM, Richard Guenther wrote: > On Thu, Oct 21, 2010 at 1:15 PM, Artem Shinkarov > wrote: >> Richard is right about the fact that backend setterd and getters are >> not used properly. For instance, the code like this: >> >> #define vector(elcount, type)  \ >> __attribute__((vector_size((elcount)*sizeof(type)))) type >> >> extern vector (8, short) vec; >> >> int main (int argc, char *argv[]) { >>    vec += (vector (8, short)) {4,4,4,4,4,4,4,4}; >>    return vec[0]; >> } >> >> is compiled down to the assembly: >> >>  main: >>  .LFB0: >>         .cfi_startproc >>         movdqa  .LC0(%rip), %xmm0 >>         paddw   vec(%rip), %xmm0 >>         movdqa  %xmm0, -24(%rsp) >>         movzwl  -24(%rsp), %eax >>         movdqa  %xmm0, vec(%rip) >>         cwtl >>         ret >>         .cfi_endproc >> >> but in theory we could use pextrw instead of assigning through the >> memory. Like this: >> >>  main: >>  .LFB0: >>         .cfi_startproc >>         movdqa  .LC0(%rip), %xmm0 >>         paddw   vec(%rip), %xmm0 >>         pextrw  $0x0, %xmm0, %eax >>         movdqa  %xmm0, vec(%rip) >>         cwtl >>         ret >>         .cfi_endproc >> >> Although in the middleend we have transformation of vec[0] to >> BIT_FIELD_REF ; >> >> So the question is why the backend did not recognize this as a pattern >> for pextrw. And how much effort would it take to fix this. >> >> As for new codes, I think that it would be nicer to recognize the >> pattern in the middle end rather than generate new codes at the very >> beginning. For instance if I'll have: >> >> vector (8, short) vec; >> short x = *((short *) vec + 3); >> >> then it means that I would not be able to get my pextrw out of this? >> >> Any ideas why backend does not catch this pattern? > > In this particular case it is because the global variable vec isn't > re-written into SSA form and thus the BIT_FIELD_REF operates > on memory (and we do not CSE the BIT_FIELD_REFs variable > operand, something we could fix easily at the tree level). > > On the original mem-ref branch BIT_FIELD_REFs were not > allowed on memory, with that lowering in place it would have been > optimized already.  It would already help if sole BIT_FIELD_REF > (and REAL/IMAGPART_EXPR) operating on register type variables > would act as non-reference operations.  For the vector selection > case that is one more reason in favor of a VECT_SELECT_EXPR > of course. > > OTOH the case with the global variable is probably not too common > to worry about at the moment. > > The argument that we should optimize vec[3] the same as *((short *)vec + 3) > is valid (same as in as good), so the FE could just rely on that. > > Richard. > >> >> Artem. >> >> On Wed, Oct 20, 2010 at 11:47 PM, Richard Guenther >> wrote: >>> On Thu, Oct 21, 2010 at 12:06 AM, Richard Henderson wrote: >>>> On 10/20/2010 02:47 PM, Richard Guenther wrote: >>>>> On Wed, Oct 20, 2010 at 11:36 PM, Richard Henderson wrote: >>>>>> On 10/20/2010 02:27 PM, Richard Guenther wrote: >>>>>>> A VIEW_CONVERT_EXPR [index] should also work. >>>>>> >>>>>> That should mark the variable addressable iff index is non-constant? >>>>> >>>>> No, the variable will still be in SSA form, thus the expanders have to take >>>>> extra care.  The variable will be non-addressable but in non-SSA form >>>>> if there are partial stores though. >>>> >>>> Er, of course there will be partial stores.  That's the >>>> whole point of setting an element. >>>> >>>> It's almost certainly better, then, to have a new code >>>> >>>>  vector v; >>>>  T t; >>>> >>>>  v1 = VEC_SET_EXPR(v0, index, t0); >>>>  t1 = VEC_EXT_EXPR(v0, index) >>>> >>>> which can at least keep V in SSA form, and avoid playing >>>> silly games with ARRAY_REF_EXPR. >>> >>> ;) >>> >>> We also lack a genuine vector construction tree and use CONSTRUCTOR >>> for it.  But your VEC_SET_EXPR sounds much like the BIT_FIELD_EXPR >>> I want to invent for the modify part of a read-modify-write bitfield store. >>> >>> Anyway, the question is what we want to do now, given we're close to >>> stage3 and nothing in the middle-end currently knows about a >>> VEC_SET_EXPR or a VEC_EXT_EXPR. >>> >>> Doing the addressable memory thing will pessimize -O0 for sure, but >>> the optimizers should be able to fix it up.  We can easily special-case >>> constant indices on the RHS with BIT_FIELD_REF. >>> >>> Richard. >>> >>>> >>>> r~ >>>> >>> >> > Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 165700) +++ gcc/doc/extend.texi (working copy) @@ -6310,6 +6310,12 @@ minus or complement operators on a vecto elements are the negative or complemented values of the corresponding elements in the operand. +In C vectors can be subscripted as if the vector were an array with +the same number of elements and base type. Out of bound accesses +invoke undefined behavior at runtime. Warnings for out of bound +accesses for vector subscription can be enabled with +@option{-Warray-bounds}. + You can declare variables and use them in function calls and returns, as well as in assignments and some casts. You can specify a vector type as a return type for a function. Vector types can also be used as function Index: gcc/c-family/c-common.c =================================================================== --- gcc/c-family/c-common.c (revision 165700) +++ gcc/c-family/c-common.c (working copy) @@ -8703,6 +8703,18 @@ complete_array_type (tree *ptype, tree i return failure; } +/* Like c_mark_addressable but don't check register qualifier. */ +void +c_common_mark_addressable_vec (tree t) +{ + while (handled_component_p (t)) + t = TREE_OPERAND (t, 0); + if (TREE_CODE (t) != VAR_DECL && TREE_CODE (t) != PARM_DECL) + return; + TREE_ADDRESSABLE (t) = 1; +} + + /* Used to help initialize the builtin-types.def table. When a type of the correct size doesn't exist, use error_mark_node instead of NULL. Index: gcc/c-family/c-common.h =================================================================== --- gcc/c-family/c-common.h (revision 165700) +++ gcc/c-family/c-common.h (working copy) @@ -933,6 +933,8 @@ extern int complete_array_type (tree *, extern tree builtin_type_for_size (int, bool); +extern void c_common_mark_addressable_vec (tree); + extern void warn_array_subscript_with_type_char (tree); extern void warn_about_parentheses (enum tree_code, enum tree_code, tree, Index: gcc/testsuite/gcc.c-torture/execute/vector-subscript-3.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-subscript-3.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-subscript-3.c (revision 0) @@ -0,0 +1,26 @@ +/* dg-do run */ +#define vector __attribute__((vector_size(16) )) + +/* Check whether register declaration of vector type still + allow us to subscript this type. */ + +typedef vector short myvec_t; + +struct vec_s { + vector short member; +}; + + +int main () { + register short vector v0 = {1,2,3,4,5,6,7}; + register myvec_t v1 = {1,2,3,4,5,6,7}; + register struct vec_s v2; + + v2.member = v1; + + short r = v0[0] + v1[1] + v2.member[2]; + if (r != 6) + __builtin_abort (); + + return 0; +} Index: gcc/testsuite/gcc.c-torture/execute/vector-subscript-2.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-subscript-2.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-subscript-2.c (revision 0) @@ -0,0 +1,67 @@ +#define vector __attribute__((vector_size(sizeof(int)*4) )) + +/* Check to make sure that we extract and insert the vector at the same + location for vector subscripting (with constant indexes) and + that vectors layout are the same as arrays. */ + +struct TV4 +{ + vector int v; +}; + +typedef struct TV4 MYV4; + +static inline MYV4 myfunc2( int x, int y, int z, int w ) +{ + MYV4 temp; + temp.v[0] = x; + temp.v[1] = y; + temp.v[2] = z; + temp.v[3] = w; + return temp; +} +MYV4 val3; +__attribute__((noinline)) void modify (void) +{ + val3 = myfunc2( 1, 2, 3, 4 ); +} +int main( int argc, char* argv[] ) +{ + int a[4]; + int i; + + /* Set up the vector. */ + modify(); + + /* Check the vector via the global variable. */ + if (val3.v[0] != 1) + __builtin_abort (); + if (val3.v[1] != 2) + __builtin_abort (); + if (val3.v[2] != 3) + __builtin_abort (); + if (val3.v[3] != 4) + __builtin_abort (); + + vector int a1 = val3.v; + + /* Check the vector via a local variable. */ + if (a1[0] != 1) + __builtin_abort (); + if (a1[1] != 2) + __builtin_abort (); + if (a1[2] != 3) + __builtin_abort (); + if (a1[3] != 4) + __builtin_abort (); + + __builtin_memcpy(a, &val3, sizeof(a)); + /* Check the vector via copying it to an array. */ + for(i = 0; i < 4; i++) + if (a[i] != i+1) + __builtin_abort (); + + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vector-subscript-1.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-subscript-1.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-subscript-1.c (revision 0) @@ -0,0 +1,60 @@ +/* dg-do run */ +#define vector __attribute__((vector_size(sizeof(int)*4) )) + +/* Check to make sure that we extract and insert the vector at the same + location for vector subscripting and that vectors layout are the same + as arrays. */ + +struct TV4 +{ + vector int v; +}; + +typedef struct TV4 MYV4; +static inline int *f(MYV4 *a, int i) +{ + return &(a->v[i]); +} + +static inline MYV4 myfunc2( int x, int y, int z, int w ) +{ + MYV4 temp; + *f(&temp, 0 ) = x; + *f(&temp, 1 ) = y; + *f(&temp, 2 ) = z; + *f(&temp, 3 ) = w; + return temp; +} + +MYV4 val3; + +__attribute__((noinline)) void modify (void) +{ + val3 = myfunc2( 1, 2, 3, 4 ); +} + +int main( int argc, char* argv[] ) +{ + int a[4]; + int i; + + modify(); + + if (*f(&val3, 0 ) != 1) + __builtin_abort (); + if (*f(&val3, 1 ) != 2) + __builtin_abort (); + if (*f(&val3, 2 ) != 3) + __builtin_abort (); + if (*f(&val3, 3 ) != 4) + __builtin_abort (); + + __builtin_memcpy(a, &val3, 16); + for(i = 0; i < 4; i++) + if (a[i] != i+1) + __builtin_abort (); + + + return 0; +} + Index: gcc/testsuite/gcc.dg/vector-subscript-3.c =================================================================== --- gcc/testsuite/gcc.dg/vector-subscript-3.c (revision 0) +++ gcc/testsuite/gcc.dg/vector-subscript-3.c (revision 0) @@ -0,0 +1,19 @@ +/* Check the case when index is out of bound */ +/* { dg-do compile } */ +/* { dg-options "-Warray-bounds" } */ + +#define vector __attribute__((vector_size(16) )) + + +int test0(void) +{ + vector int a; + return a[10]; /* { dg-warning "index value is out of bound" } */ +} + +int test1(void) +{ + vector int a; + return a[-1]; /* { dg-warning "index value is out of bound" } */ +} + Index: gcc/testsuite/gcc.dg/array-8.c =================================================================== --- gcc/testsuite/gcc.dg/array-8.c (revision 165700) +++ gcc/testsuite/gcc.dg/array-8.c (working copy) @@ -35,7 +35,7 @@ g (void) f().c[0]; 0[f().c]; /* Various invalid cases. */ - c[c]; /* { dg-error "subscripted value is neither array nor pointer" } */ + c[c]; /* { dg-error "subscripted value is neither array nor pointer nor vector" } */ p[1.0]; /* { dg-error "array subscript is not an integer" } */ 1.0[a]; /* { dg-error "array subscript is not an integer" } */ fp[0]; /* { dg-error "subscripted value is pointer to function" } */ Index: gcc/testsuite/gcc.dg/vector-subscript-1.c =================================================================== --- gcc/testsuite/gcc.dg/vector-subscript-1.c (revision 0) +++ gcc/testsuite/gcc.dg/vector-subscript-1.c (revision 0) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-w" } */ + +#define vector __attribute__((vector_size(16) )) +/* Check that vector[index] works and index[vector] is rejected. */ + +float vf(vector float a) +{ + return 0[a]; /* { dg-error "subscripted value is neither array nor pointer nor vector" } */ +} + + +float fv(vector float a) +{ + return a[0]; +} + Index: gcc/testsuite/gcc.dg/vector-subscript-2.c =================================================================== --- gcc/testsuite/gcc.dg/vector-subscript-2.c (revision 0) +++ gcc/testsuite/gcc.dg/vector-subscript-2.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ + +/* Check that subscripting of vectors work with register storage class decls. */ + +#define vector __attribute__((vector_size(16) )) + + +float vf(int i) +{ + register vector float a; + return a[0]; +} + Index: gcc/c-typeck.c =================================================================== --- gcc/c-typeck.c (revision 165700) +++ gcc/c-typeck.c (working copy) @@ -2305,6 +2305,9 @@ build_indirect_ref (location_t loc, tree arrays that are not lvalues (for example, members of structures returned by functions). + For vector types, allow vector[i] but not i[vector], and create + *(((type*)&vectortype) + i) for the expression. + LOC is the location to use for the returned expression. */ tree @@ -2317,13 +2320,17 @@ build_array_ref (location_t loc, tree ar return error_mark_node; if (TREE_CODE (TREE_TYPE (array)) != ARRAY_TYPE - && TREE_CODE (TREE_TYPE (array)) != POINTER_TYPE) + && TREE_CODE (TREE_TYPE (array)) != POINTER_TYPE + /* Allow vector[index] but not index[vector]. */ + && TREE_CODE (TREE_TYPE (array)) != VECTOR_TYPE) { tree temp; if (TREE_CODE (TREE_TYPE (index)) != ARRAY_TYPE && TREE_CODE (TREE_TYPE (index)) != POINTER_TYPE) { - error_at (loc, "subscripted value is neither array nor pointer"); + error_at (loc, + "subscripted value is neither array nor pointer nor vector"); + return error_mark_node; } temp = array; @@ -2353,6 +2360,27 @@ build_array_ref (location_t loc, tree ar index = default_conversion (index); gcc_assert (TREE_CODE (TREE_TYPE (index)) == INTEGER_TYPE); + + /* For vector[index], convert the vector to a + pointer of the underlying type. */ + if (TREE_CODE (TREE_TYPE (array)) == VECTOR_TYPE) + { + tree type = TREE_TYPE (array); + tree type1; + + if (TREE_CODE (index) == INTEGER_CST) + if (!host_integerp (index, 1) + || ((unsigned HOST_WIDE_INT) tree_low_cst (index, 1) + >= TYPE_VECTOR_SUBPARTS (TREE_TYPE (array)))) + warning_at (loc, OPT_Warray_bounds, "index value is out of bound"); + + c_common_mark_addressable_vec (array); + type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type)); + type = build_pointer_type (type); + type1 = build_pointer_type (TREE_TYPE (array)); + array = build1 (ADDR_EXPR, type1, array); + array = convert (type, array); + } if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE) {