diff mbox series

[RFC] RISC-V: Add support for LP64DV

Message ID 20240904202611.16794-1-palmer@rivosinc.com
State New
Headers show
Series [RFC] RISC-V: Add support for LP64DV | expand

Commit Message

Palmer Dabbelt Sept. 4, 2024, 8:26 p.m. UTC
Now that we've got the riscv_vector_cc attribute it's pretty much free
to add a system-wide ABI -- at least in terms of implementation.  So
this just adds a new ABI command-line value that defaults to enabling
the vector calling convention, essentially the same as scattering the
attribute on every function.

gcc/ChangeLog:

	* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add LP64DV.
	* config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
	Likewise.
	* config/riscv/riscv-opts.h (enum riscv_abi_type): Likewise.
	* config/riscv/riscv.cc (riscv_vector_cc_function_p): Use
	LP64DV.
	(riscv_option_override): Likewise.
	* config/riscv/riscv.opt: Add LP64DV.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/lp64dv.c: New test.
---
So this is very much an RFC, again.  As such it's basically not tested,
I just manually inspected the test case and it looks sane.

This concept of a yes-V-by-default ABI has come up a bunch of times.
There's some marginal performance benefit here (the added test saves a
stack spill, for example).  I have no idea how exciting this would be in
real code, but I don't think having autovectorized values with lifetimes
that cross function calls is super esoteric or anything.  The
implementation is basically free, though, and it seems kind of odd to
just leave some performance on the floor for the sake of compatibility
with the pre-official distro ABIs.

Normally adding another ABI would be a big ask on the testing side of
things, but for this I think it might actually be net easier: any bugs
that would show up via `-mabi=lp64dv` would also show up via
`__attribute__((riscv_vector_cc))`, so this would basically just give us
a bunch of free tests.  Of course it's way more exposed having a
command-line argument and thus those bugs become way more important, but
we'd need to fix them all eventually anyway.

Presumably we'd want a full suite of V-default ABIs, but I just started
with a single one -- there's really no code here, just boilerplate, so
that's just mostly me being lazy.

I'd assume we also want psABI coverage here.  IIRC it's come up over
there, but I don't think there's a PR to add it or anything (though I'm
not paying much attention to the psABI these days).  I figured it'd be
best to feel things out over here first, though -- no sense in starting
an argument over there if we're not even going to support it.
---
 gcc/config/riscv/riscv-c.cc                   |  8 ++++++
 gcc/config/riscv/riscv-d.cc                   |  2 ++
 gcc/config/riscv/riscv-opts.h                 |  3 +-
 gcc/config/riscv/riscv.cc                     |  8 ++++++
 gcc/config/riscv/riscv.opt                    |  3 ++
 .../gcc.target/riscv/rvv/base/lp64dv.c        | 28 +++++++++++++++++++
 6 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c

Comments

Palmer Dabbelt Sept. 4, 2024, 9:30 p.m. UTC | #1
On Wed, 04 Sep 2024 13:26:11 PDT (-0700), Palmer Dabbelt wrote:
> Now that we've got the riscv_vector_cc attribute it's pretty much free
> to add a system-wide ABI -- at least in terms of implementation.  So
> this just adds a new ABI command-line value that defaults to enabling
> the vector calling convention, essentially the same as scattering the
> attribute on every function.
>
> gcc/ChangeLog:
>
> 	* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add LP64DV.
> 	* config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
> 	Likewise.
> 	* config/riscv/riscv-opts.h (enum riscv_abi_type): Likewise.
> 	* config/riscv/riscv.cc (riscv_vector_cc_function_p): Use
> 	LP64DV.
> 	(riscv_option_override): Likewise.
> 	* config/riscv/riscv.opt: Add LP64DV.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/riscv/rvv/base/lp64dv.c: New test.
> ---
> So this is very much an RFC, again.  As such it's basically not tested,
> I just manually inspected the test case and it looks sane.
>
> This concept of a yes-V-by-default ABI has come up a bunch of times.
> There's some marginal performance benefit here (the added test saves a
> stack spill, for example).  I have no idea how exciting this would be in
> real code, but I don't think having autovectorized values with lifetimes
> that cross function calls is super esoteric or anything.  The
> implementation is basically free, though, and it seems kind of odd to
> just leave some performance on the floor for the sake of compatibility
> with the pre-official distro ABIs.
>
> Normally adding another ABI would be a big ask on the testing side of
> things, but for this I think it might actually be net easier: any bugs
> that would show up via `-mabi=lp64dv` would also show up via
> `__attribute__((riscv_vector_cc))`, so this would basically just give us
> a bunch of free tests.  Of course it's way more exposed having a
> command-line argument and thus those bugs become way more important, but
> we'd need to fix them all eventually anyway.
>
> Presumably we'd want a full suite of V-default ABIs, but I just started
> with a single one -- there's really no code here, just boilerplate, so
> that's just mostly me being lazy.
>
> I'd assume we also want psABI coverage here.  IIRC it's come up over
> there, but I don't think there's a PR to add it or anything (though I'm
> not paying much attention to the psABI these days).  I figured it'd be
> best to feel things out over here first, though -- no sense in starting
> an argument over there if we're not even going to support it.
> ---
>  gcc/config/riscv/riscv-c.cc                   |  8 ++++++
>  gcc/config/riscv/riscv-d.cc                   |  2 ++
>  gcc/config/riscv/riscv-opts.h                 |  3 +-
>  gcc/config/riscv/riscv.cc                     |  8 ++++++
>  gcc/config/riscv/riscv.opt                    |  3 ++
>  .../gcc.target/riscv/rvv/base/lp64dv.c        | 28 +++++++++++++++++++
>  6 files changed, 51 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c
>
> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> index 71112d9c66d..c114da376ef 100644
> --- a/gcc/config/riscv/riscv-c.cc
> +++ b/gcc/config/riscv/riscv-c.cc
> @@ -159,10 +159,18 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>
>      case ABI_ILP32D:
>      case ABI_LP64D:
> +    case ABI_LP64DV:
>        builtin_define ("__riscv_float_abi_double");
>        break;
>      }
>
> +  switch (riscv_abi)
> +    {
> +    case ABI_LP64DV:
> +      builtin_define ("__riscv_vector_abi_always");
> +      break;
> +    }
> +
>    switch (riscv_cmodel)
>      {
>      case CM_MEDLOW:
> diff --git a/gcc/config/riscv/riscv-d.cc b/gcc/config/riscv/riscv-d.cc
> index bb4539243f8..d4f814dc0d3 100644
> --- a/gcc/config/riscv/riscv-d.cc
> +++ b/gcc/config/riscv/riscv-d.cc
> @@ -64,6 +64,8 @@ riscv_d_handle_target_float_abi (void)
>
>      case ABI_ILP32D:
>      case ABI_LP64D:
> +    /* FIXME: Should we even have the V ABI for D? */
> +    case ABI_LP64DV:
>        abi = "double";
>        break;
>
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index 5497d1173c4..64e1e27ea29 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -29,7 +29,8 @@ enum riscv_abi_type {
>    ABI_LP64,
>    ABI_LP64E,
>    ABI_LP64F,
> -  ABI_LP64D
> +  ABI_LP64D,
> +  ABI_LP64DV
>  };
>  extern enum riscv_abi_type riscv_abi;
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index f82e64a6fec..605fb67b808 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -6151,6 +6151,9 @@ riscv_arguments_is_vector_type_p (const_tree fntype)
>  static bool
>  riscv_vector_cc_function_p (const_tree fntype)
>  {
> +  if (riscv_abi == ABI_LP64DV)
> +    return true;
> +
>    tree attr = TYPE_ATTRIBUTES (fntype);
>    bool vector_cc_p = lookup_attribute ("vector_cc", attr) != NULL_TREE
>      || lookup_attribute ("riscv_vector_cc", attr) != NULL_TREE;
> @@ -10137,6 +10140,11 @@ riscv_option_override (void)
>  		"project via %{PR116152%}", "https://gcc.gnu.org/PR116152");
>      }
>
> +  if (riscv_abi == ABI_LP64DV && !TARGET_VECTOR)
> +    {
> +      error ("lp64dv requires the V extension");
> +    }
> +
>    /* Zfinx require abi ilp32, ilp32e, lp64 or lp64e.  */
>    if (TARGET_ZFINX
>        && riscv_abi != ABI_ILP32 && riscv_abi != ABI_LP64
> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> index a8758abc918..5100af9b7d6 100644
> --- a/gcc/config/riscv/riscv.opt
> +++ b/gcc/config/riscv/riscv.opt
> @@ -73,6 +73,9 @@ Enum(abi_type) String(lp64f) Value(ABI_LP64F)
>  EnumValue
>  Enum(abi_type) String(lp64d) Value(ABI_LP64D)
>
> +EnumValue
> +Enum(abi_type) String(lp64dv) Value(ABI_LP64DV)
> +
>  mfdiv
>  Target Mask(FDIV)
>  Use hardware floating-point divide and square root instructions.
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c b/gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c
> new file mode 100644
> index 00000000000..76815d5e4d0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64dv -O3" } */

and I forgot to include my `_zvl256b` and `-mrvv-vector-bits=zvl` in the 
test case...

> +
> +void func_vcc(long i);
> +
> +void call(const double * restrict a, double * restrict c)
> +{
> +    for (long i = 0; i < 1024; i += 8)
> +      {
> +        double a0, a1, a2, a3;
> +        a0 = a[i+0];
> +        a1 = a[i+1];
> +        a2 = a[i+2];
> +        a3 = a[i+3];
> +
> +        c[i+0] = a0;
> +        c[i+1] = a1;
> +        c[i+2] = a2;
> +        c[i+3] = a3;
> +        func_vcc(i);
> +        c[i+4] = a0 + a[i+4];
> +        c[i+5] = a1 + a[i+5];
> +        c[i+6] = a2 + a[i+6];
> +        c[i+7] = a3 + a[i+7];
> +      }
> +}
> +
> +/* { dg-final { scan-assembler-times {vl1re64\.v} 2 } } */
Jeff Law Sept. 4, 2024, 10:56 p.m. UTC | #2
On 9/4/24 2:26 PM, Palmer Dabbelt wrote:
> Now that we've got the riscv_vector_cc attribute it's pretty much free
> to add a system-wide ABI -- at least in terms of implementation.  So
> this just adds a new ABI command-line value that defaults to enabling
> the vector calling convention, essentially the same as scattering the
> attribute on every function.
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add LP64DV.
> 	* config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
> 	Likewise.
> 	* config/riscv/riscv-opts.h (enum riscv_abi_type): Likewise.
> 	* config/riscv/riscv.cc (riscv_vector_cc_function_p): Use
> 	LP64DV.
> 	(riscv_option_override): Likewise.
> 	* config/riscv/riscv.opt: Add LP64DV.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/lp64dv.c: New test.
> ---
> So this is very much an RFC, again.  As such it's basically not tested,
> I just manually inspected the test case and it looks sane.
> 
> This concept of a yes-V-by-default ABI has come up a bunch of times.
> There's some marginal performance benefit here (the added test saves a
> stack spill, for example).  I have no idea how exciting this would be in
> real code, but I don't think having autovectorized values with lifetimes
> that cross function calls is super esoteric or anything.  The
> implementation is basically free, though, and it seems kind of odd to
> just leave some performance on the floor for the sake of compatibility
> with the pre-official distro ABIs.
Well, that's really the question, isn't it.  Will the distros pick it up 
or not?  If they don't, then it's just an academic exercise.  I don't 
think we've ever managed to get any kind of distro level buy-in on a 
baseline architecture.

So I don't object to the idea, I just don't know if it's going to end up 
being a dead end of effort or not.

jeff
Kito Cheng Sept. 5, 2024, 2:24 a.m. UTC | #3
Just remember adding a system wide vector calling convention has wide
compatible issues we need to worry about, like jump buf (for
setjmp/longjmp) will need to keep vector status, it doesn't need to
keep before since all vectors are call-clobber by default.

Also that may cause performance issue for vector, that will increase
the init cost for vector register - because part of vector reg become
callee save register now, so most case in current vector code gen
don't need backup/restore at prologue/epilogue, but it will change
once we change the default to vector calling convention by default.

So I would suggest system wilde should still keep using lp64d even
though the vector is available as one of the proposers for the vector
calling convention, but I am fine if the intention is having an option
to do some exercise or experiment.

On Thu, Sep 5, 2024 at 6:56 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 9/4/24 2:26 PM, Palmer Dabbelt wrote:
> > Now that we've got the riscv_vector_cc attribute it's pretty much free
> > to add a system-wide ABI -- at least in terms of implementation.  So
> > this just adds a new ABI command-line value that defaults to enabling
> > the vector calling convention, essentially the same as scattering the
> > attribute on every function.
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add LP64DV.
> >       * config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
> >       Likewise.
> >       * config/riscv/riscv-opts.h (enum riscv_abi_type): Likewise.
> >       * config/riscv/riscv.cc (riscv_vector_cc_function_p): Use
> >       LP64DV.
> >       (riscv_option_override): Likewise.
> >       * config/riscv/riscv.opt: Add LP64DV.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/rvv/base/lp64dv.c: New test.
> > ---
> > So this is very much an RFC, again.  As such it's basically not tested,
> > I just manually inspected the test case and it looks sane.
> >
> > This concept of a yes-V-by-default ABI has come up a bunch of times.
> > There's some marginal performance benefit here (the added test saves a
> > stack spill, for example).  I have no idea how exciting this would be in
> > real code, but I don't think having autovectorized values with lifetimes
> > that cross function calls is super esoteric or anything.  The
> > implementation is basically free, though, and it seems kind of odd to
> > just leave some performance on the floor for the sake of compatibility
> > with the pre-official distro ABIs.
> Well, that's really the question, isn't it.  Will the distros pick it up
> or not?  If they don't, then it's just an academic exercise.  I don't
> think we've ever managed to get any kind of distro level buy-in on a
> baseline architecture.
>
> So I don't object to the idea, I just don't know if it's going to end up
> being a dead end of effort or not.
>
> jeff
>
Palmer Dabbelt Sept. 5, 2024, 2:39 a.m. UTC | #4
On Wed, 04 Sep 2024 19:24:41 PDT (-0700), Kito Cheng wrote:
> Just remember adding a system wide vector calling convention has wide
> compatible issues we need to worry about, like jump buf (for
> setjmp/longjmp) will need to keep vector status, it doesn't need to
> keep before since all vectors are call-clobber by default.
>
> Also that may cause performance issue for vector, that will increase
> the init cost for vector register - because part of vector reg become
> callee save register now, so most case in current vector code gen
> don't need backup/restore at prologue/epilogue, but it will change
> once we change the default to vector calling convention by default.

Ya, I think we went through a bunch of that earlier on in vector land 
when the design was still a bit vaguer and we weren't sure how it was 
all going to fit together.  Since it's a new ABI we don't have to worry 
about cross-compatibility for the structs, so I think most of that stuff 
is pretty managable (and I thought it was all in glibc, but sorry if I 
missed something).

I think the trickiest bit is going to be the dynamic resolver, that was 
the big thing that ended up being easy with the variant-only approach 
-- and presumably we wouldn't want to tag everything as VARIANT_CC if 
we're changing the system ABI, which IIRC this will end up doing.

> So I would suggest system wilde should still keep using lp64d even
> though the vector is available as one of the proposers for the vector
> calling convention, but I am fine if the intention is having an option
> to do some exercise or experiment.

Even if we were to merge it glibc would just break without support, so 
IMO it's best to at least get a proof of concept for glibc before 
merging anything.

Maybe we'll get lucky and this will trick a friendly glibc release 
maintainer into doing it for us... ;)

>
> On Thu, Sep 5, 2024 at 6:56 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>>
>>
>> On 9/4/24 2:26 PM, Palmer Dabbelt wrote:
>> > Now that we've got the riscv_vector_cc attribute it's pretty much free
>> > to add a system-wide ABI -- at least in terms of implementation.  So
>> > this just adds a new ABI command-line value that defaults to enabling
>> > the vector calling convention, essentially the same as scattering the
>> > attribute on every function.
>> >
>> > gcc/ChangeLog:
>> >
>> >       * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add LP64DV.
>> >       * config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
>> >       Likewise.
>> >       * config/riscv/riscv-opts.h (enum riscv_abi_type): Likewise.
>> >       * config/riscv/riscv.cc (riscv_vector_cc_function_p): Use
>> >       LP64DV.
>> >       (riscv_option_override): Likewise.
>> >       * config/riscv/riscv.opt: Add LP64DV.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >       * gcc.target/riscv/rvv/base/lp64dv.c: New test.
>> > ---
>> > So this is very much an RFC, again.  As such it's basically not tested,
>> > I just manually inspected the test case and it looks sane.
>> >
>> > This concept of a yes-V-by-default ABI has come up a bunch of times.
>> > There's some marginal performance benefit here (the added test saves a
>> > stack spill, for example).  I have no idea how exciting this would be in
>> > real code, but I don't think having autovectorized values with lifetimes
>> > that cross function calls is super esoteric or anything.  The
>> > implementation is basically free, though, and it seems kind of odd to
>> > just leave some performance on the floor for the sake of compatibility
>> > with the pre-official distro ABIs.
>> Well, that's really the question, isn't it.  Will the distros pick it up
>> or not?  If they don't, then it's just an academic exercise.  I don't
>> think we've ever managed to get any kind of distro level buy-in on a
>> baseline architecture.
>>
>> So I don't object to the idea, I just don't know if it's going to end up
>> being a dead end of effort or not.

Ya, I agree it's useless if it doesn't get used ;).  It can't get used 
if it doesn't exist, though, so it's kind of one of those 
chicken-and-egg things.  Hence the RFC...

>>
>> jeff
>>
diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 71112d9c66d..c114da376ef 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -159,10 +159,18 @@  riscv_cpu_cpp_builtins (cpp_reader *pfile)
 
     case ABI_ILP32D:
     case ABI_LP64D:
+    case ABI_LP64DV:
       builtin_define ("__riscv_float_abi_double");
       break;
     }
 
+  switch (riscv_abi)
+    {
+    case ABI_LP64DV:
+      builtin_define ("__riscv_vector_abi_always");
+      break;
+    }
+
   switch (riscv_cmodel)
     {
     case CM_MEDLOW:
diff --git a/gcc/config/riscv/riscv-d.cc b/gcc/config/riscv/riscv-d.cc
index bb4539243f8..d4f814dc0d3 100644
--- a/gcc/config/riscv/riscv-d.cc
+++ b/gcc/config/riscv/riscv-d.cc
@@ -64,6 +64,8 @@  riscv_d_handle_target_float_abi (void)
 
     case ABI_ILP32D:
     case ABI_LP64D:
+    /* FIXME: Should we even have the V ABI for D? */
+    case ABI_LP64DV:
       abi = "double";
       break;
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 5497d1173c4..64e1e27ea29 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -29,7 +29,8 @@  enum riscv_abi_type {
   ABI_LP64,
   ABI_LP64E,
   ABI_LP64F,
-  ABI_LP64D
+  ABI_LP64D,
+  ABI_LP64DV
 };
 extern enum riscv_abi_type riscv_abi;
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f82e64a6fec..605fb67b808 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6151,6 +6151,9 @@  riscv_arguments_is_vector_type_p (const_tree fntype)
 static bool
 riscv_vector_cc_function_p (const_tree fntype)
 {
+  if (riscv_abi == ABI_LP64DV)
+    return true;
+
   tree attr = TYPE_ATTRIBUTES (fntype);
   bool vector_cc_p = lookup_attribute ("vector_cc", attr) != NULL_TREE
     || lookup_attribute ("riscv_vector_cc", attr) != NULL_TREE;
@@ -10137,6 +10140,11 @@  riscv_option_override (void)
 		"project via %{PR116152%}", "https://gcc.gnu.org/PR116152");
     }
 
+  if (riscv_abi == ABI_LP64DV && !TARGET_VECTOR)
+    {
+      error ("lp64dv requires the V extension");
+    }
+
   /* Zfinx require abi ilp32, ilp32e, lp64 or lp64e.  */
   if (TARGET_ZFINX
       && riscv_abi != ABI_ILP32 && riscv_abi != ABI_LP64
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index a8758abc918..5100af9b7d6 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -73,6 +73,9 @@  Enum(abi_type) String(lp64f) Value(ABI_LP64F)
 EnumValue
 Enum(abi_type) String(lp64d) Value(ABI_LP64D)
 
+EnumValue
+Enum(abi_type) String(lp64dv) Value(ABI_LP64DV)
+
 mfdiv
 Target Mask(FDIV)
 Use hardware floating-point divide and square root instructions.
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c b/gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c
new file mode 100644
index 00000000000..76815d5e4d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/lp64dv.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64dv -O3" } */
+
+void func_vcc(long i);
+
+void call(const double * restrict a, double * restrict c)
+{
+    for (long i = 0; i < 1024; i += 8)
+      {
+        double a0, a1, a2, a3;
+        a0 = a[i+0];
+        a1 = a[i+1];
+        a2 = a[i+2];
+        a3 = a[i+3];
+
+        c[i+0] = a0;
+        c[i+1] = a1;
+        c[i+2] = a2;
+        c[i+3] = a3;
+        func_vcc(i);
+        c[i+4] = a0 + a[i+4];
+        c[i+5] = a1 + a[i+5];
+        c[i+6] = a2 + a[i+6];
+        c[i+7] = a3 + a[i+7];
+      }
+}
+
+/* { dg-final { scan-assembler-times {vl1re64\.v} 2 } } */