diff mbox series

arm: Fix wrong code with MVE V2DImode loads and stores [PR99960]

Message ID 20210426101524.a6atjl3hyo2fpvyq@arm.com
State New
Headers show
Series arm: Fix wrong code with MVE V2DImode loads and stores [PR99960] | expand

Commit Message

Alex Coplan April 26, 2021, 10:15 a.m. UTC
Hi,

As the PR shows, we currently miscompile V2DImode loads and stores for
MVE. We're currently using 64-bit loads/stores, but need to be using
128-bit vector loads and stores.

Some intrinsics tests were checking that we (incorrectly) used the
64-bit loads/stores: these have been updated.

Regression tested an arm-eabi cross configured
--with-arch=armv8.1-m.main+mve --with-float=hard. The patch has the
following effect on the testsuite:

FAIL->PASS: c-c++-common/torture/vector-compare-1.c   -O0  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O0  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O1  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O3 -g  execution test
FAIL->PASS: gcc.c-torture/execute/pr92618.c   -Os  execution test
FAIL->PASS: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL->PASS: gcc.dg/torture/pr52407.c   -O0  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -O1  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -O2  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -O3 -g  execution test
FAIL->PASS: gcc.dg/torture/pr52407.c   -Os  execution test
FAIL->PASS: gcc.dg/torture/pr57748-1.c   -O0  execution test
FAIL->PASS: gcc.dg/torture/pr57748-2.c   -O0  execution test
FAIL->PASS: gcc.dg/torture/pr57748-3.c   -O0  execution test
FAIL->PASS: gcc.dg/torture/pr57748-4.c   -O0  execution test
FAIL->PASS: gcc.dg/torture/pr58041.c   -O0  execution test
FAIL->PASS: gcc.dg/torture/pr58041.c   -O1  execution test
FAIL->PASS: gcc.dg/torture/pr58041.c   -O2  execution test
FAIL->PASS: gcc.dg/torture/pr58041.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL->PASS: gcc.dg/torture/pr58041.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL->PASS: gcc.dg/torture/pr58041.c   -O3 -g  execution test FAIL->PASS: gcc.dg/torture/pr58041.c   -Os  execution test
FAIL->PASS: gcc.dg/torture/pr61346.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL->PASS: gcc.dg/torture/pr61346.c   -O3 -g  execution test
FAIL->PASS: gcc.dg/torture/vshuf-v2df.c   -O2  execution test
FAIL->PASS: gcc.dg/torture/vshuf-v2di.c   -O2  execution test

Bootstrap and regtest on arm-linux-gnueabihf in progress.

OK for trunk and eventual backports to 11 and 10 branches if regstrap
looks good?

Thanks,
Alex

gcc/ChangeLog:

	PR target/99960
	* config/arm/mve.md (*mve_mov<mode>): Simplify output code. Use
	vldrw.u32 and vstrw.32 for V2D[IF]mode loads and stores.

gcc/testsuite/ChangeLog:

	PR target/99960
	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
	Update now that we're (correctly) using full 128-bit vector
	loads/stores.
	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c:
	Likewise.

Comments

Alex Coplan May 10, 2021, 8:36 a.m. UTC | #1
Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568669.html

On 26/04/2021 11:15, Alex Coplan via Gcc-patches wrote:
> Hi,
> 
> As the PR shows, we currently miscompile V2DImode loads and stores for
> MVE. We're currently using 64-bit loads/stores, but need to be using
> 128-bit vector loads and stores.
> 
> Some intrinsics tests were checking that we (incorrectly) used the
> 64-bit loads/stores: these have been updated.
> 
> Regression tested an arm-eabi cross configured
> --with-arch=armv8.1-m.main+mve --with-float=hard. The patch has the
> following effect on the testsuite:
> 
> FAIL->PASS: c-c++-common/torture/vector-compare-1.c   -O0  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O0  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O1  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O3 -g  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -Os  execution test
> FAIL->PASS: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o execute
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O1  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O2  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O3 -g  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -Os  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-1.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-2.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-3.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-4.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O1  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O2  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O3 -g  execution test FAIL->PASS: gcc.dg/torture/pr58041.c   -Os  execution test
> FAIL->PASS: gcc.dg/torture/pr61346.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL->PASS: gcc.dg/torture/pr61346.c   -O3 -g  execution test
> FAIL->PASS: gcc.dg/torture/vshuf-v2df.c   -O2  execution test
> FAIL->PASS: gcc.dg/torture/vshuf-v2di.c   -O2  execution test
> 
> Bootstrap and regtest on arm-linux-gnueabihf in progress.
> 
> OK for trunk and eventual backports to 11 and 10 branches if regstrap
> looks good?
> 
> Thanks,
> Alex
> 
> gcc/ChangeLog:
> 
> 	PR target/99960
> 	* config/arm/mve.md (*mve_mov<mode>): Simplify output code. Use
> 	vldrw.u32 and vstrw.32 for V2D[IF]mode loads and stores.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR target/99960
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
> 	Update now that we're (correctly) using full 128-bit vector
> 	loads/stores.
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
> 	Likewise.
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c:
> 	Likewise.
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c:
> 	Likewise.
> 	* gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c:
> 	Likewise.
Kyrylo Tkachov May 10, 2021, 8:39 a.m. UTC | #2
> -----Original Message-----
> From: Alex Coplan <Alex.Coplan@arm.com>
> Sent: 26 April 2021 11:15
> To: gcc-patches@gcc.gnu.org
> Cc: nickc@redhat.com; Richard Earnshaw <Richard.Earnshaw@arm.com>;
> Ramana Radhakrishnan <Ramana.Radhakrishnan@arm.com>; Kyrylo
> Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: [PATCH] arm: Fix wrong code with MVE V2DImode loads and stores
> [PR99960]
> 
> Hi,
> 
> As the PR shows, we currently miscompile V2DImode loads and stores for
> MVE. We're currently using 64-bit loads/stores, but need to be using
> 128-bit vector loads and stores.
> 
> Some intrinsics tests were checking that we (incorrectly) used the
> 64-bit loads/stores: these have been updated.
> 
> Regression tested an arm-eabi cross configured
> --with-arch=armv8.1-m.main+mve --with-float=hard. The patch has the
> following effect on the testsuite:
> 
> FAIL->PASS: c-c++-common/torture/vector-compare-1.c   -O0  execution
> test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O0  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O1  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2 -flto -fno-use-linker-
> plugin -flto-partition=none  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O2 -flto -fuse-linker-plugin -
> fno-fat-lto-objects  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -O3 -g  execution test
> FAIL->PASS: gcc.c-torture/execute/pr92618.c   -Os  execution test
> FAIL->PASS: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o
> execute
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O1  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O2  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O2 -flto -fno-use-linker-plugin -flto-
> partition=none  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O2 -flto -fuse-linker-plugin -fno-fat-
> lto-objects  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O3 -fomit-frame-pointer -funroll-
> loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -O3 -g  execution test
> FAIL->PASS: gcc.dg/torture/pr52407.c   -Os  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-1.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-2.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-3.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr57748-4.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O0  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O1  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O2  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O2 -flto -fno-use-linker-plugin -flto-
> partition=none  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O2 -flto -fuse-linker-plugin -fno-fat-
> lto-objects  execution test
> FAIL->PASS: gcc.dg/torture/pr58041.c   -O3 -g  execution test FAIL->PASS:
> gcc.dg/torture/pr58041.c   -Os  execution test
> FAIL->PASS: gcc.dg/torture/pr61346.c   -O3 -fomit-frame-pointer -funroll-
> loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL->PASS: gcc.dg/torture/pr61346.c   -O3 -g  execution test
> FAIL->PASS: gcc.dg/torture/vshuf-v2df.c   -O2  execution test
> FAIL->PASS: gcc.dg/torture/vshuf-v2di.c   -O2  execution test
> 
> Bootstrap and regtest on arm-linux-gnueabihf in progress.
> 
> OK for trunk and eventual backports to 11 and 10 branches if regstrap
> looks good?

Ok. I had looked at it earlier but had forgotten to reply...
Thanks,
Kyrill

> 
> Thanks,
> Alex
> 
> gcc/ChangeLog:
> 
> 	PR target/99960
> 	* config/arm/mve.md (*mve_mov<mode>): Simplify output code.
> Use
> 	vldrw.u32 and vstrw.32 for V2D[IF]mode loads and stores.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR target/99960
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
> 	Update now that we're (correctly) using full 128-bit vector
> 	loads/stores.
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
> 	Likewise.
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c:
> 	Likewise.
> 	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c:
> 	Likewise.
> 	* gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c:
> 	Likewise.
diff mbox series

Patch

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 7467d5f4d57..5c11885fb73 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -41,44 +41,19 @@  (define_insn "*mve_mov<mode>"
 
   if (which_alternative == 4 || which_alternative == 7)
     {
-      rtx ops[2];
-      int regno = (which_alternative == 7)
-		  ? REGNO (operands[1]) : REGNO (operands[0]);
-
-      ops[0] = operands[0];
-      ops[1] = operands[1];
-      if (<MODE>mode == V2DFmode || <MODE>mode == V2DImode)
-	{
-	  if (which_alternative == 7)
-	    {
-	      ops[1] = gen_rtx_REG (DImode, regno);
-	      output_asm_insn ("vstr.64\t%P1, %E0",ops);
-	    }
-	  else
-	    {
-	      ops[0] = gen_rtx_REG (DImode, regno);
-	      output_asm_insn ("vldr.64\t%P0, %E1",ops);
-	    }
-	}
-      else if (<MODE>mode == TImode)
+      if (<MODE>mode == V2DFmode || <MODE>mode == V2DImode || <MODE>mode == TImode)
 	{
 	  if (which_alternative == 7)
-	    output_asm_insn ("vstr.64\t%q1, %E0",ops);
+	    output_asm_insn ("vstrw.32\t%q1, %E0", operands);
 	  else
-	    output_asm_insn ("vldr.64\t%q0, %E1",ops);
+	    output_asm_insn ("vldrw.u32\t%q0, %E1",operands);
 	}
       else
 	{
 	  if (which_alternative == 7)
-	    {
-	      ops[1] = gen_rtx_REG (TImode, regno);
-	      output_asm_insn ("vstr<V_sz_elem1>.<V_sz_elem>\t%q1, %E0",ops);
-	    }
+	    output_asm_insn ("vstr<V_sz_elem1>.<V_sz_elem>\t%q1, %E0", operands);
 	  else
-	    {
-	      ops[0] = gen_rtx_REG (TImode, regno);
-	      output_asm_insn ("vldr<V_sz_elem1>.<V_sz_elem>\t%q0, %E1",ops);
-	    }
+	    output_asm_insn ("vldr<V_sz_elem1>.<V_sz_elem>\t%q0, %E1", operands);
 	}
       return "";
     }
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
index 7420d0198e7..a9b1f81b62d 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
@@ -11,6 +11,6 @@  foo (uint64x2_t * addr)
 }
 
 /* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
-/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vldrw.u32" 1 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 1 } } */
 /* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c
index ebe5b2fd70c..e32a06695ae 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c
@@ -11,6 +11,6 @@  foo (uint64x2_t * addr)
 }
 
 /* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
-/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vldrw.u32" 1 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 1 } } */
 /* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c
index 231a24a1e55..bb06cf88e32 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c
@@ -10,6 +10,6 @@  int64x2_t foo (uint64x2_t * addr, mve_pred16_t p)
 
 /* { dg-final { scan-assembler "vpst" } } */
 /* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
-/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vldrw.u32" 1 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 1 } } */
 /* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c
index b8d9b5c1391..558115d49ef 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c
@@ -10,6 +10,6 @@  uint64x2_t foo (uint64x2_t * addr, mve_pred16_t p)
 
 /* { dg-final { scan-assembler "vpst" } } */
 /* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
-/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vldrw.u32" 1 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 1 } } */
 /* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
index bf6692fe573..cc5e6358bee 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
@@ -27,6 +27,5 @@  foo ()
 
 /* { dg-final { scan-assembler-times "vstrb.8" 2 } } */
 /* { dg-final { scan-assembler-times "vstrh.16" 2 } } */
-/* { dg-final { scan-assembler-times "vstrw.32" 2 } } */
-/* { dg-final { scan-assembler-times "vstr.64" 2 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 4 } } */
 /* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
index 4f66a07ac29..bfeb52b4dd6 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
@@ -28,6 +28,5 @@  foo ()
 
 /* { dg-final { scan-assembler-times "vstrb.8" 2 } } */
 /* { dg-final { scan-assembler-times "vstrh.16" 2 } } */
-/* { dg-final { scan-assembler-times "vstrw.32" 2 } } */
-/* { dg-final { scan-assembler-times "vstr.64" 2 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 4 } } */
 /* { dg-final { scan-assembler-not "__ARM_undef" } } */