diff mbox

[AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics

Message ID 5575615E.30509@arm.com
State New
Headers show

Commit Message

Alan Lawrence June 8, 2015, 9:33 a.m. UTC
Thanks for working on this!

I'd been fiddling around with a patch with some similar elements to this, but 
many trials with union types, subregs, etc., all worsened the register 
allocation and led to more unnecessary shuffling / moves. The only real thing I 
tried which you don't do here, was to introduce a set_dreg expander to clean up 
some of those macro definitions in arm_neon.h. That could easily follow in a 
separate patch if desired!

So your patch looks good to me.

A couple of style nits:


also the dg-error messages in the testsuite, do not need to be on the same line 
as the statement generating the error, because the trailing 0 tells dg that the 
position/line number doesn't matter (i.e. dg should allow the error to be 
reported at any line); so these could be brought under 80 chars.

Thanks, Alan

Charles Baylis wrote:
> This is another attempt at fixing this PR63870 for AArch64 (ARM is
> still to come).
> 
> As before, the Q register variants are handled by moving the check for
> the lane bounds into builtin expansion. The handling of lane numbers
> is made consistent wrt endianess with other NEON single lane
> operations - lane numbers in RTL are flipped for big-endian, and
> flipped back at assembly time.
> 
> The D register variants are now handled by adding new builtins for all
> the 64bit operations. These behave identically to Q register variants,
> except that the permitted lane bounds are different.
> 
> In the iterators used by the relevant patterns are changed from VQ and
> VALLDIF so that the correct vector sizes are used in the endian-flip
> at assembly time.
> 
> Finally, a set of machine-generated test cases is added. These do need
> to be in separate files, because of testsuite limitations.
> 
> Regression tested on qemu for aarch64-linux-gnu with no regressions
> and all new tests pass.
> 
> OK for trunk?
> 
> 
> gcc/ChangeLog:
> 
> <DATE>  Charles Baylis  <charles.baylis@linaro.org>
> 
>     PR target/63870
>     * config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
>     Add qualifier_struct_load_store_lane_index.
>     (aarch64_types_loadstruct_lane_qualifiers): Use
>     qualifier_struct_load_store_lane_index for lane index argument for
>     last argument.
>     (aarch64_types_storestruct_lane_qualifiers): Ditto.
>     (builtin_simd_arg): Add SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
>     (aarch64_simd_expand_args): Add new argument describing mode of
>     builtin. Check lane bounds for arguments with
>     SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
>     (aarch64_simd_expand_builtin): Emit error for incorrect lane indices
>     if marked with SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
>     (aarch64_simd_expand_builtin): Handle arguments with
>     qualifier_struct_load_store_lane_index. Pass machine mode of builtin to
>     aarch64_simd_expand_args.
>     * config/aarch64/aarch64-simd-builtins.def: Declare ld[234]_lane and
>     vst[234]_lane with BUILTIN_VALLDIF.
>     * config/aarch64/aarch64-simd.md:
>     (aarch64_vec_load_lanesoi_lane<mode>): Use VALLDIF iterator. Perform
>     endianness reversal on lane index.
>     (aarch64_vec_load_lanesci_lane<mode>): Ditto.
>     (aarch64_vec_load_lanesxi_lane<mode>): Ditto.
>     (vec_store_lanesoi_lane<mode>): Use VALLDIF iterator. Fix typo
>     in attribute.
>     (vec_store_lanesci_lane<mode>): Use VALLDIF iterator.
>     (vec_store_lanesxi_lane<mode>): Ditto.
>     (aarch64_ld2_lane<mode>): Use VALLDIF iterator. Remove endianness
>     reversal of lane index.
>     (aarch64_ld3_lane<mode>): Ditto.
>     (aarch64_ld4_lane<mode>): Ditto.
>     (aarch64_st2_lane<mode>): Ditto.
>     (aarch64_st3_lane<mode>): Ditto.
>     (aarch64_st4_lane<mode>): Ditto.
>     * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Rename mode parameter
>     to qmode. Add new mode parameter. Update uses.
>     (__LD3_LANE_FUNC): Ditto.
>     (__LD4_LANE_FUNC): Ditto.
>     (__ST2_LANE_FUNC): Ditto.
>     (__ST3_LANE_FUNC): Ditto.
>     (__ST4_LANE_FUNC): Ditto.
> 
> 
> <DATE>  Charles Baylis  <charles.baylis@linaro.org>
> 
>     * gcc.target/aarch64/simd/vld2_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld2q_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld3q_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vld4q_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst2q_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst3q_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4_lane_u8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_f32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_f64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_p8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_s16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_s32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_s64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_s8_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_u16_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_u32_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_u64_indices_1.c: New test.
>     * gcc.target/aarch64/simd/vst4q_lane_u8_indices_1.c: New test.

Comments

Charles Baylis June 10, 2015, 9:13 a.m. UTC | #1
On 8 June 2015 at 10:33, Alan Lawrence <alan.lawrence@arm.com> wrote:
> Thanks for working on this!
>
> I'd been fiddling around with a patch with some similar elements to this,
> but many trials with union types, subregs, etc., all worsened the register
> allocation and led to more unnecessary shuffling / moves.

Kugan has been looking into this at Linaro. We should avoid
duplicating effort here.

> The only real
> thing I tried which you don't do here, was to introduce a set_dreg expander
> to clean up some of those macro definitions in arm_neon.h. That could easily
> follow in a separate patch if desired!

I'd prefer that to be a separate step.

> So your patch looks good to me.
>
> A couple of style nits:
>
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -128,7 +128,9 @@ enum aarch64_type_qualifiers
>    /* Polynomial types.  */
>    qualifier_poly = 0x100,
>    /* Lane indices - must be in range, and flipped for bigendian.  */
> -  qualifier_lane_index = 0x200
> +  qualifier_lane_index = 0x200,
> +  /* Lane indices for single lane structure loads and stores */
> +  qualifier_struct_load_store_lane_index = 0x400
>  };
>
> should be ...'loads and stores.  */'
>
> also the dg-error messages in the testsuite, do not need to be on the same
> line as the statement generating the error, because the trailing 0 tells dg
> that the position/line number doesn't matter (i.e. dg should allow the error
> to be reported at any line); so these could be brought under 80 chars.

OK, thanks. I'll re-spin once I've tested on big endian.

> Oh, have you tested bigendian?

I have started a bigendian build on our validation infrastructure here.

Thanks for the review
Charles
Alan Lawrence June 10, 2015, 12:18 p.m. UTC | #2
Charles Baylis wrote:
> On 8 June 2015 at 10:33, Alan Lawrence <alan.lawrence@arm.com> wrote:
>> Thanks for working on this!
>>
>> I'd been fiddling around with a patch with some similar elements to this,
>> but many trials with union types, subregs, etc., all worsened the register
>> allocation and led to more unnecessary shuffling / moves.
> 
> Kugan has been looking into this at Linaro. We should avoid
> duplicating effort here.

Yes. I stopped short of looking into the internals of the register allocator, 
although I believe any proper solution is going to have to make changes here. 
However, I am working on (/nearly finished, just some tidying!) a patch series 
to add D-registers to TARGET_ARRAY_MODE_SUPPORTED_P, which may help matters.

>> The only real
>> thing I tried which you don't do here, was to introduce a set_dreg expander
>> to clean up some of those macro definitions in arm_neon.h. That could easily
>> follow in a separate patch if desired!
> 
> I'd prefer that to be a separate step.

Sure. (*If* we go that route - I hope to have another look after 
aarch64_array_mode_supported_p).

Cheers,
Alan
diff mbox

Patch

--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -128,7 +128,9 @@  enum aarch64_type_qualifiers
    /* Polynomial types.  */
    qualifier_poly = 0x100,
    /* Lane indices - must be in range, and flipped for bigendian.  */
-  qualifier_lane_index = 0x200
+  qualifier_lane_index = 0x200,
+  /* Lane indices for single lane structure loads and stores */
+  qualifier_struct_load_store_lane_index = 0x400
  };

should be ...'loads and stores.  */'