Message ID | AM5PR0802MB2610FEC899BA4DC8E9E6AF5C83AD0@AM5PR0802MB2610.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On Fri, Oct 28, 2016 at 04:54:05PM +0100, Wilco Dijkstra wrote: > James Greenhalgh wrote: > > On Wed, Oct 26, 2016 at 12:11:44PM +0000, Wilco Dijkstra wrote: > > > Add a SHA1H pattern with a V2SI input. This avoids unnecessary > > > DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)). > > > > I think this is incorrect for big endian - element 0 of a vec_select in > > big-endian for V4SImode is the high 32-bits (i.e. bits 96-127 of the > > architected register). I think you'd need two patterns, one as below for > > !BYTES_BIG_ENDIAN, and one selecting element 3 for BYTES_BIG_ENDIAN. > > Yes that's true, big-endian SIMD works in mysterious ways... Here is the updated > patch (tested on aarch64_be-none-elf too): > > Add LE/BE SHA1H patterns with a V2SI input. This avoids unnecessary > DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)). Thanks, this respin looks OK to me. James > ChangeLog: > 2016-10-28 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64-simd.md (aarch64_crypto_sha1hv4si): New pattern. > (aarch64_be_crypto_sha1hv4si): New pattern. > --
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 9ce7f00050913aebd9f83ae9c4ce4ad469dd0d98..89bdcb3f7ed53d092dd95c81fe4a15fb15dc907c 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -5705,6 +5705,26 @@ [(set_attr "type" "crypto_sha1_fast")] ) +(define_insn "aarch64_crypto_sha1hv4si" + [(set (match_operand:SI 0 "register_operand" "=w") + (unspec:SI [(vec_select:SI (match_operand:V4SI 1 "register_operand" "w") + (parallel [(const_int 0)]))] + UNSPEC_SHA1H))] + "TARGET_SIMD && TARGET_CRYPTO && !BYTES_BIG_ENDIAN" + "sha1h\\t%s0, %s1" + [(set_attr "type" "crypto_sha1_fast")] +) + +(define_insn "aarch64_be_crypto_sha1hv4si" + [(set (match_operand:SI 0 "register_operand" "=w") + (unspec:SI [(vec_select:SI (match_operand:V4SI 1 "register_operand" "w") + (parallel [(const_int 3)]))] + UNSPEC_SHA1H))] + "TARGET_SIMD && TARGET_CRYPTO && BYTES_BIG_ENDIAN" + "sha1h\\t%s0, %s1" + [(set_attr "type" "crypto_sha1_fast")] +) + (define_insn "aarch64_crypto_sha1su1v4si" [(set (match_operand:V4SI 0 "register_operand" "=w") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0")