From patchwork Mon Jan 18 12:14:41 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alan Lawrence <alan.lawrence@arm.com>
X-Patchwork-Id: 569533
Return-Path: 
 <gcc-patches-return-419334-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id C1A09140AD9
	for <incoming@patchwork.ozlabs.org>;
	Mon, 18 Jan 2016 23:15:04 +1100 (AEDT)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b=Zyyn/CC6; dkim-atps=neutral
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:in-reply-to:references; q=dns; s=
	default; b=HI5ApZQUrTfmEqc/KsJaw7Av52qU3H15XNLKZ9sETJfVbwC6N82bu
	uebNDpPbqHB79LWiL5iSpGwRrtiK+VTP9JcDULURAG04nMQ0TwisxHdnCAq5j8wi
	6tGHK4AhszujMv2D+wiah9u+9iWmHLia4FCOB6WcvJlbp9DZOMJYp8=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:in-reply-to:references; s=
	default; bh=DTEfyg0qU7yi+zSRQnYC5NS2HtE=; b=Zyyn/CC6cZQ/cCb8j5oq
	UjIWkpTWD0I3GzB0jDFHRgHJbwjObncx12f7FEY5Hdjk7e1P/mvhlSTRC3Glm7dV
	MW8hzDplX6xr3+sBa4SlsHylFqJWH/l2SN3UlAMEmmUOMW/bX241ErAV6l6FsbmR
	p2UUn0Azf90fIMiAoWIoBsM=
Received: (qmail 90157 invoked by alias); 18 Jan 2016 12:14:57 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 90146 invoked by uid 89); 18 Jan 2016 12:14:55 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,
	RP_MATCHES_RCVD,
	SPF_PASS autolearn=ham version=3.3.2 spammy=599, 1607, 160, 7,
	machine_mode
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Mon, 18 Jan 2016 12:14:54 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by
	usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id
	BB25848D; Mon, 18 Jan 2016 04:14:15 -0800 (PST)
Received: from arm.com (e105915-lin.emea.arm.com [10.2.206.30])	by
	usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id
	AE7FB3F246; Mon, 18 Jan 2016 04:14:52 -0800 (PST)
From: Alan Lawrence <alan.lawrence@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: kyrylo.tkachov@arm.com,	ramana.radhakrishnan@arm.com
Subject: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns;
	disable VnHF immediates.
Date: Mon, 18 Jan 2016 12:14:41 +0000
Message-Id: <1453119281-14441-2-git-send-email-alan.lawrence@arm.com>
In-Reply-To: <1453119281-14441-1-git-send-email-alan.lawrence@arm.com>
References: <1453119281-14441-1-git-send-email-alan.lawrence@arm.com>
X-IsSubscribed: yes

This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
check_effective_target_arm_neon_fp_16_ok.

At present, without the expander, moving v4hf/v8hf values around is done
via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On arm-*,
moving via two subregs is less efficient than one native insn!)

However, adding the expanders, reveals a latent bug in the V4HF variant of
*neon_mov, that vector constants are not handled properly in the
neon_valid_immediate code. Hence, for now I've used a separate expander that
disallows immediates, and disabled VnHF vectors as immediates in
neon_valid_immediate_for_move; I'll file a PR for this.

Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector modes to
the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
identical HF elements is now done by loading the scalar + vdup rather than
forcing the vector out to the constant pool.

On armeb, one of the ICEs this fixes, is in the test program for
check_effective_target_arm_neon_fp_16_ok. This means the advsimd-intrinsics
vcvt_f16 test now runs (and passes), and also that the other tests now run
with neon-fp16, rather than only neon as previously (on armeb).
This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
failing. Since those tests would previously have failed *if fp16 had been
passed in*, I think this is still a step forward; one can still run the tests
with an explicit non-fp16 multilib if the old behaviour is desired.

Note the previous patch removes other uses of VQXMOV (not strictly a dependency,
generating V4HF/V8HF reinterpret patterns is harmless, they just aren't used).

Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
cross-tested armeb-none-eabi.

gcc/ChangeLog:

	* config/arm/arm.c (neon_valid_immediate): Disallow vectors of HFmode.
	* config/arm/iterators.md (V_HF): New.
	(VQXMOV): Add V8HF.
	(VX): Add V4HF, V8HF.
	(V_reg): Add cases for V4HF, V8HF.
	* config/arm/vec-common.md (mov<mode> V_HF): New.
---
 gcc/config/arm/arm.c         |  2 ++
 gcc/config/arm/iterators.md  |  8 ++++++--
 gcc/config/arm/vec-common.md | 20 ++++++++++++++++++++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3276b03..4fdba38 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse,
   /* Vectors of float constants.  */
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
     {
+      if (GET_MODE_INNER (mode) == HFmode)
+	return -1;
       rtx el0 = CONST_VECTOR_ELT (op, 0);
       const REAL_VALUE_TYPE *r0;
 
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 974cf51..c5db868 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -59,6 +59,9 @@
 ;; Integer and float modes supported by Neon and IWMMXT.
 (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
 
+;; Vectors of half-precision floats.
+(define_mode_iterator V_HF [V4HF V8HF])
+
 ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
 (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
 
@@ -99,7 +102,7 @@
 (define_mode_iterator VQI [V16QI V8HI V4SI])
 
 ;; Quad-width vector modes, with TImode added, for moves.
-(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
+(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
 
 ;; Opaque structure types wider than TImode.
 (define_mode_iterator VSTRUCT [EI OI CI XI])
@@ -160,7 +163,7 @@
 (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
 
 ;; Modes with 8-bit and 16-bit elements.
-(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
+(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
 
 ;; Modes with 8-bit elements.
 (define_mode_iterator VE [V8QI V16QI])
@@ -428,6 +431,7 @@
 ;; Register width from element mode
 (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
                          (V4HI "P") (V8HI  "q")
+			 (V4HF "P") (V8HF  "q")
                          (V2SI "P") (V4SI  "q")
                          (V2SF "P") (V4SF  "q")
                          (DI   "P") (V2DI  "q")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index ce98f71..c27578a 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -38,6 +38,26 @@
     }
 })
 
+;; This exists separately from the above pattern to exclude an immediate RHS.
+
+(define_expand "mov<mode>"
+  [(set (match_operand:V_HF 0 "nonimmediate_operand" "")
+	(match_operand:V_HF 1 "nonimmediate_operand" ""))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+  if (can_create_pseudo_p ())
+    {
+      if (!REG_P (operands[0]))
+	operands[1] = force_reg (<MODE>mode, operands[1]);
+      else if (TARGET_NEON && CONSTANT_P (operands[1]))
+	{
+	  operands[1] = neon_make_constant (operands[1]);
+	  gcc_assert (operands[1] != NULL_RTX);
+	}
+    }
+})
+
 ;; Vector arithmetic. Expanders are blank, then unnamed insns implement
 ;; patterns separately for IWMMXT and Neon.