From patchwork Wed Jul  9 16:02:25 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ulrich Weigand <uweigand@de.ibm.com>
X-Patchwork-Id: 368255
Return-Path: 
 <gcc-patches-return-372177-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 5AA8A140108
	for <incoming@patchwork.ozlabs.org>;
	Thu, 10 Jul 2014 02:02:43 +1000 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:subject:to:date:from:cc:mime-version:content-type
	:content-transfer-encoding; q=dns; s=default; b=LGGKAGm3BLnos4JN
	N5FO430CarnDuX4qaEeRx5n2iTsJ37cBd6FMmDaV4O2TCbKHS1EoEW3KKr19SKcr
	Ebg/0IuCEnaG3OPm3Rj9WnNwSF5jLVJqnVw06ypFkYKmpCDVN5yf78LkRm4rLqTs
	+uvXOSJ+DL8gkOp+vNW9c0ifpKM=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:subject:to:date:from:cc:mime-version:content-type
	:content-transfer-encoding; s=default; bh=kVlGHpMeKopaB0OH5PqOke
	5Galk=; b=gRn9biuYyYKhlLRnvv5xP3otGUu7UX7XLSwzcNVDCRfy3N4ibdJpE1
	bExVwAExI/m643Cphz0fj8wxY6xT8KL+31+3gcPBmAuRWpAvNNjf/aem8mx1sDLz
	ubcfVmfVWkdctscaElYshN9zjzSPZETYPg1XFEEiWxKQb102ottaI=
Received: (qmail 7425 invoked by alias); 9 Jul 2014 16:02:36 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 7411 invoked by uid 89); 9 Jul 2014 16:02:35 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00,
	MSGID_FROM_MTA_HEADER, RP_MATCHES_RCVD,
	SPF_PASS autolearn=ham version=3.3.2
X-HELO: e06smtp14.uk.ibm.com
Received: from e06smtp14.uk.ibm.com (HELO e06smtp14.uk.ibm.com)
	(195.75.94.110) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted)
	ESMTPS; Wed, 09 Jul 2014 16:02:32 +0000
Received: from /spool/local	by e06smtp14.uk.ibm.com with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be
	prosecuted	for <gcc-patches@gcc.gnu.org> from
	<uweigand@de.ibm.com>; Wed, 9 Jul 2014 17:02:29 +0100
Received: from d06dlp03.portsmouth.uk.ibm.com (9.149.20.15)	by
	e06smtp14.uk.ibm.com (192.168.101.144) with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be prosecuted;
	Wed, 9 Jul 2014 17:02:28 +0100
Received: from b06cxnps4074.portsmouth.uk.ibm.com
	(d06relay11.portsmouth.uk.ibm.com [9.149.109.196])	by
	d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id
	B72DB1B0806B	for <gcc-patches@gcc.gnu.org>;
	Wed,  9 Jul 2014 17:03:05 +0100 (BST)
Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com
	[9.149.37.228])	by b06cxnps4074.portsmouth.uk.ibm.com
	(8.13.8/8.13.8/NCO v10.0) with ESMTP id s69G2Rxo32506044	for
	<gcc-patches@gcc.gnu.org>; Wed, 9 Jul 2014 16:02:27 GMT
Received: from d06av02.portsmouth.uk.ibm.com (localhost [127.0.0.1])	by
	d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout)
	with ESMTP id s69G2RHQ014594	for <gcc-patches@gcc.gnu.org>;
	Wed, 9 Jul 2014 10:02:27 -0600
Received: from tuxmaker.boeblingen.de.ibm.com
	(tuxmaker.boeblingen.de.ibm.com [9.152.85.9])	by
	d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin)
	with SMTP id s69G2QOm014547; Wed, 9 Jul 2014 10:02:26 -0600
Message-Id: <201407091602.s69G2QOm014547@d06av02.portsmouth.uk.ibm.com>
Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation);
	Wed, 09 Jul 2014 18:02:25 +0200
Subject: [PATCH, rs6000] Fix ELFv2 homogeneous float aggregate ABI bug
To: gcc-patches@gcc.gnu.org
Date: Wed, 9 Jul 2014 18:02:25 +0200 (CEST)
From: "Ulrich Weigand" <uweigand@de.ibm.com>
Cc: dje.gcc@gmail.com
MIME-Version: 1.0
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 14070916-1948-0000-0000-00000079ADC2

Hello,

the implementation of homogenous float aggregates for the ELFv2 ABI has
unfortunately shown to have a bug in a corner case.

The problem is that because such aggregates are packed in the argument
save area, but each (4-byte) float occupies one of just 13 registers
on its own, we may run out of registers while we're still within the
first 64 bytes of the argument save area.

Usually, any argument that doesn't fit into register should go in
memory.  But that rule doesn't apply within the first 64 bytes, where
such arguments need to go into GPRs first.  This is important since
the ABI guarantees that the first 64 bytes of the save area are free,
e.g. to store GPRs into.  If an argument is actually passed within
those first 64 bytes, some code (e.g. libffi assembler stubs) may
clobber its contents.

Now, the existing rs6000_function_arg code will handle this case
correctly if the extra floats come in a *new* argument.  For example,
in the following test case

struct float2 { float x[2]; };
struct float6 { float x[6]; };
struct float8 { float x[8]; };

float func (struct float8 a, struct float6 b, struct float2 c);

both parts of "c" are correctly expected in r10.

However, the code handles incorrectly the case where a *single*
aggregate argument is split between FPRs and "extra" floats.
For example, "b.x[5]" is expected in memory, although it ought
to reside in r9.

The appended patch fixes the implementation to comply with the ABI.

This is an ABI change for the affected corner cases of the ELFv2
ABI.  However, those cases should be extremely rare; the full
compat.exe and struct-layout-1.exp ABI compatibility test suite 
passed, with the exception of two tests specifically intended
to test multiple homogeneous float aggregates.

Tested on powerpc64le-linux.

OK for mainline?
[ The patch should then also go into the 4.8 and 4.9 branches for
consistency. ]

Bye,
Ulrich


ChangeLog:

        * config/rs6000/rs6000.c (rs6000_function_arg): If a float argument
	does not fit fully into floating-point registers, and there is still
	space in the register parameter area, use GPRs to pass those parts
	of the argument.
        (rs6000_arg_partial_bytes): Likewise.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 212100)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -10227,6 +10227,7 @@
 	  rtx r, off;
 	  int i, k = 0;
 	  unsigned long n_fpreg = (GET_MODE_SIZE (elt_mode) + 7) >> 3;
+	  int fpr_words;
 
 	  /* Do we also need to pass this argument in the parameter
 	     save area?  */
@@ -10255,6 +10256,37 @@
 	      rvec[k++] = gen_rtx_EXPR_LIST (VOIDmode, r, off);
 	    }
 
+	  /* If there were not enough FPRs to hold the argument, the rest
+	     usually goes into memory.  However, if the current position
+	     is still within the register parameter area, a portion may
+	     actually have to go into GPRs.
+
+	     Note that it may happen that the portion of the argument
+	     passed in the first "half" of the first GPR was already
+	     passed in the last FPR as well.
+
+	     For unnamed arguments, we already set up GPRs to cover the
+	     whole argument in rs6000_psave_function_arg, so there is
+	     nothing further to do at this point.  */
+	  fpr_words = (i * GET_MODE_SIZE (elt_mode)) / (TARGET_32BIT ? 4 : 8);
+	  if (i < n_elts && align_words + fpr_words < GP_ARG_NUM_REG
+	      && cum->nargs_prototype > 0)
+            {
+	      enum machine_mode rmode = TARGET_32BIT ? SImode : DImode;
+	      int n_words = rs6000_arg_size (mode, type);
+
+	      align_words += fpr_words;
+	      n_words -= fpr_words;
+
+	      do
+		{
+		  r = gen_rtx_REG (rmode, GP_ARG_MIN_REG + align_words);
+		  off = GEN_INT (fpr_words++ * GET_MODE_SIZE (rmode));
+		  rvec[k++] = gen_rtx_EXPR_LIST (VOIDmode, r, off);
+		}
+	      while (++align_words < GP_ARG_NUM_REG && --n_words != 0);
+	    }
+
 	  return rs6000_finish_function_arg (mode, rvec, k);
 	}
       else if (align_words < GP_ARG_NUM_REG)
@@ -10330,8 +10362,23 @@
       /* Otherwise, we pass in FPRs only.  Check for partial copies.  */
       passed_in_gprs = false;
       if (cum->fregno + n_elts * n_fpreg > FP_ARG_MAX_REG + 1)
-	ret = ((FP_ARG_MAX_REG + 1 - cum->fregno)
-	       * MIN (8, GET_MODE_SIZE (elt_mode)));
+	{
+	  /* Compute number of bytes / words passed in FPRs.  If there
+	     is still space available in the register parameter area
+	     *after* that amount, a part of the argument will be passed
+	     in GPRs.  In that case, the total amount passed in any
+	     registers is equal to the amount that would have been passed
+	     in GPRs if everything were passed there, so we fall back to
+	     the GPR code below to compute the appropriate value.  */
+	  int fpr = ((FP_ARG_MAX_REG + 1 - cum->fregno)
+		     * MIN (8, GET_MODE_SIZE (elt_mode)));
+	  int fpr_words = fpr / (TARGET_32BIT ? 4 : 8);
+
+	  if (align_words + fpr_words < GP_ARG_NUM_REG)
+	    passed_in_gprs = true;
+	  else
+	    ret = fpr;
+	}
     }
 
   if (passed_in_gprs