From patchwork Wed Mar 19 19:32:21 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
X-Patchwork-Id: 331851
Return-Path: 
 <gcc-patches-return-363570-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 767452C0096
	for <incoming@patchwork.ozlabs.org>;
	Thu, 20 Mar 2014 06:33:52 +1100 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:subject:from:to:cc:date:content-type
	:content-transfer-encoding:mime-version; q=dns; s=default; b=kci
	yC5PplkNReMeIkirieup3WBbxBN4aIA25LPoMobFAEUuwOFFNQlbj27HOtsqZbZv
	bCiQ7szjlINMP9AXTxh2eX6rwvhwkwjL4FdE3i5yeRPO8IK6EShVOkr5vgrPkjRT
	F/dRhIldW+QRqxvnUbVaqf/ZNdef2Dql3F0Dny24=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:subject:from:to:cc:date:content-type
	:content-transfer-encoding:mime-version; s=default; bh=lYhFcLeyG
	fZfkjxk1PlHueLwR0w=; b=VJ4Ao2ele+3RvU3a3Kn6JNO4HXb7BRwSAzrFKEYNq
	tPq8sUMh0aQ22PsMy1NcoGQQBkiLdALEAcV3sTlCOAaWhxYp9MS4BOi3wjGt7Plg
	5XL4Hor5GhDb+w/lFIeq5h6xpVBlJO9q9p9MXyUQWiEXyRGWE1d4RTiiuIZ/leLX
	bU=
Received: (qmail 26068 invoked by alias); 19 Mar 2014 19:32:42 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 25984 invoked by uid 89); 19 Mar 2014 19:32:41 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL, BAYES_50,
	T_RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: e28smtp02.in.ibm.com
Received: from e28smtp02.in.ibm.com (HELO e28smtp02.in.ibm.com)
	(122.248.162.2) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted)
	ESMTPS; Wed, 19 Mar 2014 19:32:21 +0000
Received: from /spool/local	by e28smtp02.in.ibm.com with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be
	prosecuted	for <gcc-patches@gcc.gnu.org> from
	<wschmidt@linux.vnet.ibm.com>; Thu, 20 Mar 2014 01:02:15 +0530
Received: from d28dlp02.in.ibm.com (9.184.220.127)	by e28smtp02.in.ibm.com
	(192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use
	Only! Violators will be prosecuted; Thu, 20 Mar 2014 01:02:14 +0530
Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com
	[9.184.220.61])	by d28dlp02.in.ibm.com (Postfix) with ESMTP
	id B69053940023	for <gcc-patches@gcc.gnu.org>;
	Thu, 20 Mar 2014 01:02:13 +0530 (IST)
Received: from d28av04.in.ibm.com (d28av04.in.ibm.com [9.184.220.66])	by
	d28relay04.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	s2JJWH9v60555266	for <gcc-patches@gcc.gnu.org>;
	Thu, 20 Mar 2014 01:02:17 +0530
Received: from d28av04.in.ibm.com (localhost [127.0.0.1])	by
	d28av04.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP
	id s2JJWCVR025442	for <gcc-patches@gcc.gnu.org>;
	Thu, 20 Mar 2014 01:02:12 +0530
Received: from [9.50.16.86] (dyn9050016086.mts.ibm.com [9.50.16.86] (may be
	forged))	by d28av04.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin)
	with ESMTP id s2JJW8Yc025240; Thu, 20 Mar 2014 01:02:09 +0530
Message-ID: <1395257541.17148.15.camel@gnopaine>
Subject: [4.8, PATCH 13/26] Backport Power8 and LE support:  libffi
From: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
To: gcc-patches@gcc.gnu.org
Cc: dje.gcc@gmail.com, jakub@redhat.com, rguenther@suse.de
Date: Wed, 19 Mar 2014 14:32:21 -0500
Mime-Version: 1.0
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 14031919-5816-0000-0000-00000CF16A15
X-IsSubscribed: yes

Hi,

This patch (diff-abi-libffi) backports the ELFv2 implementation in
libffi.  Copying Jakub and Richard for the generic bits.

Thanks,
Bill


2014-03-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport mainline r205844.
	2013-11-18  Alan Modra  <amodra@gmail.com>
	* libffi/src/powerpc/ffitarget.h: Import from upstream.
	* libffi/src/powerpc/ffi_powerpc.h: Likewise.
	* libffi/src/powerpc/ffi.c: Likewise.
	* libffi/src/powerpc/ffi_sysv.c: Likewise.
	* libffi/src/powerpc/ffi_linux64.c: Likewise.
	* libffi/src/powerpc/sysv.S: Likewise.
	* libffi/src/powerpc/ppc_closure.S: Likewise.
	* libffi/src/powerpc/linux64.S: Likewise.
	* libffi/src/powerpc/linux64_closure.S: Likewise.
	* libffi/src/types.c: Likewise.
	* libffi/Makefile.am (EXTRA_DIST): Add new src/powerpc files.
	(nodist_libffi_la_SOURCES <POWERPC, POWERPC_FREEBSD>): Likewise.
	* libffi/configure.ac (HAVE_LONG_DOUBLE_VARIANT): Define for powerpc.
	* libffi/include/ffi.h.in (ffi_prep_types): Declare.
	* libffi/src/prep_cif.c (ffi_prep_cif_core): Call ffi_prep_types.
	* libffi/configure: Regenerate.
	* libffi/fficonfig.h.in: Regenerate.
	* libffi/Makefile.in: Regenerate.
	* libffi/man/Makefile.in: Regenerate.
	* libffi/include/Makefile.in: Regenerate.
	* libffi/testsuite/Makefile.in: Regenerate.

	* libffi/src/powerpc/ppc_closure.S: Don't bl .Luint128.

	* libffi/src/powerpc/ffitarget.h: Import from upstream.
	* libffi/src/powerpc/ffi.c: Likewise.
	* libffi/src/powerpc/linux64.S: Likewise.
	* libffi/src/powerpc/linux64_closure.S: Likewise.
	* libffi/doc/libffi.texi: Likewise.
	* libffi/testsuite/libffi.call/cls_double_va.c: Likewise.
	* libffi/testsuite/libffi.call/cls_longdouble_va.c: Likewise.

Index: gcc-4_8-branch/libffi/doc/libffi.texi
===================================================================
--- gcc-4_8-branch.orig/libffi/doc/libffi.texi	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/doc/libffi.texi	2013-12-28 17:50:47.091374717 +0100
@@ -184,11 +184,11 @@ This calls the function @var{fn} accordi
 
 @var{rvalue} is a pointer to a chunk of memory that will hold the
 result of the function call.  This must be large enough to hold the
-result and must be suitably aligned; it is the caller's responsibility
+result, no smaller than the system register size (generally 32 or 64
+bits), and must be suitably aligned; it is the caller's responsibility
 to ensure this.  If @var{cif} declares that the function returns
 @code{void} (using @code{ffi_type_void}), then @var{rvalue} is
-ignored.  If @var{rvalue} is @samp{NULL}, then the return value is
-discarded.
+ignored.
 
 @var{avalues} is a vector of @code{void *} pointers that point to the
 memory locations holding the argument values for a call.  If @var{cif}
@@ -214,7 +214,7 @@ int main()
   ffi_type *args[1];
   void *values[1];
   char *s;
-  int rc;
+  ffi_arg rc;
   
   /* Initialize the argument info vectors */    
   args[0] = &ffi_type_pointer;
@@ -222,7 +222,7 @@ int main()
   
   /* Initialize the cif */
   if (ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 1, 
-		       &ffi_type_uint, args) == FFI_OK)
+		       &ffi_type_sint, args) == FFI_OK)
     @{
       s = "Hello World!";
       ffi_call(&cif, puts, &rc, values);
@@ -360,7 +360,7 @@ You must first describe the structure to
 new @code{ffi_type} object for it.
 
 @tindex ffi_type
-@deftp ffi_type
+@deftp {Data type} ffi_type
 The @code{ffi_type} has the following members:
 @table @code
 @item size_t size
@@ -414,6 +414,7 @@ Here is the corresponding code to descri
       int i;
 
       tm_type.size = tm_type.alignment = 0;
+      tm_type.type = FFI_TYPE_STRUCT;
       tm_type.elements = &tm_type_elements;
     
       for (i = 0; i < 9; i++)
@@ -540,21 +541,23 @@ A trivial example that creates a new @co
 #include <ffi.h>
 
 /* Acts like puts with the file given at time of enclosure. */
-void puts_binding(ffi_cif *cif, unsigned int *ret, void* args[], 
-                  FILE *stream)
+void puts_binding(ffi_cif *cif, void *ret, void* args[],
+                  void *stream)
 @{
-  *ret = fputs(*(char **)args[0], stream);
+  *(ffi_arg *)ret = fputs(*(char **)args[0], (FILE *)stream);
 @}
 
+typedef int (*puts_t)(char *);
+
 int main()
 @{
   ffi_cif cif;
   ffi_type *args[1];
   ffi_closure *closure;
 
-  int (*bound_puts)(char *);
+  void *bound_puts;
   int rc;
-  
+
   /* Allocate closure and bound_puts */
   closure = ffi_closure_alloc(sizeof(ffi_closure), &bound_puts);
 
@@ -565,13 +568,13 @@ int main()
 
       /* Initialize the cif */
       if (ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 1,
-                       &ffi_type_uint, args) == FFI_OK)
+                       &ffi_type_sint, args) == FFI_OK)
         @{
           /* Initialize the closure, setting stream to stdout */
-          if (ffi_prep_closure_loc(closure, &cif, puts_binding, 
+          if (ffi_prep_closure_loc(closure, &cif, puts_binding,
                                    stdout, bound_puts) == FFI_OK)
             @{
-              rc = bound_puts("Hello World!");
+              rc = ((puts_t)bound_puts)("Hello World!");
               /* rc now holds the result of the call to fputs */
             @}
         @}
Index: gcc-4_8-branch/libffi/src/powerpc/ffi.c
===================================================================
--- gcc-4_8-branch.orig/libffi/src/powerpc/ffi.c	2013-12-28 17:41:31.722625405 +0100
+++ gcc-4_8-branch/libffi/src/powerpc/ffi.c	2013-12-28 17:50:47.097374747 +0100
@@ -1,5 +1,6 @@
 /* -----------------------------------------------------------------------
-   ffi.c - Copyright (C) 2011 Anthony Green
+   ffi.c - Copyright (C) 2013 IBM
+           Copyright (C) 2011 Anthony Green
            Copyright (C) 2011 Kyle Moffett
            Copyright (C) 2008 Red Hat, Inc
            Copyright (C) 2007, 2008 Free Software Foundation, Inc
@@ -27,966 +28,104 @@
    OTHER DEALINGS IN THE SOFTWARE.
    ----------------------------------------------------------------------- */
 
-#include <ffi.h>
-#include <ffi_common.h>
-
-#include <stdlib.h>
-#include <stdio.h>
-
-
-extern void ffi_closure_SYSV (void);
-extern void FFI_HIDDEN ffi_closure_LINUX64 (void);
-
-enum {
-  /* The assembly depends on these exact flags.  */
-  FLAG_RETURNS_SMST	= 1 << (31-31), /* Used for FFI_SYSV small structs.  */
-  FLAG_RETURNS_NOTHING  = 1 << (31-30), /* These go in cr7 */
-#ifndef __NO_FPRS__
-  FLAG_RETURNS_FP       = 1 << (31-29),
-#endif
-  FLAG_RETURNS_64BITS   = 1 << (31-28),
-
-  FLAG_RETURNS_128BITS  = 1 << (31-27), /* cr6  */
-
-  FLAG_ARG_NEEDS_COPY   = 1 << (31- 7),
-#ifndef __NO_FPRS__
-  FLAG_FP_ARGUMENTS     = 1 << (31- 6), /* cr1.eq; specified by ABI */
-#endif
-  FLAG_4_GPR_ARGUMENTS  = 1 << (31- 5),
-  FLAG_RETVAL_REFERENCE = 1 << (31- 4)
-};
-
-/* About the SYSV ABI.  */
-#define ASM_NEEDS_REGISTERS 4
-#define NUM_GPR_ARG_REGISTERS 8
-#ifndef __NO_FPRS__
-# define NUM_FPR_ARG_REGISTERS 8
-#endif
-
-/* ffi_prep_args_SYSV is called by the assembly routine once stack space
-   has been allocated for the function's arguments.
-
-   The stack layout we want looks like this:
-
-   |   Return address from ffi_call_SYSV 4bytes	|	higher addresses
-   |--------------------------------------------|
-   |   Previous backchain pointer	4	|       stack pointer here
-   |--------------------------------------------|<+ <<<	on entry to
-   |   Saved r28-r31			4*4	| |	ffi_call_SYSV
-   |--------------------------------------------| |
-   |   GPR registers r3-r10		8*4	| |	ffi_call_SYSV
-   |--------------------------------------------| |
-   |   FPR registers f1-f8 (optional)	8*8	| |
-   |--------------------------------------------| |	stack	|
-   |   Space for copied structures		| |	grows	|
-   |--------------------------------------------| |	down    V
-   |   Parameters that didn't fit in registers  | |
-   |--------------------------------------------| |	lower addresses
-   |   Space for callee's LR		4	| |
-   |--------------------------------------------| |	stack pointer here
-   |   Current backchain pointer	4	|-/	during
-   |--------------------------------------------|   <<<	ffi_call_SYSV
-
-*/
-
-void
-ffi_prep_args_SYSV (extended_cif *ecif, unsigned *const stack)
-{
-  const unsigned bytes = ecif->cif->bytes;
-  const unsigned flags = ecif->cif->flags;
-
-  typedef union {
-    char *c;
-    unsigned *u;
-    long long *ll;
-    float *f;
-    double *d;
-  } valp;
-
-  /* 'stacktop' points at the previous backchain pointer.  */
-  valp stacktop;
-
-  /* 'gpr_base' points at the space for gpr3, and grows upwards as
-     we use GPR registers.  */
-  valp gpr_base;
-  int intarg_count;
-
-#ifndef __NO_FPRS__
-  /* 'fpr_base' points at the space for fpr1, and grows upwards as
-     we use FPR registers.  */
-  valp fpr_base;
-  int fparg_count;
-#endif
-
-  /* 'copy_space' grows down as we put structures in it.  It should
-     stay 16-byte aligned.  */
-  valp copy_space;
-
-  /* 'next_arg' grows up as we put parameters in it.  */
-  valp next_arg;
-
-  int i;
-  ffi_type **ptr;
-#ifndef __NO_FPRS__
-  double double_tmp;
-#endif
-  union {
-    void **v;
-    char **c;
-    signed char **sc;
-    unsigned char **uc;
-    signed short **ss;
-    unsigned short **us;
-    unsigned int **ui;
-    long long **ll;
-    float **f;
-    double **d;
-  } p_argv;
-  size_t struct_copy_size;
-  unsigned gprvalue;
-
-  stacktop.c = (char *) stack + bytes;
-  gpr_base.u = stacktop.u - ASM_NEEDS_REGISTERS - NUM_GPR_ARG_REGISTERS;
-  intarg_count = 0;
-#ifndef __NO_FPRS__
-  fpr_base.d = gpr_base.d - NUM_FPR_ARG_REGISTERS;
-  fparg_count = 0;
-  copy_space.c = ((flags & FLAG_FP_ARGUMENTS) ? fpr_base.c : gpr_base.c);
-#else
-  copy_space.c = gpr_base.c;
-#endif
-  next_arg.u = stack + 2;
-
-  /* Check that everything starts aligned properly.  */
-  FFI_ASSERT (((unsigned long) (char *) stack & 0xF) == 0);
-  FFI_ASSERT (((unsigned long) copy_space.c & 0xF) == 0);
-  FFI_ASSERT (((unsigned long) stacktop.c & 0xF) == 0);
-  FFI_ASSERT ((bytes & 0xF) == 0);
-  FFI_ASSERT (copy_space.c >= next_arg.c);
-
-  /* Deal with return values that are actually pass-by-reference.  */
-  if (flags & FLAG_RETVAL_REFERENCE)
-    {
-      *gpr_base.u++ = (unsigned long) (char *) ecif->rvalue;
-      intarg_count++;
-    }
-
-  /* Now for the arguments.  */
-  p_argv.v = ecif->avalue;
-  for (ptr = ecif->cif->arg_types, i = ecif->cif->nargs;
-       i > 0;
-       i--, ptr++, p_argv.v++)
-    {
-      unsigned short typenum = (*ptr)->type;
-
-      /* We may need to handle some values depending on ABI */
-      if (ecif->cif->abi == FFI_LINUX_SOFT_FLOAT) {
-		if (typenum == FFI_TYPE_FLOAT)
-			typenum = FFI_TYPE_UINT32;
-		if (typenum == FFI_TYPE_DOUBLE)
-			typenum = FFI_TYPE_UINT64;
-		if (typenum == FFI_TYPE_LONGDOUBLE)
-			typenum = FFI_TYPE_UINT128;
-      } else if (ecif->cif->abi != FFI_LINUX) {
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-		if (typenum == FFI_TYPE_LONGDOUBLE)
-			typenum = FFI_TYPE_STRUCT;
-#endif
-      }
-
-      /* Now test the translated value */
-      switch (typenum) {
-#ifndef __NO_FPRS__
-	case FFI_TYPE_FLOAT:
-	  /* With FFI_LINUX_SOFT_FLOAT floats are handled like UINT32.  */
-	  double_tmp = **p_argv.f;
-	  if (fparg_count >= NUM_FPR_ARG_REGISTERS)
-	    {
-	      *next_arg.f = (float) double_tmp;
-	      next_arg.u += 1;
-	      intarg_count++;
-	    }
-	  else
-	    *fpr_base.d++ = double_tmp;
-	  fparg_count++;
-	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
-	  break;
-
-	case FFI_TYPE_DOUBLE:
-	  /* With FFI_LINUX_SOFT_FLOAT doubles are handled like UINT64.  */
-	  double_tmp = **p_argv.d;
-
-	  if (fparg_count >= NUM_FPR_ARG_REGISTERS)
-	    {
-	      if (intarg_count >= NUM_GPR_ARG_REGISTERS
-		  && intarg_count % 2 != 0)
-		{
-		  intarg_count++;
-		  next_arg.u++;
-		}
-	      *next_arg.d = double_tmp;
-	      next_arg.u += 2;
-	    }
-	  else
-	    *fpr_base.d++ = double_tmp;
-	  fparg_count++;
-	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
-	  break;
-
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	case FFI_TYPE_LONGDOUBLE:
-	      double_tmp = (*p_argv.d)[0];
-
-	      if (fparg_count >= NUM_FPR_ARG_REGISTERS - 1)
-		{
-		  if (intarg_count >= NUM_GPR_ARG_REGISTERS
-		      && intarg_count % 2 != 0)
-		    {
-		      intarg_count++;
-		      next_arg.u++;
-		    }
-		  *next_arg.d = double_tmp;
-		  next_arg.u += 2;
-		  double_tmp = (*p_argv.d)[1];
-		  *next_arg.d = double_tmp;
-		  next_arg.u += 2;
-		}
-	      else
-		{
-		  *fpr_base.d++ = double_tmp;
-		  double_tmp = (*p_argv.d)[1];
-		  *fpr_base.d++ = double_tmp;
-		}
-
-	      fparg_count += 2;
-	      FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
-	  break;
-#endif
-#endif /* have FPRs */
-
-	/*
-	 * The soft float ABI for long doubles works like this, a long double
-	 * is passed in four consecutive GPRs if available.  A maximum of 2
-	 * long doubles can be passed in gprs.  If we do not have 4 GPRs
-	 * left, the long double is passed on the stack, 4-byte aligned.
-	 */
-	case FFI_TYPE_UINT128: {
-		unsigned int int_tmp = (*p_argv.ui)[0];
-		unsigned int ii;
-		if (intarg_count >= NUM_GPR_ARG_REGISTERS - 3) {
-			if (intarg_count < NUM_GPR_ARG_REGISTERS)
-				intarg_count += NUM_GPR_ARG_REGISTERS - intarg_count;
-			*(next_arg.u++) = int_tmp;
-			for (ii = 1; ii < 4; ii++) {
-				int_tmp = (*p_argv.ui)[ii];
-				*(next_arg.u++) = int_tmp;
-			}
-		} else {
-			*(gpr_base.u++) = int_tmp;
-			for (ii = 1; ii < 4; ii++) {
-				int_tmp = (*p_argv.ui)[ii];
-				*(gpr_base.u++) = int_tmp;
-			}
-		}
-		intarg_count += 4;
-		break;
-	}
-
-	case FFI_TYPE_UINT64:
-	case FFI_TYPE_SINT64:
-	  if (intarg_count == NUM_GPR_ARG_REGISTERS-1)
-	    intarg_count++;
-	  if (intarg_count >= NUM_GPR_ARG_REGISTERS)
-	    {
-	      if (intarg_count % 2 != 0)
-		{
-		  intarg_count++;
-		  next_arg.u++;
-		}
-	      *next_arg.ll = **p_argv.ll;
-	      next_arg.u += 2;
-	    }
-	  else
-	    {
-	      /* whoops: abi states only certain register pairs
-	       * can be used for passing long long int
-	       * specifically (r3,r4), (r5,r6), (r7,r8),
-	       * (r9,r10) and if next arg is long long but
-	       * not correct starting register of pair then skip
-	       * until the proper starting register
-	       */
-	      if (intarg_count % 2 != 0)
-		{
-		  intarg_count ++;
-		  gpr_base.u++;
-		}
-	      *gpr_base.ll++ = **p_argv.ll;
-	    }
-	  intarg_count += 2;
-	  break;
-
-	case FFI_TYPE_STRUCT:
-	  struct_copy_size = ((*ptr)->size + 15) & ~0xF;
-	  copy_space.c -= struct_copy_size;
-	  memcpy (copy_space.c, *p_argv.c, (*ptr)->size);
-
-	  gprvalue = (unsigned long) copy_space.c;
-
-	  FFI_ASSERT (copy_space.c > next_arg.c);
-	  FFI_ASSERT (flags & FLAG_ARG_NEEDS_COPY);
-	  goto putgpr;
-
-	case FFI_TYPE_UINT8:
-	  gprvalue = **p_argv.uc;
-	  goto putgpr;
-	case FFI_TYPE_SINT8:
-	  gprvalue = **p_argv.sc;
-	  goto putgpr;
-	case FFI_TYPE_UINT16:
-	  gprvalue = **p_argv.us;
-	  goto putgpr;
-	case FFI_TYPE_SINT16:
-	  gprvalue = **p_argv.ss;
-	  goto putgpr;
-
-	case FFI_TYPE_INT:
-	case FFI_TYPE_UINT32:
-	case FFI_TYPE_SINT32:
-	case FFI_TYPE_POINTER:
-
-	  gprvalue = **p_argv.ui;
-
-	putgpr:
-	  if (intarg_count >= NUM_GPR_ARG_REGISTERS)
-	    *next_arg.u++ = gprvalue;
-	  else
-	    *gpr_base.u++ = gprvalue;
-	  intarg_count++;
-	  break;
-	}
-    }
-
-  /* Check that we didn't overrun the stack...  */
-  FFI_ASSERT (copy_space.c >= next_arg.c);
-  FFI_ASSERT (gpr_base.u <= stacktop.u - ASM_NEEDS_REGISTERS);
-#ifndef __NO_FPRS__
-  FFI_ASSERT (fpr_base.u
-	      <= stacktop.u - ASM_NEEDS_REGISTERS - NUM_GPR_ARG_REGISTERS);
-#endif
-  FFI_ASSERT (flags & FLAG_4_GPR_ARGUMENTS || intarg_count <= 4);
-}
-
-/* About the LINUX64 ABI.  */
-enum {
-  NUM_GPR_ARG_REGISTERS64 = 8,
-  NUM_FPR_ARG_REGISTERS64 = 13
-};
-enum { ASM_NEEDS_REGISTERS64 = 4 };
-
-/* ffi_prep_args64 is called by the assembly routine once stack space
-   has been allocated for the function's arguments.
-
-   The stack layout we want looks like this:
-
-   |   Ret addr from ffi_call_LINUX64	8bytes	|	higher addresses
-   |--------------------------------------------|
-   |   CR save area			8bytes	|
-   |--------------------------------------------|
-   |   Previous backchain pointer	8	|	stack pointer here
-   |--------------------------------------------|<+ <<<	on entry to
-   |   Saved r28-r31			4*8	| |	ffi_call_LINUX64
-   |--------------------------------------------| |
-   |   GPR registers r3-r10		8*8	| |
-   |--------------------------------------------| |
-   |   FPR registers f1-f13 (optional)	13*8	| |
-   |--------------------------------------------| |
-   |   Parameter save area		        | |
-   |--------------------------------------------| |
-   |   TOC save area			8	| |
-   |--------------------------------------------| |	stack	|
-   |   Linker doubleword		8	| |	grows	|
-   |--------------------------------------------| |	down	V
-   |   Compiler doubleword		8	| |
-   |--------------------------------------------| |	lower addresses
-   |   Space for callee's LR		8	| |
-   |--------------------------------------------| |
-   |   CR save area			8	| |
-   |--------------------------------------------| |	stack pointer here
-   |   Current backchain pointer	8	|-/	during
-   |--------------------------------------------|   <<<	ffi_call_LINUX64
-
-*/
+#include "ffi.h"
+#include "ffi_common.h"
+#include "ffi_powerpc.h"
 
+#if HAVE_LONG_DOUBLE_VARIANT
+/* Adjust ffi_type_longdouble.  */
 void FFI_HIDDEN
-ffi_prep_args64 (extended_cif *ecif, unsigned long *const stack)
+ffi_prep_types (ffi_abi abi)
 {
-  const unsigned long bytes = ecif->cif->bytes;
-  const unsigned long flags = ecif->cif->flags;
-
-  typedef union {
-    char *c;
-    unsigned long *ul;
-    float *f;
-    double *d;
-  } valp;
-
-  /* 'stacktop' points at the previous backchain pointer.  */
-  valp stacktop;
-
-  /* 'next_arg' points at the space for gpr3, and grows upwards as
-     we use GPR registers, then continues at rest.  */
-  valp gpr_base;
-  valp gpr_end;
-  valp rest;
-  valp next_arg;
-
-  /* 'fpr_base' points at the space for fpr3, and grows upwards as
-     we use FPR registers.  */
-  valp fpr_base;
-  int fparg_count;
-
-  int i, words;
-  ffi_type **ptr;
-  double double_tmp;
-  union {
-    void **v;
-    char **c;
-    signed char **sc;
-    unsigned char **uc;
-    signed short **ss;
-    unsigned short **us;
-    signed int **si;
-    unsigned int **ui;
-    unsigned long **ul;
-    float **f;
-    double **d;
-  } p_argv;
-  unsigned long gprvalue;
-
-  stacktop.c = (char *) stack + bytes;
-  gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - NUM_GPR_ARG_REGISTERS64;
-  gpr_end.ul = gpr_base.ul + NUM_GPR_ARG_REGISTERS64;
-  rest.ul = stack + 6 + NUM_GPR_ARG_REGISTERS64;
-  fpr_base.d = gpr_base.d - NUM_FPR_ARG_REGISTERS64;
-  fparg_count = 0;
-  next_arg.ul = gpr_base.ul;
-
-  /* Check that everything starts aligned properly.  */
-  FFI_ASSERT (((unsigned long) (char *) stack & 0xF) == 0);
-  FFI_ASSERT (((unsigned long) stacktop.c & 0xF) == 0);
-  FFI_ASSERT ((bytes & 0xF) == 0);
-
-  /* Deal with return values that are actually pass-by-reference.  */
-  if (flags & FLAG_RETVAL_REFERENCE)
-    *next_arg.ul++ = (unsigned long) (char *) ecif->rvalue;
-
-  /* Now for the arguments.  */
-  p_argv.v = ecif->avalue;
-  for (ptr = ecif->cif->arg_types, i = ecif->cif->nargs;
-       i > 0;
-       i--, ptr++, p_argv.v++)
-    {
-      switch ((*ptr)->type)
-	{
-	case FFI_TYPE_FLOAT:
-	  double_tmp = **p_argv.f;
-	  *next_arg.f = (float) double_tmp;
-	  if (++next_arg.ul == gpr_end.ul)
-	    next_arg.ul = rest.ul;
-	  if (fparg_count < NUM_FPR_ARG_REGISTERS64)
-	    *fpr_base.d++ = double_tmp;
-	  fparg_count++;
-	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
-	  break;
-
-	case FFI_TYPE_DOUBLE:
-	  double_tmp = **p_argv.d;
-	  *next_arg.d = double_tmp;
-	  if (++next_arg.ul == gpr_end.ul)
-	    next_arg.ul = rest.ul;
-	  if (fparg_count < NUM_FPR_ARG_REGISTERS64)
-	    *fpr_base.d++ = double_tmp;
-	  fparg_count++;
-	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
-	  break;
-
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	case FFI_TYPE_LONGDOUBLE:
-	  double_tmp = (*p_argv.d)[0];
-	  *next_arg.d = double_tmp;
-	  if (++next_arg.ul == gpr_end.ul)
-	    next_arg.ul = rest.ul;
-	  if (fparg_count < NUM_FPR_ARG_REGISTERS64)
-	    *fpr_base.d++ = double_tmp;
-	  fparg_count++;
-	  double_tmp = (*p_argv.d)[1];
-	  *next_arg.d = double_tmp;
-	  if (++next_arg.ul == gpr_end.ul)
-	    next_arg.ul = rest.ul;
-	  if (fparg_count < NUM_FPR_ARG_REGISTERS64)
-	    *fpr_base.d++ = double_tmp;
-	  fparg_count++;
-	  FFI_ASSERT (__LDBL_MANT_DIG__ == 106);
-	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
-	  break;
-#endif
-
-	case FFI_TYPE_STRUCT:
-	  words = ((*ptr)->size + 7) / 8;
-	  if (next_arg.ul >= gpr_base.ul && next_arg.ul + words > gpr_end.ul)
-	    {
-	      size_t first = gpr_end.c - next_arg.c;
-	      memcpy (next_arg.c, *p_argv.c, first);
-	      memcpy (rest.c, *p_argv.c + first, (*ptr)->size - first);
-	      next_arg.c = rest.c + words * 8 - first;
-	    }
-	  else
-	    {
-	      char *where = next_arg.c;
-
-#ifndef __LITTLE_ENDIAN__
-	      /* Structures with size less than eight bytes are passed
-		 left-padded.  */
-	      if ((*ptr)->size < 8)
-		where += 8 - (*ptr)->size;
-#endif
-	      memcpy (where, *p_argv.c, (*ptr)->size);
-	      next_arg.ul += words;
-	      if (next_arg.ul == gpr_end.ul)
-		next_arg.ul = rest.ul;
-	    }
-	  break;
-
-	case FFI_TYPE_UINT8:
-	  gprvalue = **p_argv.uc;
-	  goto putgpr;
-	case FFI_TYPE_SINT8:
-	  gprvalue = **p_argv.sc;
-	  goto putgpr;
-	case FFI_TYPE_UINT16:
-	  gprvalue = **p_argv.us;
-	  goto putgpr;
-	case FFI_TYPE_SINT16:
-	  gprvalue = **p_argv.ss;
-	  goto putgpr;
-	case FFI_TYPE_UINT32:
-	  gprvalue = **p_argv.ui;
-	  goto putgpr;
-	case FFI_TYPE_INT:
-	case FFI_TYPE_SINT32:
-	  gprvalue = **p_argv.si;
-	  goto putgpr;
-
-	case FFI_TYPE_UINT64:
-	case FFI_TYPE_SINT64:
-	case FFI_TYPE_POINTER:
-	  gprvalue = **p_argv.ul;
-	putgpr:
-	  *next_arg.ul++ = gprvalue;
-	  if (next_arg.ul == gpr_end.ul)
-	    next_arg.ul = rest.ul;
-	  break;
-	}
-    }
-
-  FFI_ASSERT (flags & FLAG_4_GPR_ARGUMENTS
-	      || (next_arg.ul >= gpr_base.ul
-		  && next_arg.ul <= gpr_base.ul + 4));
+# if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+#  ifdef POWERPC64
+  ffi_prep_types_linux64 (abi);
+#  else
+  ffi_prep_types_sysv (abi);
+#  endif
+# endif
 }
-
-
+#endif
 
 /* Perform machine dependent cif processing */
-ffi_status
+ffi_status FFI_HIDDEN
 ffi_prep_cif_machdep (ffi_cif *cif)
 {
-  /* All this is for the SYSV and LINUX64 ABI.  */
-  int i;
-  ffi_type **ptr;
-  unsigned bytes;
-  int fparg_count = 0, intarg_count = 0;
-  unsigned flags = 0;
-  unsigned struct_copy_size = 0;
-  unsigned type = cif->rtype->type;
-  unsigned size = cif->rtype->size;
-
-  if (cif->abi != FFI_LINUX64)
-    {
-      /* All the machine-independent calculation of cif->bytes will be wrong.
-	 Redo the calculation for SYSV.  */
-
-      /* Space for the frame pointer, callee's LR, and the asm's temp regs.  */
-      bytes = (2 + ASM_NEEDS_REGISTERS) * sizeof (int);
-
-      /* Space for the GPR registers.  */
-      bytes += NUM_GPR_ARG_REGISTERS * sizeof (int);
-    }
-  else
-    {
-      /* 64-bit ABI.  */
-
-      /* Space for backchain, CR, LR, cc/ld doubleword, TOC and the asm's temp
-	 regs.  */
-      bytes = (6 + ASM_NEEDS_REGISTERS64) * sizeof (long);
-
-      /* Space for the mandatory parm save area and general registers.  */
-      bytes += 2 * NUM_GPR_ARG_REGISTERS64 * sizeof (long);
-    }
-
-  /* Return value handling.  The rules for SYSV are as follows:
-     - 32-bit (or less) integer values are returned in gpr3;
-     - Structures of size <= 4 bytes also returned in gpr3;
-     - 64-bit integer values and structures between 5 and 8 bytes are returned
-     in gpr3 and gpr4;
-     - Single/double FP values are returned in fpr1;
-     - Larger structures are allocated space and a pointer is passed as
-     the first argument.
-     - long doubles (if not equivalent to double) are returned in
-     fpr1,fpr2 for Linux and as for large structs for SysV.
-     For LINUX64:
-     - integer values in gpr3;
-     - Structures/Unions by reference;
-     - Single/double FP values in fpr1, long double in fpr1,fpr2.
-     - soft-float float/doubles are treated as UINT32/UINT64 respectivley.
-     - soft-float long doubles are returned in gpr3-gpr6.  */
-  /* First translate for softfloat/nonlinux */
-  if (cif->abi == FFI_LINUX_SOFT_FLOAT) {
-	if (type == FFI_TYPE_FLOAT)
-		type = FFI_TYPE_UINT32;
-	if (type == FFI_TYPE_DOUBLE)
-		type = FFI_TYPE_UINT64;
-	if (type == FFI_TYPE_LONGDOUBLE)
-		type = FFI_TYPE_UINT128;
-  } else if (cif->abi != FFI_LINUX && cif->abi != FFI_LINUX64) {
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	if (type == FFI_TYPE_LONGDOUBLE)
-		type = FFI_TYPE_STRUCT;
-#endif
-  }
-
-  switch (type)
-    {
-#ifndef __NO_FPRS__
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-    case FFI_TYPE_LONGDOUBLE:
-      flags |= FLAG_RETURNS_128BITS;
-      /* Fall through.  */
-#endif
-    case FFI_TYPE_DOUBLE:
-      flags |= FLAG_RETURNS_64BITS;
-      /* Fall through.  */
-    case FFI_TYPE_FLOAT:
-      flags |= FLAG_RETURNS_FP;
-      break;
-#endif
-
-    case FFI_TYPE_UINT128:
-      flags |= FLAG_RETURNS_128BITS;
-      /* Fall through.  */
-    case FFI_TYPE_UINT64:
-    case FFI_TYPE_SINT64:
-      flags |= FLAG_RETURNS_64BITS;
-      break;
-
-    case FFI_TYPE_STRUCT:
-      /*
-       * The final SYSV ABI says that structures smaller or equal 8 bytes
-       * are returned in r3/r4. The FFI_GCC_SYSV ABI instead returns them
-       * in memory.
-       *
-       * NOTE: The assembly code can safely assume that it just needs to
-       *       store both r3 and r4 into a 8-byte word-aligned buffer, as
-       *       we allocate a temporary buffer in ffi_call() if this flag is
-       *       set.
-       */
-      if (cif->abi == FFI_SYSV && size <= 8)
-	flags |= FLAG_RETURNS_SMST;
-      intarg_count++;
-      flags |= FLAG_RETVAL_REFERENCE;
-      /* Fall through.  */
-    case FFI_TYPE_VOID:
-      flags |= FLAG_RETURNS_NOTHING;
-      break;
-
-    default:
-      /* Returns 32-bit integer, or similar.  Nothing to do here.  */
-      break;
-    }
-
-  if (cif->abi != FFI_LINUX64)
-    /* The first NUM_GPR_ARG_REGISTERS words of integer arguments, and the
-       first NUM_FPR_ARG_REGISTERS fp arguments, go in registers; the rest
-       goes on the stack.  Structures and long doubles (if not equivalent
-       to double) are passed as a pointer to a copy of the structure.
-       Stuff on the stack needs to keep proper alignment.  */
-    for (ptr = cif->arg_types, i = cif->nargs; i > 0; i--, ptr++)
-      {
-	unsigned short typenum = (*ptr)->type;
-
-	/* We may need to handle some values depending on ABI */
-	if (cif->abi == FFI_LINUX_SOFT_FLOAT) {
-		if (typenum == FFI_TYPE_FLOAT)
-			typenum = FFI_TYPE_UINT32;
-		if (typenum == FFI_TYPE_DOUBLE)
-			typenum = FFI_TYPE_UINT64;
-		if (typenum == FFI_TYPE_LONGDOUBLE)
-			typenum = FFI_TYPE_UINT128;
-	} else if (cif->abi != FFI_LINUX && cif->abi != FFI_LINUX64) {
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-		if (typenum == FFI_TYPE_LONGDOUBLE)
-			typenum = FFI_TYPE_STRUCT;
-#endif
-	}
-
-	switch (typenum) {
-#ifndef __NO_FPRS__
-	  case FFI_TYPE_FLOAT:
-	    fparg_count++;
-	    /* floating singles are not 8-aligned on stack */
-	    break;
-
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	  case FFI_TYPE_LONGDOUBLE:
-	    fparg_count++;
-	    /* Fall thru */
-#endif
-	  case FFI_TYPE_DOUBLE:
-	    fparg_count++;
-	    /* If this FP arg is going on the stack, it must be
-	       8-byte-aligned.  */
-	    if (fparg_count > NUM_FPR_ARG_REGISTERS
-		&& intarg_count >= NUM_GPR_ARG_REGISTERS
-		&& intarg_count % 2 != 0)
-	      intarg_count++;
-	    break;
-#endif
-	  case FFI_TYPE_UINT128:
-		/*
-		 * A long double in FFI_LINUX_SOFT_FLOAT can use only a set
-		 * of four consecutive gprs. If we do not have enough, we
-		 * have to adjust the intarg_count value.
-		 */
-		if (intarg_count >= NUM_GPR_ARG_REGISTERS - 3
-				&& intarg_count < NUM_GPR_ARG_REGISTERS)
-			intarg_count = NUM_GPR_ARG_REGISTERS;
-		intarg_count += 4;
-		break;
-
-	  case FFI_TYPE_UINT64:
-	  case FFI_TYPE_SINT64:
-	    /* 'long long' arguments are passed as two words, but
-	       either both words must fit in registers or both go
-	       on the stack.  If they go on the stack, they must
-	       be 8-byte-aligned.
-
-	       Also, only certain register pairs can be used for
-	       passing long long int -- specifically (r3,r4), (r5,r6),
-	       (r7,r8), (r9,r10).
-	    */
-	    if (intarg_count == NUM_GPR_ARG_REGISTERS-1
-		|| intarg_count % 2 != 0)
-	      intarg_count++;
-	    intarg_count += 2;
-	    break;
-
-	  case FFI_TYPE_STRUCT:
-	    /* We must allocate space for a copy of these to enforce
-	       pass-by-value.  Pad the space up to a multiple of 16
-	       bytes (the maximum alignment required for anything under
-	       the SYSV ABI).  */
-	    struct_copy_size += ((*ptr)->size + 15) & ~0xF;
-	    /* Fall through (allocate space for the pointer).  */
-
-	  case FFI_TYPE_POINTER:
-	  case FFI_TYPE_INT:
-	  case FFI_TYPE_UINT32:
-	  case FFI_TYPE_SINT32:
-	  case FFI_TYPE_UINT16:
-	  case FFI_TYPE_SINT16:
-	  case FFI_TYPE_UINT8:
-	  case FFI_TYPE_SINT8:
-	    /* Everything else is passed as a 4-byte word in a GPR, either
-	       the object itself or a pointer to it.  */
-	    intarg_count++;
-	    break;
-	  default:
-		FFI_ASSERT (0);
-	  }
-      }
-  else
-    for (ptr = cif->arg_types, i = cif->nargs; i > 0; i--, ptr++)
-      {
-	switch ((*ptr)->type)
-	  {
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	  case FFI_TYPE_LONGDOUBLE:
-	    if (cif->abi == FFI_LINUX_SOFT_FLOAT)
-	      intarg_count += 4;
-	    else
-	      {
-		fparg_count += 2;
-		intarg_count += 2;
-	      }
-	    break;
-#endif
-	  case FFI_TYPE_FLOAT:
-	  case FFI_TYPE_DOUBLE:
-	    fparg_count++;
-	    intarg_count++;
-	    break;
-
-	  case FFI_TYPE_STRUCT:
-	    intarg_count += ((*ptr)->size + 7) / 8;
-	    break;
-
-	  case FFI_TYPE_POINTER:
-	  case FFI_TYPE_UINT64:
-	  case FFI_TYPE_SINT64:
-	  case FFI_TYPE_INT:
-	  case FFI_TYPE_UINT32:
-	  case FFI_TYPE_SINT32:
-	  case FFI_TYPE_UINT16:
-	  case FFI_TYPE_SINT16:
-	  case FFI_TYPE_UINT8:
-	  case FFI_TYPE_SINT8:
-	    /* Everything else is passed as a 8-byte word in a GPR, either
-	       the object itself or a pointer to it.  */
-	    intarg_count++;
-	    break;
-	  default:
-		FFI_ASSERT (0);
-	  }
-      }
-
-#ifndef __NO_FPRS__
-  if (fparg_count != 0)
-    flags |= FLAG_FP_ARGUMENTS;
-#endif
-  if (intarg_count > 4)
-    flags |= FLAG_4_GPR_ARGUMENTS;
-  if (struct_copy_size != 0)
-    flags |= FLAG_ARG_NEEDS_COPY;
-
-  if (cif->abi != FFI_LINUX64)
-    {
-#ifndef __NO_FPRS__
-      /* Space for the FPR registers, if needed.  */
-      if (fparg_count != 0)
-	bytes += NUM_FPR_ARG_REGISTERS * sizeof (double);
+#ifdef POWERPC64
+  return ffi_prep_cif_linux64 (cif);
+#else
+  return ffi_prep_cif_sysv (cif);
 #endif
+}
 
-      /* Stack space.  */
-      if (intarg_count > NUM_GPR_ARG_REGISTERS)
-	bytes += (intarg_count - NUM_GPR_ARG_REGISTERS) * sizeof (int);
-#ifndef __NO_FPRS__
-      if (fparg_count > NUM_FPR_ARG_REGISTERS)
-	bytes += (fparg_count - NUM_FPR_ARG_REGISTERS) * sizeof (double);
-#endif
-    }
-  else
-    {
-#ifndef __NO_FPRS__
-      /* Space for the FPR registers, if needed.  */
-      if (fparg_count != 0)
-	bytes += NUM_FPR_ARG_REGISTERS64 * sizeof (double);
+ffi_status FFI_HIDDEN
+ffi_prep_cif_machdep_var (ffi_cif *cif,
+			  unsigned int nfixedargs MAYBE_UNUSED,
+			  unsigned int ntotalargs MAYBE_UNUSED)
+{
+#ifdef POWERPC64
+  return ffi_prep_cif_linux64_var (cif, nfixedargs, ntotalargs);
+#else
+  return ffi_prep_cif_sysv (cif);
 #endif
-
-      /* Stack space.  */
-      if (intarg_count > NUM_GPR_ARG_REGISTERS64)
-	bytes += (intarg_count - NUM_GPR_ARG_REGISTERS64) * sizeof (long);
-    }
-
-  /* The stack space allocated needs to be a multiple of 16 bytes.  */
-  bytes = (bytes + 15) & ~0xF;
-
-  /* Add in the space for the copied structures.  */
-  bytes += struct_copy_size;
-
-  cif->flags = flags;
-  cif->bytes = bytes;
-
-  return FFI_OK;
 }
 
-extern void ffi_call_SYSV(extended_cif *, unsigned, unsigned, unsigned *,
-			  void (*fn)(void));
-extern void FFI_HIDDEN ffi_call_LINUX64(extended_cif *, unsigned long,
-					unsigned long, unsigned long *,
-					void (*fn)(void));
-
 void
 ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
 {
-  /*
-   * The final SYSV ABI says that structures smaller or equal 8 bytes
-   * are returned in r3/r4. The FFI_GCC_SYSV ABI instead returns them
-   * in memory.
-   *
-   * Just to keep things simple for the assembly code, we will always
-   * bounce-buffer struct return values less than or equal to 8 bytes.
-   * This allows the ASM to handle SYSV small structures by directly
-   * writing r3 and r4 to memory without worrying about struct size.
-   */
-  unsigned int smst_buffer[2];
+  /* The final SYSV ABI says that structures smaller or equal 8 bytes
+     are returned in r3/r4.  A draft ABI used by linux instead returns
+     them in memory.
+
+     We bounce-buffer SYSV small struct return values so that sysv.S
+     can write r3 and r4 to memory without worrying about struct size.
+   
+     For ELFv2 ABI, use a bounce buffer for homogeneous structs too,
+     for similar reasons.  */
+  unsigned long smst_buffer[8];
   extended_cif ecif;
-  unsigned int rsize = 0;
 
   ecif.cif = cif;
   ecif.avalue = avalue;
 
-  /* Ensure that we have a valid struct return value */
   ecif.rvalue = rvalue;
-  if (cif->rtype->type == FFI_TYPE_STRUCT) {
-    rsize = cif->rtype->size;
-    if (rsize <= 8)
-      ecif.rvalue = smst_buffer;
-    else if (!rvalue)
-      ecif.rvalue = alloca(rsize);
-  }
+  if ((cif->flags & FLAG_RETURNS_SMST) != 0)
+    ecif.rvalue = smst_buffer;
+  /* Ensure that we have a valid struct return value.
+     FIXME: Isn't this just papering over a user problem?  */
+  else if (!rvalue && cif->rtype->type == FFI_TYPE_STRUCT)
+    ecif.rvalue = alloca (cif->rtype->size);
 
-  switch (cif->abi)
-    {
-#ifndef POWERPC64
-# ifndef __NO_FPRS__
-    case FFI_SYSV:
-    case FFI_GCC_SYSV:
-    case FFI_LINUX:
-# endif
-    case FFI_LINUX_SOFT_FLOAT:
-      ffi_call_SYSV (&ecif, -cif->bytes, cif->flags, ecif.rvalue, fn);
-      break;
+#ifdef POWERPC64
+  ffi_call_LINUX64 (&ecif, -(long) cif->bytes, cif->flags, ecif.rvalue, fn);
 #else
-    case FFI_LINUX64:
-      ffi_call_LINUX64 (&ecif, -(long) cif->bytes, cif->flags, ecif.rvalue, fn);
-      break;
+  ffi_call_SYSV (&ecif, -cif->bytes, cif->flags, ecif.rvalue, fn);
 #endif
-    default:
-      FFI_ASSERT (0);
-      break;
-    }
 
   /* Check for a bounce-buffered return value */
   if (rvalue && ecif.rvalue == smst_buffer)
-    memcpy(rvalue, smst_buffer, rsize);
+    {
+      unsigned int rsize = cif->rtype->size;
+#ifndef __LITTLE_ENDIAN__
+      /* The SYSV ABI returns a structure of up to 4 bytes in size
+	 left-padded in r3.  */
+# ifndef POWERPC64
+      if (rsize <= 4)
+	memcpy (rvalue, (char *) smst_buffer + 4 - rsize, rsize);
+      else
+# endif
+	/* The SYSV ABI returns a structure of up to 8 bytes in size
+	   left-padded in r3/r4, and the ELFv2 ABI similarly returns a
+	   structure of up to 8 bytes in size left-padded in r3.  */
+	if (rsize <= 8)
+	  memcpy (rvalue, (char *) smst_buffer + 8 - rsize, rsize);
+	else
+#endif
+	  memcpy (rvalue, smst_buffer, rsize);
+    }
 }
 

-#ifndef POWERPC64
-#define MIN_CACHE_LINE_SIZE 8
-
-static void
-flush_icache (char *wraddr, char *xaddr, int size)
-{
-  int i;
-  for (i = 0; i < size; i += MIN_CACHE_LINE_SIZE)
-    __asm__ volatile ("icbi 0,%0;" "dcbf 0,%1;"
-		      : : "r" (xaddr + i), "r" (wraddr + i) : "memory");
-  __asm__ volatile ("icbi 0,%0;" "dcbf 0,%1;" "sync;" "isync;"
-		    : : "r"(xaddr + size - 1), "r"(wraddr + size - 1)
-		    : "memory");
-}
-#endif
-
 ffi_status
 ffi_prep_closure_loc (ffi_closure *closure,
 		      ffi_cif *cif,
@@ -995,487 +134,8 @@ ffi_prep_closure_loc (ffi_closure *closu
 		      void *codeloc)
 {
 #ifdef POWERPC64
-  void **tramp = (void **) &closure->tramp[0];
-
-  if (cif->abi != FFI_LINUX64)
-    return FFI_BAD_ABI;
-  /* Copy function address and TOC from ffi_closure_LINUX64.  */
-  memcpy (tramp, (char *) ffi_closure_LINUX64, 16);
-  tramp[2] = codeloc;
+  return ffi_prep_closure_loc_linux64 (closure, cif, fun, user_data, codeloc);
 #else
-  unsigned int *tramp;
-
-  if (! (cif->abi == FFI_GCC_SYSV 
-	 || cif->abi == FFI_SYSV
-	 || cif->abi == FFI_LINUX
-	 || cif->abi == FFI_LINUX_SOFT_FLOAT))
-    return FFI_BAD_ABI;
-
-  tramp = (unsigned int *) &closure->tramp[0];
-  tramp[0] = 0x7c0802a6;  /*   mflr    r0 */
-  tramp[1] = 0x4800000d;  /*   bl      10 <trampoline_initial+0x10> */
-  tramp[4] = 0x7d6802a6;  /*   mflr    r11 */
-  tramp[5] = 0x7c0803a6;  /*   mtlr    r0 */
-  tramp[6] = 0x800b0000;  /*   lwz     r0,0(r11) */
-  tramp[7] = 0x816b0004;  /*   lwz     r11,4(r11) */
-  tramp[8] = 0x7c0903a6;  /*   mtctr   r0 */
-  tramp[9] = 0x4e800420;  /*   bctr */
-  *(void **) &tramp[2] = (void *) ffi_closure_SYSV; /* function */
-  *(void **) &tramp[3] = codeloc;                   /* context */
-
-  /* Flush the icache.  */
-  flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE);
-#endif
-
-  closure->cif = cif;
-  closure->fun = fun;
-  closure->user_data = user_data;
-
-  return FFI_OK;
-}
-
-typedef union
-{
-  float f;
-  double d;
-} ffi_dblfl;
-
-int ffi_closure_helper_SYSV (ffi_closure *, void *, unsigned long *,
-			     ffi_dblfl *, unsigned long *);
-
-/* Basically the trampoline invokes ffi_closure_SYSV, and on
- * entry, r11 holds the address of the closure.
- * After storing the registers that could possibly contain
- * parameters to be passed into the stack frame and setting
- * up space for a return value, ffi_closure_SYSV invokes the
- * following helper function to do most of the work
- */
-
-int
-ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue,
-			 unsigned long *pgr, ffi_dblfl *pfr,
-			 unsigned long *pst)
-{
-  /* rvalue is the pointer to space for return value in closure assembly */
-  /* pgr is the pointer to where r3-r10 are stored in ffi_closure_SYSV */
-  /* pfr is the pointer to where f1-f8 are stored in ffi_closure_SYSV  */
-  /* pst is the pointer to outgoing parameter stack in original caller */
-
-  void **          avalue;
-  ffi_type **      arg_types;
-  long             i, avn;
-#ifndef __NO_FPRS__
-  long             nf = 0;   /* number of floating registers already used */
-#endif
-  long             ng = 0;   /* number of general registers already used */
-
-  ffi_cif *cif = closure->cif;
-  unsigned       size     = cif->rtype->size;
-  unsigned short rtypenum = cif->rtype->type;
-
-  avalue = alloca (cif->nargs * sizeof (void *));
-
-  /* First translate for softfloat/nonlinux */
-  if (cif->abi == FFI_LINUX_SOFT_FLOAT) {
-	if (rtypenum == FFI_TYPE_FLOAT)
-		rtypenum = FFI_TYPE_UINT32;
-	if (rtypenum == FFI_TYPE_DOUBLE)
-		rtypenum = FFI_TYPE_UINT64;
-	if (rtypenum == FFI_TYPE_LONGDOUBLE)
-		rtypenum = FFI_TYPE_UINT128;
-  } else if (cif->abi != FFI_LINUX && cif->abi != FFI_LINUX64) {
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	if (rtypenum == FFI_TYPE_LONGDOUBLE)
-		rtypenum = FFI_TYPE_STRUCT;
-#endif
-  }
-
-
-  /* Copy the caller's structure return value address so that the closure
-     returns the data directly to the caller.
-     For FFI_SYSV the result is passed in r3/r4 if the struct size is less
-     or equal 8 bytes.  */
-  if (rtypenum == FFI_TYPE_STRUCT && ((cif->abi != FFI_SYSV) || (size > 8))) {
-      rvalue = (void *) *pgr;
-      ng++;
-      pgr++;
-    }
-
-  i = 0;
-  avn = cif->nargs;
-  arg_types = cif->arg_types;
-
-  /* Grab the addresses of the arguments from the stack frame.  */
-  while (i < avn) {
-      unsigned short typenum = arg_types[i]->type;
-
-      /* We may need to handle some values depending on ABI */
-      if (cif->abi == FFI_LINUX_SOFT_FLOAT) {
-		if (typenum == FFI_TYPE_FLOAT)
-			typenum = FFI_TYPE_UINT32;
-		if (typenum == FFI_TYPE_DOUBLE)
-			typenum = FFI_TYPE_UINT64;
-		if (typenum == FFI_TYPE_LONGDOUBLE)
-			typenum = FFI_TYPE_UINT128;
-      } else if (cif->abi != FFI_LINUX && cif->abi != FFI_LINUX64) {
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-		if (typenum == FFI_TYPE_LONGDOUBLE)
-			typenum = FFI_TYPE_STRUCT;
-#endif
-      }
-
-      switch (typenum) {
-#ifndef __NO_FPRS__
-	case FFI_TYPE_FLOAT:
-	  /* unfortunately float values are stored as doubles
-	   * in the ffi_closure_SYSV code (since we don't check
-	   * the type in that routine).
-	   */
-
-	  /* there are 8 64bit floating point registers */
-
-	  if (nf < 8)
-	    {
-	      double temp = pfr->d;
-	      pfr->f = (float) temp;
-	      avalue[i] = pfr;
-	      nf++;
-	      pfr++;
-	    }
-	  else
-	    {
-	      /* FIXME? here we are really changing the values
-	       * stored in the original calling routines outgoing
-	       * parameter stack.  This is probably a really
-	       * naughty thing to do but...
-	       */
-	      avalue[i] = pst;
-	      pst += 1;
-	    }
-	  break;
-
-	case FFI_TYPE_DOUBLE:
-	  /* On the outgoing stack all values are aligned to 8 */
-	  /* there are 8 64bit floating point registers */
-
-	  if (nf < 8)
-	    {
-	      avalue[i] = pfr;
-	      nf++;
-	      pfr++;
-	    }
-	  else
-	    {
-	      if (((long) pst) & 4)
-		pst++;
-	      avalue[i] = pst;
-	      pst += 2;
-	    }
-	  break;
-
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	case FFI_TYPE_LONGDOUBLE:
-	  if (nf < 7)
-	    {
-	      avalue[i] = pfr;
-	      pfr += 2;
-	      nf += 2;
-	    }
-	  else
-	    {
-	      if (((long) pst) & 4)
-		pst++;
-	      avalue[i] = pst;
-	      pst += 4;
-	      nf = 8;
-	    }
-	  break;
-#endif
-#endif /* have FPRS */
-
-	case FFI_TYPE_UINT128:
-		/*
-		 * Test if for the whole long double, 4 gprs are available.
-		 * otherwise the stuff ends up on the stack.
-		 */
-		if (ng < 5) {
-			avalue[i] = pgr;
-			pgr += 4;
-			ng += 4;
-		} else {
-			avalue[i] = pst;
-			pst += 4;
-			ng = 8+4;
-		}
-		break;
-
-	case FFI_TYPE_SINT8:
-	case FFI_TYPE_UINT8:
-#ifndef __LITTLE_ENDIAN__
-	  /* there are 8 gpr registers used to pass values */
-	  if (ng < 8)
-	    {
-	      avalue[i] = (char *) pgr + 3;
-	      ng++;
-	      pgr++;
-	    }
-	  else
-	    {
-	      avalue[i] = (char *) pst + 3;
-	      pst++;
-	    }
-	  break;
+  return ffi_prep_closure_loc_sysv (closure, cif, fun, user_data, codeloc);
 #endif
-	case FFI_TYPE_SINT16:
-	case FFI_TYPE_UINT16:
-#ifndef __LITTLE_ENDIAN__
-	  /* there are 8 gpr registers used to pass values */
-	  if (ng < 8)
-	    {
-	      avalue[i] = (char *) pgr + 2;
-	      ng++;
-	      pgr++;
-	    }
-	  else
-	    {
-	      avalue[i] = (char *) pst + 2;
-	      pst++;
-	    }
-	  break;
-#endif
-	case FFI_TYPE_SINT32:
-	case FFI_TYPE_UINT32:
-	case FFI_TYPE_POINTER:
-	  /* there are 8 gpr registers used to pass values */
-	  if (ng < 8)
-	    {
-	      avalue[i] = pgr;
-	      ng++;
-	      pgr++;
-	    }
-	  else
-	    {
-	      avalue[i] = pst;
-	      pst++;
-	    }
-	  break;
-
-	case FFI_TYPE_STRUCT:
-	  /* Structs are passed by reference. The address will appear in a
-	     gpr if it is one of the first 8 arguments.  */
-	  if (ng < 8)
-	    {
-	      avalue[i] = (void *) *pgr;
-	      ng++;
-	      pgr++;
-	    }
-	  else
-	    {
-	      avalue[i] = (void *) *pst;
-	      pst++;
-	    }
-	  break;
-
-	case FFI_TYPE_SINT64:
-	case FFI_TYPE_UINT64:
-	  /* passing long long ints are complex, they must
-	   * be passed in suitable register pairs such as
-	   * (r3,r4) or (r5,r6) or (r6,r7), or (r7,r8) or (r9,r10)
-	   * and if the entire pair aren't available then the outgoing
-	   * parameter stack is used for both but an alignment of 8
-	   * must will be kept.  So we must either look in pgr
-	   * or pst to find the correct address for this type
-	   * of parameter.
-	   */
-	  if (ng < 7)
-	    {
-	      if (ng & 0x01)
-		{
-		  /* skip r4, r6, r8 as starting points */
-		  ng++;
-		  pgr++;
-		}
-	      avalue[i] = pgr;
-	      ng += 2;
-	      pgr += 2;
-	    }
-	  else
-	    {
-	      if (((long) pst) & 4)
-		pst++;
-	      avalue[i] = pst;
-	      pst += 2;
-	      ng = 8;
-	    }
-	  break;
-
-	default:
-		FFI_ASSERT (0);
-	}
-
-      i++;
-    }
-
-
-  (closure->fun) (cif, rvalue, avalue, closure->user_data);
-
-  /* Tell ffi_closure_SYSV how to perform return type promotions.
-     Because the FFI_SYSV ABI returns the structures <= 8 bytes in r3/r4
-     we have to tell ffi_closure_SYSV how to treat them. We combine the base
-     type FFI_SYSV_TYPE_SMALL_STRUCT - 1  with the size of the struct.
-     So a one byte struct gets the return type 16. Return type 1 to 15 are
-     already used and we never have a struct with size zero. That is the reason
-     for the subtraction of 1. See the comment in ffitarget.h about ordering.
-  */
-  if (cif->abi == FFI_SYSV && rtypenum == FFI_TYPE_STRUCT && size <= 8)
-    return (FFI_SYSV_TYPE_SMALL_STRUCT - 1) + size;
-  return rtypenum;
-}
-
-int FFI_HIDDEN ffi_closure_helper_LINUX64 (ffi_closure *, void *,
-					   unsigned long *, ffi_dblfl *);
-
-int FFI_HIDDEN
-ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue,
-			    unsigned long *pst, ffi_dblfl *pfr)
-{
-  /* rvalue is the pointer to space for return value in closure assembly */
-  /* pst is the pointer to parameter save area
-     (r3-r10 are stored into its first 8 slots by ffi_closure_LINUX64) */
-  /* pfr is the pointer to where f1-f13 are stored in ffi_closure_LINUX64 */
-
-  void **avalue;
-  ffi_type **arg_types;
-  long i, avn;
-  ffi_cif *cif;
-  ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64;
-
-  cif = closure->cif;
-  avalue = alloca (cif->nargs * sizeof (void *));
-
-  /* Copy the caller's structure return value address so that the closure
-     returns the data directly to the caller.  */
-  if (cif->rtype->type == FFI_TYPE_STRUCT)
-    {
-      rvalue = (void *) *pst;
-      pst++;
-    }
-
-  i = 0;
-  avn = cif->nargs;
-  arg_types = cif->arg_types;
-
-  /* Grab the addresses of the arguments from the stack frame.  */
-  while (i < avn)
-    {
-      switch (arg_types[i]->type)
-	{
-	case FFI_TYPE_SINT8:
-	case FFI_TYPE_UINT8:
-#ifndef __LITTLE_ENDIAN__
-	  avalue[i] = (char *) pst + 7;
-	  pst++;
-	  break;
-#endif
-	case FFI_TYPE_SINT16:
-	case FFI_TYPE_UINT16:
-#ifndef __LITTLE_ENDIAN__
-	  avalue[i] = (char *) pst + 6;
-	  pst++;
-	  break;
-#endif
-	case FFI_TYPE_SINT32:
-	case FFI_TYPE_UINT32:
-#ifndef __LITTLE_ENDIAN__
-	  avalue[i] = (char *) pst + 4;
-	  pst++;
-	  break;
-#endif
-	case FFI_TYPE_SINT64:
-	case FFI_TYPE_UINT64:
-	case FFI_TYPE_POINTER:
-	  avalue[i] = pst;
-	  pst++;
-	  break;
-
-	case FFI_TYPE_STRUCT:
-#ifndef __LITTLE_ENDIAN__
-	  /* Structures with size less than eight bytes are passed
-	     left-padded.  */
-	  if (arg_types[i]->size < 8)
-	    avalue[i] = (char *) pst + 8 - arg_types[i]->size;
-	  else
-#endif
-	    avalue[i] = pst;
-	  pst += (arg_types[i]->size + 7) / 8;
-	  break;
-
-	case FFI_TYPE_FLOAT:
-	  /* unfortunately float values are stored as doubles
-	   * in the ffi_closure_LINUX64 code (since we don't check
-	   * the type in that routine).
-	   */
-
-	  /* there are 13 64bit floating point registers */
-
-	  if (pfr < end_pfr)
-	    {
-	      double temp = pfr->d;
-	      pfr->f = (float) temp;
-	      avalue[i] = pfr;
-	      pfr++;
-	    }
-	  else
-	    avalue[i] = pst;
-	  pst++;
-	  break;
-
-	case FFI_TYPE_DOUBLE:
-	  /* On the outgoing stack all values are aligned to 8 */
-	  /* there are 13 64bit floating point registers */
-
-	  if (pfr < end_pfr)
-	    {
-	      avalue[i] = pfr;
-	      pfr++;
-	    }
-	  else
-	    avalue[i] = pst;
-	  pst++;
-	  break;
-
-#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
-	case FFI_TYPE_LONGDOUBLE:
-	  if (pfr + 1 < end_pfr)
-	    {
-	      avalue[i] = pfr;
-	      pfr += 2;
-	    }
-	  else
-	    {
-	      if (pfr < end_pfr)
-		{
-		  /* Passed partly in f13 and partly on the stack.
-		     Move it all to the stack.  */
-		  *pst = *(unsigned long *) pfr;
-		  pfr++;
-		}
-	      avalue[i] = pst;
-	    }
-	  pst += 2;
-	  break;
-#endif
-
-	default:
-	  FFI_ASSERT (0);
-	}
-
-      i++;
-    }
-
-
-  (closure->fun) (cif, rvalue, avalue, closure->user_data);
-
-  /* Tell ffi_closure_LINUX64 how to perform return type promotions.  */
-  return cif->rtype->type;
 }
Index: gcc-4_8-branch/libffi/src/powerpc/ffitarget.h
===================================================================
--- gcc-4_8-branch.orig/libffi/src/powerpc/ffitarget.h	2013-12-28 17:41:31.722625405 +0100
+++ gcc-4_8-branch/libffi/src/powerpc/ffitarget.h	2013-12-28 17:50:47.100374762 +0100
@@ -60,45 +60,76 @@ typedef signed long            ffi_sarg;
 typedef enum ffi_abi {
   FFI_FIRST_ABI = 0,
 
-#ifdef POWERPC
-  FFI_SYSV,
-  FFI_GCC_SYSV,
-  FFI_LINUX64,
-  FFI_LINUX,
-  FFI_LINUX_SOFT_FLOAT,
-# if defined(POWERPC64)
-  FFI_DEFAULT_ABI = FFI_LINUX64,
-# elif defined(__NO_FPRS__)
-  FFI_DEFAULT_ABI = FFI_LINUX_SOFT_FLOAT,
-# elif (__LDBL_MANT_DIG__ == 106)
-  FFI_DEFAULT_ABI = FFI_LINUX,
-# else
-  FFI_DEFAULT_ABI = FFI_GCC_SYSV,
-# endif
-#endif
-
-#ifdef POWERPC_AIX
+#if defined (POWERPC_AIX)
   FFI_AIX,
   FFI_DARWIN,
   FFI_DEFAULT_ABI = FFI_AIX,
-#endif
+  FFI_LAST_ABI
 
-#ifdef POWERPC_DARWIN
+#elif defined (POWERPC_DARWIN)
   FFI_AIX,
   FFI_DARWIN,
   FFI_DEFAULT_ABI = FFI_DARWIN,
-#endif
+  FFI_LAST_ABI
+
+#else
+  /* The FFI_COMPAT values are used by old code.  Since libffi may be
+     a shared library we have to support old values for backwards
+     compatibility.  */
+  FFI_COMPAT_SYSV,
+  FFI_COMPAT_GCC_SYSV,
+  FFI_COMPAT_LINUX64,
+  FFI_COMPAT_LINUX,
+  FFI_COMPAT_LINUX_SOFT_FLOAT,
+
+# if defined (POWERPC64)
+  /* This bit, always set in new code, must not be set in any of the
+     old FFI_COMPAT values that might be used for 64-bit linux.  We
+     only need worry about FFI_COMPAT_LINUX64, but to be safe avoid
+     all old values.  */
+  FFI_LINUX = 8,
+  /* This and following bits can reuse FFI_COMPAT values.  */
+  FFI_LINUX_STRUCT_ALIGN = 1,
+  FFI_LINUX_LONG_DOUBLE_128 = 2,
+  FFI_DEFAULT_ABI = (FFI_LINUX
+#  ifdef __STRUCT_PARM_ALIGN__
+		     | FFI_LINUX_STRUCT_ALIGN
+#  endif
+#  ifdef __LONG_DOUBLE_128__
+		     | FFI_LINUX_LONG_DOUBLE_128
+#  endif
+		     ),
+  FFI_LAST_ABI = 12
 
-#ifdef POWERPC_FREEBSD
-  FFI_SYSV,
-  FFI_GCC_SYSV,
-  FFI_LINUX64,
-  FFI_LINUX,
-  FFI_LINUX_SOFT_FLOAT,
-  FFI_DEFAULT_ABI = FFI_SYSV,
+# else
+  /* This bit, always set in new code, must not be set in any of the
+     old FFI_COMPAT values that might be used for 32-bit linux/sysv/bsd.  */
+  FFI_SYSV = 8,
+  /* This and following bits can reuse FFI_COMPAT values.  */
+  FFI_SYSV_SOFT_FLOAT = 1,
+  FFI_SYSV_STRUCT_RET = 2,
+  FFI_SYSV_IBM_LONG_DOUBLE = 4,
+  FFI_SYSV_LONG_DOUBLE_128 = 16,
+
+  FFI_DEFAULT_ABI = (FFI_SYSV
+#  ifdef __NO_FPRS__
+		     | FFI_SYSV_SOFT_FLOAT
+#  endif
+#  if (defined (__SVR4_STRUCT_RETURN)					\
+       || defined (POWERPC_FREEBSD) && !defined (__AIX_STRUCT_RETURN))
+		     | FFI_SYSV_STRUCT_RET
+#  endif
+#  if __LDBL_MANT_DIG__ == 106
+		     | FFI_SYSV_IBM_LONG_DOUBLE
+#  endif
+#  ifdef __LONG_DOUBLE_128__
+		     | FFI_SYSV_LONG_DOUBLE_128
+#  endif
+		     ),
+  FFI_LAST_ABI = 32
+# endif
 #endif
 
-  FFI_LAST_ABI
 } ffi_abi;
 #endif
 
@@ -106,6 +137,10 @@ typedef enum ffi_abi {
 
 #define FFI_CLOSURES 1
 #define FFI_NATIVE_RAW_API 0
+#if defined (POWERPC) || defined (POWERPC_FREEBSD)
+# define FFI_TARGET_SPECIFIC_VARIADIC 1
+# define FFI_EXTRA_CIF_FIELDS unsigned nfixedargs
+#endif
 
 /* For additional types like the below, take care about the order in
    ppc_closures.S. They must follow after the FFI_TYPE_LAST.  */
@@ -113,19 +148,26 @@ typedef enum ffi_abi {
 /* Needed for soft-float long-double-128 support.  */
 #define FFI_TYPE_UINT128 (FFI_TYPE_LAST + 1)
 
-/* Needed for FFI_SYSV small structure returns.
-   We use two flag bits, (FLAG_SYSV_SMST_R3, FLAG_SYSV_SMST_R4) which are
-   defined in ffi.c, to determine the exact return type and its size.  */
+/* Needed for FFI_SYSV small structure returns.  */
 #define FFI_SYSV_TYPE_SMALL_STRUCT (FFI_TYPE_LAST + 2)
 
-#if defined(POWERPC64) || defined(POWERPC_AIX)
+/* Used by ELFv2 for homogenous structure returns.  */
+#define FFI_V2_TYPE_FLOAT_HOMOG		(FFI_TYPE_LAST + 1)
+#define FFI_V2_TYPE_DOUBLE_HOMOG	(FFI_TYPE_LAST + 2)
+#define FFI_V2_TYPE_SMALL_STRUCT	(FFI_TYPE_LAST + 3)
+
+#if _CALL_ELF == 2
+# define FFI_TRAMPOLINE_SIZE 32
+#else
+# if defined(POWERPC64) || defined(POWERPC_AIX)
 #  if defined(POWERPC_DARWIN64)
 #    define FFI_TRAMPOLINE_SIZE 48
 #  else
 #    define FFI_TRAMPOLINE_SIZE 24
 #  endif
-#else /* POWERPC || POWERPC_AIX */
+# else /* POWERPC || POWERPC_AIX */
 #  define FFI_TRAMPOLINE_SIZE 40
+# endif
 #endif
 
 #ifndef LIBFFI_ASM
Index: gcc-4_8-branch/libffi/src/powerpc/linux64.S
===================================================================
--- gcc-4_8-branch.orig/libffi/src/powerpc/linux64.S	2013-12-28 17:41:31.722625405 +0100
+++ gcc-4_8-branch/libffi/src/powerpc/linux64.S	2013-12-28 17:50:47.104374782 +0100
@@ -29,18 +29,25 @@
 #include <fficonfig.h>
 #include <ffi.h>
 
-#ifdef __powerpc64__
+#ifdef POWERPC64
 	.hidden	ffi_call_LINUX64
 	.globl	ffi_call_LINUX64
+# if _CALL_ELF == 2
+	.text
+ffi_call_LINUX64:
+	addis	%r2, %r12, .TOC.-ffi_call_LINUX64@ha
+	addi	%r2, %r2, .TOC.-ffi_call_LINUX64@l
+	.localentry ffi_call_LINUX64, . - ffi_call_LINUX64
+# else
 	.section	".opd","aw"
 	.align	3
 ffi_call_LINUX64:
-#ifdef _CALL_LINUX
+#  ifdef _CALL_LINUX
 	.quad	.L.ffi_call_LINUX64,.TOC.@tocbase,0
 	.type	ffi_call_LINUX64,@function
 	.text
 .L.ffi_call_LINUX64:
-#else
+#  else
 	.hidden	.ffi_call_LINUX64
 	.globl	.ffi_call_LINUX64
 	.quad	.ffi_call_LINUX64,.TOC.@tocbase,0
@@ -48,7 +55,8 @@ ffi_call_LINUX64:
 	.type	.ffi_call_LINUX64,@function
 	.text
 .ffi_call_LINUX64:
-#endif
+#  endif
+# endif
 .LFB1:
 	mflr	%r0
 	std	%r28, -32(%r1)
@@ -63,26 +71,35 @@ ffi_call_LINUX64:
 	mr	%r31, %r5	/* flags, */
 	mr	%r30, %r6	/* rvalue, */
 	mr	%r29, %r7	/* function address.  */
+/* Save toc pointer, not for the ffi_prep_args64 call, but for the later
+   bctrl function call.  */
+# if _CALL_ELF == 2
+	std	%r2, 24(%r1)
+# else
 	std	%r2, 40(%r1)
+# endif
 
 	/* Call ffi_prep_args64.  */
 	mr	%r4, %r1
-#ifdef _CALL_LINUX
+# if defined _CALL_LINUX || _CALL_ELF == 2
 	bl	ffi_prep_args64
-#else
+# else
 	bl	.ffi_prep_args64
-#endif
+# endif
 
-	ld	%r0, 0(%r29)
+# if _CALL_ELF == 2
+	mr	%r12, %r29
+# else
+	ld	%r12, 0(%r29)
 	ld	%r2, 8(%r29)
 	ld	%r11, 16(%r29)
-
+# endif
 	/* Now do the call.  */
 	/* Set up cr1 with bits 4-7 of the flags.  */
 	mtcrf	0x40, %r31
 
 	/* Get the address to call into CTR.  */
-	mtctr	%r0
+	mtctr	%r12
 	/* Load all those argument registers.  */
 	ld	%r3, -32-(8*8)(%r28)
 	ld	%r4, -32-(7*8)(%r28)
@@ -117,12 +134,17 @@ ffi_call_LINUX64:
 
 	/* This must follow the call immediately, the unwinder
 	   uses this to find out if r2 has been saved or not.  */
+# if _CALL_ELF == 2
+	ld	%r2, 24(%r1)
+# else
 	ld	%r2, 40(%r1)
+# endif
 
 	/* Now, deal with the return value.  */
 	mtcrf	0x01, %r31
-	bt-	30, .Ldone_return_value
-	bt-	29, .Lfp_return_value
+	bt	31, .Lstruct_return_value
+	bt	30, .Ldone_return_value
+	bt	29, .Lfp_return_value
 	std	%r3, 0(%r30)
 	/* Fall through...  */
 
@@ -130,7 +152,7 @@ ffi_call_LINUX64:
 	/* Restore the registers we used and return.  */
 	mr	%r1, %r28
 	ld	%r0, 16(%r28)
-	ld	%r28, -32(%r1)
+	ld	%r28, -32(%r28)
 	mtlr	%r0
 	ld	%r29, -24(%r1)
 	ld	%r30, -16(%r1)
@@ -147,14 +169,48 @@ ffi_call_LINUX64:
 .Lfloat_return_value:
 	stfs	%f1, 0(%r30)
 	b	.Ldone_return_value
+
+.Lstruct_return_value:
+	bf	29, .Lsmall_struct
+	bf	28, .Lfloat_homog_return_value
+	stfd	%f1, 0(%r30)
+	stfd	%f2, 8(%r30)
+	stfd	%f3, 16(%r30)
+	stfd	%f4, 24(%r30)
+	stfd	%f5, 32(%r30)
+	stfd	%f6, 40(%r30)
+	stfd	%f7, 48(%r30)
+	stfd	%f8, 56(%r30)
+	b	.Ldone_return_value
+
+.Lfloat_homog_return_value:
+	stfs	%f1, 0(%r30)
+	stfs	%f2, 4(%r30)
+	stfs	%f3, 8(%r30)
+	stfs	%f4, 12(%r30)
+	stfs	%f5, 16(%r30)
+	stfs	%f6, 20(%r30)
+	stfs	%f7, 24(%r30)
+	stfs	%f8, 28(%r30)
+	b	.Ldone_return_value
+
+.Lsmall_struct:
+	std	%r3, 0(%r30)
+	std	%r4, 8(%r30)
+	b	.Ldone_return_value
+
 .LFE1:
 	.long	0
 	.byte	0,12,0,1,128,4,0,0
-#ifdef _CALL_LINUX
+# if _CALL_ELF == 2
+	.size	ffi_call_LINUX64,.-ffi_call_LINUX64
+# else
+#  ifdef _CALL_LINUX
 	.size	ffi_call_LINUX64,.-.L.ffi_call_LINUX64
-#else
+#  else
 	.size	.ffi_call_LINUX64,.-.ffi_call_LINUX64
-#endif
+#  endif
+# endif
 
 	.section	.eh_frame,EH_FRAME_FLAGS,@progbits
 .Lframe1:
@@ -197,8 +253,8 @@ ffi_call_LINUX64:
 	.uleb128 0x4
 	.align 3
 .LEFDE1:
-#endif
 
-#if defined __ELF__ && defined __linux__
+# if (defined __ELF__ && defined __linux__) || _CALL_ELF == 2
 	.section	.note.GNU-stack,"",@progbits
+# endif
 #endif
Index: gcc-4_8-branch/libffi/src/powerpc/linux64_closure.S
===================================================================
--- gcc-4_8-branch.orig/libffi/src/powerpc/linux64_closure.S	2013-12-28 17:41:31.722625405 +0100
+++ gcc-4_8-branch/libffi/src/powerpc/linux64_closure.S	2013-12-28 17:50:47.107374797 +0100
@@ -30,18 +30,25 @@
 
 	.file	"linux64_closure.S"
 
-#ifdef __powerpc64__
+#ifdef POWERPC64
 	FFI_HIDDEN (ffi_closure_LINUX64)
 	.globl  ffi_closure_LINUX64
+# if _CALL_ELF == 2
+	.text
+ffi_closure_LINUX64:
+	addis	%r2, %r12, .TOC.-ffi_closure_LINUX64@ha
+	addi	%r2, %r2, .TOC.-ffi_closure_LINUX64@l
+	.localentry ffi_closure_LINUX64, . - ffi_closure_LINUX64
+# else
 	.section        ".opd","aw"
 	.align  3
 ffi_closure_LINUX64:
-#ifdef _CALL_LINUX
+#  ifdef _CALL_LINUX
 	.quad   .L.ffi_closure_LINUX64,.TOC.@tocbase,0
 	.type   ffi_closure_LINUX64,@function
 	.text
 .L.ffi_closure_LINUX64:
-#else
+#  else
 	FFI_HIDDEN (.ffi_closure_LINUX64)
 	.globl  .ffi_closure_LINUX64
 	.quad   .ffi_closure_LINUX64,.TOC.@tocbase,0
@@ -49,61 +56,101 @@ ffi_closure_LINUX64:
 	.type   .ffi_closure_LINUX64,@function
 	.text
 .ffi_closure_LINUX64:
-#endif
+#  endif
+# endif
+
+# if _CALL_ELF == 2
+#  32 byte special reg save area + 64 byte parm save area
+#  + 64 byte retval area + 13*8 fpr save area + round to 16
+#  define STACKFRAME 272
+#  define PARMSAVE 32
+#  define RETVAL PARMSAVE+64
+# else
+#  48 bytes special reg save area + 64 bytes parm save area
+#  + 16 bytes retval area + 13*8 bytes fpr save area + round to 16
+#  define STACKFRAME 240
+#  define PARMSAVE 48
+#  define RETVAL PARMSAVE+64
+# endif
+
 .LFB1:
-	# save general regs into parm save area
-	std	%r3, 48(%r1)
-	std	%r4, 56(%r1)
-	std	%r5, 64(%r1)
-	std	%r6, 72(%r1)
+# if _CALL_ELF == 2
+	ld	%r12, FFI_TRAMPOLINE_SIZE(%r11)		# closure->cif
+	mflr	%r0
+	lwz	%r12, 28(%r12)				# cif->flags
+	mtcrf	0x40, %r12
+	addi	%r12, %r1, PARMSAVE
+	bt	7, .Lparmsave
+	# Our caller has not allocated a parameter save area.
+	# We need to allocate one here and use it to pass gprs to
+	# ffi_closure_helper_LINUX64.
+	addi	%r12, %r1, -STACKFRAME+PARMSAVE
+.Lparmsave:
+	std	%r0, 16(%r1)
+	# Save general regs into parm save area
+	std	%r3, 0(%r12)
+	std	%r4, 8(%r12)
+	std	%r5, 16(%r12)
+	std	%r6, 24(%r12)
+	std	%r7, 32(%r12)
+	std	%r8, 40(%r12)
+	std	%r9, 48(%r12)
+	std	%r10, 56(%r12)
+
+	# load up the pointer to the parm save area
+	mr	%r5, %r12
+# else
 	mflr	%r0
+	# Save general regs into parm save area
+	# This is the parameter save area set up by our caller.
+	std	%r3, PARMSAVE+0(%r1)
+	std	%r4, PARMSAVE+8(%r1)
+	std	%r5, PARMSAVE+16(%r1)
+	std	%r6, PARMSAVE+24(%r1)
+	std	%r7, PARMSAVE+32(%r1)
+	std	%r8, PARMSAVE+40(%r1)
+	std	%r9, PARMSAVE+48(%r1)
+	std	%r10, PARMSAVE+56(%r1)
 
-	std	%r7, 80(%r1)
-	std	%r8, 88(%r1)
-	std	%r9, 96(%r1)
-	std	%r10, 104(%r1)
 	std	%r0, 16(%r1)
 
-	# mandatory 48 bytes special reg save area + 64 bytes parm save area
-	# + 16 bytes retval area + 13*8 bytes fpr save area + round to 16
-	stdu	%r1, -240(%r1)
-.LCFI0:
+	# load up the pointer to the parm save area
+	addi	%r5, %r1, PARMSAVE
+# endif
 
 	# next save fpr 1 to fpr 13
-	stfd  %f1, 128+(0*8)(%r1)
-	stfd  %f2, 128+(1*8)(%r1)
-	stfd  %f3, 128+(2*8)(%r1)
-	stfd  %f4, 128+(3*8)(%r1)
-	stfd  %f5, 128+(4*8)(%r1)
-	stfd  %f6, 128+(5*8)(%r1)
-	stfd  %f7, 128+(6*8)(%r1)
-	stfd  %f8, 128+(7*8)(%r1)
-	stfd  %f9, 128+(8*8)(%r1)
-	stfd  %f10, 128+(9*8)(%r1)
-	stfd  %f11, 128+(10*8)(%r1)
-	stfd  %f12, 128+(11*8)(%r1)
-	stfd  %f13, 128+(12*8)(%r1)
+	stfd	%f1, -104+(0*8)(%r1)
+	stfd	%f2, -104+(1*8)(%r1)
+	stfd	%f3, -104+(2*8)(%r1)
+	stfd	%f4, -104+(3*8)(%r1)
+	stfd	%f5, -104+(4*8)(%r1)
+	stfd	%f6, -104+(5*8)(%r1)
+	stfd	%f7, -104+(6*8)(%r1)
+	stfd	%f8, -104+(7*8)(%r1)
+	stfd	%f9, -104+(8*8)(%r1)
+	stfd	%f10, -104+(9*8)(%r1)
+	stfd	%f11, -104+(10*8)(%r1)
+	stfd	%f12, -104+(11*8)(%r1)
+	stfd	%f13, -104+(12*8)(%r1)
 
-	# set up registers for the routine that actually does the work
-	# get the context pointer from the trampoline
-	mr %r3, %r11
+	# load up the pointer to the saved fpr registers */
+	addi	%r6, %r1, -104
 
-	# now load up the pointer to the result storage
-	addi %r4, %r1, 112
+	# load up the pointer to the result storage
+	addi	%r4, %r1, -STACKFRAME+RETVAL
 
-	# now load up the pointer to the parameter save area
-	# in the previous frame
-	addi %r5, %r1, 240 + 48
+	stdu	%r1, -STACKFRAME(%r1)
+.LCFI0:
 
-	# now load up the pointer to the saved fpr registers */
-	addi %r6, %r1, 128
+	# get the context pointer from the trampoline
+	mr	%r3, %r11
 
 	# make the call
-#ifdef _CALL_LINUX
+# if defined _CALL_LINUX || _CALL_ELF == 2
 	bl ffi_closure_helper_LINUX64
-#else
+# else
 	bl .ffi_closure_helper_LINUX64
-#endif
+# endif
 .Lret:
 
 	# now r3 contains the return type
@@ -112,10 +159,12 @@ ffi_closure_LINUX64:
 
 	# look up the proper starting point in table
 	# by using return type as offset
+	ld %r0, STACKFRAME+16(%r1)
+	cmpldi %r3, FFI_V2_TYPE_SMALL_STRUCT
+	bge .Lsmall
 	mflr %r4		# move address of .Lret to r4
 	sldi %r3, %r3, 4	# now multiply return type by 16
 	addi %r4, %r4, .Lret_type0 - .Lret
-	ld %r0, 240+16(%r1)
 	add %r3, %r3, %r4	# add contents of table to table address
 	mtctr %r3
 	bctr			# jump to it
@@ -128,117 +177,175 @@ ffi_closure_LINUX64:
 .Lret_type0:
 # case FFI_TYPE_VOID
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 	nop
 # case FFI_TYPE_INT
-#ifdef __LITTLE_ENDIAN__
-	lwa %r3, 112+0(%r1)
-#else
-	lwa %r3, 112+4(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lwa %r3, RETVAL+0(%r1)
+# else
+	lwa %r3, RETVAL+4(%r1)
+# endif
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_FLOAT
-	lfs %f1, 112+0(%r1)
+	lfs %f1, RETVAL+0(%r1)
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_DOUBLE
-	lfd %f1, 112+0(%r1)
+	lfd %f1, RETVAL+0(%r1)
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_LONGDOUBLE
-	lfd %f1, 112+0(%r1)
+	lfd %f1, RETVAL+0(%r1)
 	mtlr %r0
-	lfd %f2, 112+8(%r1)
+	lfd %f2, RETVAL+8(%r1)
 	b .Lfinish
 # case FFI_TYPE_UINT8
-#ifdef __LITTLE_ENDIAN__
-	lbz %r3, 112+0(%r1)
-#else
-	lbz %r3, 112+7(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lbz %r3, RETVAL+0(%r1)
+# else
+	lbz %r3, RETVAL+7(%r1)
+# endif
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_SINT8
-#ifdef __LITTLE_ENDIAN__
-	lbz %r3, 112+0(%r1)
-#else
-	lbz %r3, 112+7(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lbz %r3, RETVAL+0(%r1)
+# else
+	lbz %r3, RETVAL+7(%r1)
+# endif
 	extsb %r3,%r3
 	mtlr %r0
 	b .Lfinish
 # case FFI_TYPE_UINT16
-#ifdef __LITTLE_ENDIAN__
-	lhz %r3, 112+0(%r1)
-#else
-	lhz %r3, 112+6(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lhz %r3, RETVAL+0(%r1)
+# else
+	lhz %r3, RETVAL+6(%r1)
+# endif
 	mtlr %r0
 .Lfinish:
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_SINT16
-#ifdef __LITTLE_ENDIAN__
-	lha %r3, 112+0(%r1)
-#else
-	lha %r3, 112+6(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lha %r3, RETVAL+0(%r1)
+# else
+	lha %r3, RETVAL+6(%r1)
+# endif
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_UINT32
-#ifdef __LITTLE_ENDIAN__
-	lwz %r3, 112+0(%r1)
-#else
-	lwz %r3, 112+4(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lwz %r3, RETVAL+0(%r1)
+# else
+	lwz %r3, RETVAL+4(%r1)
+# endif
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_SINT32
-#ifdef __LITTLE_ENDIAN__
-	lwa %r3, 112+0(%r1)
-#else
-	lwa %r3, 112+4(%r1)
-#endif
+# ifdef __LITTLE_ENDIAN__
+	lwa %r3, RETVAL+0(%r1)
+# else
+	lwa %r3, RETVAL+4(%r1)
+# endif
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_UINT64
-	ld %r3, 112+0(%r1)
+	ld %r3, RETVAL+0(%r1)
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_SINT64
-	ld %r3, 112+0(%r1)
+	ld %r3, RETVAL+0(%r1)
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 # case FFI_TYPE_STRUCT
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
 	blr
 	nop
 # case FFI_TYPE_POINTER
-	ld %r3, 112+0(%r1)
+	ld %r3, RETVAL+0(%r1)
 	mtlr %r0
-	addi %r1, %r1, 240
+	addi %r1, %r1, STACKFRAME
+	blr
+# case FFI_V2_TYPE_FLOAT_HOMOG
+	lfs %f1, RETVAL+0(%r1)
+	lfs %f2, RETVAL+4(%r1)
+	lfs %f3, RETVAL+8(%r1)
+	b .Lmorefloat
+# case FFI_V2_TYPE_DOUBLE_HOMOG
+	lfd %f1, RETVAL+0(%r1)
+	lfd %f2, RETVAL+8(%r1)
+	lfd %f3, RETVAL+16(%r1)
+	lfd %f4, RETVAL+24(%r1)
+	mtlr %r0
+	lfd %f5, RETVAL+32(%r1)
+	lfd %f6, RETVAL+40(%r1)
+	lfd %f7, RETVAL+48(%r1)
+	lfd %f8, RETVAL+56(%r1)
+	addi %r1, %r1, STACKFRAME
+	blr
+.Lmorefloat:
+	lfs %f4, RETVAL+12(%r1)
+	mtlr %r0
+	lfs %f5, RETVAL+16(%r1)
+	lfs %f6, RETVAL+20(%r1)
+	lfs %f7, RETVAL+24(%r1)
+	lfs %f8, RETVAL+28(%r1)
+	addi %r1, %r1, STACKFRAME
+	blr
+.Lsmall:
+# ifdef __LITTLE_ENDIAN__
+	ld %r3,RETVAL+0(%r1)
+	mtlr %r0
+	ld %r4,RETVAL+8(%r1)
+	addi %r1, %r1, STACKFRAME
+	blr
+# else
+	# A struct smaller than a dword is returned in the low bits of r3
+	# ie. right justified.  Larger structs are passed left justified
+	# in r3 and r4.  The return value area on the stack will have
+	# the structs as they are usually stored in memory.
+	cmpldi %r3, FFI_V2_TYPE_SMALL_STRUCT + 7 # size 8 bytes?
+	neg %r5, %r3
+	ld %r3,RETVAL+0(%r1)
+	blt .Lsmalldown
+	mtlr %r0
+	ld %r4,RETVAL+8(%r1)
+	addi %r1, %r1, STACKFRAME
+	blr
+.Lsmalldown:
+	addi %r5, %r5, FFI_V2_TYPE_SMALL_STRUCT + 7
+	mtlr %r0
+	sldi %r5, %r5, 3
+	addi %r1, %r1, STACKFRAME
+	srd %r3, %r3, %r5
 	blr
-# esac
+# endif
+
 .LFE1:
 	.long	0
 	.byte	0,12,0,1,128,0,0,0
-#ifdef _CALL_LINUX
+# if _CALL_ELF == 2
+	.size	ffi_closure_LINUX64,.-ffi_closure_LINUX64
+# else
+#  ifdef _CALL_LINUX
 	.size	ffi_closure_LINUX64,.-.L.ffi_closure_LINUX64
-#else
+#  else
 	.size	.ffi_closure_LINUX64,.-.ffi_closure_LINUX64
-#endif
+#  endif
+# endif
 
 	.section	.eh_frame,EH_FRAME_FLAGS,@progbits
 .Lframe1:
@@ -267,14 +374,14 @@ ffi_closure_LINUX64:
 	.byte	0x2	 # DW_CFA_advance_loc1
 	.byte	.LCFI0-.LFB1
 	.byte	0xe	 # DW_CFA_def_cfa_offset
-	.uleb128 240
+	.uleb128 STACKFRAME
 	.byte	0x11	 # DW_CFA_offset_extended_sf
 	.uleb128 0x41
 	.sleb128 -2
 	.align 3
 .LEFDE1:
-#endif
 
-#if defined __ELF__ && defined __linux__
+# if defined __ELF__ && defined __linux__
 	.section	.note.GNU-stack,"",@progbits
+# endif
 #endif
Index: gcc-4_8-branch/libffi/src/powerpc/ppc_closure.S
===================================================================
--- gcc-4_8-branch.orig/libffi/src/powerpc/ppc_closure.S	2013-12-28 17:41:31.722625405 +0100
+++ gcc-4_8-branch/libffi/src/powerpc/ppc_closure.S	2013-12-28 17:50:47.110374812 +0100
@@ -31,7 +31,7 @@
 
 	.file   "ppc_closure.S"
 
-#ifndef __powerpc64__
+#ifndef POWERPC64
 
 ENTRY(ffi_closure_SYSV)
 .LFB1:
@@ -238,7 +238,7 @@ ENTRY(ffi_closure_SYSV)
 	lwz %r3,112+0(%r1)
 	lwz %r4,112+4(%r1)
 	lwz %r5,112+8(%r1)
-	bl .Luint128
+	b .Luint128
 
 # The return types below are only used when the ABI type is FFI_SYSV.
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 1. One byte struct.
@@ -378,8 +378,7 @@ END(ffi_closure_SYSV)
 	.align 2
 .LEFDE1:
 
-#endif
-
 #if defined __ELF__ && defined __linux__
 	.section	.note.GNU-stack,"",@progbits
 #endif
+#endif
Index: gcc-4_8-branch/libffi/testsuite/libffi.call/cls_double_va.c
===================================================================
--- gcc-4_8-branch.orig/libffi/testsuite/libffi.call/cls_double_va.c	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/testsuite/libffi.call/cls_double_va.c	2013-12-28 17:50:47.113374827 +0100
@@ -38,26 +38,24 @@ int main (void)
 
 	/* This printf call is variadic */
 	CHECK(ffi_prep_cif_var(&cif, FFI_DEFAULT_ABI, 1, 2, &ffi_type_sint,
-		arg_types) == FFI_OK);
+			       arg_types) == FFI_OK);
 
 	args[0] = &format;
 	args[1] = &doubleArg;
 	args[2] = NULL;
 
 	ffi_call(&cif, FFI_FN(printf), &res, args);
-	// { dg-output "7.0" }
+	/* { dg-output "7.0" } */
 	printf("res: %d\n", (int) res);
-	// { dg-output "\nres: 4" }
+	/* { dg-output "\nres: 4" } */
 
-	/* The call to cls_double_va_fn is static, so have to use a normal prep_cif */
-	CHECK(ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 2, &ffi_type_sint, arg_types) == FFI_OK);
+	CHECK(ffi_prep_closure_loc(pcl, &cif, cls_double_va_fn, NULL,
+				   code) == FFI_OK);
 
-	CHECK(ffi_prep_closure_loc(pcl, &cif, cls_double_va_fn, NULL, code) == FFI_OK);
-
-	res	= ((int(*)(char*, double))(code))(format, doubleArg);
-	// { dg-output "\n7.0" }
+	res = ((int(*)(char*, ...))(code))(format, doubleArg);
+	/* { dg-output "\n7.0" } */
 	printf("res: %d\n", (int) res);
-	// { dg-output "\nres: 4" }
+	/* { dg-output "\nres: 4" } */
 
 	exit(0);
 }
Index: gcc-4_8-branch/libffi/testsuite/libffi.call/cls_longdouble_va.c
===================================================================
--- gcc-4_8-branch.orig/libffi/testsuite/libffi.call/cls_longdouble_va.c	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/testsuite/libffi.call/cls_longdouble_va.c	2013-12-28 17:50:47.116374842 +0100
@@ -38,27 +38,24 @@ int main (void)
 
 	/* This printf call is variadic */
 	CHECK(ffi_prep_cif_var(&cif, FFI_DEFAULT_ABI, 1, 2, &ffi_type_sint,
-		arg_types) == FFI_OK);
+			       arg_types) == FFI_OK);
 
 	args[0] = &format;
 	args[1] = &ldArg;
 	args[2] = NULL;
 
 	ffi_call(&cif, FFI_FN(printf), &res, args);
-	// { dg-output "7.0" }
+	/* { dg-output "7.0" } */
 	printf("res: %d\n", (int) res);
-	// { dg-output "\nres: 4" }
+	/* { dg-output "\nres: 4" } */
 
-	/* The call to cls_longdouble_va_fn is static, so have to use a normal prep_cif */
-	CHECK(ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 2, &ffi_type_sint,
-		arg_types) == FFI_OK);
+	CHECK(ffi_prep_closure_loc(pcl, &cif, cls_longdouble_va_fn, NULL,
+				   code) == FFI_OK);
 
-	CHECK(ffi_prep_closure_loc(pcl, &cif, cls_longdouble_va_fn, NULL, code) == FFI_OK);
-
-	res	= ((int(*)(char*, long double))(code))(format, ldArg);
-	// { dg-output "\n7.0" }
+	res = ((int(*)(char*, ...))(code))(format, ldArg);
+	/* { dg-output "\n7.0" } */
 	printf("res: %d\n", (int) res);
-	// { dg-output "\nres: 4" }
+	/* { dg-output "\nres: 4" } */
 
 	exit(0);
 }
Index: gcc-4_8-branch/libffi/Makefile.am
===================================================================
--- gcc-4_8-branch.orig/libffi/Makefile.am	2013-12-28 17:41:31.721625400 +0100
+++ gcc-4_8-branch/libffi/Makefile.am	2013-12-28 17:50:47.119374857 +0100
@@ -15,10 +15,12 @@ EXTRA_DIST = LICENSE ChangeLog.v1 Change
 	 src/ia64/unix.S src/mips/ffi.c src/mips/n32.S src/mips/o32.S	\
 	 src/mips/ffitarget.h src/m32r/ffi.c src/m32r/sysv.S		\
 	 src/m32r/ffitarget.h src/m68k/ffi.c src/m68k/sysv.S		\
-	 src/m68k/ffitarget.h src/powerpc/ffi.c src/powerpc/sysv.S	\
-	 src/powerpc/linux64.S src/powerpc/linux64_closure.S		\
-	 src/powerpc/ppc_closure.S src/powerpc/asm.h			\
-	src/powerpc/aix.S src/powerpc/darwin.S				\
+	 src/m68k/ffitarget.h						\
+	src/powerpc/ffi.c src/powerpc/ffi_powerpc.h			\
+	src/powerpc/ffi_sysv.c src/powerpc/ffi_linux64.c		\
+	src/powerpc/sysv.S src/powerpc/linux64.S			\
+	src/powerpc/linux64_closure.S src/powerpc/ppc_closure.S		\
+	src/powerpc/asm.h src/powerpc/aix.S src/powerpc/darwin.S	\
 	src/powerpc/aix_closure.S src/powerpc/darwin_closure.S		\
 	src/powerpc/ffi_darwin.c src/powerpc/ffitarget.h		\
 	src/s390/ffi.c src/s390/sysv.S src/s390/ffitarget.h		\
@@ -179,7 +181,7 @@ if M68K
 nodist_libffi_la_SOURCES += src/m68k/ffi.c src/m68k/sysv.S
 endif
 if POWERPC
-nodist_libffi_la_SOURCES += src/powerpc/ffi.c src/powerpc/sysv.S src/powerpc/ppc_closure.S src/powerpc/linux64.S src/powerpc/linux64_closure.S
+nodist_libffi_la_SOURCES += src/powerpc/ffi.c src/powerpc/ffi_sysv.c src/powerpc/ffi_linux64.c src/powerpc/sysv.S src/powerpc/ppc_closure.S src/powerpc/linux64.S src/powerpc/linux64_closure.S
 endif
 if POWERPC_AIX
 nodist_libffi_la_SOURCES += src/powerpc/ffi_darwin.c src/powerpc/aix.S src/powerpc/aix_closure.S
@@ -188,7 +190,7 @@ if POWERPC_DARWIN
 nodist_libffi_la_SOURCES += src/powerpc/ffi_darwin.c src/powerpc/darwin.S src/powerpc/darwin_closure.S
 endif
 if POWERPC_FREEBSD
-nodist_libffi_la_SOURCES += src/powerpc/ffi.c src/powerpc/sysv.S src/powerpc/ppc_closure.S
+nodist_libffi_la_SOURCES += src/powerpc/ffi.c src/powerpc/ffi_sysv.c src/powerpc/sysv.S src/powerpc/ppc_closure.S
 endif
 if AARCH64
 nodist_libffi_la_SOURCES += src/aarch64/sysv.S src/aarch64/ffi.c
Index: gcc-4_8-branch/libffi/Makefile.in
===================================================================
--- gcc-4_8-branch.orig/libffi/Makefile.in	2013-12-28 17:41:31.721625400 +0100
+++ gcc-4_8-branch/libffi/Makefile.in	2013-12-28 17:50:47.122374872 +0100
@@ -48,10 +48,10 @@ target_triplet = @target@
 @IA64_TRUE@am__append_11 = src/ia64/ffi.c src/ia64/unix.S
 @M32R_TRUE@am__append_12 = src/m32r/sysv.S src/m32r/ffi.c
 @M68K_TRUE@am__append_13 = src/m68k/ffi.c src/m68k/sysv.S
-@POWERPC_TRUE@am__append_14 = src/powerpc/ffi.c src/powerpc/sysv.S src/powerpc/ppc_closure.S src/powerpc/linux64.S src/powerpc/linux64_closure.S
+@POWERPC_TRUE@am__append_14 = src/powerpc/ffi.c src/powerpc/ffi_sysv.c src/powerpc/ffi_linux64.c src/powerpc/sysv.S src/powerpc/ppc_closure.S src/powerpc/linux64.S src/powerpc/linux64_closure.S
 @POWERPC_AIX_TRUE@am__append_15 = src/powerpc/ffi_darwin.c src/powerpc/aix.S src/powerpc/aix_closure.S
 @POWERPC_DARWIN_TRUE@am__append_16 = src/powerpc/ffi_darwin.c src/powerpc/darwin.S src/powerpc/darwin_closure.S
-@POWERPC_FREEBSD_TRUE@am__append_17 = src/powerpc/ffi.c src/powerpc/sysv.S src/powerpc/ppc_closure.S
+@POWERPC_FREEBSD_TRUE@am__append_17 = src/powerpc/ffi.c src/powerpc/ffi_sysv.c src/powerpc/sysv.S src/powerpc/ppc_closure.S
 @AARCH64_TRUE@am__append_18 = src/aarch64/sysv.S src/aarch64/ffi.c
 @ARM_TRUE@am__append_19 = src/arm/sysv.S src/arm/ffi.c
 @ARM_TRUE@@FFI_EXEC_TRAMPOLINE_TABLE_TRUE@am__append_20 = src/arm/trampoline.S
@@ -133,7 +133,9 @@ am_libffi_la_OBJECTS = src/prep_cif.lo s
 @IA64_TRUE@am__objects_11 = src/ia64/ffi.lo src/ia64/unix.lo
 @M32R_TRUE@am__objects_12 = src/m32r/sysv.lo src/m32r/ffi.lo
 @M68K_TRUE@am__objects_13 = src/m68k/ffi.lo src/m68k/sysv.lo
-@POWERPC_TRUE@am__objects_14 = src/powerpc/ffi.lo src/powerpc/sysv.lo \
+@POWERPC_TRUE@am__objects_14 = src/powerpc/ffi.lo \
+@POWERPC_TRUE@	src/powerpc/ffi_sysv.lo \
+@POWERPC_TRUE@	src/powerpc/ffi_linux64.lo src/powerpc/sysv.lo \
 @POWERPC_TRUE@	src/powerpc/ppc_closure.lo \
 @POWERPC_TRUE@	src/powerpc/linux64.lo \
 @POWERPC_TRUE@	src/powerpc/linux64_closure.lo
@@ -144,6 +146,7 @@ am_libffi_la_OBJECTS = src/prep_cif.lo s
 @POWERPC_DARWIN_TRUE@	src/powerpc/darwin.lo \
 @POWERPC_DARWIN_TRUE@	src/powerpc/darwin_closure.lo
 @POWERPC_FREEBSD_TRUE@am__objects_17 = src/powerpc/ffi.lo \
+@POWERPC_FREEBSD_TRUE@	src/powerpc/ffi_sysv.lo \
 @POWERPC_FREEBSD_TRUE@	src/powerpc/sysv.lo \
 @POWERPC_FREEBSD_TRUE@	src/powerpc/ppc_closure.lo
 @AARCH64_TRUE@am__objects_18 = src/aarch64/sysv.lo src/aarch64/ffi.lo
@@ -278,6 +281,7 @@ FFI_EXEC_TRAMPOLINE_TABLE = @FFI_EXEC_TR
 FGREP = @FGREP@
 GREP = @GREP@
 HAVE_LONG_DOUBLE = @HAVE_LONG_DOUBLE@
+HAVE_LONG_DOUBLE_VARIANT = @HAVE_LONG_DOUBLE_VARIANT@
 INSTALL = @INSTALL@
 INSTALL_DATA = @INSTALL_DATA@
 INSTALL_PROGRAM = @INSTALL_PROGRAM@
@@ -387,10 +391,12 @@ EXTRA_DIST = LICENSE ChangeLog.v1 Change
 	 src/ia64/unix.S src/mips/ffi.c src/mips/n32.S src/mips/o32.S	\
 	 src/mips/ffitarget.h src/m32r/ffi.c src/m32r/sysv.S		\
 	 src/m32r/ffitarget.h src/m68k/ffi.c src/m68k/sysv.S		\
-	 src/m68k/ffitarget.h src/powerpc/ffi.c src/powerpc/sysv.S	\
-	 src/powerpc/linux64.S src/powerpc/linux64_closure.S		\
-	 src/powerpc/ppc_closure.S src/powerpc/asm.h			\
-	src/powerpc/aix.S src/powerpc/darwin.S				\
+	 src/m68k/ffitarget.h						\
+	src/powerpc/ffi.c src/powerpc/ffi_powerpc.h			\
+	src/powerpc/ffi_sysv.c src/powerpc/ffi_linux64.c		\
+	src/powerpc/sysv.S src/powerpc/linux64.S			\
+	src/powerpc/linux64_closure.S src/powerpc/ppc_closure.S		\
+	src/powerpc/asm.h src/powerpc/aix.S src/powerpc/darwin.S	\
 	src/powerpc/aix_closure.S src/powerpc/darwin_closure.S		\
 	src/powerpc/ffi_darwin.c src/powerpc/ffitarget.h		\
 	src/s390/ffi.c src/s390/sysv.S src/s390/ffitarget.h		\
@@ -711,6 +717,10 @@ src/powerpc/$(DEPDIR)/$(am__dirstamp):
 	@: > src/powerpc/$(DEPDIR)/$(am__dirstamp)
 src/powerpc/ffi.lo: src/powerpc/$(am__dirstamp) \
 	src/powerpc/$(DEPDIR)/$(am__dirstamp)
+src/powerpc/ffi_sysv.lo: src/powerpc/$(am__dirstamp) \
+	src/powerpc/$(DEPDIR)/$(am__dirstamp)
+src/powerpc/ffi_linux64.lo: src/powerpc/$(am__dirstamp) \
+	src/powerpc/$(DEPDIR)/$(am__dirstamp)
 src/powerpc/sysv.lo: src/powerpc/$(am__dirstamp) \
 	src/powerpc/$(DEPDIR)/$(am__dirstamp)
 src/powerpc/ppc_closure.lo: src/powerpc/$(am__dirstamp) \
@@ -912,6 +922,10 @@ mostlyclean-compile:
 	-rm -f src/powerpc/ffi.lo
 	-rm -f src/powerpc/ffi_darwin.$(OBJEXT)
 	-rm -f src/powerpc/ffi_darwin.lo
+	-rm -f src/powerpc/ffi_linux64.$(OBJEXT)
+	-rm -f src/powerpc/ffi_linux64.lo
+	-rm -f src/powerpc/ffi_sysv.$(OBJEXT)
+	-rm -f src/powerpc/ffi_sysv.lo
 	-rm -f src/powerpc/linux64.$(OBJEXT)
 	-rm -f src/powerpc/linux64.lo
 	-rm -f src/powerpc/linux64_closure.$(OBJEXT)
@@ -1009,6 +1023,8 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/darwin_closure.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/ffi.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/ffi_darwin.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/ffi_linux64.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/ffi_sysv.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/linux64.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/linux64_closure.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@src/powerpc/$(DEPDIR)/ppc_closure.Plo@am__quote@
Index: gcc-4_8-branch/libffi/configure
===================================================================
--- gcc-4_8-branch.orig/libffi/configure	2013-12-28 17:50:38.678332861 +0100
+++ gcc-4_8-branch/libffi/configure	2013-12-28 17:50:47.129374907 +0100
@@ -613,6 +613,7 @@ TARGET
 FFI_EXEC_TRAMPOLINE_TABLE
 FFI_EXEC_TRAMPOLINE_TABLE_FALSE
 FFI_EXEC_TRAMPOLINE_TABLE_TRUE
+HAVE_LONG_DOUBLE_VARIANT
 HAVE_LONG_DOUBLE
 ALLOCA
 TILE_FALSE
@@ -10906,7 +10907,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 10909 "configure"
+#line 10910 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11012,7 +11013,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11015 "configure"
+#line 11016 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11449,6 +11450,7 @@ fi
 

 TARGETDIR="unknown"
+HAVE_LONG_DOUBLE_VARIANT=0
 case "$host" in
   aarch64*-*-*)
 	TARGET=AARCH64; TARGETDIR=aarch64
@@ -11546,6 +11548,7 @@ case "$host" in
 
   powerpc*-*-linux* | powerpc-*-sysv*)
 	TARGET=POWERPC; TARGETDIR=powerpc
+	HAVE_LONG_DOUBLE_VARIANT=1
 	;;
   powerpc-*-amigaos*)
 	TARGET=POWERPC; TARGETDIR=powerpc
@@ -11561,6 +11564,7 @@ case "$host" in
 	;;
   powerpc-*-freebsd* | powerpc-*-openbsd*)
 	TARGET=POWERPC_FREEBSD; TARGETDIR=powerpc
+	HAVE_LONG_DOUBLE_VARIANT=1
 	;;
   powerpc64-*-freebsd*)
 	TARGET=POWERPC; TARGETDIR=powerpc
@@ -12236,17 +12240,25 @@ _ACEOF
 # Also AC_SUBST this variable for ffi.h.
 if test -z "$HAVE_LONG_DOUBLE"; then
   HAVE_LONG_DOUBLE=0
-  if test $ac_cv_sizeof_double != $ac_cv_sizeof_long_double; then
-    if test $ac_cv_sizeof_long_double != 0; then
+  if test $ac_cv_sizeof_long_double != 0; then
+    if test $HAVE_LONG_DOUBLE_VARIANT != 0; then
+
+$as_echo "#define HAVE_LONG_DOUBLE_VARIANT 1" >>confdefs.h
+
       HAVE_LONG_DOUBLE=1
+    else
+      if test $ac_cv_sizeof_double != $ac_cv_sizeof_long_double; then
+        HAVE_LONG_DOUBLE=1
 
 $as_echo "#define HAVE_LONG_DOUBLE 1" >>confdefs.h
 
+      fi
     fi
   fi
 fi
 

+
  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether byte ordering is bigendian" >&5
 $as_echo_n "checking whether byte ordering is bigendian... " >&6; }
 if test "${ac_cv_c_bigendian+set}" = set; then :
Index: gcc-4_8-branch/libffi/configure.ac
===================================================================
--- gcc-4_8-branch.orig/libffi/configure.ac	2013-12-28 17:41:31.721625400 +0100
+++ gcc-4_8-branch/libffi/configure.ac	2013-12-28 17:50:47.132374922 +0100
@@ -65,6 +65,7 @@ dnl The -no-testsuite modules omit the t
 AM_CONDITIONAL(TESTSUBDIR, test -d $srcdir/testsuite)
 
 TARGETDIR="unknown"
+HAVE_LONG_DOUBLE_VARIANT=0
 case "$host" in
   aarch64*-*-*)
 	TARGET=AARCH64; TARGETDIR=aarch64
@@ -162,6 +163,7 @@ case "$host" in
 
   powerpc*-*-linux* | powerpc-*-sysv*)
 	TARGET=POWERPC; TARGETDIR=powerpc
+	HAVE_LONG_DOUBLE_VARIANT=1
 	;;
   powerpc-*-amigaos*)
 	TARGET=POWERPC; TARGETDIR=powerpc
@@ -177,6 +179,7 @@ case "$host" in
 	;;
   powerpc-*-freebsd* | powerpc-*-openbsd*)
 	TARGET=POWERPC_FREEBSD; TARGETDIR=powerpc
+	HAVE_LONG_DOUBLE_VARIANT=1
 	;;
   powerpc64-*-freebsd*)
 	TARGET=POWERPC; TARGETDIR=powerpc
@@ -273,14 +276,20 @@ AC_CHECK_SIZEOF(long double)
 # Also AC_SUBST this variable for ffi.h.
 if test -z "$HAVE_LONG_DOUBLE"; then
   HAVE_LONG_DOUBLE=0
-  if test $ac_cv_sizeof_double != $ac_cv_sizeof_long_double; then
-    if test $ac_cv_sizeof_long_double != 0; then
+  if test $ac_cv_sizeof_long_double != 0; then
+    if test $HAVE_LONG_DOUBLE_VARIANT != 0; then
+      AC_DEFINE(HAVE_LONG_DOUBLE_VARIANT, 1, [Define if you support more than one size of the long double type])
       HAVE_LONG_DOUBLE=1
-      AC_DEFINE(HAVE_LONG_DOUBLE, 1, [Define if you have the long double type and it is bigger than a double])
+    else
+      if test $ac_cv_sizeof_double != $ac_cv_sizeof_long_double; then
+        HAVE_LONG_DOUBLE=1
+        AC_DEFINE(HAVE_LONG_DOUBLE, 1, [Define if you have the long double type and it is bigger than a double])
+      fi
     fi
   fi
 fi
 AC_SUBST(HAVE_LONG_DOUBLE)
+AC_SUBST(HAVE_LONG_DOUBLE_VARIANT)
 
 AC_C_BIGENDIAN
 
Index: gcc-4_8-branch/libffi/fficonfig.h.in
===================================================================
--- gcc-4_8-branch.orig/libffi/fficonfig.h.in	2013-12-28 17:41:31.721625400 +0100
+++ gcc-4_8-branch/libffi/fficonfig.h.in	2013-12-28 17:50:47.135374937 +0100
@@ -73,6 +73,9 @@
 /* Define if you have the long double type and it is bigger than a double */
 #undef HAVE_LONG_DOUBLE
 
+/* Define if you support more than one size of the long double type */
+#undef HAVE_LONG_DOUBLE_VARIANT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
Index: gcc-4_8-branch/libffi/include/Makefile.in
===================================================================
--- gcc-4_8-branch.orig/libffi/include/Makefile.in	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/include/Makefile.in	2013-12-28 17:50:47.138374952 +0100
@@ -113,6 +113,7 @@ FFI_EXEC_TRAMPOLINE_TABLE = @FFI_EXEC_TR
 FGREP = @FGREP@
 GREP = @GREP@
 HAVE_LONG_DOUBLE = @HAVE_LONG_DOUBLE@
+HAVE_LONG_DOUBLE_VARIANT = @HAVE_LONG_DOUBLE_VARIANT@
 INSTALL = @INSTALL@
 INSTALL_DATA = @INSTALL_DATA@
 INSTALL_PROGRAM = @INSTALL_PROGRAM@
Index: gcc-4_8-branch/libffi/include/ffi.h.in
===================================================================
--- gcc-4_8-branch.orig/libffi/include/ffi.h.in	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/include/ffi.h.in	2013-12-28 17:50:47.141374967 +0100
@@ -207,6 +207,11 @@ typedef struct {
 #endif
 } ffi_cif;
 
+#if HAVE_LONG_DOUBLE_VARIANT
+/* Used to adjust size/alignment of ffi types.  */
+void ffi_prep_types (ffi_abi abi);
+# endif
+
 /* Used internally, but overridden by some architectures */
 ffi_status ffi_prep_cif_core(ffi_cif *cif,
 			     ffi_abi abi,
Index: gcc-4_8-branch/libffi/man/Makefile.in
===================================================================
--- gcc-4_8-branch.orig/libffi/man/Makefile.in	2013-12-28 17:41:31.724625415 +0100
+++ gcc-4_8-branch/libffi/man/Makefile.in	2013-12-28 17:50:47.144374982 +0100
@@ -111,6 +111,7 @@ FFI_EXEC_TRAMPOLINE_TABLE = @FFI_EXEC_TR
 FGREP = @FGREP@
 GREP = @GREP@
 HAVE_LONG_DOUBLE = @HAVE_LONG_DOUBLE@
+HAVE_LONG_DOUBLE_VARIANT = @HAVE_LONG_DOUBLE_VARIANT@
 INSTALL = @INSTALL@
 INSTALL_DATA = @INSTALL_DATA@
 INSTALL_PROGRAM = @INSTALL_PROGRAM@
Index: gcc-4_8-branch/libffi/src/powerpc/ffi_linux64.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc-4_8-branch/libffi/src/powerpc/ffi_linux64.c	2013-12-28 17:50:47.148375002 +0100
@@ -0,0 +1,942 @@
+/* -----------------------------------------------------------------------
+   ffi_linux64.c - Copyright (C) 2013 IBM
+                   Copyright (C) 2011 Anthony Green
+                   Copyright (C) 2011 Kyle Moffett
+                   Copyright (C) 2008 Red Hat, Inc
+                   Copyright (C) 2007, 2008 Free Software Foundation, Inc
+                   Copyright (c) 1998 Geoffrey Keating
+
+   PowerPC Foreign Function Interface
+
+   Permission is hereby granted, free of charge, to any person obtaining
+   a copy of this software and associated documentation files (the
+   ``Software''), to deal in the Software without restriction, including
+   without limitation the rights to use, copy, modify, merge, publish,
+   distribute, sublicense, and/or sell copies of the Software, and to
+   permit persons to whom the Software is furnished to do so, subject to
+   the following conditions:
+
+   The above copyright notice and this permission notice shall be included
+   in all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, EXPRESS
+   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+   IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES OR
+   OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+   ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+   OTHER DEALINGS IN THE SOFTWARE.
+   ----------------------------------------------------------------------- */
+
+#include "ffi.h"
+
+#ifdef POWERPC64
+#include "ffi_common.h"
+#include "ffi_powerpc.h"
+
+
+/* About the LINUX64 ABI.  */
+enum {
+  NUM_GPR_ARG_REGISTERS64 = 8,
+  NUM_FPR_ARG_REGISTERS64 = 13
+};
+enum { ASM_NEEDS_REGISTERS64 = 4 };
+
+
+#if HAVE_LONG_DOUBLE_VARIANT && FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+/* Adjust size of ffi_type_longdouble.  */
+void FFI_HIDDEN
+ffi_prep_types_linux64 (ffi_abi abi)
+{
+  if ((abi & (FFI_LINUX | FFI_LINUX_LONG_DOUBLE_128)) == FFI_LINUX)
+    {
+      ffi_type_longdouble.size = 8;
+      ffi_type_longdouble.alignment = 8;
+    }
+  else
+    {
+      ffi_type_longdouble.size = 16;
+      ffi_type_longdouble.alignment = 16;
+    }
+}
+#endif
+
+
+#if _CALL_ELF == 2
+static unsigned int
+discover_homogeneous_aggregate (const ffi_type *t, unsigned int *elnum)
+{
+  switch (t->type)
+    {
+    case FFI_TYPE_FLOAT:
+    case FFI_TYPE_DOUBLE:
+      *elnum = 1;
+      return (int) t->type;
+
+    case FFI_TYPE_STRUCT:;
+      {
+	unsigned int base_elt = 0, total_elnum = 0;
+	ffi_type **el = t->elements;
+	while (*el)
+	  {
+	    unsigned int el_elt, el_elnum = 0;
+	    el_elt = discover_homogeneous_aggregate (*el, &el_elnum);
+	    if (el_elt == 0
+		|| (base_elt && base_elt != el_elt))
+	      return 0;
+	    base_elt = el_elt;
+	    total_elnum += el_elnum;
+	    if (total_elnum > 8)
+	      return 0;
+	    el++;
+	  }
+	*elnum = total_elnum;
+	return base_elt;
+      }
+
+    default:
+      return 0;
+    }
+}
+#endif
+
+
+/* Perform machine dependent cif processing */
+static ffi_status
+ffi_prep_cif_linux64_core (ffi_cif *cif)
+{
+  ffi_type **ptr;
+  unsigned bytes;
+  unsigned i, fparg_count = 0, intarg_count = 0;
+  unsigned flags = cif->flags;
+#if _CALL_ELF == 2
+  unsigned int elt, elnum;
+#endif
+
+#if FFI_TYPE_LONGDOUBLE == FFI_TYPE_DOUBLE
+  /* If compiled without long double support..  */
+  if ((cif->abi & FFI_LINUX_LONG_DOUBLE_128) != 0)
+    return FFI_BAD_ABI;
+#endif
+
+  /* The machine-independent calculation of cif->bytes doesn't work
+     for us.  Redo the calculation.  */
+#if _CALL_ELF == 2
+  /* Space for backchain, CR, LR, TOC and the asm's temp regs.  */
+  bytes = (4 + ASM_NEEDS_REGISTERS64) * sizeof (long);
+
+  /* Space for the general registers.  */
+  bytes += NUM_GPR_ARG_REGISTERS64 * sizeof (long);
+#else
+  /* Space for backchain, CR, LR, cc/ld doubleword, TOC and the asm's temp
+     regs.  */
+  bytes = (6 + ASM_NEEDS_REGISTERS64) * sizeof (long);
+
+  /* Space for the mandatory parm save area and general registers.  */
+  bytes += 2 * NUM_GPR_ARG_REGISTERS64 * sizeof (long);
+#endif
+
+  /* Return value handling.  */
+  switch (cif->rtype->type)
+    {
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+    case FFI_TYPE_LONGDOUBLE:
+      if ((cif->abi & FFI_LINUX_LONG_DOUBLE_128) != 0)
+	flags |= FLAG_RETURNS_128BITS;
+      /* Fall through.  */
+#endif
+    case FFI_TYPE_DOUBLE:
+      flags |= FLAG_RETURNS_64BITS;
+      /* Fall through.  */
+    case FFI_TYPE_FLOAT:
+      flags |= FLAG_RETURNS_FP;
+      break;
+
+    case FFI_TYPE_UINT128:
+      flags |= FLAG_RETURNS_128BITS;
+      /* Fall through.  */
+    case FFI_TYPE_UINT64:
+    case FFI_TYPE_SINT64:
+      flags |= FLAG_RETURNS_64BITS;
+      break;
+
+    case FFI_TYPE_STRUCT:
+#if _CALL_ELF == 2
+      elt = discover_homogeneous_aggregate (cif->rtype, &elnum);
+      if (elt)
+	{
+	  if (elt == FFI_TYPE_DOUBLE)
+	    flags |= FLAG_RETURNS_64BITS;
+	  flags |= FLAG_RETURNS_FP | FLAG_RETURNS_SMST;
+	  break;
+	}
+      if (cif->rtype->size <= 16)
+	{
+	  flags |= FLAG_RETURNS_SMST;
+	  break;
+	}
+#endif
+      intarg_count++;
+      flags |= FLAG_RETVAL_REFERENCE;
+      /* Fall through.  */
+    case FFI_TYPE_VOID:
+      flags |= FLAG_RETURNS_NOTHING;
+      break;
+
+    default:
+      /* Returns 32-bit integer, or similar.  Nothing to do here.  */
+      break;
+    }
+
+  for (ptr = cif->arg_types, i = cif->nargs; i > 0; i--, ptr++)
+    {
+      unsigned int align;
+
+      switch ((*ptr)->type)
+	{
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+	case FFI_TYPE_LONGDOUBLE:
+	  if ((cif->abi & FFI_LINUX_LONG_DOUBLE_128) != 0)
+	    {
+	      fparg_count++;
+	      intarg_count++;
+	    }
+	  /* Fall through.  */
+#endif
+	case FFI_TYPE_DOUBLE:
+	case FFI_TYPE_FLOAT:
+	  fparg_count++;
+	  intarg_count++;
+	  if (fparg_count > NUM_FPR_ARG_REGISTERS64)
+	    flags |= FLAG_ARG_NEEDS_PSAVE;
+	  break;
+
+	case FFI_TYPE_STRUCT:
+	  if ((cif->abi & FFI_LINUX_STRUCT_ALIGN) != 0)
+	    {
+	      align = (*ptr)->alignment;
+	      if (align > 16)
+		align = 16;
+	      align = align / 8;
+	      if (align > 1)
+		intarg_count = ALIGN (intarg_count, align);
+	    }
+	  intarg_count += ((*ptr)->size + 7) / 8;
+#if _CALL_ELF == 2
+	  elt = discover_homogeneous_aggregate (*ptr, &elnum);
+	  if (elt)
+	    {
+	      fparg_count += elnum;
+	      if (fparg_count > NUM_FPR_ARG_REGISTERS64)
+		flags |= FLAG_ARG_NEEDS_PSAVE;
+	    }
+	  else
+#endif
+	    {
+	      if (intarg_count > NUM_GPR_ARG_REGISTERS64)
+		flags |= FLAG_ARG_NEEDS_PSAVE;
+	    }
+	  break;
+
+	case FFI_TYPE_POINTER:
+	case FFI_TYPE_UINT64:
+	case FFI_TYPE_SINT64:
+	case FFI_TYPE_INT:
+	case FFI_TYPE_UINT32:
+	case FFI_TYPE_SINT32:
+	case FFI_TYPE_UINT16:
+	case FFI_TYPE_SINT16:
+	case FFI_TYPE_UINT8:
+	case FFI_TYPE_SINT8:
+	  /* Everything else is passed as a 8-byte word in a GPR, either
+	     the object itself or a pointer to it.  */
+	  intarg_count++;
+	  if (intarg_count > NUM_GPR_ARG_REGISTERS64)
+	    flags |= FLAG_ARG_NEEDS_PSAVE;
+	  break;
+	default:
+	  FFI_ASSERT (0);
+	}
+    }
+
+  if (fparg_count != 0)
+    flags |= FLAG_FP_ARGUMENTS;
+  if (intarg_count > 4)
+    flags |= FLAG_4_GPR_ARGUMENTS;
+
+  /* Space for the FPR registers, if needed.  */
+  if (fparg_count != 0)
+    bytes += NUM_FPR_ARG_REGISTERS64 * sizeof (double);
+
+  /* Stack space.  */
+#if _CALL_ELF == 2
+  if ((flags & FLAG_ARG_NEEDS_PSAVE) != 0)
+    bytes += intarg_count * sizeof (long);
+#else
+  if (intarg_count > NUM_GPR_ARG_REGISTERS64)
+    bytes += (intarg_count - NUM_GPR_ARG_REGISTERS64) * sizeof (long);
+#endif
+
+  /* The stack space allocated needs to be a multiple of 16 bytes.  */
+  bytes = (bytes + 15) & ~0xF;
+
+  cif->flags = flags;
+  cif->bytes = bytes;
+
+  return FFI_OK;
+}
+
+ffi_status FFI_HIDDEN
+ffi_prep_cif_linux64 (ffi_cif *cif)
+{
+  if ((cif->abi & FFI_LINUX) != 0)
+    cif->nfixedargs = cif->nargs;
+#if _CALL_ELF != 2
+  else if (cif->abi == FFI_COMPAT_LINUX64)
+    {
+      /* This call is from old code.  Don't touch cif->nfixedargs
+	 since old code will be using a smaller cif.  */
+      cif->flags |= FLAG_COMPAT;
+      /* Translate to new abi value.  */
+      cif->abi = FFI_LINUX | FFI_LINUX_LONG_DOUBLE_128;
+    }
+#endif
+  else
+    return FFI_BAD_ABI;
+  return ffi_prep_cif_linux64_core (cif);
+}
+
+ffi_status FFI_HIDDEN
+ffi_prep_cif_linux64_var (ffi_cif *cif,
+			  unsigned int nfixedargs,
+			  unsigned int ntotalargs MAYBE_UNUSED)
+{
+  if ((cif->abi & FFI_LINUX) != 0)
+    cif->nfixedargs = nfixedargs;
+#if _CALL_ELF != 2
+  else if (cif->abi == FFI_COMPAT_LINUX64)
+    {
+      /* This call is from old code.  Don't touch cif->nfixedargs
+	 since old code will be using a smaller cif.  */
+      cif->flags |= FLAG_COMPAT;
+      /* Translate to new abi value.  */
+      cif->abi = FFI_LINUX | FFI_LINUX_LONG_DOUBLE_128;
+    }
+#endif
+  else
+    return FFI_BAD_ABI;
+#if _CALL_ELF == 2
+  cif->flags |= FLAG_ARG_NEEDS_PSAVE;
+#endif
+  return ffi_prep_cif_linux64_core (cif);
+}
+
+
+/* ffi_prep_args64 is called by the assembly routine once stack space
+   has been allocated for the function's arguments.
+
+   The stack layout we want looks like this:
+
+   |   Ret addr from ffi_call_LINUX64	8bytes	|	higher addresses
+   |--------------------------------------------|
+   |   CR save area			8bytes	|
+   |--------------------------------------------|
+   |   Previous backchain pointer	8	|	stack pointer here
+   |--------------------------------------------|<+ <<<	on entry to
+   |   Saved r28-r31			4*8	| |	ffi_call_LINUX64
+   |--------------------------------------------| |
+   |   GPR registers r3-r10		8*8	| |
+   |--------------------------------------------| |
+   |   FPR registers f1-f13 (optional)	13*8	| |
+   |--------------------------------------------| |
+   |   Parameter save area		        | |
+   |--------------------------------------------| |
+   |   TOC save area			8	| |
+   |--------------------------------------------| |	stack	|
+   |   Linker doubleword		8	| |	grows	|
+   |--------------------------------------------| |	down	V
+   |   Compiler doubleword		8	| |
+   |--------------------------------------------| |	lower addresses
+   |   Space for callee's LR		8	| |
+   |--------------------------------------------| |
+   |   CR save area			8	| |
+   |--------------------------------------------| |	stack pointer here
+   |   Current backchain pointer	8	|-/	during
+   |--------------------------------------------|   <<<	ffi_call_LINUX64
+
+*/
+
+void FFI_HIDDEN
+ffi_prep_args64 (extended_cif *ecif, unsigned long *const stack)
+{
+  const unsigned long bytes = ecif->cif->bytes;
+  const unsigned long flags = ecif->cif->flags;
+
+  typedef union
+  {
+    char *c;
+    unsigned long *ul;
+    float *f;
+    double *d;
+    size_t p;
+  } valp;
+
+  /* 'stacktop' points at the previous backchain pointer.  */
+  valp stacktop;
+
+  /* 'next_arg' points at the space for gpr3, and grows upwards as
+     we use GPR registers, then continues at rest.  */
+  valp gpr_base;
+  valp gpr_end;
+  valp rest;
+  valp next_arg;
+
+  /* 'fpr_base' points at the space for fpr3, and grows upwards as
+     we use FPR registers.  */
+  valp fpr_base;
+  unsigned int fparg_count;
+
+  unsigned int i, words, nargs, nfixedargs;
+  ffi_type **ptr;
+  double double_tmp;
+  union
+  {
+    void **v;
+    char **c;
+    signed char **sc;
+    unsigned char **uc;
+    signed short **ss;
+    unsigned short **us;
+    signed int **si;
+    unsigned int **ui;
+    unsigned long **ul;
+    float **f;
+    double **d;
+  } p_argv;
+  unsigned long gprvalue;
+  unsigned long align;
+
+  stacktop.c = (char *) stack + bytes;
+  gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - NUM_GPR_ARG_REGISTERS64;
+  gpr_end.ul = gpr_base.ul + NUM_GPR_ARG_REGISTERS64;
+#if _CALL_ELF == 2
+  rest.ul = stack + 4 + NUM_GPR_ARG_REGISTERS64;
+#else
+  rest.ul = stack + 6 + NUM_GPR_ARG_REGISTERS64;
+#endif
+  fpr_base.d = gpr_base.d - NUM_FPR_ARG_REGISTERS64;
+  fparg_count = 0;
+  next_arg.ul = gpr_base.ul;
+
+  /* Check that everything starts aligned properly.  */
+  FFI_ASSERT (((unsigned long) (char *) stack & 0xF) == 0);
+  FFI_ASSERT (((unsigned long) stacktop.c & 0xF) == 0);
+  FFI_ASSERT ((bytes & 0xF) == 0);
+
+  /* Deal with return values that are actually pass-by-reference.  */
+  if (flags & FLAG_RETVAL_REFERENCE)
+    *next_arg.ul++ = (unsigned long) (char *) ecif->rvalue;
+
+  /* Now for the arguments.  */
+  p_argv.v = ecif->avalue;
+  nargs = ecif->cif->nargs;
+#if _CALL_ELF != 2
+  nfixedargs = (unsigned) -1;
+  if ((flags & FLAG_COMPAT) == 0)
+#endif
+    nfixedargs = ecif->cif->nfixedargs;
+  for (ptr = ecif->cif->arg_types, i = 0;
+       i < nargs;
+       i++, ptr++, p_argv.v++)
+    {
+#if _CALL_ELF == 2
+      unsigned int elt, elnum;
+#endif
+
+      switch ((*ptr)->type)
+	{
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+	case FFI_TYPE_LONGDOUBLE:
+	  if ((ecif->cif->abi & FFI_LINUX_LONG_DOUBLE_128) != 0)
+	    {
+	      double_tmp = (*p_argv.d)[0];
+	      if (fparg_count < NUM_FPR_ARG_REGISTERS64 && i < nfixedargs)
+		{
+		  *fpr_base.d++ = double_tmp;
+# if _CALL_ELF != 2
+		  if ((flags & FLAG_COMPAT) != 0)
+		    *next_arg.d = double_tmp;
+# endif
+		}
+	      else
+		*next_arg.d = double_tmp;
+	      if (++next_arg.ul == gpr_end.ul)
+		next_arg.ul = rest.ul;
+	      fparg_count++;
+	      double_tmp = (*p_argv.d)[1];
+	      if (fparg_count < NUM_FPR_ARG_REGISTERS64 && i < nfixedargs)
+		{
+		  *fpr_base.d++ = double_tmp;
+# if _CALL_ELF != 2
+		  if ((flags & FLAG_COMPAT) != 0)
+		    *next_arg.d = double_tmp;
+# endif
+		}
+	      else
+		*next_arg.d = double_tmp;
+	      if (++next_arg.ul == gpr_end.ul)
+		next_arg.ul = rest.ul;
+	      fparg_count++;
+	      FFI_ASSERT (__LDBL_MANT_DIG__ == 106);
+	      FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
+	      break;
+	    }
+	  /* Fall through.  */
+#endif
+	case FFI_TYPE_DOUBLE:
+	  double_tmp = **p_argv.d;
+	  if (fparg_count < NUM_FPR_ARG_REGISTERS64 && i < nfixedargs)
+	    {
+	      *fpr_base.d++ = double_tmp;
+#if _CALL_ELF != 2
+	      if ((flags & FLAG_COMPAT) != 0)
+		*next_arg.d = double_tmp;
+#endif
+	    }
+	  else
+	    *next_arg.d = double_tmp;
+	  if (++next_arg.ul == gpr_end.ul)
+	    next_arg.ul = rest.ul;
+	  fparg_count++;
+	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
+	  break;
+
+	case FFI_TYPE_FLOAT:
+	  double_tmp = **p_argv.f;
+	  if (fparg_count < NUM_FPR_ARG_REGISTERS64 && i < nfixedargs)
+	    {
+	      *fpr_base.d++ = double_tmp;
+#if _CALL_ELF != 2
+	      if ((flags & FLAG_COMPAT) != 0)
+		*next_arg.f = (float) double_tmp;
+#endif
+	    }
+	  else
+	    *next_arg.f = (float) double_tmp;
+	  if (++next_arg.ul == gpr_end.ul)
+	    next_arg.ul = rest.ul;
+	  fparg_count++;
+	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
+	  break;
+
+	case FFI_TYPE_STRUCT:
+	  if ((ecif->cif->abi & FFI_LINUX_STRUCT_ALIGN) != 0)
+	    {
+	      align = (*ptr)->alignment;
+	      if (align > 16)
+		align = 16;
+	      if (align > 1)
+		next_arg.p = ALIGN (next_arg.p, align);
+	    }
+#if _CALL_ELF == 2
+	  elt = discover_homogeneous_aggregate (*ptr, &elnum);
+	  if (elt)
+	    {
+	      union {
+		void *v;
+		float *f;
+		double *d;
+	      } arg;
+
+	      arg.v = *p_argv.v;
+	      if (elt == FFI_TYPE_FLOAT)
+		{
+		  do
+		    {
+		      double_tmp = *arg.f++;
+		      if (fparg_count < NUM_FPR_ARG_REGISTERS64
+			  && i < nfixedargs)
+			*fpr_base.d++ = double_tmp;
+		      else
+			*next_arg.f = (float) double_tmp;
+		      if (++next_arg.f == gpr_end.f)
+			next_arg.f = rest.f;
+		      fparg_count++;
+		    }
+		  while (--elnum != 0);
+		  if ((next_arg.p & 3) != 0)
+		    {
+		      if (++next_arg.f == gpr_end.f)
+			next_arg.f = rest.f;
+		    }
+		}
+	      else
+		do
+		  {
+		    double_tmp = *arg.d++;
+		    if (fparg_count < NUM_FPR_ARG_REGISTERS64 && i < nfixedargs)
+		      *fpr_base.d++ = double_tmp;
+		    else
+		      *next_arg.d = double_tmp;
+		    if (++next_arg.d == gpr_end.d)
+		      next_arg.d = rest.d;
+		    fparg_count++;
+		  }
+		while (--elnum != 0);
+	    }
+	  else
+#endif
+	    {
+	      words = ((*ptr)->size + 7) / 8;
+	      if (next_arg.ul >= gpr_base.ul && next_arg.ul + words > gpr_end.ul)
+		{
+		  size_t first = gpr_end.c - next_arg.c;
+		  memcpy (next_arg.c, *p_argv.c, first);
+		  memcpy (rest.c, *p_argv.c + first, (*ptr)->size - first);
+		  next_arg.c = rest.c + words * 8 - first;
+		}
+	      else
+		{
+		  char *where = next_arg.c;
+
+#ifndef __LITTLE_ENDIAN__
+		  /* Structures with size less than eight bytes are passed
+		     left-padded.  */
+		  if ((*ptr)->size < 8)
+		    where += 8 - (*ptr)->size;
+#endif
+		  memcpy (where, *p_argv.c, (*ptr)->size);
+		  next_arg.ul += words;
+		  if (next_arg.ul == gpr_end.ul)
+		    next_arg.ul = rest.ul;
+		}
+	    }
+	  break;
+
+	case FFI_TYPE_UINT8:
+	  gprvalue = **p_argv.uc;
+	  goto putgpr;
+	case FFI_TYPE_SINT8:
+	  gprvalue = **p_argv.sc;
+	  goto putgpr;
+	case FFI_TYPE_UINT16:
+	  gprvalue = **p_argv.us;
+	  goto putgpr;
+	case FFI_TYPE_SINT16:
+	  gprvalue = **p_argv.ss;
+	  goto putgpr;
+	case FFI_TYPE_UINT32:
+	  gprvalue = **p_argv.ui;
+	  goto putgpr;
+	case FFI_TYPE_INT:
+	case FFI_TYPE_SINT32:
+	  gprvalue = **p_argv.si;
+	  goto putgpr;
+
+	case FFI_TYPE_UINT64:
+	case FFI_TYPE_SINT64:
+	case FFI_TYPE_POINTER:
+	  gprvalue = **p_argv.ul;
+	putgpr:
+	  *next_arg.ul++ = gprvalue;
+	  if (next_arg.ul == gpr_end.ul)
+	    next_arg.ul = rest.ul;
+	  break;
+	}
+    }
+
+  FFI_ASSERT (flags & FLAG_4_GPR_ARGUMENTS
+	      || (next_arg.ul >= gpr_base.ul
+		  && next_arg.ul <= gpr_base.ul + 4));
+}
+
+
+#if _CALL_ELF == 2
+#define MIN_CACHE_LINE_SIZE 8
+
+static void
+flush_icache (char *wraddr, char *xaddr, int size)
+{
+  int i;
+  for (i = 0; i < size; i += MIN_CACHE_LINE_SIZE)
+    __asm__ volatile ("icbi 0,%0;" "dcbf 0,%1;"
+		      : : "r" (xaddr + i), "r" (wraddr + i) : "memory");
+  __asm__ volatile ("icbi 0,%0;" "dcbf 0,%1;" "sync;" "isync;"
+		    : : "r"(xaddr + size - 1), "r"(wraddr + size - 1)
+		    : "memory");
+}
+#endif
+
+ffi_status
+ffi_prep_closure_loc_linux64 (ffi_closure *closure,
+			      ffi_cif *cif,
+			      void (*fun) (ffi_cif *, void *, void **, void *),
+			      void *user_data,
+			      void *codeloc)
+{
+#if _CALL_ELF == 2
+  unsigned int *tramp = (unsigned int *) &closure->tramp[0];
+
+  if (cif->abi < FFI_LINUX || cif->abi >= FFI_LAST_ABI)
+    return FFI_BAD_ABI;
+
+  tramp[0] = 0xe96c0018;	/* 0:	ld	11,2f-0b(12)	*/
+  tramp[1] = 0xe98c0010;	/*	ld	12,1f-0b(12)	*/
+  tramp[2] = 0x7d8903a6;	/*	mtctr	12		*/
+  tramp[3] = 0x4e800420;	/*	bctr			*/
+				/* 1:	.quad	function_addr	*/
+				/* 2:	.quad	context		*/
+  *(void **) &tramp[4] = (void *) ffi_closure_LINUX64;
+  *(void **) &tramp[6] = codeloc;
+  flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE);
+#else
+  void **tramp = (void **) &closure->tramp[0];
+
+  if (cif->abi < FFI_LINUX || cif->abi >= FFI_LAST_ABI)
+    return FFI_BAD_ABI;
+
+  /* Copy function address and TOC from ffi_closure_LINUX64.  */
+  memcpy (tramp, (char *) ffi_closure_LINUX64, 16);
+  tramp[2] = codeloc;
+#endif
+
+  closure->cif = cif;
+  closure->fun = fun;
+  closure->user_data = user_data;
+
+  return FFI_OK;
+}
+
+
+int FFI_HIDDEN
+ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue,
+			    unsigned long *pst, ffi_dblfl *pfr)
+{
+  /* rvalue is the pointer to space for return value in closure assembly */
+  /* pst is the pointer to parameter save area
+     (r3-r10 are stored into its first 8 slots by ffi_closure_LINUX64) */
+  /* pfr is the pointer to where f1-f13 are stored in ffi_closure_LINUX64 */
+
+  void **avalue;
+  ffi_type **arg_types;
+  unsigned long i, avn, nfixedargs;
+  ffi_cif *cif;
+  ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64;
+  unsigned long align;
+
+  cif = closure->cif;
+  avalue = alloca (cif->nargs * sizeof (void *));
+
+  /* Copy the caller's structure return value address so that the
+     closure returns the data directly to the caller.  */
+  if (cif->rtype->type == FFI_TYPE_STRUCT
+      && (cif->flags & FLAG_RETURNS_SMST) == 0)
+    {
+      rvalue = (void *) *pst;
+      pst++;
+    }
+
+  i = 0;
+  avn = cif->nargs;
+#if _CALL_ELF != 2
+  nfixedargs = (unsigned) -1;
+  if ((cif->flags & FLAG_COMPAT) == 0)
+#endif
+    nfixedargs = cif->nfixedargs;
+  arg_types = cif->arg_types;
+
+  /* Grab the addresses of the arguments from the stack frame.  */
+  while (i < avn)
+    {
+      unsigned int elt, elnum;
+
+      switch (arg_types[i]->type)
+	{
+	case FFI_TYPE_SINT8:
+	case FFI_TYPE_UINT8:
+#ifndef __LITTLE_ENDIAN__
+	  avalue[i] = (char *) pst + 7;
+	  pst++;
+	  break;
+#endif
+
+	case FFI_TYPE_SINT16:
+	case FFI_TYPE_UINT16:
+#ifndef __LITTLE_ENDIAN__
+	  avalue[i] = (char *) pst + 6;
+	  pst++;
+	  break;
+#endif
+
+	case FFI_TYPE_SINT32:
+	case FFI_TYPE_UINT32:
+#ifndef __LITTLE_ENDIAN__
+	  avalue[i] = (char *) pst + 4;
+	  pst++;
+	  break;
+#endif
+
+	case FFI_TYPE_SINT64:
+	case FFI_TYPE_UINT64:
+	case FFI_TYPE_POINTER:
+	  avalue[i] = pst;
+	  pst++;
+	  break;
+
+	case FFI_TYPE_STRUCT:
+	  if ((cif->abi & FFI_LINUX_STRUCT_ALIGN) != 0)
+	    {
+	      align = arg_types[i]->alignment;
+	      if (align > 16)
+		align = 16;
+	      if (align > 1)
+		pst = (unsigned long *) ALIGN ((size_t) pst, align);
+	    }
+	  elt = 0;
+#if _CALL_ELF == 2
+	  elt = discover_homogeneous_aggregate (arg_types[i], &elnum);
+#endif
+	  if (elt)
+	    {
+	      union {
+		void *v;
+		unsigned long *ul;
+		float *f;
+		double *d;
+		size_t p;
+	      } to, from;
+
+	      /* Repackage the aggregate from its parts.  The
+		 aggregate size is not greater than the space taken by
+		 the registers so store back to the register/parameter
+		 save arrays.  */
+	      if (pfr + elnum <= end_pfr)
+		to.v = pfr;
+	      else
+		to.v = pst;
+
+	      avalue[i] = to.v;
+	      from.ul = pst;
+	      if (elt == FFI_TYPE_FLOAT)
+		{
+		  do
+		    {
+		      if (pfr < end_pfr && i < nfixedargs)
+			{
+			  *to.f = (float) pfr->d;
+			  pfr++;
+			}
+		      else
+			*to.f = *from.f;
+		      to.f++;
+		      from.f++;
+		    }
+		  while (--elnum != 0);
+		}
+	      else
+		{
+		  do
+		    {
+		      if (pfr < end_pfr && i < nfixedargs)
+			{
+			  *to.d = pfr->d;
+			  pfr++;
+			}
+		      else
+			*to.d = *from.d;
+		      to.d++;
+		      from.d++;
+		    }
+		  while (--elnum != 0);
+		}
+	    }
+	  else
+	    {
+#ifndef __LITTLE_ENDIAN__
+	      /* Structures with size less than eight bytes are passed
+		 left-padded.  */
+	      if (arg_types[i]->size < 8)
+		avalue[i] = (char *) pst + 8 - arg_types[i]->size;
+	      else
+#endif
+		avalue[i] = pst;
+	    }
+	  pst += (arg_types[i]->size + 7) / 8;
+	  break;
+
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+	case FFI_TYPE_LONGDOUBLE:
+	  if ((cif->abi & FFI_LINUX_LONG_DOUBLE_128) != 0)
+	    {
+	      if (pfr + 1 < end_pfr && i + 1 < nfixedargs)
+		{
+		  avalue[i] = pfr;
+		  pfr += 2;
+		}
+	      else
+		{
+		  if (pfr < end_pfr && i < nfixedargs)
+		    {
+		      /* Passed partly in f13 and partly on the stack.
+			 Move it all to the stack.  */
+		      *pst = *(unsigned long *) pfr;
+		      pfr++;
+		    }
+		  avalue[i] = pst;
+		}
+	      pst += 2;
+	      break;
+	    }
+	  /* Fall through.  */
+#endif
+	case FFI_TYPE_DOUBLE:
+	  /* On the outgoing stack all values are aligned to 8 */
+	  /* there are 13 64bit floating point registers */
+
+	  if (pfr < end_pfr && i < nfixedargs)
+	    {
+	      avalue[i] = pfr;
+	      pfr++;
+	    }
+	  else
+	    avalue[i] = pst;
+	  pst++;
+	  break;
+
+	case FFI_TYPE_FLOAT:
+	  if (pfr < end_pfr && i < nfixedargs)
+	    {
+	      /* Float values are stored as doubles in the
+		 ffi_closure_LINUX64 code.  Fix them here.  */
+	      pfr->f = (float) pfr->d;
+	      avalue[i] = pfr;
+	      pfr++;
+	    }
+	  else
+	    avalue[i] = pst;
+	  pst++;
+	  break;
+
+	default:
+	  FFI_ASSERT (0);
+	}
+
+      i++;
+    }
+
+
+  (closure->fun) (cif, rvalue, avalue, closure->user_data);
+
+  /* Tell ffi_closure_LINUX64 how to perform return type promotions.  */
+  if ((cif->flags & FLAG_RETURNS_SMST) != 0)
+    {
+      if ((cif->flags & FLAG_RETURNS_FP) == 0)
+	return FFI_V2_TYPE_SMALL_STRUCT + cif->rtype->size - 1;
+      else if ((cif->flags & FLAG_RETURNS_64BITS) != 0)
+	return FFI_V2_TYPE_DOUBLE_HOMOG;
+      else
+	return FFI_V2_TYPE_FLOAT_HOMOG;
+    }
+  return cif->rtype->type;
+}
+#endif
Index: gcc-4_8-branch/libffi/src/powerpc/ffi_powerpc.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc-4_8-branch/libffi/src/powerpc/ffi_powerpc.h	2013-12-28 17:50:47.153375027 +0100
@@ -0,0 +1,77 @@
+/* -----------------------------------------------------------------------
+   ffi_powerpc.h - Copyright (C) 2013 IBM
+                   Copyright (C) 2011 Anthony Green
+                   Copyright (C) 2011 Kyle Moffett
+                   Copyright (C) 2008 Red Hat, Inc
+                   Copyright (C) 2007, 2008 Free Software Foundation, Inc
+                   Copyright (c) 1998 Geoffrey Keating
+
+   PowerPC Foreign Function Interface
+
+   Permission is hereby granted, free of charge, to any person obtaining
+   a copy of this software and associated documentation files (the
+   ``Software''), to deal in the Software without restriction, including
+   without limitation the rights to use, copy, modify, merge, publish,
+   distribute, sublicense, and/or sell copies of the Software, and to
+   permit persons to whom the Software is furnished to do so, subject to
+   the following conditions:
+
+   The above copyright notice and this permission notice shall be included
+   in all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, EXPRESS
+   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+   IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES OR
+   OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+   ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+   OTHER DEALINGS IN THE SOFTWARE.
+   ----------------------------------------------------------------------- */
+
+enum {
+  /* The assembly depends on these exact flags.  */
+  /* These go in cr7 */
+  FLAG_RETURNS_SMST	= 1 << (31-31), /* Used for FFI_SYSV small structs.  */
+  FLAG_RETURNS_NOTHING  = 1 << (31-30),
+  FLAG_RETURNS_FP       = 1 << (31-29),
+  FLAG_RETURNS_64BITS   = 1 << (31-28),
+
+  /* This goes in cr6 */
+  FLAG_RETURNS_128BITS  = 1 << (31-27),
+
+  FLAG_COMPAT		= 1 << (31- 8), /* Not used by assembly */
+
+  /* These go in cr1 */
+  FLAG_ARG_NEEDS_COPY   = 1 << (31- 7), /* Used by sysv code */
+  FLAG_ARG_NEEDS_PSAVE  = FLAG_ARG_NEEDS_COPY, /* Used by linux64 code */
+  FLAG_FP_ARGUMENTS     = 1 << (31- 6), /* cr1.eq; specified by ABI */
+  FLAG_4_GPR_ARGUMENTS  = 1 << (31- 5),
+  FLAG_RETVAL_REFERENCE = 1 << (31- 4)
+};
+
+typedef union
+{
+  float f;
+  double d;
+} ffi_dblfl;
+
+void FFI_HIDDEN ffi_closure_SYSV (void);
+void FFI_HIDDEN ffi_call_SYSV(extended_cif *, unsigned, unsigned, unsigned *,
+			      void (*)(void));
+
+void FFI_HIDDEN ffi_prep_types_sysv (ffi_abi);
+ffi_status FFI_HIDDEN ffi_prep_cif_sysv (ffi_cif *);
+int FFI_HIDDEN ffi_closure_helper_SYSV (ffi_closure *, void *, unsigned long *,
+					ffi_dblfl *, unsigned long *);
+
+void FFI_HIDDEN ffi_call_LINUX64(extended_cif *, unsigned long, unsigned long,
+				 unsigned long *, void (*)(void));
+void FFI_HIDDEN ffi_closure_LINUX64 (void);
+
+void FFI_HIDDEN ffi_prep_types_linux64 (ffi_abi);
+ffi_status FFI_HIDDEN ffi_prep_cif_linux64 (ffi_cif *);
+ffi_status FFI_HIDDEN ffi_prep_cif_linux64_var (ffi_cif *, unsigned int,
+						unsigned int);
+void FFI_HIDDEN ffi_prep_args64 (extended_cif *, unsigned long *const);
+int FFI_HIDDEN ffi_closure_helper_LINUX64 (ffi_closure *, void *,
+					   unsigned long *, ffi_dblfl *);
Index: gcc-4_8-branch/libffi/src/powerpc/ffi_sysv.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc-4_8-branch/libffi/src/powerpc/ffi_sysv.c	2013-12-28 17:50:47.158375052 +0100
@@ -0,0 +1,931 @@
+/* -----------------------------------------------------------------------
+   ffi_sysv.c - Copyright (C) 2013 IBM
+                Copyright (C) 2011 Anthony Green
+                Copyright (C) 2011 Kyle Moffett
+                Copyright (C) 2008 Red Hat, Inc
+                Copyright (C) 2007, 2008 Free Software Foundation, Inc
+                Copyright (c) 1998 Geoffrey Keating
+
+   PowerPC Foreign Function Interface
+
+   Permission is hereby granted, free of charge, to any person obtaining
+   a copy of this software and associated documentation files (the
+   ``Software''), to deal in the Software without restriction, including
+   without limitation the rights to use, copy, modify, merge, publish,
+   distribute, sublicense, and/or sell copies of the Software, and to
+   permit persons to whom the Software is furnished to do so, subject to
+   the following conditions:
+
+   The above copyright notice and this permission notice shall be included
+   in all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, EXPRESS
+   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+   IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES OR
+   OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+   ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+   OTHER DEALINGS IN THE SOFTWARE.
+   ----------------------------------------------------------------------- */
+
+#include "ffi.h"
+
+#ifndef POWERPC64
+#include "ffi_common.h"
+#include "ffi_powerpc.h"
+
+
+/* About the SYSV ABI.  */
+#define ASM_NEEDS_REGISTERS 4
+#define NUM_GPR_ARG_REGISTERS 8
+#define NUM_FPR_ARG_REGISTERS 8
+
+
+#if HAVE_LONG_DOUBLE_VARIANT && FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+/* Adjust size of ffi_type_longdouble.  */
+void FFI_HIDDEN
+ffi_prep_types_sysv (ffi_abi abi)
+{
+  if ((abi & (FFI_SYSV | FFI_SYSV_LONG_DOUBLE_128)) == FFI_SYSV)
+    {
+      ffi_type_longdouble.size = 8;
+      ffi_type_longdouble.alignment = 8;
+    }
+  else
+    {
+      ffi_type_longdouble.size = 16;
+      ffi_type_longdouble.alignment = 16;
+    }
+}
+#endif
+
+/* Transform long double, double and float to other types as per abi.  */
+static int
+translate_float (int abi, int type)
+{
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+  if (type == FFI_TYPE_LONGDOUBLE
+      && (abi & FFI_SYSV_LONG_DOUBLE_128) == 0)
+    type = FFI_TYPE_DOUBLE;
+#endif
+  if ((abi & FFI_SYSV_SOFT_FLOAT) != 0)
+    {
+      if (type == FFI_TYPE_FLOAT)
+	type = FFI_TYPE_UINT32;
+      else if (type == FFI_TYPE_DOUBLE)
+	type = FFI_TYPE_UINT64;
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+      else if (type == FFI_TYPE_LONGDOUBLE)
+	type = FFI_TYPE_UINT128;
+    }
+  else if ((abi & FFI_SYSV_IBM_LONG_DOUBLE) == 0)
+    {
+      if (type == FFI_TYPE_LONGDOUBLE)
+	type = FFI_TYPE_STRUCT;
+#endif
+    }
+  return type;
+}
+
+/* Perform machine dependent cif processing */
+static ffi_status
+ffi_prep_cif_sysv_core (ffi_cif *cif)
+{
+  ffi_type **ptr;
+  unsigned bytes;
+  unsigned i, fparg_count = 0, intarg_count = 0;
+  unsigned flags = cif->flags;
+  unsigned struct_copy_size = 0;
+  unsigned type = cif->rtype->type;
+  unsigned size = cif->rtype->size;
+
+  /* The machine-independent calculation of cif->bytes doesn't work
+     for us.  Redo the calculation.  */
+
+  /* Space for the frame pointer, callee's LR, and the asm's temp regs.  */
+  bytes = (2 + ASM_NEEDS_REGISTERS) * sizeof (int);
+
+  /* Space for the GPR registers.  */
+  bytes += NUM_GPR_ARG_REGISTERS * sizeof (int);
+
+  /* Return value handling.  The rules for SYSV are as follows:
+     - 32-bit (or less) integer values are returned in gpr3;
+     - Structures of size <= 4 bytes also returned in gpr3;
+     - 64-bit integer values and structures between 5 and 8 bytes are returned
+     in gpr3 and gpr4;
+     - Larger structures are allocated space and a pointer is passed as
+     the first argument.
+     - Single/double FP values are returned in fpr1;
+     - long doubles (if not equivalent to double) are returned in
+     fpr1,fpr2 for Linux and as for large structs for SysV.  */
+
+  type = translate_float (cif->abi, type);
+
+  switch (type)
+    {
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+    case FFI_TYPE_LONGDOUBLE:
+      flags |= FLAG_RETURNS_128BITS;
+      /* Fall through.  */
+#endif
+    case FFI_TYPE_DOUBLE:
+      flags |= FLAG_RETURNS_64BITS;
+      /* Fall through.  */
+    case FFI_TYPE_FLOAT:
+      flags |= FLAG_RETURNS_FP;
+#ifdef __NO_FPRS__
+      return FFI_BAD_ABI;
+#endif
+      break;
+
+    case FFI_TYPE_UINT128:
+      flags |= FLAG_RETURNS_128BITS;
+      /* Fall through.  */
+    case FFI_TYPE_UINT64:
+    case FFI_TYPE_SINT64:
+      flags |= FLAG_RETURNS_64BITS;
+      break;
+
+    case FFI_TYPE_STRUCT:
+      /* The final SYSV ABI says that structures smaller or equal 8 bytes
+	 are returned in r3/r4.  A draft ABI used by linux instead
+	 returns them in memory.  */
+      if ((cif->abi & FFI_SYSV_STRUCT_RET) != 0 && size <= 8)
+	{
+	  flags |= FLAG_RETURNS_SMST;
+	  break;
+	}
+      intarg_count++;
+      flags |= FLAG_RETVAL_REFERENCE;
+      /* Fall through.  */
+    case FFI_TYPE_VOID:
+      flags |= FLAG_RETURNS_NOTHING;
+      break;
+
+    default:
+      /* Returns 32-bit integer, or similar.  Nothing to do here.  */
+      break;
+    }
+
+  /* The first NUM_GPR_ARG_REGISTERS words of integer arguments, and the
+     first NUM_FPR_ARG_REGISTERS fp arguments, go in registers; the rest
+     goes on the stack.  Structures and long doubles (if not equivalent
+     to double) are passed as a pointer to a copy of the structure.
+     Stuff on the stack needs to keep proper alignment.  */
+  for (ptr = cif->arg_types, i = cif->nargs; i > 0; i--, ptr++)
+    {
+      unsigned short typenum = (*ptr)->type;
+
+      typenum = translate_float (cif->abi, typenum);
+
+      switch (typenum)
+	{
+#if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+	case FFI_TYPE_LONGDOUBLE:
+	  fparg_count++;
+	  /* Fall thru */
+#endif
+	case FFI_TYPE_DOUBLE:
+	  fparg_count++;
+	  /* If this FP arg is going on the stack, it must be
+	     8-byte-aligned.  */
+	  if (fparg_count > NUM_FPR_ARG_REGISTERS
+	      && intarg_count >= NUM_GPR_ARG_REGISTERS
+	      && intarg_count % 2 != 0)
+	    intarg_count++;
+#ifdef __NO_FPRS__
+	  return FFI_BAD_ABI;
+#endif
+	  break;
+
+	case FFI_TYPE_FLOAT:
+	  fparg_count++;
+#ifdef __NO_FPRS__
+	  return FFI_BAD_ABI;
+#endif
+	  break;
+
+	case FFI_TYPE_UINT128:
+	  /* A long double in FFI_LINUX_SOFT_FLOAT can use only a set
+	     of four consecutive gprs. If we do not have enough, we
+	     have to adjust the intarg_count value.  */
+	  if (intarg_count >= NUM_GPR_ARG_REGISTERS - 3
+	      && intarg_count < NUM_GPR_ARG_REGISTERS)
+	    intarg_count = NUM_GPR_ARG_REGISTERS;
+	  intarg_count += 4;
+	  break;
+
+	case FFI_TYPE_UINT64:
+	case FFI_TYPE_SINT64:
+	  /* 'long long' arguments are passed as two words, but
+	     either both words must fit in registers or both go
+	     on the stack.  If they go on the stack, they must
+	     be 8-byte-aligned.
+
+	     Also, only certain register pairs can be used for
+	     passing long long int -- specifically (r3,r4), (r5,r6),
+	     (r7,r8), (r9,r10).  */
+	  if (intarg_count == NUM_GPR_ARG_REGISTERS-1
+	      || intarg_count % 2 != 0)
+	    intarg_count++;
+	  intarg_count += 2;
+	  break;
+
+	case FFI_TYPE_STRUCT:
+	  /* We must allocate space for a copy of these to enforce
+	     pass-by-value.  Pad the space up to a multiple of 16
+	     bytes (the maximum alignment required for anything under
+	     the SYSV ABI).  */
+	  struct_copy_size += ((*ptr)->size + 15) & ~0xF;
+	  /* Fall through (allocate space for the pointer).  */
+
+	case FFI_TYPE_POINTER:
+	case FFI_TYPE_INT:
+	case FFI_TYPE_UINT32:
+	case FFI_TYPE_SINT32:
+	case FFI_TYPE_UINT16:
+	case FFI_TYPE_SINT16:
+	case FFI_TYPE_UINT8:
+	case FFI_TYPE_SINT8:
+	  /* Everything else is passed as a 4-byte word in a GPR, either
+	     the object itself or a pointer to it.  */
+	  intarg_count++;
+	  break;
+
+	default:
+	  FFI_ASSERT (0);
+	}
+    }
+
+  if (fparg_count != 0)
+    flags |= FLAG_FP_ARGUMENTS;
+  if (intarg_count > 4)
+    flags |= FLAG_4_GPR_ARGUMENTS;
+  if (struct_copy_size != 0)
+    flags |= FLAG_ARG_NEEDS_COPY;
+
+  /* Space for the FPR registers, if needed.  */
+  if (fparg_count != 0)
+    bytes += NUM_FPR_ARG_REGISTERS * sizeof (double);
+
+  /* Stack space.  */
+  if (intarg_count > NUM_GPR_ARG_REGISTERS)
+    bytes += (intarg_count - NUM_GPR_ARG_REGISTERS) * sizeof (int);
+  if (fparg_count > NUM_FPR_ARG_REGISTERS)
+    bytes += (fparg_count - NUM_FPR_ARG_REGISTERS) * sizeof (double);
+
+  /* The stack space allocated needs to be a multiple of 16 bytes.  */
+  bytes = (bytes + 15) & ~0xF;
+
+  /* Add in the space for the copied structures.  */
+  bytes += struct_copy_size;
+
+  cif->flags = flags;
+  cif->bytes = bytes;
+
+  return FFI_OK;
+}
+
+ffi_status FFI_HIDDEN
+ffi_prep_cif_sysv (ffi_cif *cif)
+{
+  if ((cif->abi & FFI_SYSV) == 0)
+    {
+      /* This call is from old code.  Translate to new ABI values.  */
+      cif->flags |= FLAG_COMPAT;
+      switch (cif->abi)
+	{
+	default:
+	  return FFI_BAD_ABI;
+
+	case FFI_COMPAT_SYSV:
+	  cif->abi = FFI_SYSV | FFI_SYSV_STRUCT_RET | FFI_SYSV_LONG_DOUBLE_128;
+	  break;
+
+	case FFI_COMPAT_GCC_SYSV:
+	  cif->abi = FFI_SYSV | FFI_SYSV_LONG_DOUBLE_128;
+	  break;
+
+	case FFI_COMPAT_LINUX:
+	  cif->abi = (FFI_SYSV | FFI_SYSV_IBM_LONG_DOUBLE
+		      | FFI_SYSV_LONG_DOUBLE_128);
+	  break;
+
+	case FFI_COMPAT_LINUX_SOFT_FLOAT:
+	  cif->abi = (FFI_SYSV | FFI_SYSV_SOFT_FLOAT | FFI_SYSV_IBM_LONG_DOUBLE
+		      | FFI_SYSV_LONG_DOUBLE_128);
+	  break;
+	}
+    }
+  return ffi_prep_cif_sysv_core (cif);
+}
+
+/* ffi_prep_args_SYSV is called by the assembly routine once stack space
+   has been allocated for the function's arguments.
+
+   The stack layout we want looks like this:
+
+   |   Return address from ffi_call_SYSV 4bytes	|	higher addresses
+   |--------------------------------------------|
+   |   Previous backchain pointer	4	|       stack pointer here
+   |--------------------------------------------|<+ <<<	on entry to
+   |   Saved r28-r31			4*4	| |	ffi_call_SYSV
+   |--------------------------------------------| |
+   |   GPR registers r3-r10		8*4	| |	ffi_call_SYSV
+   |--------------------------------------------| |
+   |   FPR registers f1-f8 (optional)	8*8	| |
+   |--------------------------------------------| |	stack	|
+   |   Space for copied structures		| |	grows	|
+   |--------------------------------------------| |	down    V
+   |   Parameters that didn't fit in registers  | |
+   |--------------------------------------------| |	lower addresses
+   |   Space for callee's LR		4	| |
+   |--------------------------------------------| |	stack pointer here
+   |   Current backchain pointer	4	|-/	during
+   |--------------------------------------------|   <<<	ffi_call_SYSV
+
+*/
+
+void FFI_HIDDEN
+ffi_prep_args_SYSV (extended_cif *ecif, unsigned *const stack)
+{
+  const unsigned bytes = ecif->cif->bytes;
+  const unsigned flags = ecif->cif->flags;
+
+  typedef union
+  {
+    char *c;
+    unsigned *u;
+    long long *ll;
+    float *f;
+    double *d;
+  } valp;
+
+  /* 'stacktop' points at the previous backchain pointer.  */
+  valp stacktop;
+
+  /* 'gpr_base' points at the space for gpr3, and grows upwards as
+     we use GPR registers.  */
+  valp gpr_base;
+  int intarg_count;
+
+#ifndef __NO_FPRS__
+  /* 'fpr_base' points at the space for fpr1, and grows upwards as
+     we use FPR registers.  */
+  valp fpr_base;
+  int fparg_count;
+#endif
+
+  /* 'copy_space' grows down as we put structures in it.  It should
+     stay 16-byte aligned.  */
+  valp copy_space;
+
+  /* 'next_arg' grows up as we put parameters in it.  */
+  valp next_arg;
+
+  int i;
+  ffi_type **ptr;
+#ifndef __NO_FPRS__
+  double double_tmp;
+#endif
+  union
+  {
+    void **v;
+    char **c;
+    signed char **sc;
+    unsigned char **uc;
+    signed short **ss;
+    unsigned short **us;
+    unsigned int **ui;
+    long long **ll;
+    float **f;
+    double **d;
+  } p_argv;
+  size_t struct_copy_size;
+  unsigned gprvalue;
+
+  stacktop.c = (char *) stack + bytes;
+  gpr_base.u = stacktop.u - ASM_NEEDS_REGISTERS - NUM_GPR_ARG_REGISTERS;
+  intarg_count = 0;
+#ifndef __NO_FPRS__
+  fpr_base.d = gpr_base.d - NUM_FPR_ARG_REGISTERS;
+  fparg_count = 0;
+  copy_space.c = ((flags & FLAG_FP_ARGUMENTS) ? fpr_base.c : gpr_base.c);
+#else
+  copy_space.c = gpr_base.c;
+#endif
+  next_arg.u = stack + 2;
+
+  /* Check that everything starts aligned properly.  */
+  FFI_ASSERT (((unsigned long) (char *) stack & 0xF) == 0);
+  FFI_ASSERT (((unsigned long) copy_space.c & 0xF) == 0);
+  FFI_ASSERT (((unsigned long) stacktop.c & 0xF) == 0);
+  FFI_ASSERT ((bytes & 0xF) == 0);
+  FFI_ASSERT (copy_space.c >= next_arg.c);
+
+  /* Deal with return values that are actually pass-by-reference.  */
+  if (flags & FLAG_RETVAL_REFERENCE)
+    {
+      *gpr_base.u++ = (unsigned long) (char *) ecif->rvalue;
+      intarg_count++;
+    }
+
+  /* Now for the arguments.  */
+  p_argv.v = ecif->avalue;
+  for (ptr = ecif->cif->arg_types, i = ecif->cif->nargs;
+       i > 0;
+       i--, ptr++, p_argv.v++)
+    {
+      unsigned int typenum = (*ptr)->type;
+
+      typenum = translate_float (ecif->cif->abi, typenum);
+
+      /* Now test the translated value */
+      switch (typenum)
+	{
+#ifndef __NO_FPRS__
+# if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+	case FFI_TYPE_LONGDOUBLE:
+	  double_tmp = (*p_argv.d)[0];
+
+	  if (fparg_count >= NUM_FPR_ARG_REGISTERS - 1)
+	    {
+	      if (intarg_count >= NUM_GPR_ARG_REGISTERS
+		  && intarg_count % 2 != 0)
+		{
+		  intarg_count++;
+		  next_arg.u++;
+		}
+	      *next_arg.d = double_tmp;
+	      next_arg.u += 2;
+	      double_tmp = (*p_argv.d)[1];
+	      *next_arg.d = double_tmp;
+	      next_arg.u += 2;
+	    }
+	  else
+	    {
+	      *fpr_base.d++ = double_tmp;
+	      double_tmp = (*p_argv.d)[1];
+	      *fpr_base.d++ = double_tmp;
+	    }
+
+	  fparg_count += 2;
+	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
+	  break;
+# endif
+	case FFI_TYPE_DOUBLE:
+	  double_tmp = **p_argv.d;
+
+	  if (fparg_count >= NUM_FPR_ARG_REGISTERS)
+	    {
+	      if (intarg_count >= NUM_GPR_ARG_REGISTERS
+		  && intarg_count % 2 != 0)
+		{
+		  intarg_count++;
+		  next_arg.u++;
+		}
+	      *next_arg.d = double_tmp;
+	      next_arg.u += 2;
+	    }
+	  else
+	    *fpr_base.d++ = double_tmp;
+	  fparg_count++;
+	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
+	  break;
+
+	case FFI_TYPE_FLOAT:
+	  double_tmp = **p_argv.f;
+	  if (fparg_count >= NUM_FPR_ARG_REGISTERS)
+	    {
+	      *next_arg.f = (float) double_tmp;
+	      next_arg.u += 1;
+	      intarg_count++;
+	    }
+	  else
+	    *fpr_base.d++ = double_tmp;
+	  fparg_count++;
+	  FFI_ASSERT (flags & FLAG_FP_ARGUMENTS);
+	  break;
+#endif /* have FPRs */
+
+	case FFI_TYPE_UINT128:
+	  /* The soft float ABI for long doubles works like this, a long double
+	     is passed in four consecutive GPRs if available.  A maximum of 2
+	     long doubles can be passed in gprs.  If we do not have 4 GPRs
+	     left, the long double is passed on the stack, 4-byte aligned.  */
+	  {
+	    unsigned int int_tmp;
+	    unsigned int ii;
+	    if (intarg_count >= NUM_GPR_ARG_REGISTERS - 3)
+	      {
+		if (intarg_count < NUM_GPR_ARG_REGISTERS)
+		  intarg_count = NUM_GPR_ARG_REGISTERS;
+		for (ii = 0; ii < 4; ii++)
+		  {
+		    int_tmp = (*p_argv.ui)[ii];
+		    *next_arg.u++ = int_tmp;
+		  }
+	      }
+	    else
+	      {
+		for (ii = 0; ii < 4; ii++)
+		  {
+		    int_tmp = (*p_argv.ui)[ii];
+		    *gpr_base.u++ = int_tmp;
+		  }
+	      }
+	    intarg_count += 4;
+	    break;
+	  }
+
+	case FFI_TYPE_UINT64:
+	case FFI_TYPE_SINT64:
+	  if (intarg_count == NUM_GPR_ARG_REGISTERS-1)
+	    intarg_count++;
+	  if (intarg_count >= NUM_GPR_ARG_REGISTERS)
+	    {
+	      if (intarg_count % 2 != 0)
+		{
+		  intarg_count++;
+		  next_arg.u++;
+		}
+	      *next_arg.ll = **p_argv.ll;
+	      next_arg.u += 2;
+	    }
+	  else
+	    {
+	      /* The abi states only certain register pairs can be
+		 used for passing long long int specifically (r3,r4),
+		 (r5,r6), (r7,r8), (r9,r10).  If next arg is long long
+		 but not correct starting register of pair then skip
+		 until the proper starting register.  */
+	      if (intarg_count % 2 != 0)
+		{
+		  intarg_count ++;
+		  gpr_base.u++;
+		}
+	      *gpr_base.ll++ = **p_argv.ll;
+	    }
+	  intarg_count += 2;
+	  break;
+
+	case FFI_TYPE_STRUCT:
+	  struct_copy_size = ((*ptr)->size + 15) & ~0xF;
+	  copy_space.c -= struct_copy_size;
+	  memcpy (copy_space.c, *p_argv.c, (*ptr)->size);
+
+	  gprvalue = (unsigned long) copy_space.c;
+
+	  FFI_ASSERT (copy_space.c > next_arg.c);
+	  FFI_ASSERT (flags & FLAG_ARG_NEEDS_COPY);
+	  goto putgpr;
+
+	case FFI_TYPE_UINT8:
+	  gprvalue = **p_argv.uc;
+	  goto putgpr;
+	case FFI_TYPE_SINT8:
+	  gprvalue = **p_argv.sc;
+	  goto putgpr;
+	case FFI_TYPE_UINT16:
+	  gprvalue = **p_argv.us;
+	  goto putgpr;
+	case FFI_TYPE_SINT16:
+	  gprvalue = **p_argv.ss;
+	  goto putgpr;
+
+	case FFI_TYPE_INT:
+	case FFI_TYPE_UINT32:
+	case FFI_TYPE_SINT32:
+	case FFI_TYPE_POINTER:
+
+	  gprvalue = **p_argv.ui;
+
+	putgpr:
+	  if (intarg_count >= NUM_GPR_ARG_REGISTERS)
+	    *next_arg.u++ = gprvalue;
+	  else
+	    *gpr_base.u++ = gprvalue;
+	  intarg_count++;
+	  break;
+	}
+    }
+
+  /* Check that we didn't overrun the stack...  */
+  FFI_ASSERT (copy_space.c >= next_arg.c);
+  FFI_ASSERT (gpr_base.u <= stacktop.u - ASM_NEEDS_REGISTERS);
+  /* The assert below is testing that the number of integer arguments agrees
+     with the number found in ffi_prep_cif_machdep().  However, intarg_count
+     is incremented whenever we place an FP arg on the stack, so account for
+     that before our assert test.  */
+#ifndef __NO_FPRS__
+  if (fparg_count > NUM_FPR_ARG_REGISTERS)
+    intarg_count -= fparg_count - NUM_FPR_ARG_REGISTERS;
+  FFI_ASSERT (fpr_base.u
+	      <= stacktop.u - ASM_NEEDS_REGISTERS - NUM_GPR_ARG_REGISTERS);
+#endif
+  FFI_ASSERT (flags & FLAG_4_GPR_ARGUMENTS || intarg_count <= 4);
+}
+
+#define MIN_CACHE_LINE_SIZE 8
+
+static void
+flush_icache (char *wraddr, char *xaddr, int size)
+{
+  int i;
+  for (i = 0; i < size; i += MIN_CACHE_LINE_SIZE)
+    __asm__ volatile ("icbi 0,%0;" "dcbf 0,%1;"
+		      : : "r" (xaddr + i), "r" (wraddr + i) : "memory");
+  __asm__ volatile ("icbi 0,%0;" "dcbf 0,%1;" "sync;" "isync;"
+		    : : "r"(xaddr + size - 1), "r"(wraddr + size - 1)
+		    : "memory");
+}
+
+ffi_status FFI_HIDDEN
+ffi_prep_closure_loc_sysv (ffi_closure *closure,
+			   ffi_cif *cif,
+			   void (*fun) (ffi_cif *, void *, void **, void *),
+			   void *user_data,
+			   void *codeloc)
+{
+  unsigned int *tramp;
+
+  if (cif->abi < FFI_SYSV || cif->abi >= FFI_LAST_ABI)
+    return FFI_BAD_ABI;
+
+  tramp = (unsigned int *) &closure->tramp[0];
+  tramp[0] = 0x7c0802a6;  /*   mflr    r0 */
+  tramp[1] = 0x4800000d;  /*   bl      10 <trampoline_initial+0x10> */
+  tramp[4] = 0x7d6802a6;  /*   mflr    r11 */
+  tramp[5] = 0x7c0803a6;  /*   mtlr    r0 */
+  tramp[6] = 0x800b0000;  /*   lwz     r0,0(r11) */
+  tramp[7] = 0x816b0004;  /*   lwz     r11,4(r11) */
+  tramp[8] = 0x7c0903a6;  /*   mtctr   r0 */
+  tramp[9] = 0x4e800420;  /*   bctr */
+  *(void **) &tramp[2] = (void *) ffi_closure_SYSV; /* function */
+  *(void **) &tramp[3] = codeloc;                   /* context */
+
+  /* Flush the icache.  */
+  flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE);
+
+  closure->cif = cif;
+  closure->fun = fun;
+  closure->user_data = user_data;
+
+  return FFI_OK;
+}
+
+/* Basically the trampoline invokes ffi_closure_SYSV, and on
+   entry, r11 holds the address of the closure.
+   After storing the registers that could possibly contain
+   parameters to be passed into the stack frame and setting
+   up space for a return value, ffi_closure_SYSV invokes the
+   following helper function to do most of the work.  */
+
+int
+ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue,
+			 unsigned long *pgr, ffi_dblfl *pfr,
+			 unsigned long *pst)
+{
+  /* rvalue is the pointer to space for return value in closure assembly */
+  /* pgr is the pointer to where r3-r10 are stored in ffi_closure_SYSV */
+  /* pfr is the pointer to where f1-f8 are stored in ffi_closure_SYSV  */
+  /* pst is the pointer to outgoing parameter stack in original caller */
+
+  void **          avalue;
+  ffi_type **      arg_types;
+  long             i, avn;
+#ifndef __NO_FPRS__
+  long             nf = 0;   /* number of floating registers already used */
+#endif
+  long             ng = 0;   /* number of general registers already used */
+
+  ffi_cif *cif = closure->cif;
+  unsigned       size     = cif->rtype->size;
+  unsigned short rtypenum = cif->rtype->type;
+
+  avalue = alloca (cif->nargs * sizeof (void *));
+
+  /* First translate for softfloat/nonlinux */
+  rtypenum = translate_float (cif->abi, rtypenum);
+
+  /* Copy the caller's structure return value address so that the closure
+     returns the data directly to the caller.
+     For FFI_SYSV the result is passed in r3/r4 if the struct size is less
+     or equal 8 bytes.  */
+  if (rtypenum == FFI_TYPE_STRUCT
+      && !((cif->abi & FFI_SYSV_STRUCT_RET) != 0 && size <= 8))
+    {
+      rvalue = (void *) *pgr;
+      ng++;
+      pgr++;
+    }
+
+  i = 0;
+  avn = cif->nargs;
+  arg_types = cif->arg_types;
+
+  /* Grab the addresses of the arguments from the stack frame.  */
+  while (i < avn) {
+    unsigned short typenum = arg_types[i]->type;
+
+    /* We may need to handle some values depending on ABI.  */
+    typenum = translate_float (cif->abi, typenum);
+
+    switch (typenum)
+      {
+#ifndef __NO_FPRS__
+      case FFI_TYPE_FLOAT:
+	/* Unfortunately float values are stored as doubles
+	   in the ffi_closure_SYSV code (since we don't check
+	   the type in that routine).  */
+	if (nf < NUM_FPR_ARG_REGISTERS)
+	  {
+	    /* FIXME? here we are really changing the values
+	       stored in the original calling routines outgoing
+	       parameter stack.  This is probably a really
+	       naughty thing to do but...  */
+	    double temp = pfr->d;
+	    pfr->f = (float) temp;
+	    avalue[i] = pfr;
+	    nf++;
+	    pfr++;
+	  }
+	else
+	  {
+	    avalue[i] = pst;
+	    pst += 1;
+	  }
+	break;
+
+      case FFI_TYPE_DOUBLE:
+	if (nf < NUM_FPR_ARG_REGISTERS)
+	  {
+	    avalue[i] = pfr;
+	    nf++;
+	    pfr++;
+	  }
+	else
+	  {
+	    if (((long) pst) & 4)
+	      pst++;
+	    avalue[i] = pst;
+	    pst += 2;
+	  }
+	break;
+
+# if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+      case FFI_TYPE_LONGDOUBLE:
+	if (nf < NUM_FPR_ARG_REGISTERS - 1)
+	  {
+	    avalue[i] = pfr;
+	    pfr += 2;
+	    nf += 2;
+	  }
+	else
+	  {
+	    if (((long) pst) & 4)
+	      pst++;
+	    avalue[i] = pst;
+	    pst += 4;
+	    nf = 8;
+	  }
+	break;
+# endif
+#endif
+
+      case FFI_TYPE_UINT128:
+	/* Test if for the whole long double, 4 gprs are available.
+	   otherwise the stuff ends up on the stack.  */
+	if (ng < NUM_GPR_ARG_REGISTERS - 3)
+	  {
+	    avalue[i] = pgr;
+	    pgr += 4;
+	    ng += 4;
+	  }
+	else
+	  {
+	    avalue[i] = pst;
+	    pst += 4;
+	    ng = 8+4;
+	  }
+	break;
+
+      case FFI_TYPE_SINT8:
+      case FFI_TYPE_UINT8:
+#ifndef __LITTLE_ENDIAN__
+	if (ng < NUM_GPR_ARG_REGISTERS)
+	  {
+	    avalue[i] = (char *) pgr + 3;
+	    ng++;
+	    pgr++;
+	  }
+	else
+	  {
+	    avalue[i] = (char *) pst + 3;
+	    pst++;
+	  }
+	break;
+#endif
+
+      case FFI_TYPE_SINT16:
+      case FFI_TYPE_UINT16:
+#ifndef __LITTLE_ENDIAN__
+	if (ng < NUM_GPR_ARG_REGISTERS)
+	  {
+	    avalue[i] = (char *) pgr + 2;
+	    ng++;
+	    pgr++;
+	  }
+	else
+	  {
+	    avalue[i] = (char *) pst + 2;
+	    pst++;
+	  }
+	break;
+#endif
+
+      case FFI_TYPE_SINT32:
+      case FFI_TYPE_UINT32:
+      case FFI_TYPE_POINTER:
+	if (ng < NUM_GPR_ARG_REGISTERS)
+	  {
+	    avalue[i] = pgr;
+	    ng++;
+	    pgr++;
+	  }
+	else
+	  {
+	    avalue[i] = pst;
+	    pst++;
+	  }
+	break;
+
+      case FFI_TYPE_STRUCT:
+	/* Structs are passed by reference. The address will appear in a
+	   gpr if it is one of the first 8 arguments.  */
+	if (ng < NUM_GPR_ARG_REGISTERS)
+	  {
+	    avalue[i] = (void *) *pgr;
+	    ng++;
+	    pgr++;
+	  }
+	else
+	  {
+	    avalue[i] = (void *) *pst;
+	    pst++;
+	  }
+	break;
+
+      case FFI_TYPE_SINT64:
+      case FFI_TYPE_UINT64:
+	/* Passing long long ints are complex, they must
+	   be passed in suitable register pairs such as
+	   (r3,r4) or (r5,r6) or (r6,r7), or (r7,r8) or (r9,r10)
+	   and if the entire pair aren't available then the outgoing
+	   parameter stack is used for both but an alignment of 8
+	   must will be kept.  So we must either look in pgr
+	   or pst to find the correct address for this type
+	   of parameter.  */
+	if (ng < NUM_GPR_ARG_REGISTERS - 1)
+	  {
+	    if (ng & 1)
+	      {
+		/* skip r4, r6, r8 as starting points */
+		ng++;
+		pgr++;
+	      }
+	    avalue[i] = pgr;
+	    ng += 2;
+	    pgr += 2;
+	  }
+	else
+	  {
+	    if (((long) pst) & 4)
+	      pst++;
+	    avalue[i] = pst;
+	    pst += 2;
+	    ng = NUM_GPR_ARG_REGISTERS;
+	  }
+	break;
+
+      default:
+	FFI_ASSERT (0);
+      }
+
+    i++;
+  }
+
+  (closure->fun) (cif, rvalue, avalue, closure->user_data);
+
+  /* Tell ffi_closure_SYSV how to perform return type promotions.
+     Because the FFI_SYSV ABI returns the structures <= 8 bytes in
+     r3/r4 we have to tell ffi_closure_SYSV how to treat them.  We
+     combine the base type FFI_SYSV_TYPE_SMALL_STRUCT with the size of
+     the struct less one.  We never have a struct with size zero.
+     See the comment in ffitarget.h about ordering.  */
+  if (rtypenum == FFI_TYPE_STRUCT
+      && (cif->abi & FFI_SYSV_STRUCT_RET) != 0 && size <= 8)
+    return FFI_SYSV_TYPE_SMALL_STRUCT - 1 + size;
+  return rtypenum;
+}
+#endif
Index: gcc-4_8-branch/libffi/src/powerpc/sysv.S
===================================================================
--- gcc-4_8-branch.orig/libffi/src/powerpc/sysv.S	2013-12-28 17:41:31.722625405 +0100
+++ gcc-4_8-branch/libffi/src/powerpc/sysv.S	2013-12-28 17:50:47.163375075 +0100
@@ -30,7 +30,7 @@
 #include <ffi.h>
 #include <powerpc/asm.h>
 
-#ifndef __powerpc64__
+#ifndef POWERPC64
 	.globl ffi_prep_args_SYSV
 ENTRY(ffi_call_SYSV)
 .LFB1:
@@ -213,8 +213,8 @@ END(ffi_call_SYSV)
       .uleb128  0x1c
       .align 2
 .LEFDE1:
-#endif
 
 #if defined __ELF__ && defined __linux__
 	.section	.note.GNU-stack,"",@progbits
 #endif
+#endif
Index: gcc-4_8-branch/libffi/src/prep_cif.c
===================================================================
--- gcc-4_8-branch.orig/libffi/src/prep_cif.c	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/src/prep_cif.c	2013-12-28 17:50:47.166375090 +0100
@@ -126,6 +126,10 @@ ffi_status FFI_HIDDEN ffi_prep_cif_core(
 
   cif->flags = 0;
 
+#if HAVE_LONG_DOUBLE_VARIANT
+  ffi_prep_types (abi);
+#endif
+
   /* Initialize the return type if necessary */
   if ((cif->rtype->size == 0) && (initialize_aggregate(cif->rtype) != FFI_OK))
     return FFI_BAD_TYPEDEF;
Index: gcc-4_8-branch/libffi/src/types.c
===================================================================
--- gcc-4_8-branch.orig/libffi/src/types.c	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/src/types.c	2013-12-28 17:50:47.169375105 +0100
@@ -44,6 +44,17 @@ const ffi_type ffi_type_##name = {		\
   id, NULL					\
 }
 
+#define FFI_NONCONST_TYPEDEF(name, type, id)	\
+struct struct_align_##name {			\
+  char c;					\
+  type x;					\
+};						\
+ffi_type ffi_type_##name = {			\
+  sizeof(type),					\
+  offsetof(struct struct_align_##name, x),	\
+  id, NULL					\
+}
+
 /* Size and alignment are fake here. They must not be 0. */
 const ffi_type ffi_type_void = {
   1, 1, FFI_TYPE_VOID, NULL
@@ -73,5 +84,9 @@ FFI_TYPEDEF(double, double, FFI_TYPE_DOU
 # endif
 const ffi_type ffi_type_longdouble = { 16, 16, 4, NULL };
 #elif FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
+# if HAVE_LONG_DOUBLE_VARIANT
+FFI_NONCONST_TYPEDEF(longdouble, long double, FFI_TYPE_LONGDOUBLE);
+# else
 FFI_TYPEDEF(longdouble, long double, FFI_TYPE_LONGDOUBLE);
+# endif
 #endif
Index: gcc-4_8-branch/libffi/testsuite/Makefile.in
===================================================================
--- gcc-4_8-branch.orig/libffi/testsuite/Makefile.in	2013-12-28 17:41:31.723625410 +0100
+++ gcc-4_8-branch/libffi/testsuite/Makefile.in	2013-12-28 17:50:47.172375120 +0100
@@ -88,6 +88,7 @@ FFI_EXEC_TRAMPOLINE_TABLE = @FFI_EXEC_TR
 FGREP = @FGREP@
 GREP = @GREP@
 HAVE_LONG_DOUBLE = @HAVE_LONG_DOUBLE@
+HAVE_LONG_DOUBLE_VARIANT = @HAVE_LONG_DOUBLE_VARIANT@
 INSTALL = @INSTALL@
 INSTALL_DATA = @INSTALL_DATA@
 INSTALL_PROGRAM = @INSTALL_PROGRAM@