diff mbox

, rs6000 cleanup, make constraints tighter

Message ID 20140807184122.GA15905@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner Aug. 7, 2014, 6:41 p.m. UTC
I'm starting to look at updating my old address branch with an eye towards
getting the changes committed in GCC 4.10.  The address branch is meant to
rewrite handling of addresses in the rs6000 backend, to generalize the
addresses before register allocation to allow more general forms, and then
within register allocation, be more specific, depending on the register classes
involved.

The goals of the address work are:

   1)	On power7, enable using the traditional Altivec registers to hold
	double precision floating point;

   2)	On power8, enable using the traditional Altivec registers to hold
	single precision floating point;

   3)	On power8, enable better fusion support for loads, by keeping the
	extended addressing forms together, so more fused instructions can be
	emitted.

I believe that for the first two features, we will also need to flip the switch
on LRA, to make it default, in order to better deal with having reg+offset
addressing in the traditional floating point registers, but only reg+reg
addressing in the traditional Altivec registers.

These patches primarily tighten the constraints, so that the "wa" constraing
(all VSX registers) is not used on types that are not allowed in the
traditional Altivec registers.  This can throw off the pressure sensitive
optimizations like -fsched-pressure because the pass thinks there are more
registers available than could be allocated.

I also have a simplification in rs6000.c that makes it easier to decide if
scalar values can go in traditional Altivec registers in the patch.

I have done bootstrap and C/C++ regression testing on power7, power8 big
endian, and power8 little endian systems with no regressions.  At this time, I
have not run the fortran regression tests, since there are various failures due
to memory allocation in the trunk when I tested it (bugzilla 61950).  Are these
patches ok to be installed in the trunk?  I would like to install them in 4.9
and 4.8 trees as well.

2014-08-07  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/constraints.md (wh constraint): New constraint,
	for FP registers if direct move is available.
	(wi constraint): New constraint, for VSX/FP registers that can
	handle 64-bit integers.
	(wj constraint): New constraint for VSX/FP registers that can
	handle 64-bit integers for direct moves.
	(wk constraint): New constraint for VSX/FP registers that can
	handle 64-bit doubles for direct moves.
	(wy constraint): Make documentation match implementation.

	* config/rs6000/rs6000.c (struct rs6000_reg_addr): Add
	scalar_in_vmx_p field to simplify tests of whether SFmode or
	DFmode can go in the Altivec registers.
	(rs6000_hard_regno_mode_ok): Use scalar_in_vmx_p field.
	(rs6000_setup_reg_addr_masks): Likewise.
	(rs6000_debug_print_mode): Add debug support for scalar_in_vmx_p
	field, and wh/wi/wj/wk constraints.
	(rs6000_init_hard_regno_mode_ok): Setup scalar_in_vmx_p field, and
	the wh/wi/wj/wk constraints.
	(rs6000_preferred_reload_class): If SFmode/DFmode can go in the
	upper registers, prefer VSX registers unless the operation is a
	memory operation with REG+OFFSET addressing.

	* config/rs6000/vsx.md (VSr mode attribute): Add support for
	DImode.  Change SFmode to use ww constraint instead of d to allow
	SF registers in the upper registers.
	(VSr2): Likewise.
	(VSr3): Likewise.
	(VSr5): Fix thinko in comment.
	(VSa): New mode attribute that is an alternative to wa, that
	returns the VSX register class that a mode can go in, but may not
	be the preferred register class.
	(VS_64dm): New mode attribute for appropriate register classes for
	referencing 64-bit elements of vectors for direct moves and normal
	moves.
	(VS_64reg): Likewise.
	(vsx_mov<mode>): Change wa constraint to <VSa> to limit the
	register allocator to only registers the data type can handle.
	(vsx_le_perm_load_<mode>): Likewise.
	(vsx_le_perm_store_<mode>): Likewise.
	(vsx_xxpermdi2_le_<mode>): Likewise.
	(vsx_xxpermdi4_le_<mode>): Likewise.
	(vsx_lxvd2x2_le_<mode>): Likewise.
	(vsx_lxvd2x4_le_<mode>): Likewise.
	(vsx_stxvd2x2_le_<mode>): Likewise.
	(vsx_add<mode>3): Likewise.
	(vsx_sub<mode>3): Likewise.
	(vsx_mul<mode>3): Likewise.
	(vsx_div<mode>3): Likewise.
	(vsx_tdiv<mode>3_internal): Likewise.
	(vsx_fre<mode>2): Likewise.
	(vsx_neg<mode>2): Likewise.
	(vsx_abs<mode>2): Likewise.
	(vsx_nabs<mode>2): Likewise.
	(vsx_smax<mode>3): Likewise.
	(vsx_smin<mode>3): Likewise.
	(vsx_sqrt<mode>2): Likewise.
	(vsx_rsqrte<mode>2): Likewise.
	(vsx_tsqrt<mode>2_internal): Likewise.
	(vsx_fms<mode>4): Likewise.
	(vsx_nfma<mode>4): Likewise.
	(vsx_eq<mode>): Likewise.
	(vsx_gt<mode>): Likewise.
	(vsx_ge<mode>): Likewise.
	(vsx_eq<mode>_p): Likewise.
	(vsx_gt<mode>_p): Likewise.
	(vsx_ge<mode>_p): Likewise.
	(vsx_xxsel<mode>): Likewise.
	(vsx_xxsel<mode>_uns): Likewise.
	(vsx_copysign<mode>3): Likewise.
	(vsx_float<VSi><mode>2): Likewise.
	(vsx_floatuns<VSi><mode>2): Likewise.
	(vsx_fix_trunc<mode><VSi>2): Likewise.
	(vsx_fixuns_trunc<mode><VSi>2): Likewise.
	(vsx_x<VSv>r<VSs>i): Likewise.
	(vsx_x<VSv>r<VSs>ic): Likewise.
	(vsx_btrunc<mode>2): Likewise.
	(vsx_b2trunc<mode>2): Likewise.
	(vsx_floor<mode>2): Likewise.
	(vsx_ceil<mode>2): Likewise.
	(vsx_<VS_spdp_insn>): Likewise.
	(vsx_xscvspdp): Likewise.
	(vsx_xvcvspuxds): Likewise.
	(vsx_float_fix_<mode>2): Likewise.
	(vsx_set_<mode>): Likewise.
	(vsx_extract_<mode>_internal1): Likewise.
	(vsx_extract_<mode>_internal2): Likewise.
	(vsx_extract_<mode>_load): Likewise.
	(vsx_extract_<mode>_store): Likewise.
	(vsx_splat_<mode>): Likewise.
	(vsx_xxspltw_<mode>): Likewise.
	(vsx_xxspltw_<mode>_direct): Likewise.
	(vsx_xxmrghw_<mode>): Likewise.
	(vsx_xxmrglw_<mode>): Likewise.
	(vsx_xxsldwi_<mode>): Likewise.
	(vsx_xscvdpspn): Tighten constraints to only use register classes
	the types use.
	(vsx_xscvspdpn): Likewise.
	(vsx_xscvdpspn_scalar): Likewise.

	* config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wh, wi,
	wj, and wk constraints.
	(GPR_REG_CLASS_P): New helper macro for register classes targeting
	general purpose registers.

	* config/rs6000/rs6000.md (f32_dm): Use wh constraint for SDmode
	direct moves.
	(zero_extendsidi2_lfiwz): Use wj constraint for direct move of
	DImode instead of wm.  Use wk constraint for direct move of DFmode
	instead of wm.
	(extendsidi2_lfiwax): Likewise.
	(lfiwax): Likewise.
	(lfiwzx): Likewise.
	(movdi_internal64): Likewise.

	* doc/md.texi (PowerPC and IBM RS6000): Document wh, wi, wj, and
	wk constraints. Make the wy constraint documentation match them
	implementation.

Comments

David Edelsohn Aug. 8, 2014, 7:31 p.m. UTC | #1
On Thu, Aug 7, 2014 at 2:41 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> I'm starting to look at updating my old address branch with an eye towards
> getting the changes committed in GCC 4.10.  The address branch is meant to
> rewrite handling of addresses in the rs6000 backend, to generalize the
> addresses before register allocation to allow more general forms, and then
> within register allocation, be more specific, depending on the register classes
> involved.
>
> The goals of the address work are:
>
>    1)   On power7, enable using the traditional Altivec registers to hold
>         double precision floating point;
>
>    2)   On power8, enable using the traditional Altivec registers to hold
>         single precision floating point;
>
>    3)   On power8, enable better fusion support for loads, by keeping the
>         extended addressing forms together, so more fused instructions can be
>         emitted.
>
> I believe that for the first two features, we will also need to flip the switch
> on LRA, to make it default, in order to better deal with having reg+offset
> addressing in the traditional floating point registers, but only reg+reg
> addressing in the traditional Altivec registers.
>
> These patches primarily tighten the constraints, so that the "wa" constraing
> (all VSX registers) is not used on types that are not allowed in the
> traditional Altivec registers.  This can throw off the pressure sensitive
> optimizations like -fsched-pressure because the pass thinks there are more
> registers available than could be allocated.
>
> I also have a simplification in rs6000.c that makes it easier to decide if
> scalar values can go in traditional Altivec registers in the patch.
>
> I have done bootstrap and C/C++ regression testing on power7, power8 big
> endian, and power8 little endian systems with no regressions.  At this time, I
> have not run the fortran regression tests, since there are various failures due
> to memory allocation in the trunk when I tested it (bugzilla 61950).  Are these
> patches ok to be installed in the trunk?  I would like to install them in 4.9
> and 4.8 trees as well.
>
> 2014-08-07  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/constraints.md (wh constraint): New constraint,
>         for FP registers if direct move is available.
>         (wi constraint): New constraint, for VSX/FP registers that can
>         handle 64-bit integers.
>         (wj constraint): New constraint for VSX/FP registers that can
>         handle 64-bit integers for direct moves.
>         (wk constraint): New constraint for VSX/FP registers that can
>         handle 64-bit doubles for direct moves.
>         (wy constraint): Make documentation match implementation.
>
>         * config/rs6000/rs6000.c (struct rs6000_reg_addr): Add
>         scalar_in_vmx_p field to simplify tests of whether SFmode or
>         DFmode can go in the Altivec registers.
>         (rs6000_hard_regno_mode_ok): Use scalar_in_vmx_p field.
>         (rs6000_setup_reg_addr_masks): Likewise.
>         (rs6000_debug_print_mode): Add debug support for scalar_in_vmx_p
>         field, and wh/wi/wj/wk constraints.
>         (rs6000_init_hard_regno_mode_ok): Setup scalar_in_vmx_p field, and
>         the wh/wi/wj/wk constraints.
>         (rs6000_preferred_reload_class): If SFmode/DFmode can go in the
>         upper registers, prefer VSX registers unless the operation is a
>         memory operation with REG+OFFSET addressing.
>
>         * config/rs6000/vsx.md (VSr mode attribute): Add support for
>         DImode.  Change SFmode to use ww constraint instead of d to allow
>         SF registers in the upper registers.
>         (VSr2): Likewise.
>         (VSr3): Likewise.
>         (VSr5): Fix thinko in comment.
>         (VSa): New mode attribute that is an alternative to wa, that
>         returns the VSX register class that a mode can go in, but may not
>         be the preferred register class.
>         (VS_64dm): New mode attribute for appropriate register classes for
>         referencing 64-bit elements of vectors for direct moves and normal
>         moves.
>         (VS_64reg): Likewise.
>         (vsx_mov<mode>): Change wa constraint to <VSa> to limit the
>         register allocator to only registers the data type can handle.
>         (vsx_le_perm_load_<mode>): Likewise.
>         (vsx_le_perm_store_<mode>): Likewise.
>         (vsx_xxpermdi2_le_<mode>): Likewise.
>         (vsx_xxpermdi4_le_<mode>): Likewise.
>         (vsx_lxvd2x2_le_<mode>): Likewise.
>         (vsx_lxvd2x4_le_<mode>): Likewise.
>         (vsx_stxvd2x2_le_<mode>): Likewise.
>         (vsx_add<mode>3): Likewise.
>         (vsx_sub<mode>3): Likewise.
>         (vsx_mul<mode>3): Likewise.
>         (vsx_div<mode>3): Likewise.
>         (vsx_tdiv<mode>3_internal): Likewise.
>         (vsx_fre<mode>2): Likewise.
>         (vsx_neg<mode>2): Likewise.
>         (vsx_abs<mode>2): Likewise.
>         (vsx_nabs<mode>2): Likewise.
>         (vsx_smax<mode>3): Likewise.
>         (vsx_smin<mode>3): Likewise.
>         (vsx_sqrt<mode>2): Likewise.
>         (vsx_rsqrte<mode>2): Likewise.
>         (vsx_tsqrt<mode>2_internal): Likewise.
>         (vsx_fms<mode>4): Likewise.
>         (vsx_nfma<mode>4): Likewise.
>         (vsx_eq<mode>): Likewise.
>         (vsx_gt<mode>): Likewise.
>         (vsx_ge<mode>): Likewise.
>         (vsx_eq<mode>_p): Likewise.
>         (vsx_gt<mode>_p): Likewise.
>         (vsx_ge<mode>_p): Likewise.
>         (vsx_xxsel<mode>): Likewise.
>         (vsx_xxsel<mode>_uns): Likewise.
>         (vsx_copysign<mode>3): Likewise.
>         (vsx_float<VSi><mode>2): Likewise.
>         (vsx_floatuns<VSi><mode>2): Likewise.
>         (vsx_fix_trunc<mode><VSi>2): Likewise.
>         (vsx_fixuns_trunc<mode><VSi>2): Likewise.
>         (vsx_x<VSv>r<VSs>i): Likewise.
>         (vsx_x<VSv>r<VSs>ic): Likewise.
>         (vsx_btrunc<mode>2): Likewise.
>         (vsx_b2trunc<mode>2): Likewise.
>         (vsx_floor<mode>2): Likewise.
>         (vsx_ceil<mode>2): Likewise.
>         (vsx_<VS_spdp_insn>): Likewise.
>         (vsx_xscvspdp): Likewise.
>         (vsx_xvcvspuxds): Likewise.
>         (vsx_float_fix_<mode>2): Likewise.
>         (vsx_set_<mode>): Likewise.
>         (vsx_extract_<mode>_internal1): Likewise.
>         (vsx_extract_<mode>_internal2): Likewise.
>         (vsx_extract_<mode>_load): Likewise.
>         (vsx_extract_<mode>_store): Likewise.
>         (vsx_splat_<mode>): Likewise.
>         (vsx_xxspltw_<mode>): Likewise.
>         (vsx_xxspltw_<mode>_direct): Likewise.
>         (vsx_xxmrghw_<mode>): Likewise.
>         (vsx_xxmrglw_<mode>): Likewise.
>         (vsx_xxsldwi_<mode>): Likewise.
>         (vsx_xscvdpspn): Tighten constraints to only use register classes
>         the types use.
>         (vsx_xscvspdpn): Likewise.
>         (vsx_xscvdpspn_scalar): Likewise.
>
>         * config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wh, wi,
>         wj, and wk constraints.
>         (GPR_REG_CLASS_P): New helper macro for register classes targeting
>         general purpose registers.
>
>         * config/rs6000/rs6000.md (f32_dm): Use wh constraint for SDmode
>         direct moves.
>         (zero_extendsidi2_lfiwz): Use wj constraint for direct move of
>         DImode instead of wm.  Use wk constraint for direct move of DFmode
>         instead of wm.
>         (extendsidi2_lfiwax): Likewise.
>         (lfiwax): Likewise.
>         (lfiwzx): Likewise.
>         (movdi_internal64): Likewise.
>
>         * doc/md.texi (PowerPC and IBM RS6000): Document wh, wi, wj, and
>         wk constraints. Make the wy constraint documentation match them
>         implementation.

Okay.

Note that you or Carrot will have to be careful about the merge of
movdi_internal64 as both of your patches affect that pattern.

Thanks, David
Michael Meissner Aug. 11, 2014, 5:21 p.m. UTC | #2
On Fri, Aug 08, 2014 at 03:31:48PM -0400, David Edelsohn wrote:
> Okay.
> 
> Note that you or Carrot will have to be careful about the merge of
> movdi_internal64 as both of your patches affect that pattern.

Carrot has checked in his patch, and I will adapt my patch (and replace the
constraint wm Carrot used with wi, since wi is now the constraint to use for
DImode in FP/VSX registers.
diff mbox

Patch

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 213676)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -68,6 +68,20 @@  (define_register_constraint "wf" "rs6000
 (define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]"
   "If -mmfpgpr was used, a floating point register or NO_REGS.")
 
+(define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]"
+  "Floating point register if direct moves are available, or NO_REGS.")
+
+;; At present, DImode is not allowed in the Altivec registers.  If in the
+;; future it is allowed, wi/wj can be set to VSX_REGS instead of FLOAT_REGS.
+(define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
+  "FP or VSX register to hold 64-bit integers or NO_REGS.")
+
+(define_register_constraint "wj" "rs6000_constraints[RS6000_CONSTRAINT_wj]"
+  "FP or VSX register to hold 64-bit integers for direct moves or NO_REGS.")
+
+(define_register_constraint "wk" "rs6000_constraints[RS6000_CONSTRAINT_wk]"
+  "FP or VSX register to hold 64-bit doubles for direct moves or NO_REGS.")
+
 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
   "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
 
@@ -101,7 +115,7 @@  (define_register_constraint "wx" "rs6000
   "Floating point register if the STFIWX instruction is enabled or NO_REGS.")
 
 (define_register_constraint "wy" "rs6000_constraints[RS6000_CONSTRAINT_wy]"
-  "VSX vector register to hold scalar float values or NO_REGS.")
+  "FP or VSX register to perform ISA 2.07 float ops or NO_REGS.")
 
 (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
   "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 213676)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -389,6 +389,7 @@  struct rs6000_reg_addr {
   enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
   enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
   addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
+  bool scalar_in_vmx_p;			/* Scalar value can go in VMX.  */
 };
 
 static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
@@ -1733,8 +1734,7 @@  rs6000_hard_regno_mode_ok (int regno, en
      asked for it.  */
   if (TARGET_VSX && VSX_REGNO_P (regno)
       && (VECTOR_MEM_VSX_P (mode)
-	  || (TARGET_VSX_SCALAR_FLOAT && mode == SFmode)
-	  || (TARGET_VSX_SCALAR_DOUBLE && (mode == DFmode || mode == DImode))
+	  || reg_addr[mode].scalar_in_vmx_p
 	  || (TARGET_VSX_TIMODE && mode == TImode)
 	  || (TARGET_VADDUQM && mode == V1TImode)))
     {
@@ -1743,10 +1743,7 @@  rs6000_hard_regno_mode_ok (int regno, en
 
       if (ALTIVEC_REGNO_P (regno))
 	{
-	  if (mode == SFmode && !TARGET_UPPER_REGS_SF)
-	    return 0;
-
-	  if ((mode == DFmode || mode == DImode) && !TARGET_UPPER_REGS_DF)
+	  if (GET_MODE_SIZE (mode) != 16 && !reg_addr[mode].scalar_in_vmx_p)
 	    return 0;
 
 	  return ALTIVEC_REGNO_P (last_regno);
@@ -1926,14 +1923,16 @@  rs6000_debug_print_mode (ssize_t m)
   if (rs6000_vector_unit[m] != VECTOR_NONE
       || rs6000_vector_mem[m] != VECTOR_NONE
       || (reg_addr[m].reload_store != CODE_FOR_nothing)
-      || (reg_addr[m].reload_load != CODE_FOR_nothing))
+      || (reg_addr[m].reload_load != CODE_FOR_nothing)
+      || reg_addr[m].scalar_in_vmx_p)
     {
       fprintf (stderr,
-	       "  Vector-arith=%-10s Vector-mem=%-10s Reload=%c%c",
+	       "  Vector-arith=%-10s Vector-mem=%-10s Reload=%c%c Upper=%c",
 	       rs6000_debug_vector_unit (rs6000_vector_unit[m]),
 	       rs6000_debug_vector_unit (rs6000_vector_mem[m]),
 	       (reg_addr[m].reload_store != CODE_FOR_nothing) ? 's' : '*',
-	       (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'l' : '*');
+	       (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'l' : '*',
+	       (reg_addr[m].scalar_in_vmx_p) ? 'y' : 'n');
     }
 
   fputs ("\n", stderr);
@@ -2050,6 +2049,9 @@  rs6000_debug_reg_global (void)
 	   "wd reg_class = %s\n"
 	   "wf reg_class = %s\n"
 	   "wg reg_class = %s\n"
+	   "wi reg_class = %s\n"
+	   "wj reg_class = %s\n"
+	   "wk reg_class = %s\n"
 	   "wl reg_class = %s\n"
 	   "wm reg_class = %s\n"
 	   "wr reg_class = %s\n"
@@ -2069,6 +2071,9 @@  rs6000_debug_reg_global (void)
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wd]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wi]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wj]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wk]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wm]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]],
@@ -2398,8 +2403,7 @@  rs6000_setup_reg_addr_masks (void)
 		  && !COMPLEX_MODE_P (m2)
 		  && !indexed_only_p
 		  && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)
-		  && !(m2 == DFmode && TARGET_UPPER_REGS_DF)
-		  && !(m2 == SFmode && TARGET_UPPER_REGS_SF))
+		  && !reg_addr[m2].scalar_in_vmx_p)
 		{
 		  addr_mask |= RELOAD_REG_PRE_INCDEC;
 
@@ -2630,37 +2634,46 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	f  - Register class to use with traditional SFmode instructions.
 	v  - Altivec register.
 	wa - Any VSX register.
+	wc - Reserved to represent individual CR bits (used in LLVM).
 	wd - Preferred register class for V2DFmode.
 	wf - Preferred register class for V4SFmode.
 	wg - Float register for power6x move insns.
+	wh - FP register for direct move instructions.
+	wi - FP or VSX register to hold 64-bit integers.
+	wj - FP or VSX register to hold 64-bit integers for direct moves.
+	wk - FP or VSX register to hold 64-bit doubles for direct moves.
 	wl - Float register if we can do 32-bit signed int loads.
 	wm - VSX register for ISA 2.07 direct move operations.
+	wn - always NO_REGS.
 	wr - GPR if 64-bit mode is permitted.
 	ws - Register class to do ISA 2.06 DF operations.
+	wt - VSX register for TImode in VSX registers.
 	wu - Altivec register for ISA 2.07 VSX SF/SI load/stores.
 	wv - Altivec register for ISA 2.06 VSX DF/DI load/stores.
-	wt - VSX register for TImode in VSX registers.
 	ww - Register class to do SF conversions in with VSX operations.
 	wx - Float register if we can do 32-bit int stores.
 	wy - Register class to do ISA 2.07 SF operations.
 	wz - Float register if we can do 32-bit unsigned int loads.  */
 
   if (TARGET_HARD_FLOAT && TARGET_FPRS)
-    rs6000_constraints[RS6000_CONSTRAINT_f] = FLOAT_REGS;
+    rs6000_constraints[RS6000_CONSTRAINT_f] = FLOAT_REGS;	/* SFmode  */
 
   if (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
-    rs6000_constraints[RS6000_CONSTRAINT_d] = FLOAT_REGS;
+    {
+      rs6000_constraints[RS6000_CONSTRAINT_d]  = FLOAT_REGS;	/* DFmode  */
+      rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;	/* DImode  */
+    }
 
   if (TARGET_VSX)
     {
       rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;	/* V2DFmode  */
+      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;	/* V4SFmode  */
 
       if (TARGET_VSX_TIMODE)
-	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;
+	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;	/* TImode  */
 
-      if (TARGET_UPPER_REGS_DF)
+      if (TARGET_UPPER_REGS_DF)					/* DFmode  */
 	{
 	  rs6000_constraints[RS6000_CONSTRAINT_ws] = VSX_REGS;
 	  rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
@@ -2674,19 +2687,26 @@  rs6000_init_hard_regno_mode_ok (bool glo
   if (TARGET_ALTIVEC)
     rs6000_constraints[RS6000_CONSTRAINT_v] = ALTIVEC_REGS;
 
-  if (TARGET_MFPGPR)
+  if (TARGET_MFPGPR)						/* DFmode  */
     rs6000_constraints[RS6000_CONSTRAINT_wg] = FLOAT_REGS;
 
   if (TARGET_LFIWAX)
-    rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS;
+    rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS;	/* DImode  */
 
   if (TARGET_DIRECT_MOVE)
-    rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS;
+    {
+      rs6000_constraints[RS6000_CONSTRAINT_wh] = FLOAT_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wj]			/* DImode  */
+	= rs6000_constraints[RS6000_CONSTRAINT_wi];
+      rs6000_constraints[RS6000_CONSTRAINT_wk]			/* DFmode  */
+	= rs6000_constraints[RS6000_CONSTRAINT_ws];
+      rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS;
+    }
 
   if (TARGET_POWERPC64)
     rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS;
 
-  if (TARGET_P8_VECTOR && TARGET_UPPER_REGS_SF)
+  if (TARGET_P8_VECTOR && TARGET_UPPER_REGS_SF)			/* SFmode  */
     {
       rs6000_constraints[RS6000_CONSTRAINT_wu] = ALTIVEC_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wy] = VSX_REGS;
@@ -2701,10 +2721,10 @@  rs6000_init_hard_regno_mode_ok (bool glo
     rs6000_constraints[RS6000_CONSTRAINT_ww] = FLOAT_REGS;
 
   if (TARGET_STFIWX)
-    rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS;
+    rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS;	/* DImode  */
 
   if (TARGET_LFIWZX)
-    rs6000_constraints[RS6000_CONSTRAINT_wz] = FLOAT_REGS;
+    rs6000_constraints[RS6000_CONSTRAINT_wz] = FLOAT_REGS;	/* DImode  */
 
   /* Set up the reload helper and direct move functions.  */
   if (TARGET_VSX || TARGET_ALTIVEC)
@@ -2727,10 +2747,11 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_di_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
-	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_di_store;
-	      reg_addr[DFmode].reload_load   = CODE_FOR_reload_df_di_load;
-	      reg_addr[DDmode].reload_store  = CODE_FOR_reload_dd_di_store;
-	      reg_addr[DDmode].reload_load   = CODE_FOR_reload_dd_di_load;
+	      reg_addr[DFmode].reload_store    = CODE_FOR_reload_df_di_store;
+	      reg_addr[DFmode].reload_load     = CODE_FOR_reload_df_di_load;
+	      reg_addr[DFmode].scalar_in_vmx_p = true;
+	      reg_addr[DDmode].reload_store    = CODE_FOR_reload_dd_di_store;
+	      reg_addr[DDmode].reload_load     = CODE_FOR_reload_dd_di_load;
 	    }
 	  if (TARGET_P8_VECTOR)
 	    {
@@ -2738,6 +2759,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	      reg_addr[SFmode].reload_load   = CODE_FOR_reload_sf_di_load;
 	      reg_addr[SDmode].reload_store  = CODE_FOR_reload_sd_di_store;
 	      reg_addr[SDmode].reload_load   = CODE_FOR_reload_sd_di_load;
+	      if (TARGET_UPPER_REGS_SF)
+		reg_addr[SFmode].scalar_in_vmx_p = true;
 	    }
 	  if (TARGET_VSX_TIMODE)
 	    {
@@ -2794,10 +2817,11 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_si_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
-	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_si_store;
-	      reg_addr[DFmode].reload_load   = CODE_FOR_reload_df_si_load;
-	      reg_addr[DDmode].reload_store  = CODE_FOR_reload_dd_si_store;
-	      reg_addr[DDmode].reload_load   = CODE_FOR_reload_dd_si_load;
+	      reg_addr[DFmode].reload_store    = CODE_FOR_reload_df_si_store;
+	      reg_addr[DFmode].reload_load     = CODE_FOR_reload_df_si_load;
+	      reg_addr[DFmode].scalar_in_vmx_p = true;
+	      reg_addr[DDmode].reload_store    = CODE_FOR_reload_dd_si_store;
+	      reg_addr[DDmode].reload_load     = CODE_FOR_reload_dd_si_load;
 	    }
 	  if (TARGET_P8_VECTOR)
 	    {
@@ -2805,6 +2829,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	      reg_addr[SFmode].reload_load   = CODE_FOR_reload_sf_si_load;
 	      reg_addr[SDmode].reload_store  = CODE_FOR_reload_sd_si_store;
 	      reg_addr[SDmode].reload_load   = CODE_FOR_reload_sd_si_load;
+	      if (TARGET_UPPER_REGS_SF)
+		reg_addr[SFmode].scalar_in_vmx_p = true;
 	    }
 	  if (TARGET_VSX_TIMODE)
 	    {
@@ -17216,7 +17242,14 @@  rs6000_preferred_reload_class (rtx x, en
      prefer Altivec loads..  */
   if (rclass == VSX_REGS)
     {
-      if (GET_MODE_SIZE (mode) <= 8)
+      if (MEM_P (x) && reg_addr[mode].scalar_in_vmx_p)
+	{
+	  rtx addr = XEXP (x, 0);
+	  if (rs6000_legitimate_offset_address_p (mode, addr, false, true)
+	      || legitimate_lo_sum_address_p (mode, addr, false))
+	    return FLOAT_REGS;
+	}
+      else if (GET_MODE_SIZE (mode) <= 8 && !reg_addr[mode].scalar_in_vmx_p)
 	return FLOAT_REGS;
 
       if (VECTOR_UNIT_ALTIVEC_P (mode) || VECTOR_MEM_ALTIVEC_P (mode)
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 213676)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -86,19 +86,26 @@  (define_mode_attr VSr	[(V16QI "v")
 			 (V4SF  "wf")
 			 (V2DI  "wd")
 			 (V2DF  "wd")
+			 (DI	"wi")
 			 (DF    "ws")
-			 (SF	"d")
+			 (SF	"ww")
 			 (V1TI  "v")
 			 (TI    "wt")])
 
-;; Map the register class used for float<->int conversions
+;; Map the register class used for float<->int conversions (floating point side)
+;; VSr2 is the preferred register class, VSr3 is any register class that will
+;; hold the data
 (define_mode_attr VSr2	[(V2DF  "wd")
 			 (V4SF  "wf")
-			 (DF    "ws")])
+			 (DF    "ws")
+			 (SF	"ww")
+			 (DI	"wi")])
 
 (define_mode_attr VSr3	[(V2DF  "wa")
 			 (V4SF  "wa")
-			 (DF    "ws")])
+			 (DF    "ws")
+			 (SF	"ww")
+			 (DI	"wi")])
 
 ;; Map the register class for sp<->dp float conversions, destination
 (define_mode_attr VSr4	[(SF	"ws")
@@ -106,12 +113,27 @@  (define_mode_attr VSr4	[(SF	"ws")
 			 (V2DF  "wd")
 			 (V4SF	"v")])
 
-;; Map the register class for sp<->dp float conversions, destination
+;; Map the register class for sp<->dp float conversions, source
 (define_mode_attr VSr5	[(SF	"ws")
 			 (DF	"f")
 			 (V2DF  "v")
 			 (V4SF	"wd")])
 
+;; The VSX register class that a type can occupy, even if it is not the
+;; preferred register class (VSr is the preferred register class that will get
+;; allocated first).
+(define_mode_attr VSa	[(V16QI "wa")
+			 (V8HI  "wa")
+			 (V4SI  "wa")
+			 (V4SF  "wa")
+			 (V2DI  "wa")
+			 (V2DF  "wa")
+			 (DI	"wi")
+			 (DF    "ws")
+			 (SF	"ww")
+			 (V1TI	"wa")
+			 (TI    "wt")])
+
 ;; Same size integer type for floating point data
 (define_mode_attr VSi [(V4SF  "v4si")
 		       (V2DF  "v2di")
@@ -207,6 +229,16 @@  (define_mode_attr VS_double [(V4SI	"V8SI
 			     (V2DF	"V4DF")
 			     (V1TI	"V2TI")])
 
+;; Map register class for 64-bit element in 128-bit vector for direct moves
+;; to/from gprs
+(define_mode_attr VS_64dm [(V2DF	"wk")
+			   (V2DI	"wj")])
+
+;; Map register class for 64-bit element in 128-bit vector for normal register
+;; to register moves
+(define_mode_attr VS_64reg [(V2DF	"ws")
+			    (V2DI	"wi")])
+
 ;; Constants for creating unspecs
 (define_c_enum "unspec"
   [UNSPEC_VSX_CONCAT
@@ -235,7 +267,7 @@  (define_c_enum "unspec"
 ;; The patterns for LE permuted loads and stores come before the general
 ;; VSX moves so they match first.
 (define_insn_and_split "*vsx_le_perm_load_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=wa")
+  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
         (match_operand:VSX_LE 1 "memory_operand" "Z"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX"
   "#"
@@ -258,7 +290,7 @@  (define_insn_and_split "*vsx_le_perm_loa
    (set_attr "length" "8")])
 
 (define_insn_and_split "*vsx_le_perm_load_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=<VSa>")
         (match_operand:VSX_W 1 "memory_operand" "Z"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX"
   "#"
@@ -350,7 +382,7 @@  (define_insn_and_split "*vsx_le_perm_loa
 
 (define_insn "*vsx_le_perm_store_<mode>"
   [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-        (match_operand:VSX_LE 1 "vsx_register_operand" "+wa"))]
+        (match_operand:VSX_LE 1 "vsx_register_operand" "+<VSa>"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX"
   "#"
   [(set_attr "type" "vecstore")
@@ -395,7 +427,7 @@  (define_split
 
 (define_insn "*vsx_le_perm_store_<mode>"
   [(set (match_operand:VSX_W 0 "memory_operand" "=Z")
-        (match_operand:VSX_W 1 "vsx_register_operand" "+wa"))]
+        (match_operand:VSX_W 1 "vsx_register_operand" "+<VSa>"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX"
   "#"
   [(set_attr "type" "vecstore")
@@ -585,8 +617,8 @@  (define_split
 
 
 (define_insn "*vsx_mov<mode>"
-  [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,wQ,?&r,??Y,??r,??r,<VSr>,?wa,*r,v,wZ, v")
-	(match_operand:VSX_M 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,wQ,r,Y,r,j,j,j,W,v,wZ"))]
+  [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?<VSa>,?<VSa>,wQ,?&r,??Y,??r,??r,<VSr>,?<VSa>,*r,v,wZ, v")
+	(match_operand:VSX_M 1 "input_operand" "<VSr>,Z,<VSr>,<VSa>,Z,<VSa>,r,wQ,r,Y,r,j,j,j,W,v,wZ"))]
   "VECTOR_MEM_VSX_P (<MODE>mode)
    && (register_operand (operands[0], <MODE>mode) 
        || register_operand (operands[1], <MODE>mode))"
@@ -689,36 +721,36 @@  (define_expand "vsx_store_<mode>"
 ;; instructions are now combined with the insn for the traditional floating
 ;; point unit.
 (define_insn "*vsx_add<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvadd<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_sub<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (minus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		     (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (minus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		     (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvsub<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_mul<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (mult:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (mult:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvmul<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_mul>")])
 
 (define_insn "*vsx_div<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (div:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (div:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		   (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvdiv<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_div>")
@@ -754,8 +786,8 @@  (define_expand "vsx_tdiv<mode>3_fe"
 
 (define_insn "*vsx_tdiv<mode>3_internal"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=x,x")
-	(unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		      (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")]
+	(unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSa>")
+		      (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,<VSa>")]
 		   UNSPEC_VSX_TDIV))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>tdiv<VSs> %0,%x1,%x2"
@@ -763,8 +795,8 @@  (define_insn "*vsx_tdiv<mode>3_internal"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_fre<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_FRES))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvre<VSs> %x0,%x1"
@@ -772,60 +804,60 @@  (define_insn "vsx_fre<mode>2"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_neg<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (neg:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (neg:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvneg<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_abs<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (abs:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (abs:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvabs<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_nabs<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
         (neg:VSX_F
 	 (abs:VSX_F
-	  (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa"))))]
+	  (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvnabs<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_smax<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (smax:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (smax:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvmax<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_smin<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (smin:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (smin:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvmin<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_sqrt<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-        (sqrt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+        (sqrt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvsqrt<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_sqrt>")
    (set_attr "fp_type" "<VSfptype_sqrt>")])
 
 (define_insn "*vsx_rsqrte<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_RSQRT))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvrsqrte<VSs> %x0,%x1"
@@ -860,7 +892,7 @@  (define_expand "vsx_tsqrt<mode>2_fe"
 
 (define_insn "*vsx_tsqrt<mode>2_internal"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=x,x")
-	(unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+	(unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		     UNSPEC_VSX_TSQRT))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>tsqrt<VSs> %0,%x1"
@@ -902,12 +934,12 @@  (define_insn "*vsx_fmav2df4"
   [(set_attr "type" "vecdouble")])
 
 (define_insn "*vsx_fms<mode>4"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa")
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?<VSa>,?<VSa>")
 	(fma:VSX_F
-	  (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
-	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0,wa,0")
+	  (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>,<VSa>,<VSa>")
+	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0,<VSa>,0")
 	  (neg:VSX_F
-	    (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,wa"))))]
+	    (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,<VSa>"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "@
    xvmsuba<VSs> %x0,%x1,%x2
@@ -917,12 +949,12 @@  (define_insn "*vsx_fms<mode>4"
   [(set_attr "type" "<VStype_mul>")])
 
 (define_insn "*vsx_nfma<mode>4"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa")
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?<VSa>,?<VSa>")
 	(neg:VSX_F
 	 (fma:VSX_F
-	  (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSr>,wa,wa")
-	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0,wa,0")
-	  (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,wa"))))]
+	  (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSr>,<VSa>,<VSa>")
+	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0,<VSa>,0")
+	  (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,<VSa>"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "@
    xvnmadda<VSs> %x0,%x1,%x2
@@ -967,27 +999,27 @@  (define_insn "*vsx_nfmsv2df4"
 
 ;; Vector conditional expressions (no scalar version for these instructions)
 (define_insn "vsx_eq<mode>"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(eq:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(eq:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpeq<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_gt<mode>"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(gt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(gt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpgt<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_ge<mode>"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(ge:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-		  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(ge:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+		  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpge<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
@@ -998,10 +1030,10 @@  (define_insn "*vsx_ge<mode>"
 (define_insn "*vsx_eq_<mode>_p"
   [(set (reg:CC 74)
 	(unspec:CC
-	 [(eq:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?wa")
-		 (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?wa"))]
+	 [(eq:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?<VSa>")
+		 (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?<VSa>"))]
 	 UNSPEC_PREDICATE))
-   (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+   (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(eq:VSX_F (match_dup 1)
 		  (match_dup 2)))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
@@ -1011,10 +1043,10 @@  (define_insn "*vsx_eq_<mode>_p"
 (define_insn "*vsx_gt_<mode>_p"
   [(set (reg:CC 74)
 	(unspec:CC
-	 [(gt:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?wa")
-		 (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?wa"))]
+	 [(gt:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?<VSa>")
+		 (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?<VSa>"))]
 	 UNSPEC_PREDICATE))
-   (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+   (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(gt:VSX_F (match_dup 1)
 		  (match_dup 2)))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
@@ -1024,10 +1056,10 @@  (define_insn "*vsx_gt_<mode>_p"
 (define_insn "*vsx_ge_<mode>_p"
   [(set (reg:CC 74)
 	(unspec:CC
-	 [(ge:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?wa")
-		 (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?wa"))]
+	 [(ge:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?<VSa>")
+		 (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?<VSa>"))]
 	 UNSPEC_PREDICATE))
-   (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+   (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(ge:VSX_F (match_dup 1)
 		  (match_dup 2)))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
@@ -1036,33 +1068,33 @@  (define_insn "*vsx_ge_<mode>_p"
 
 ;; Vector select
 (define_insn "*vsx_xxsel<mode>"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(if_then_else:VSX_L
-	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,<VSa>")
 		(match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
+	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,<VSa>")
+	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxsel %x0,%x3,%x2,%x1"
   [(set_attr "type" "vecperm")])
 
 (define_insn "*vsx_xxsel<mode>_uns"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(if_then_else:VSX_L
-	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,<VSa>")
 		   (match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
+	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,<VSa>")
+	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxsel %x0,%x3,%x2,%x1"
   [(set_attr "type" "vecperm")])
 
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(unspec:VSX_F
-	 [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
-	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")]
+	 [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")
+	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,<VSa>")]
 	 UNSPEC_COPYSIGN))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcpsgn<VSs> %x0,%x2,%x1"
@@ -1075,7 +1107,7 @@  (define_insn "vsx_copysign<mode>3"
 ;; in rs6000.md so don't test VECTOR_UNIT_VSX_P, just test against VSX.
 ;; Don't use vsx_register_operand here, use gpc_reg_operand to match rs6000.md.
 (define_insn "vsx_float<VSi><mode>2"
-  [(set (match_operand:VSX_B 0 "gpc_reg_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_B 0 "gpc_reg_operand" "=<VSr>,?<VSa>")
 	(float:VSX_B (match_operand:<VSI> 1 "gpc_reg_operand" "<VSr2>,<VSr3>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>cvsx<VSc><VSs> %x0,%x1"
@@ -1083,7 +1115,7 @@  (define_insn "vsx_float<VSi><mode>2"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_floatuns<VSi><mode>2"
-  [(set (match_operand:VSX_B 0 "gpc_reg_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_B 0 "gpc_reg_operand" "=<VSr>,?<VSa>")
 	(unsigned_float:VSX_B (match_operand:<VSI> 1 "gpc_reg_operand" "<VSr2>,<VSr3>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>cvux<VSc><VSs> %x0,%x1"
@@ -1092,7 +1124,7 @@  (define_insn "vsx_floatuns<VSi><mode>2"
 
 (define_insn "vsx_fix_trunc<mode><VSi>2"
   [(set (match_operand:<VSI> 0 "gpc_reg_operand" "=<VSr2>,?<VSr3>")
-	(fix:<VSI> (match_operand:VSX_B 1 "gpc_reg_operand" "<VSr>,wa")))]
+	(fix:<VSI> (match_operand:VSX_B 1 "gpc_reg_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>cv<VSs>sx<VSc>s %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
@@ -1100,7 +1132,7 @@  (define_insn "vsx_fix_trunc<mode><VSi>2"
 
 (define_insn "vsx_fixuns_trunc<mode><VSi>2"
   [(set (match_operand:<VSI> 0 "gpc_reg_operand" "=<VSr2>,?<VSr3>")
-	(unsigned_fix:<VSI> (match_operand:VSX_B 1 "gpc_reg_operand" "<VSr>,wa")))]
+	(unsigned_fix:<VSI> (match_operand:VSX_B 1 "gpc_reg_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>cv<VSs>ux<VSc>s %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
@@ -1108,8 +1140,8 @@  (define_insn "vsx_fixuns_trunc<mode><VSi
 
 ;; Math rounding functions
 (define_insn "vsx_x<VSv>r<VSs>i"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_VSX_ROUND_I))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>r<VSs>i %x0,%x1"
@@ -1117,8 +1149,8 @@  (define_insn "vsx_x<VSv>r<VSs>i"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_x<VSv>r<VSs>ic"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_VSX_ROUND_IC))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>r<VSs>ic %x0,%x1"
@@ -1126,16 +1158,16 @@  (define_insn "vsx_x<VSv>r<VSs>ic"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_btrunc<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(fix:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(fix:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvr<VSs>iz %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_b2trunc<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_FRIZ))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "x<VSv>r<VSs>iz %x0,%x1"
@@ -1143,8 +1175,8 @@  (define_insn "*vsx_b2trunc<mode>2"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_floor<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_FRIM))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvr<VSs>im %x0,%x1"
@@ -1152,8 +1184,8 @@  (define_insn "vsx_floor<mode>2"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_ceil<mode>2"
-  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?<VSa>")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,<VSa>")]
 		      UNSPEC_FRIP))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvr<VSs>ip %x0,%x1"
@@ -1168,8 +1200,8 @@  (define_insn "vsx_ceil<mode>2"
 ;; scalar single precision instructions internally use the double format.
 ;; Prefer the altivec registers, since we likely will need to do a vperm
 (define_insn "vsx_<VS_spdp_insn>"
-  [(set (match_operand:<VS_spdp_res> 0 "vsx_register_operand" "=<VSr4>,?wa")
-	(unspec:<VS_spdp_res> [(match_operand:VSX_SPDP 1 "vsx_register_operand" "<VSr5>,wa")]
+  [(set (match_operand:<VS_spdp_res> 0 "vsx_register_operand" "=<VSr4>,?<VSa>")
+	(unspec:<VS_spdp_res> [(match_operand:VSX_SPDP 1 "vsx_register_operand" "<VSr5>,<VSa>")]
 			      UNSPEC_VSX_CVSPDP))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "<VS_spdp_insn> %x0,%x1"
@@ -1177,8 +1209,8 @@  (define_insn "vsx_<VS_spdp_insn>"
 
 ;; xscvspdp, represent the scalar SF type as V4SF
 (define_insn "vsx_xscvspdp"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,?wa")
-	(unspec:DF [(match_operand:V4SF 1 "vsx_register_operand" "wa,wa")]
+  [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+	(unspec:DF [(match_operand:V4SF 1 "vsx_register_operand" "wa")]
 		   UNSPEC_VSX_CVSPDP))]
   "VECTOR_UNIT_VSX_P (V4SFmode)"
   "xscvspdp %x0,%x1"
@@ -1205,7 +1237,7 @@  (define_insn "vsx_xscvspdp_scalar2"
 
 ;; ISA 2.07 xscvdpspn/xscvspdpn that does not raise an error on signalling NaNs
 (define_insn "vsx_xscvdpspn"
-  [(set (match_operand:V4SF 0 "vsx_register_operand" "=ws,?wa")
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=ww,?ww")
 	(unspec:V4SF [(match_operand:DF 1 "vsx_register_operand" "wd,wa")]
 		     UNSPEC_VSX_CVDPSPN))]
   "TARGET_XSCVDPSPN"
@@ -1213,16 +1245,16 @@  (define_insn "vsx_xscvdpspn"
   [(set_attr "type" "fp")])
 
 (define_insn "vsx_xscvspdpn"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,?wa")
-	(unspec:DF [(match_operand:V4SF 1 "vsx_register_operand" "wa,wa")]
+  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,?ws")
+	(unspec:DF [(match_operand:V4SF 1 "vsx_register_operand" "wf,wa")]
 		   UNSPEC_VSX_CVSPDPN))]
   "TARGET_XSCVSPDPN"
   "xscvspdpn %x0,%x1"
   [(set_attr "type" "fp")])
 
 (define_insn "vsx_xscvdpspn_scalar"
-  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
-	(unspec:V4SF [(match_operand:SF 1 "vsx_register_operand" "f")]
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,?wa")
+	(unspec:V4SF [(match_operand:SF 1 "vsx_register_operand" "ww,ww")]
 		     UNSPEC_VSX_CVDPSPN))]
   "TARGET_XSCVDPSPN"
   "xscvdpspn %x0,%x1"
@@ -1310,10 +1342,10 @@  (define_insn "vsx_xvcvspuxds"
 ;; since the xsrdpiz instruction does not truncate the value if the floating
 ;; point value is < LONG_MIN or > LONG_MAX.
 (define_insn "*vsx_float_fix_<mode>2"
-  [(set (match_operand:VSX_DF 0 "vsx_register_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_DF 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(float:VSX_DF
 	 (fix:<VSI>
-	  (match_operand:VSX_DF 1 "vsx_register_operand" "<VSr>,?wa"))))]
+	  (match_operand:VSX_DF 1 "vsx_register_operand" "<VSr>,?<VSa>"))))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
    && VECTOR_UNIT_VSX_P (<MODE>mode) && flag_unsafe_math_optimizations
    && !flag_trapping_math && TARGET_FRIZ"
@@ -1326,10 +1358,10 @@  (define_insn "*vsx_float_fix_<mode>2"
 
 ;; Build a V2DF/V2DI vector from two scalars
 (define_insn "vsx_concat_<mode>"
-  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSr>,?wa")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(vec_concat:VSX_D
-	 (match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,wa")
-	 (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")))]
+	 (match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,<VSa>")
+	 (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,<VSa>")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
   if (BYTES_BIG_ENDIAN)
@@ -1360,18 +1392,18 @@  (define_insn "vsx_concat_v2sf"
 ;; xxpermdi for little endian loads and stores.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_xxpermdi2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=wa")
+  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
         (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "vsx_register_operand" "wa")
+          (match_operand:VSX_LE 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxpermdi %x0,%x1,%x1,2"
   [(set_attr "type" "vecperm")])
 
 (define_insn "*vsx_xxpermdi4_le_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=<VSa>")
         (vec_select:VSX_W
-          (match_operand:VSX_W 1 "vsx_register_operand" "wa")
+          (match_operand:VSX_W 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 2) (const_int 3)
                      (const_int 0) (const_int 1)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode)"
@@ -1409,7 +1441,7 @@  (define_insn "*vsx_xxpermdi16_le_V16QI"
 ;; lxvd2x for little endian loads.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_lxvd2x2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=wa")
+  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
         (vec_select:VSX_LE
           (match_operand:VSX_LE 1 "memory_operand" "Z")
           (parallel [(const_int 1) (const_int 0)])))]
@@ -1418,7 +1450,7 @@  (define_insn "*vsx_lxvd2x2_le_<mode>"
   [(set_attr "type" "vecload")])
 
 (define_insn "*vsx_lxvd2x4_le_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=<VSa>")
         (vec_select:VSX_W
           (match_operand:VSX_W 1 "memory_operand" "Z")
           (parallel [(const_int 2) (const_int 3)
@@ -1460,7 +1492,7 @@  (define_insn "*vsx_lxvd2x16_le_V16QI"
 (define_insn "*vsx_stxvd2x2_le_<mode>"
   [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
         (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "vsx_register_operand" "wa")
+          (match_operand:VSX_LE 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode)"
   "stxvd2x %x1,%y0"
@@ -1469,7 +1501,7 @@  (define_insn "*vsx_stxvd2x2_le_<mode>"
 (define_insn "*vsx_stxvd2x4_le_<mode>"
   [(set (match_operand:VSX_W 0 "memory_operand" "=Z")
         (vec_select:VSX_W
-          (match_operand:VSX_W 1 "vsx_register_operand" "wa")
+          (match_operand:VSX_W 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 2) (const_int 3)
                      (const_int 0) (const_int 1)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode)"
@@ -1521,11 +1553,12 @@  (define_expand "vsx_set_v1ti"
 
 ;; Set the element of a V2DI/VD2F mode
 (define_insn "vsx_set_<mode>"
-  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
-	(unspec:VSX_D [(match_operand:VSX_D 1 "vsx_register_operand" "wd,wa")
-		       (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")
-		       (match_operand:QI 3 "u5bit_cint_operand" "i,i")]
-		      UNSPEC_VSX_SET))]
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?<VSa>")
+	(unspec:VSX_D
+	 [(match_operand:VSX_D 1 "vsx_register_operand" "wd,<VSa>")
+	  (match_operand:<VS_scalar> 2 "vsx_register_operand" "<VS_64reg>,<VSa>")
+	  (match_operand:QI 3 "u5bit_cint_operand" "i,i")]
+	 UNSPEC_VSX_SET))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
   int idx_first = BYTES_BIG_ENDIAN ? 0 : 1;
@@ -1550,11 +1583,11 @@  (define_expand "vsx_extract_<mode>"
 ;; Optimize cases were we can do a simple or direct move.
 ;; Or see if we can avoid doing the move at all
 (define_insn "*vsx_extract_<mode>_internal1"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "=d,ws,?wa,r")
+  [(set (match_operand:<VS_scalar> 0 "register_operand" "=d,<VS_64reg>,r")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "register_operand" "d,wd,wa,wm")
+	 (match_operand:VSX_D 1 "register_operand" "d,<VS_64reg>,<VS_64dm>")
 	 (parallel
-	  [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD,wD")])))]
+	  [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD")])))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
 {
   int op0_regno = REGNO (operands[0]);
@@ -1571,14 +1604,14 @@  (define_insn "*vsx_extract_<mode>_intern
 
   return "xxlor %x0,%x1,%x1";
 }
-  [(set_attr "type" "fp,vecsimple,vecsimple,mftgpr")
+  [(set_attr "type" "fp,vecsimple,mftgpr")
    (set_attr "length" "4")])
 
 (define_insn "*vsx_extract_<mode>_internal2"
-  [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=d,ws,ws,?wa")
+  [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=d,<VS_64reg>,<VS_64reg>")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd,wa")
-	 (parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i,i")])))]
+	 (match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd")
+	 (parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i")])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)
    && (!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE
        || INTVAL (operands[2]) != VECTOR_ELEMENT_SCALAR_64BIT)"
@@ -1606,7 +1639,7 @@  (define_insn "*vsx_extract_<mode>_intern
   operands[3] = GEN_INT (fldDM);
   return "xxpermdi %x0,%x1,%x1,%3";
 }
-  [(set_attr "type" "fp,vecsimple,vecperm,vecperm")
+  [(set_attr "type" "fp,vecsimple,vecperm")
    (set_attr "length" "4")])
 
 ;; Optimize extracting a single scalar element from memory if the scalar is in
@@ -1629,7 +1662,7 @@  (define_insn "*vsx_extract_<mode>_load"
 (define_insn "*vsx_extract_<mode>_store"
   [(set (match_operand:<VS_scalar> 0 "memory_operand" "=m,Z,?Z")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "register_operand" "d,wd,wa")
+	 (match_operand:VSX_D 1 "register_operand" "d,wd,<VSa>")
 	 (parallel [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD")])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "@
@@ -1643,7 +1676,7 @@  (define_insn "*vsx_extract_<mode>_store"
 (define_insn_and_split "vsx_extract_v4sf"
   [(set (match_operand:SF 0 "vsx_register_operand" "=f,f")
 	(vec_select:SF
-	 (match_operand:V4SF 1 "vsx_register_operand" "wa,wa")
+	 (match_operand:V4SF 1 "vsx_register_operand" "<VSa>,<VSa>")
 	 (parallel [(match_operand:QI 2 "u5bit_cint_operand" "O,i")])))
    (clobber (match_scratch:V4SF 3 "=X,0"))]
   "VECTOR_UNIT_VSX_P (V4SFmode)"
@@ -1826,9 +1859,9 @@  (define_expand "vsx_mergeh_<mode>"
 
 ;; V2DF/V2DI splat
 (define_insn "vsx_splat_<mode>"
-  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,wd,wd,?<VSa>,?<VSa>,?<VSa>")
 	(vec_duplicate:VSX_D
-	 (match_operand:<VS_scalar> 1 "splat_input_operand" "ws,f,Z,wa,wa,Z")))]
+	 (match_operand:<VS_scalar> 1 "splat_input_operand" "<VS_64reg>,f,Z,<VSa>,<VSa>,Z")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "@
    xxpermdi %x0,%x1,%x1,0
@@ -1841,10 +1874,10 @@  (define_insn "vsx_splat_<mode>"
 
 ;; V4SF/V4SI splat
 (define_insn "vsx_xxspltw_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?<VSa>")
 	(vec_duplicate:VSX_W
 	 (vec_select:<VS_scalar>
-	  (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+	  (match_operand:VSX_W 1 "vsx_register_operand" "wf,<VSa>")
 	  (parallel
 	   [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
@@ -1857,8 +1890,8 @@  (define_insn "vsx_xxspltw_<mode>"
   [(set_attr "type" "vecperm")])
 
 (define_insn "vsx_xxspltw_<mode>_direct"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
-        (unspec:VSX_W [(match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?<VSa>")
+        (unspec:VSX_W [(match_operand:VSX_W 1 "vsx_register_operand" "wf,<VSa>")
                        (match_operand:QI 2 "u5bit_cint_operand" "i,i")]
                       UNSPEC_VSX_XXSPLTW))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
@@ -1867,11 +1900,11 @@  (define_insn "vsx_xxspltw_<mode>_direct"
 
 ;; V4SF/V4SI interleave
 (define_insn "vsx_xxmrghw_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?<VSa>")
         (vec_select:VSX_W
 	  (vec_concat:<VS_double>
-	    (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
-	    (match_operand:VSX_W 2 "vsx_register_operand" "wf,wa"))
+	    (match_operand:VSX_W 1 "vsx_register_operand" "wf,<VSa>")
+	    (match_operand:VSX_W 2 "vsx_register_operand" "wf,<VSa>"))
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 1) (const_int 5)])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
@@ -1884,11 +1917,11 @@  (define_insn "vsx_xxmrghw_<mode>"
   [(set_attr "type" "vecperm")])
 
 (define_insn "vsx_xxmrglw_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?<VSa>")
 	(vec_select:VSX_W
 	  (vec_concat:<VS_double>
-	    (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
-	    (match_operand:VSX_W 2 "vsx_register_operand" "wf,?wa"))
+	    (match_operand:VSX_W 1 "vsx_register_operand" "wf,<VSa>")
+	    (match_operand:VSX_W 2 "vsx_register_operand" "wf,?<VSa>"))
 	  (parallel [(const_int 2) (const_int 6)
 		     (const_int 3) (const_int 7)])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
@@ -1902,9 +1935,9 @@  (define_insn "vsx_xxmrglw_<mode>"
 
 ;; Shift left double by word immediate
 (define_insn "vsx_xxsldwi_<mode>"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=wa")
-	(unspec:VSX_L [(match_operand:VSX_L 1 "vsx_register_operand" "wa")
-		       (match_operand:VSX_L 2 "vsx_register_operand" "wa")
+  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSa>")
+	(unspec:VSX_L [(match_operand:VSX_L 1 "vsx_register_operand" "<VSa>")
+		       (match_operand:VSX_L 2 "vsx_register_operand" "<VSa>")
 		       (match_operand:QI 3 "u5bit_cint_operand" "i")]
 		      UNSPEC_VSX_SLDWI))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 213676)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1477,6 +1477,10 @@  enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_wd,		/* VSX register for V2DF */
   RS6000_CONSTRAINT_wf,		/* VSX register for V4SF */
   RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
+  RS6000_CONSTRAINT_wh,		/* FPR register for direct moves.  */
+  RS6000_CONSTRAINT_wi,		/* FPR/VSX register to hold DImode */
+  RS6000_CONSTRAINT_wj,		/* FPR/VSX register for DImode direct moves. */
+  RS6000_CONSTRAINT_wk,		/* FPR/VSX register for DFmode direct moves. */
   RS6000_CONSTRAINT_wl,		/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_wm,		/* VSX register for direct move */
   RS6000_CONSTRAINT_wr,		/* GPR register if 64-bit  */
@@ -1501,6 +1505,9 @@  extern enum reg_class rs6000_constraints
 #define VSX_REG_CLASS_P(CLASS)			\
   ((CLASS) == VSX_REGS || (CLASS) == FLOAT_REGS || (CLASS) == ALTIVEC_REGS)
 
+/* Return whether a given register class targets general purpose registers.  */
+#define GPR_REG_CLASS_P(CLASS) ((CLASS) == GENERAL_REGS || (CLASS) == BASE_REGS)
+
 /* Given an rtx X being reloaded into a reg required to be
    in class CLASS, return the class of reg to actually use.
    In general this is just CLASS; but on some machines
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 213676)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -394,7 +394,7 @@  (define_mode_attr f32_si [(SF "stfs%U0%X
 (define_mode_attr f32_sv [(SF "stxsspx %x1,%y0")  (SD "stxsiwzx %x1,%y0")])
 
 ; Definitions for 32-bit fpr direct move
-(define_mode_attr f32_dm [(SF "wn") (SD "wm")])
+(define_mode_attr f32_dm [(SF "wn") (SD "wh")])
 
 ; These modes do not fit in integer registers in 32-bit mode.
 ; but on e500v2, the gpr are 64 bit registers
@@ -638,7 +638,7 @@  (define_split
   "")
 
 (define_insn "*zero_extendsidi2_lfiwzx"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wz,!wu")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wj,!wz,!wu")
 	(zero_extend:DI (match_operand:SI 1 "reg_or_mem_operand" "m,r,r,Z,Z")))]
   "TARGET_POWERPC64 && TARGET_LFIWZX"
   "@
@@ -790,7 +790,7 @@  (define_expand "extendsidi2"
   "")
 
 (define_insn "*extendsidi2_lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wl,!wu")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wj,!wl,!wu")
 	(sign_extend:DI (match_operand:SI 1 "lwa_operand" "Y,r,r,Z,Z")))]
   "TARGET_POWERPC64 && TARGET_LFIWAX"
   "@
@@ -5631,7 +5631,7 @@  (define_insn "*fselsfdf4"
 ; We don't define lfiwax/lfiwzx with the normal definition, because we
 ; don't want to support putting SImode in FPR registers.
 (define_insn "lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wm,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi,!wj")
 	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")]
 		   UNSPEC_LFIWAX))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX"
@@ -5711,7 +5711,7 @@  (define_insn_and_split "floatsi<mode>2_l
    (set_attr "type" "fpload")])
 
 (define_insn "lfiwzx"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wm,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi,!wj")
 	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")]
 		   UNSPEC_LFIWZX))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX"
@@ -8962,8 +8962,8 @@  (define_insn "*mov<mode>_softfloat32"
 ; ld/std require word-aligned displacements -> 'Y' constraint.
 ; List Y->r and r->Y before r->r for reload.
 (define_insn "*mov<mode>_hardfloat64"
-  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,wv,Z,wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg,r,wm")
-	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,wv,wa,j,r,Y,r,r,h,0,G,H,F,wg,r,wm,r"))]
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,wv,Z,wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg,r,wk")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,wv,wa,j,r,Y,r,r,h,0,G,H,F,wg,r,wk,r"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9652,8 +9652,8 @@  (define_split
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
 (define_insn "*movdi_internal64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wm")
-	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wm,r"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wj")
+	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wj,r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"

Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 213676)
+++ gcc/doc/md.texi	(working copy)
@@ -2132,6 +2132,18 @@  VSX vector register to hold vector float
 @item wg
 If @option{-mmfpgpr} was used, a floating point register or NO_REGS.
 
+@item wh
+Floating point register if direct moves are available, or NO_REGS.
+
+@item wi
+FP or VSX register to hold 64-bit integers or NO_REGS.
+
+@item wj
+FP or VSX register to hold 64-bit integers for direct moves or NO_REGS.
+
+@item wk
+FP or VSX register to hold 64-bit doubles for direct moves or NO_REGS.
+
 @item wl
 Floating point register if the LFIWAX instruction is enabled or NO_REGS.
 
@@ -2163,7 +2175,7 @@  FP or VSX register to perform float oper
 Floating point register if the STFIWX instruction is enabled or NO_REGS.
 
 @item wy
-VSX vector register to hold scalar float values or NO_REGS.
+FP or VSX register to perform ISA 2.07 float ops or NO_REGS.
 
 @item wz
 Floating point register if the LFIWZX instruction is enabled or NO_REGS.