Message ID | 3bd3efa2-14af-1622-bcb9-d583137f7d5e@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [RS6000] Fix PR87870: ppc64 generates poor code when loading constants into TImode vars | expand |
On Tue, Nov 13, 2018 at 11:29:20AM -0600, Peter Bergner wrote: > PR87870 shows a problem loading simple constant values into TImode variables. > This is a regression ever since VSX was added and we started using the > vsx_mov<mode>_64bit pattern. We still get the correct code on trunk if we > compile with -mno-vsx, since we fall back to using the older mov<mode>_ppc64 > move pattern, which has an alternative "r" <- "n". > > Our current vsx_mov<mode>_64bit pattern currently has two alternatives for > loading constants into GPRs, one using "*r" <- "jwM" and "??r" <- "W". > These look redundant to me, since "W" contains support for both all-zero > constants (ie, "j") and all-one constants (ie, wM) as well as a few more. > My patch below consolidates them both and uses a new mode iterator that > uses "W" for the vector modes and "n" for TImode like mov<mode>_ppc64 > used. Why does the "r" have a "*"? Should it have been just a "?"? "W" is easy_vector_constant, which requires const_vector always; is that okay here? > I'll note I didn't change the vsx_mov<mode>_32bit pattern, since TImode > isn't supported with -m32. However, if you want, I could remove the > redundant "*r" <- "jwM" alternative there too? Yeah, please keep the patterns in synch. Segher
On 11/16/18 11:06 AM, Segher Boessenkool wrote: > Why does the "r" have a "*"? Should it have been just a "?"? Maybe Mike remembers why he added it? I'd guess it was probably to try and steer the vector modes away from GPRs. However, since that alternative was also suppose to handle TImode, "*" is probably not what we want for that. The constraint for the new combines alternative uses a mode iterator so that the vector modes use "??r" while TImode uses "r". I think that is what we want. > "W" is easy_vector_constant, which requires const_vector always; is that > okay here? Well, "wM" calls all_ones_constant which also requires const_vector, so if it isn't correct now, then it wasn't correct before my patch either. Since I'm using the (new) mode iterator <nW> and the "W" is only used for the vector modes, I think we're ok here, aren't we? >> I'll note I didn't change the vsx_mov<mode>_32bit pattern, since TImode >> isn't supported with -m32. However, if you want, I could remove the >> redundant "*r" <- "jwM" alternative there too? > > Yeah, please keep the patterns in synch. Ok, I'll do the same thing and retest. Peter
On Fri, Nov 16, 2018 at 01:33:58PM -0600, Peter Bergner wrote: > On 11/16/18 11:06 AM, Segher Boessenkool wrote: > > Why does the "r" have a "*"? Should it have been just a "?"? > > Maybe Mike remembers why he added it? I'd guess it was probably to > try and steer the vector modes away from GPRs. However, since that > alternative was also suppose to handle TImode, "*" is probably not > what we want for that. The constraint for the new combines alternative > uses a mode iterator so that the vector modes use "??r" while TImode > uses "r". I think that is what we want. I imagine when I made the original change we had a different move for TImode (and the -mvsx-timode hack) than for vector modes. And it was likely pre-LRA also.
On 11/16/18 11:06 AM, Segher Boessenkool wrote: > On Tue, Nov 13, 2018 at 11:29:20AM -0600, Peter Bergner wrote: >> I'll note I didn't change the vsx_mov<mode>_32bit pattern, since TImode >> isn't supported with -m32. However, if you want, I could remove the >> redundant "*r" <- "jwM" alternative there too? > > Yeah, please keep the patterns in synch. Ok, here is the patch with the vsx_mov<mod>_32bit pattern changed too. However, when I made the change below, the length attribute seems a little off. For *_64bit, we have a length of 4, but for *_32bit, we have a length of 32. The "4" looks correct for both *_64bit and *_32bit if we're loading an easy_vector_constant into one of the vector regs. For loading a TImode constant into a GPR, then it could be anything from 8 bytes to 40bytes (loading 0xdeadbeefcafebabefacefeedbaaaaaad) for -m64. Since TImode isn't supposrted in -m32 (yet?), who knows, probably it would be 16 bytes to ??? bytes. Should those sizes be updated too? If so, what should they be? The smallest, average or worst case lengths? I assume we could use another iterator to separate the vector lengths from the gpr lengths if we need to. Peter gcc/ PR target/87870 * config/rs6000/vsx.md (nW): New mode iterator. (vsx_mov<mode>_64bit): Use it. Remove redundant GPR 0/-1 alternative. gcc/testsuite/ PR target/87870 * gcc.target/powerpc/pr87870.c: New test. Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 265971) +++ gcc/config/rs6000/vsx.md (working copy) @@ -183,6 +183,18 @@ (define_mode_attr ??r [(V16QI "??r") (TF "??r") (TI "r")]) +;; A mode attribute used for 128-bit constant values. +(define_mode_attr nW [(V16QI "W") + (V8HI "W") + (V4SI "W") + (V4SF "W") + (V2DI "W") + (V2DF "W") + (V1TI "W") + (KF "W") + (TF "W") + (TI "n")]) + ;; Same size integer type for floating point data (define_mode_attr VSi [(V4SF "v4si") (V2DF "v2di") @@ -1193,17 +1205,17 @@ (define_insn_and_split "*xxspltib_<mode> ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW -;; VSX 0/-1 GPR 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) +;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) (define_insn "vsx_mov<mode>_64bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, <VSa>, <VSa>, r, we, ?wQ, ?&r, ??r, ??Y, <??r>, wo, v, - ?<VSa>, *r, v, ??r, wZ, v") + ?<VSa>, v, <??r>, wZ, v") (match_operand:VSX_M 1 "input_operand" "<VSa>, ZwO, <VSa>, we, r, r, wQ, Y, r, r, wE, jwM, - ?jwM, jwM, W, W, v, wZ"))] + ?jwM, W, <nW>, v, wZ"))] "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode) && (register_operand (operands[0], <MODE>mode) @@ -1214,25 +1226,25 @@ (define_insn "vsx_mov<mode>_64bit" [(set_attr "type" "vecstore, vecload, vecsimple, mffgpr, mftgpr, load, store, load, store, *, vecsimple, vecsimple, - vecsimple, *, *, *, vecstore, vecload") + vecsimple, *, *, vecstore, vecload") (set_attr "length" "4, 4, 4, 8, 4, 8, 8, 8, 8, 8, 4, 4, - 4, 8, 20, 20, 4, 4")]) + 4, 20, 20, 4, 4")]) ;; VSX store VSX load VSX move GPR load GPR store GPR move -;; XXSPLTIB VSPLTISW VSX 0/-1 GPR 0/-1 VMX const GPR const +;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const ;; LVX (VMX) STVX (VMX) (define_insn "*vsx_mov<mode>_32bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, <VSa>, <VSa>, ??r, ??Y, <??r>, - wo, v, ?<VSa>, *r, v, ??r, + wo, v, ?<VSa>, v, <??r>, wZ, v") (match_operand:VSX_M 1 "input_operand" "<VSa>, ZwO, <VSa>, Y, r, r, - wE, jwM, ?jwM, jwM, W, W, + wE, jwM, ?jwM, W, <nW>, v, wZ"))] "!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode) @@ -1243,12 +1255,12 @@ (define_insn "*vsx_mov<mode>_32bit" } [(set_attr "type" "vecstore, vecload, vecsimple, load, store, *, - vecsimple, vecsimple, vecsimple, *, *, *, + vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") (set_attr "length" "4, 4, 4, 16, 16, 16, - 4, 4, 4, 16, 20, 32, + 4, 4, 4, 20, 32, 4, 4")]) ;; Explicit load/store expanders for the builtin functions Index: gcc/testsuite/gcc.target/powerpc/pr87870.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr87870.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr87870.c (working copy) @@ -0,0 +1,28 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-options "-O2" } */ + +__int128 +test0 (void) +{ + return 0; +} + +__int128 +test1 (void) +{ + return 1; +} + +__int128 +test2 (void) +{ + return -1; +} + +__int128 +test3 (void) +{ + return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; +} + +/* { dg-final { scan-assembler-not {\mld\M} } } */
On Fri, Nov 16, 2018 at 01:33:58PM -0600, Peter Bergner wrote: > On 11/16/18 11:06 AM, Segher Boessenkool wrote: > > "W" is easy_vector_constant, which requires const_vector always; is that > > okay here? > > Well, "wM" calls all_ones_constant which also requires const_vector, so Does it? (define_predicate "all_ones_constant" (and (match_code "const_int,const_double,const_wide_int,const_vector") (match_test "op == CONSTM1_RTX (mode) && !FLOAT_MODE_P (mode)"))) And "j" is defined as (define_constraint "j" "Zero vector constant" (match_test "op == const0_rtx || op == CONST0_RTX (mode)")) > if it isn't correct now, then it wasn't correct before my patch either. > Since I'm using the (new) mode iterator <nW> and the "W" is only used > for the vector modes, I think we're ok here, aren't we? That's a good point yes. Okay. > >> I'll note I didn't change the vsx_mov<mode>_32bit pattern, since TImode > >> isn't supported with -m32. However, if you want, I could remove the > >> redundant "*r" <- "jwM" alternative there too? > > > > Yeah, please keep the patterns in synch. > > Ok, I'll do the same thing and retest. Could you also check (manually) that the compiler still handles things as it should here? Okay for trunk if it does. Thanks! Segher
On Fri, Nov 16, 2018 at 04:26:18PM -0600, Peter Bergner wrote: > On 11/16/18 11:06 AM, Segher Boessenkool wrote: > > On Tue, Nov 13, 2018 at 11:29:20AM -0600, Peter Bergner wrote: > >> I'll note I didn't change the vsx_mov<mode>_32bit pattern, since TImode > >> isn't supported with -m32. However, if you want, I could remove the > >> redundant "*r" <- "jwM" alternative there too? > > > > Yeah, please keep the patterns in synch. > > Ok, here is the patch with the vsx_mov<mod>_32bit pattern changed too. > > However, when I made the change below, the length attribute seems a > little off. For *_64bit, we have a length of 4, but for *_32bit, we > have a length of 32. The "4" looks correct for both *_64bit and *_32bit > if we're loading an easy_vector_constant into one of the vector regs. > For loading a TImode constant into a GPR, then it could be anything from > 8 bytes to 40bytes (loading 0xdeadbeefcafebabefacefeedbaaaaaad) for > -m64. Since TImode isn't supposrted in -m32 (yet?), who knows, probably > it would be 16 bytes to ??? bytes. > > Should those sizes be updated too? If so, what should they be? > The smallest, average or worst case lengths? I assume we could use > another iterator to separate the vector lengths from the gpr lengths > if we need to. Worst case. This is required for correctness. Segher
On 11/16/18 5:29 PM, Segher Boessenkool wrote: > On Fri, Nov 16, 2018 at 04:26:18PM -0600, Peter Bergner wrote: >> However, when I made the change below, the length attribute seems a >> little off. For *_64bit, we have a length of 4, but for *_32bit, we >> have a length of 32. The "4" looks correct for both *_64bit and *_32bit >> if we're loading an easy_vector_constant into one of the vector regs. >> For loading a TImode constant into a GPR, then it could be anything from >> 8 bytes to 40bytes (loading 0xdeadbeefcafebabefacefeedbaaaaaad) for >> -m64. Since TImode isn't supposrted in -m32 (yet?), who knows, probably >> it would be 16 bytes to ??? bytes. >> >> Should those sizes be updated too? If so, what should they be? >> The smallest, average or worst case lengths? I assume we could use >> another iterator to separate the vector lengths from the gpr lengths >> if we need to. > > Worst case. This is required for correctness. Ok, I looked into this and the point where we must have correct length info is in final assembly generation, so very very late. For the alternatives I'm changing, we're loading into GPR regs and these alternatives are always split (split2), so these length values are never used/seen at final assembly time. Given the above, I'm guessing we should probably go with the most common length value (ie, 8 for 64-bit and 16 for 32-bit)? The following patch implements that. Does this seem reasonable to you? Peter gcc/ PR target/87870 * config/rs6000/vsx.md (nW): New mode iterator. (vsx_mov<mode>_64bit): Use it. Remove redundant GPR 0/-1 alternative. Update length attribute for (<??r>, <nW>) alternative. (vsx_mov<mode>_32bit): Likewise. gcc/testsuite/ PR target/87870 * gcc.target/powerpc/pr87870.c: New test. Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 265971) +++ gcc/config/rs6000/vsx.md (working copy) @@ -183,6 +183,18 @@ (define_mode_attr ??r [(V16QI "??r") (TF "??r") (TI "r")]) +;; A mode attribute used for 128-bit constant values. +(define_mode_attr nW [(V16QI "W") + (V8HI "W") + (V4SI "W") + (V4SF "W") + (V2DI "W") + (V2DF "W") + (V1TI "W") + (KF "W") + (TF "W") + (TI "n")]) + ;; Same size integer type for floating point data (define_mode_attr VSi [(V4SF "v4si") (V2DF "v2di") @@ -1193,17 +1205,17 @@ (define_insn_and_split "*xxspltib_<mode> ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW -;; VSX 0/-1 GPR 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) +;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) (define_insn "vsx_mov<mode>_64bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, <VSa>, <VSa>, r, we, ?wQ, ?&r, ??r, ??Y, <??r>, wo, v, - ?<VSa>, *r, v, ??r, wZ, v") + ?<VSa>, v, <??r>, wZ, v") (match_operand:VSX_M 1 "input_operand" "<VSa>, ZwO, <VSa>, we, r, r, wQ, Y, r, r, wE, jwM, - ?jwM, jwM, W, W, v, wZ"))] + ?jwM, W, <nW>, v, wZ"))] "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode) && (register_operand (operands[0], <MODE>mode) @@ -1214,25 +1226,25 @@ (define_insn "vsx_mov<mode>_64bit" [(set_attr "type" "vecstore, vecload, vecsimple, mffgpr, mftgpr, load, store, load, store, *, vecsimple, vecsimple, - vecsimple, *, *, *, vecstore, vecload") + vecsimple, *, *, vecstore, vecload") (set_attr "length" "4, 4, 4, 8, 4, 8, 8, 8, 8, 8, 4, 4, - 4, 8, 20, 20, 4, 4")]) + 4, 20, 8, 4, 4")]) ;; VSX store VSX load VSX move GPR load GPR store GPR move -;; XXSPLTIB VSPLTISW VSX 0/-1 GPR 0/-1 VMX const GPR const +;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const ;; LVX (VMX) STVX (VMX) (define_insn "*vsx_mov<mode>_32bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, <VSa>, <VSa>, ??r, ??Y, <??r>, - wo, v, ?<VSa>, *r, v, ??r, + wo, v, ?<VSa>, v, <??r>, wZ, v") (match_operand:VSX_M 1 "input_operand" "<VSa>, ZwO, <VSa>, Y, r, r, - wE, jwM, ?jwM, jwM, W, W, + wE, jwM, ?jwM, W, <nW>, v, wZ"))] "!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode) @@ -1243,12 +1255,12 @@ (define_insn "*vsx_mov<mode>_32bit" } [(set_attr "type" "vecstore, vecload, vecsimple, load, store, *, - vecsimple, vecsimple, vecsimple, *, *, *, + vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") (set_attr "length" "4, 4, 4, 16, 16, 16, - 4, 4, 4, 16, 20, 32, + 4, 4, 4, 20, 16, 4, 4")]) ;; Explicit load/store expanders for the builtin functions Index: gcc/testsuite/gcc.target/powerpc/pr87870.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr87870.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr87870.c (working copy) @@ -0,0 +1,28 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-options "-O2" } */ + +__int128 +test0 (void) +{ + return 0; +} + +__int128 +test1 (void) +{ + return 1; +} + +__int128 +test2 (void) +{ + return -1; +} + +__int128 +test3 (void) +{ + return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; +} + +/* { dg-final { scan-assembler-not {\mld\M} } } */
On Thu, Dec 13, 2018 at 10:59:36AM -0600, Peter Bergner wrote: > On 11/16/18 5:29 PM, Segher Boessenkool wrote: > > On Fri, Nov 16, 2018 at 04:26:18PM -0600, Peter Bergner wrote: > >> However, when I made the change below, the length attribute seems a > >> little off. For *_64bit, we have a length of 4, but for *_32bit, we > >> have a length of 32. The "4" looks correct for both *_64bit and *_32bit > >> if we're loading an easy_vector_constant into one of the vector regs. > >> For loading a TImode constant into a GPR, then it could be anything from > >> 8 bytes to 40bytes (loading 0xdeadbeefcafebabefacefeedbaaaaaad) for > >> -m64. Since TImode isn't supposrted in -m32 (yet?), who knows, probably > >> it would be 16 bytes to ??? bytes. > >> > >> Should those sizes be updated too? If so, what should they be? > >> The smallest, average or worst case lengths? I assume we could use > >> another iterator to separate the vector lengths from the gpr lengths > >> if we need to. > > > > Worst case. This is required for correctness. > > Ok, I looked into this and the point where we must have correct length info > is in final assembly generation, so very very late. I'm not convinced this is true. And problems with this only show up with unusual code, so typically after releases :-( Maybe we could make a mode where conditional jumps can jump only 128 bytes or similar, that would make testing much easier (problems will show up much more often than with the 32kB max distance we have). > For the alternatives > I'm changing, we're loading into GPR regs and these alternatives are always > split (split2), so these length values are never used/seen at final assembly > time. But some move instructions are created *after* split2. > Given the above, I'm guessing we should probably go with the most common > length value (ie, 8 for 64-bit and 16 for 32-bit)? The following patch > implements that. Does this seem reasonable to you? I do like the patch. Let me sleep on it. Segher
On 12/14/18 8:23 PM, Segher Boessenkool wrote: > On Thu, Dec 13, 2018 at 10:59:36AM -0600, Peter Bergner wrote: >> For the alternatives >> I'm changing, we're loading into GPR regs and these alternatives are always >> split (split2), so these length values are never used/seen at final assembly >> time. > > But some move instructions are created *after* split2. That may be so, but I do not know how we could create a move insn using this alternative, since rs6000_output_move_128bit() does the following for this alternative (ie, loading a constant into a GPR): /* Constants. */ else if (dest_regno >= 0 && (GET_CODE (src) == CONST_INT || GET_CODE (src) == CONST_WIDE_INT || GET_CODE (src) == CONST_DOUBLE || GET_CODE (src) == CONST_VECTOR)) { if (dest_gpr_p) return "#"; This means we require the insn to eventually be split and not reach final assembly, does it not? Peter
On Mon, Dec 17, 2018 at 02:23:54PM -0600, Peter Bergner wrote: > On 12/14/18 8:23 PM, Segher Boessenkool wrote: > > On Thu, Dec 13, 2018 at 10:59:36AM -0600, Peter Bergner wrote: > >> For the alternatives > >> I'm changing, we're loading into GPR regs and these alternatives are always > >> split (split2), so these length values are never used/seen at final assembly > >> time. > > > > But some move instructions are created *after* split2. > > That may be so, but I do not know how we could create a move insn using > this alternative, since rs6000_output_move_128bit() does the following > for this alternative (ie, loading a constant into a GPR): > > > /* Constants. */ > else if (dest_regno >= 0 > && (GET_CODE (src) == CONST_INT > || GET_CODE (src) == CONST_WIDE_INT > || GET_CODE (src) == CONST_DOUBLE > || GET_CODE (src) == CONST_VECTOR)) > { > if (dest_gpr_p) > return "#"; > > This means we require the insn to eventually be split and not reach final > assembly, does it not? Yes, but is the length attribute used for something that matters before that? For correctness, not just for better code. It isn't clear to me that this will work. OTOH I cannot currently show it does not work either. So, okay for trunk, and I hope it actually works :-) Segher
On 12/17/18 3:55 PM, Segher Boessenkool wrote: > On Mon, Dec 17, 2018 at 02:23:54PM -0600, Peter Bergner wrote: >> This means we require the insn to eventually be split and not reach final >> assembly, does it not? > > Yes, but is the length attribute used for something that matters before > that? For correctness, not just for better code. It isn't clear to me > that this will work. > > OTOH I cannot currently show it does not work either. So, okay for trunk, > and I hope it actually works :-) Ok, committed. I'll try and keep an eye out for any fallout, but I'm fairly certain there won't be. Thanks! Peter
Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 265971) +++ gcc/config/rs6000/vsx.md (working copy) @@ -183,6 +183,18 @@ (define_mode_attr ??r [(V16QI "??r") (TF "??r") (TI "r")]) +;; A mode attribute used for 128-bit constant values. +(define_mode_attr nW [(V16QI "W") + (V8HI "W") + (V4SI "W") + (V4SF "W") + (V2DI "W") + (V2DF "W") + (V1TI "W") + (KF "W") + (TF "W") + (TI "n")]) + ;; Same size integer type for floating point data (define_mode_attr VSi [(V4SF "v4si") (V2DF "v2di") @@ -1193,17 +1205,17 @@ (define_insn_and_split "*xxspltib_<mode> ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW -;; VSX 0/-1 GPR 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) +;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) (define_insn "vsx_mov<mode>_64bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, <VSa>, <VSa>, r, we, ?wQ, ?&r, ??r, ??Y, <??r>, wo, v, - ?<VSa>, *r, v, ??r, wZ, v") + ?<VSa>, v, <??r>, wZ, v") (match_operand:VSX_M 1 "input_operand" "<VSa>, ZwO, <VSa>, we, r, r, wQ, Y, r, r, wE, jwM, - ?jwM, jwM, W, W, v, wZ"))] + ?jwM, W, <nW>, v, wZ"))] "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode) && (register_operand (operands[0], <MODE>mode) @@ -1214,12 +1226,12 @@ (define_insn "vsx_mov<mode>_64bit" [(set_attr "type" "vecstore, vecload, vecsimple, mffgpr, mftgpr, load, store, load, store, *, vecsimple, vecsimple, - vecsimple, *, *, *, vecstore, vecload") + vecsimple, *, *, vecstore, vecload") (set_attr "length" "4, 4, 4, 8, 4, 8, 8, 8, 8, 8, 4, 4, - 4, 8, 20, 20, 4, 4")]) + 4, 20, 20, 4, 4")]) ;; VSX store VSX load VSX move GPR load GPR store GPR move ;; XXSPLTIB VSPLTISW VSX 0/-1 GPR 0/-1 VMX const GPR const Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 265971) +++ gcc/config/rs6000/vsx.md (working copy) @@ -183,6 +183,18 @@ (define_mode_attr ??r [(V16QI "??r") (TF "??r") (TI "r")]) +;; A mode attribute used for 128-bit constant values. +(define_mode_attr nW [(V16QI "W") + (V8HI "W") + (V4SI "W") + (V4SF "W") + (V2DI "W") + (V2DF "W") + (V1TI "W") + (KF "W") + (TF "W") + (TI "n")]) + ;; Same size integer type for floating point data (define_mode_attr VSi [(V4SF "v4si") (V2DF "v2di") @@ -1193,17 +1205,17 @@ (define_insn_and_split "*xxspltib_<mode> ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW -;; VSX 0/-1 GPR 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) +;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) (define_insn "vsx_mov<mode>_64bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, <VSa>, <VSa>, r, we, ?wQ, ?&r, ??r, ??Y, <??r>, wo, v, - ?<VSa>, *r, v, ??r, wZ, v") + ?<VSa>, v, <??r>, wZ, v") (match_operand:VSX_M 1 "input_operand" "<VSa>, ZwO, <VSa>, we, r, r, wQ, Y, r, r, wE, jwM, - ?jwM, jwM, W, W, v, wZ"))] + ?jwM, W, <nW>, v, wZ"))] "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode) && (register_operand (operands[0], <MODE>mode) @@ -1214,12 +1226,12 @@ (define_insn "vsx_mov<mode>_64bit" [(set_attr "type" "vecstore, vecload, vecsimple, mffgpr, mftgpr, load, store, load, store, *, vecsimple, vecsimple, - vecsimple, *, *, *, vecstore, vecload") + vecsimple, *, *, vecstore, vecload") (set_attr "length" "4, 4, 4, 8, 4, 8, 8, 8, 8, 8, 4, 4, - 4, 8, 20, 20, 4, 4")]) + 4, 20, 20, 4, 4")]) ;; VSX store VSX load VSX move GPR load GPR store GPR move ;; XXSPLTIB VSPLTISW VSX 0/-1 GPR 0/-1 VMX const GPR const Index: gcc/testsuite/gcc.target/powerpc/pr87870.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr87870.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr87870.c (working copy) @@ -0,0 +1,28 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-options "-O2" } */ + +__int128 +test0 (void) +{ + return 0; +} + +__int128 +test1 (void) +{ + return 1; +} + +__int128 +test2 (void) +{ + return -1; +} + +__int128 +test3 (void) +{ + return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; +} + +/* { dg-final { scan-assembler-not {\mld\M} } } */