Message ID | alpine.DEB.1.10.1302200035560.6762@tp.orcam.me.uk |
---|---|
State | Superseded |
Headers | show |
"Maciej W. Rozycki" <macro@codesourcery.com> writes: > This issue was originally raised here: > > http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html > > We have a shortcoming in GCC in that we only allow the use half of the FP > MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register > mode (CP0.Status.FR == 1) on MIPS32r2 processors. Furthermore we never > enable the other half (NMADD.fmt and NMSUB.fmt) on those processors. > However this whole instruction subset is always available on MIPS32r2 FPUs > regardless of the mode selected, just as it always has been on FPUs of the > 64-bit ISA line from MIPS IV up. Hmm, this was discussed here: http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00488.html http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00492.html The footnote for COP1X instructions on page 12 of volume 1 of the MIPS32 ISA (v2.50) says: 1. In Release 1 of the Architecture, these instructions are legal only with a MIPS64 processor with 64-bit operations enabled (they are, in effect, actually MIPS64 instructions). In Release 2 of the Architecture, these instructions are legal with either a MIPS32 or MIPS64 processor _which includes a 64-bit floating point unit_. (my emphasis). "which" rather than "that" makes this a bit ambiguous, but various comments in the rest of the manual imply that MIPS32r2 allows an implementation choice between 32-bit and 64-bit FPUs. E.g. page 8 says: Support for 64-bit coprocessors with 32-bit CPUs: These changes allow a 64-bit coprocessor (including an FPU) to be attached to a 32-bit CPU. This enhancement is optional in a Release 2 implementation. and page 45 says: In addition to an Instruction Set Architecture, the MIPS architecture definition includes processing resources such as the set of coprocessor general registers. In Release 1 of the Architecture, the 32-bit registers in MIPS32 were enlarged to 64-bits in MIPS64; however, these 64-bit FPU registers are not backwards compatible. Instead, processors implementing the MIPS64 Architecture provide a mode bit to select either the 32-bit or 64-bit register model. In Release 2 of the Architecture, a 32-bit CPU _may_ include a full 64-bit coprocessor, including a floating point unit which implements the same mode bit to select 32-bit or 64-bit FPU register model. On page 322 of volume 2, the footnote for "Table A-20 MIPS64 COP1X Encoding of Function Field" uses slightly different wording: COP1X instructions are legal only if 64-bit floating point operations are enabled. So was this all a big misunderstanding on my part? The TARGET_FLOAT64 condition came from MIPS themselves, and when challenged they seemed pretty adamant that it was correct. If I was wrong to be convinced by the explanation, I hope you can at least see why I was convinced. :-) If it wasn't a misunderstanding, then the point is that we can't tell from ISA_MIPS32R2 alone whether the target has a 32-bit or 64-bit FPU, but we know that it must have a 64-bit FPU if using TARGET_FLOAT64. > Also, according to MIPS IV ISA documentation these operations are only > fused (i.e. don't match original IEEE 754-1985 accuracy requirements) on > the original MIPS IV R8000 CPU, and MIPS architecture specs don't mention > any limitations of these instructions either, so I have updated the GCC > manual to document that on non-R8000 CPUs (which are ones we really care > about) they are numerically equivalent to computations made with > corresponding individual operations. This part is OK, thanks, and is probably the only bit that's suitable for 4.8 at this stage. Would you mind applying it separately? > Finally, while at it, I found it interesting that we have separate > conditions to cover MADD.fmt/MSUB.fmt (ISA_HAS_FP_MADD4_MSUB4) and > NMADD.fmt/NMADD.fmt (ISA_HAS_NMADD4_NMSUB4) while all the four > instructions need to be implemented as a whole group per data format > supported and cannot be separated (the MIPS architecture specification > explicitly forbids subsetting). The difference between the two conditions > is the former expands to ISA_HAS_FP4, that is enables the subsubset for > any MIPS IV and up FPU while the latter has an extra "&& (!TARGET_MIPS5400 > || TARGET_MAD)" qualifier. > > I went ahead and checked available NEC VR54xx documentation and here's > what I came up with: > > 1. "VR5400 MIPS RISC Microprocessor Family" datasheet (NEC doc #13362) > says: > > "The VR5400 processor family complies with the MIPS IV instruction set > and IEEE-754 floating-point and IEEE-1149.1/1149.1a JTAG specification, > [...]" > > 2. "VR5432 MIPS RISC Microprocessor User's Manual, Volume 2" (NEC doc > #13751) lists all the individual MADD.fmt, MSUB.fmt, NMADD.fmt and > NMSUB.fmt instructions in Chapter 18 "Floating-Point Unit Instruction > Set" with no restrictions as to their availability (the only other > member of the VR54xx family I know of is the VR5464 that is a > high-performance version of the VR5432 and is fully software > compatible). > > Further to that TARGET_MAD controls whether to "Use PMC-style 'mad' > instructions" that are all CPU rather than FPU instructions. The VR5432 > indeed supports extra integer multiply-accumulate instructions, as > documented in #2 above; these are the MACC/MACCHI/MACCHIU/MACCU and > MSAC/MSACHI/MSACHIU/MSACU instructions as roughly covered by our > ISA_HAS_MACC, ISA_HAS_MSAC and ISA_HAS_MACCHI knobs (the latter is not > implied for TARGET_MIPS5400, perhaps because the family does not support > the doubleword variants). > > All in all it looks to me like a misplaced hunk. It was introduced in > rev. 56471 (you were named as one of the contributors on that commit, so > you may be able to remember and/or correct me if I am wrong here anywhere) > and it looks to me it should have been applied to the ISA_HAS_MADD_MSUB > macro instead that's still just a few lines above ISA_HAS_NMADD4_NMSUB4 > (and was even closer to ISA_HAS_NMADD_NMSUB as the latter was then called; > the bodies were close enough back then for a hunk to apply cleanly to > either). I was named in that commit but the VR54xx stuff wasn't mine. I do remember that Mike put a lot of effort into tuning the VR54xx madd stuff though, because of the difficulty of having multiply-accumulate instructions that force the use of HI/LO on an architecture that also has efficient three-operand multiplies. So I'm pretty sure that this worked correctly in the Cygnus devo tree, and your explanation of a misplaced hunk seems very convincing. Richard
On Thu, 21 Feb 2013, Richard Sandiford wrote: > > This issue was originally raised here: > > > > http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html > > > > We have a shortcoming in GCC in that we only allow the use half of the FP > > MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register > > mode (CP0.Status.FR == 1) on MIPS32r2 processors. Furthermore we never > > enable the other half (NMADD.fmt and NMSUB.fmt) on those processors. > > However this whole instruction subset is always available on MIPS32r2 FPUs > > regardless of the mode selected, just as it always has been on FPUs of the > > 64-bit ISA line from MIPS IV up. > > Hmm, this was discussed here: > > http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00488.html > http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00492.html > > The footnote for COP1X instructions on page 12 of volume 1 of the MIPS32 ISA > (v2.50) says: Please note that v2.50 is obsolete, several typos were corrected in later revisions. Sadly I don't even have a copy of that exact revision of volume I. > 1. In Release 1 of the Architecture, these instructions are legal only > with a MIPS64 processor with 64-bit operations enabled (they are, in > effect, actually MIPS64 instructions). In Release 2 of the Architecture, > these instructions are legal with either a MIPS32 or MIPS64 processor > _which includes a 64-bit floating point unit_. > > (my emphasis). "which" rather than "that" makes this a bit ambiguous, > but various comments in the rest of the manual imply that MIPS32r2 allows > an implementation choice between 32-bit and 64-bit FPUs. E.g. page 8 says: > > Support for 64-bit coprocessors with 32-bit CPUs: These changes allow > a 64-bit coprocessor (including an FPU) to be attached to a 32-bit > CPU. This enhancement is optional in a Release 2 implementation. > > and page 45 says: > > In addition to an Instruction Set Architecture, the MIPS architecture > definition includes processing resources such as the set of > coprocessor general registers. In Release 1 of the Architecture, the > 32-bit registers in MIPS32 were enlarged to 64-bits in MIPS64; > however, these 64-bit FPU registers are not backwards > compatible. Instead, processors implementing the MIPS64 Architecture > provide a mode bit to select either the 32-bit or 64-bit register > model. In Release 2 of the Architecture, a 32-bit CPU _may_ include a > full 64-bit coprocessor, including a floating point unit which > implements the same mode bit to select 32-bit or 64-bit FPU register > model. > > On page 322 of volume 2, the footnote for "Table A-20 MIPS64 COP1X > Encoding of Function Field" uses slightly different wording: > > COP1X instructions are legal only if 64-bit floating point operations > are enabled. > > So was this all a big misunderstanding on my part? The TARGET_FLOAT64 > condition came from MIPS themselves, and when challenged they seemed > pretty adamant that it was correct. If I was wrong to be convinced > by the explanation, I hope you can at least see why I was convinced. :-) As good as my English comprehension might be I believe your understanding of documentation is right. However several inconsistencies have already been spotted and it may well be that for example the restriction on COP1X instructions is simply a leftover that leaked from rev.1 of the MIPS64 architecture and was not spotted on the update (please note that obviously all the architecture documents are produced from templates shared between the MIPS32 and MIPS64 architectures with some parts being output conditionally as appropriate). > If it wasn't a misunderstanding, then the point is that we can't tell > from ISA_MIPS32R2 alone whether the target has a 32-bit or 64-bit FPU, > but we know that it must have a 64-bit FPU if using TARGET_FLOAT64. Well, the important point is as I understand all the MIPS32r2 FPU instructions (not all the FP formats of course though, L and PS are optional) are mandatory no matter if the FPU is 32-bit or 64-bit. This should rule out any question as to what kind of FPU has been implemented or whether the 64-bit mode has been enabled. I could be wrong however. However the good news is it's not the architecture documentation set that is normative for architecture implementers, it's the AVP (Architecture Verification Program) they have to run successfully on their processors to claim compliance. Steve, would you therefore please do us a favour and check what the AVP for MIPS32r2 requires for support of the floating-point MADD instruction subset, or preferably, the whole COP1X instruction set? Are these instructions only mandatory for a 64-bit FPU (CP1.FIR.F64 == 1), or do they have to be included in all FPU implementations? > This part is OK, thanks, and is probably the only bit that's suitable for > 4.8 at this stage. Would you mind applying it separately? Applied now, thanks. > I was named in that commit but the VR54xx stuff wasn't mine. I do remember > that Mike put a lot of effort into tuning the VR54xx madd stuff though, > because of the difficulty of having multiply-accumulate instructions > that force the use of HI/LO on an architecture that also has efficient > three-operand multiplies. So I'm pretty sure that this worked correctly > in the Cygnus devo tree, and your explanation of a misplaced hunk seems > very convincing. I'll make appropriate adjustments then. Also, regardless of the outcome of the instruction presence on 32-bit FPUs, do you agree it is a good idea to fold ISA_HAS_FP_MADD4_MSUB4 and ISA_HAS_NMADD4_NMSUB4 into one macro? Maciej
On Feb 20, 2013, at 3:50 PM, Maciej W. Rozycki <macro@codesourcery.com> wrote: > BTW, do you happen to know a way to reliable force all our testsuites NOT > to delete executables after run? Personally I think it's missing the > point to have them deleted -- how can one debug any regressions then? So, I go into the source and comment out the cleanup lines… Would be nice to have them conditional on a variable so that one can set that variable in their .dejagnurc file. The limitations of this are, multiple test cases can use the same intermediate files and they would not give you a nice 1 to 1 mapping back. However, this I usually side step by only running a subset of the test suite that contains the part I'm interested in. > What do other people do? We usually can cut and paste the one line and run the one case by hand. Test cases that don't fall into this category, well, suck, I mean, are more annoying.
>> What do other people do? > > We usually can cut and paste the one line and run the one case by hand. Test cases that don't fall into this category, well, suck, I mean, are more annoying. > These days I find contrib/repro_fail <regexpforthefailingtest> file.log.useful to rebuild the failing test(s). regards Ramana
From: Maciej W. Rozycki [macro@codesourcery.com] > Steve, would you therefore please do us a favour and check what the AVP > for MIPS32r2 requires for support of the floating-point MADD instruction > subset, or preferably, the whole COP1X instruction set? Are these > instructions only mandatory for a 64-bit FPU (CP1.FIR.F64 == 1), or do > they have to be included in all FPU implementations Maciej, I checked with one of the AVP engineers and he said: "madd/msub/nmadd/nmsub.fmt instructions are required for all release 2 implementations, whether or not that implementation has a 64 bit fpu. They are also required for mips64 release 1 implementations. The AVPs check for this." Steve Ellcey
Steve Ellcey <Steve.Ellcey@imgtec.com> writes: > From: Maciej W. Rozycki [macro@codesourcery.com] > >> Steve, would you therefore please do us a favour and check what the AVP >> for MIPS32r2 requires for support of the floating-point MADD instruction >> subset, or preferably, the whole COP1X instruction set? Are these >> instructions only mandatory for a 64-bit FPU (CP1.FIR.F64 == 1), or do >> they have to be included in all FPU implementations > > Maciej, > > I checked with one of the AVP engineers and he said: > > "madd/msub/nmadd/nmsub.fmt instructions are required for all release 2 > implementations, > whether or not that implementation has a 64 bit fpu. They are also > required for mips64 > release 1 implementations. The AVPs check for this." Steve, thanks for checking. Maciej, in that case, the rest of the patch is OK for 4.9, thanks. Richard
On Wed, 27 Feb 2013, Richard Sandiford wrote: > > I checked with one of the AVP engineers and he said: > > > > "madd/msub/nmadd/nmsub.fmt instructions are required for all release 2 > > implementations, > > whether or not that implementation has a 64 bit fpu. They are also > > required for mips64 > > release 1 implementations. The AVPs check for this." > > Steve, thanks for checking. I concur, thank you, Steve. > Maciej, in that case, the rest of the patch is OK for 4.9, thanks. I will apply in due course then, thanks for your review. Maciej
On Thu, 21 Feb 2013, Mike Stump wrote: > > BTW, do you happen to know a way to reliable force all our testsuites NOT > > to delete executables after run? Personally I think it's missing the > > point to have them deleted -- how can one debug any regressions then? > > So, I go into the source and comment out the cleanup lines… Would be > nice to have them conditional on a variable so that one can set that > variable in their .dejagnurc file. Hmm, good idea, I'll see if the hack I use could be adapted easily. > The limitations of this are, multiple test cases can use the same > intermediate files and they would not give you a nice 1 to 1 mapping > back. However, this I usually side step by only running a subset of the > test suite that contains the part I'm interested in. Yeah, I think final executables that don't have unique names are even worse. It would be good to avoid at least making such new cases. > > What do other people do? > > We usually can cut and paste the one line and run the one case by hand. > Test cases that don't fall into this category, well, suck, I mean, are > more annoying. For this to work you need to have at least all the auxiliary stuff needed for linking, etc. that's built once per test suite run still available. Somehow I found it troublesome to rebuild it manually. Thanks for all the hints -- it looks to me like there's no easy answer. In which case I'll see if I can do anything about the hack I have in case someone else finds it useful. Maciej
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c 2013-02-07 02:59:05.465114046 +0000 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c 2013-02-07 02:59:48.575511623 +0000 @@ -3798,7 +3798,7 @@ mips_rtx_costs (rtx x, int code, int out case MINUS: if (float_mode_p - && (ISA_HAS_NMADD4_NMSUB4 (mode) || ISA_HAS_NMADD3_NMSUB3 (mode)) + && (ISA_HAS_NMADD4_NMSUB4 || ISA_HAS_NMADD3_NMSUB3) && TARGET_FUSED_MADD && !HONOR_NANS (mode) && !HONOR_SIGNED_ZEROS (mode)) @@ -3850,7 +3850,7 @@ mips_rtx_costs (rtx x, int code, int out case NEG: if (float_mode_p - && (ISA_HAS_NMADD4_NMSUB4 (mode) || ISA_HAS_NMADD3_NMSUB3 (mode)) + && (ISA_HAS_NMADD4_NMSUB4 || ISA_HAS_NMADD3_NMSUB3) && TARGET_FUSED_MADD && !HONOR_NANS (mode) && HONOR_SIGNED_ZEROS (mode)) Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.h =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.h 2013-02-07 02:35:34.024073830 +0000 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.h 2013-02-07 02:59:48.575511623 +0000 @@ -855,7 +855,7 @@ struct mips_cpu_info { FP madd and msub instructions, and the FP recip and recip sqrt instructions. */ #define ISA_HAS_FP4 ((ISA_MIPS4 \ - || (ISA_MIPS32R2 && TARGET_FLOAT64) \ + || ISA_MIPS32R2 \ || ISA_MIPS64 \ || ISA_MIPS64R2) \ && !TARGET_MIPS16) @@ -885,18 +885,12 @@ struct mips_cpu_info { /* ISA has floating-point nmadd and nmsub instructions 'd = -((a * b) [+-] c)'. */ -#define ISA_HAS_NMADD4_NMSUB4(MODE) \ - ((ISA_MIPS4 \ - || (ISA_MIPS32R2 && (MODE) == V2SFmode) \ - || ISA_MIPS64 \ - || ISA_MIPS64R2) \ - && (!TARGET_MIPS5400 || TARGET_MAD) \ - && !TARGET_MIPS16) +#define ISA_HAS_NMADD4_NMSUB4 (ISA_HAS_FP4 \ + && (!TARGET_MIPS5400 || TARGET_MAD)) /* ISA has floating-point nmadd and nmsub instructions 'c = -((a * b) [+-] c)'. */ -#define ISA_HAS_NMADD3_NMSUB3(MODE) \ - TARGET_LOONGSON_2EF +#define ISA_HAS_NMADD3_NMSUB3 TARGET_LOONGSON_2EF /* ISA has count leading zeroes/ones instruction (not implemented). */ #define ISA_HAS_CLZ_CLO ((ISA_MIPS32 \ Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.md =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.md 2013-02-07 02:35:34.004034605 +0000 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.md 2013-02-07 02:59:48.585476315 +0000 @@ -2344,7 +2344,7 @@ (mult:ANYF (match_operand:ANYF 1 "register_operand" "f") (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "f"))))] - "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2359,7 +2359,7 @@ (mult:ANYF (match_operand:ANYF 1 "register_operand" "f") (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "0"))))] - "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2374,7 +2374,7 @@ (mult:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f")) (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "f")))] - "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2389,7 +2389,7 @@ (mult:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f")) (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "0")))] - "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2404,7 +2404,7 @@ (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "f")) (match_operand:ANYF 1 "register_operand" "f"))))] - "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2419,7 +2419,7 @@ (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "f")) (match_operand:ANYF 1 "register_operand" "0"))))] - "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2434,7 +2434,7 @@ (match_operand:ANYF 1 "register_operand" "f") (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "f"))))] - "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" @@ -2449,7 +2449,7 @@ (match_operand:ANYF 1 "register_operand" "f") (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "0"))))] - "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (<MODE>mode) && !HONOR_NANS (<MODE>mode)" Index: gcc-fsf-trunk-quilt/gcc/doc/invoke.texi =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/doc/invoke.texi 2013-02-07 02:21:07.574131467 +0000 +++ gcc-fsf-trunk-quilt/gcc/doc/invoke.texi 2013-02-07 02:59:48.585476315 +0000 @@ -16440,10 +16440,12 @@ Enable (disable) use of the floating-poi instructions, when they are available. The default is @option{-mfused-madd}. -When multiply-accumulate instructions are used, the intermediate -product is calculated to infinite precision and is not subject to -the FCSR Flush to Zero bit. This may be undesirable in some -circumstances. +On the R8000 CPU when multiply-accumulate instructions are used, +the intermediate product is calculated to infinite precision +and is not subject to the FCSR Flush to Zero bit. This may be +undesirable in some circumstances. On other processors the result +is numerically identical to the equivalent computation using +separate multiply, add, subtract and negate instructions. @item -nocpp @opindex nocpp