From patchwork Wed Feb 20 23:50:31 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej W. Rozycki" X-Patchwork-Id: 222162 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 4D3EF2C0084 for ; Thu, 21 Feb 2013 10:51:10 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1362009071; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Date:From:To:CC:Subject:Message-ID:User-Agent: MIME-Version:Content-Type:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=WxSZOyI3+925SYhsTddTopfpLLA=; b=HsGxgE/q+QPgdLk ZR0W7JGI/7acH+fL1N7jWCxtt+++OtJWYqfXDBDpmf1CfXxTa5n4RNSNnzHVVHu9 z2NyFTuzpeE+qV+hhXirzHakhfphxOzjAgo26N8lXr075mDNaXNH+MB2w+Kbcx9E EBvjsmJ2RTAIVfU7/3xtAB7o8Keg= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Date:From:To:CC:Subject:Message-ID:User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=VEOxW4BBnBfdXTTZ4zP0VEkr0l3GjqnUVX0SwL+qg3uWdqbicSdLhkeePy5LGv odmgmiEq0o5Bf6r9s11Hslq5q5iMPjnPe8w/3lU4fPi3lo4zsGPr6Q3nADA0/mWt 7rZnVl2zZMzYk8YO2J3/bXgpthcrth7TN8n3rqHEjc6ic=; Received: (qmail 27284 invoked by alias); 20 Feb 2013 23:51:04 -0000 Received: (qmail 27275 invoked by uid 22791); 20 Feb 2013 23:51:04 -0000 X-SWARE-Spam-Status: No, hits=-3.8 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 20 Feb 2013 23:50:41 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1U8JQu-0005Hd-E3 from Maciej_Rozycki@mentor.com ; Wed, 20 Feb 2013 15:50:40 -0800 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Wed, 20 Feb 2013 15:50:40 -0800 Received: from [172.30.64.249] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.1.289.1; Wed, 20 Feb 2013 23:50:37 +0000 Date: Wed, 20 Feb 2013 23:50:31 +0000 From: "Maciej W. Rozycki" To: Richard Sandiford CC: Steve Ellcey , Subject: [PATCH] MIPS: MIPS32r2 FP MADD instruction set support Message-ID: User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, This issue was originally raised here: http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html We have a shortcoming in GCC in that we only allow the use half of the FP MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register mode (CP0.Status.FR == 1) on MIPS32r2 processors. Furthermore we never enable the other half (NMADD.fmt and NMSUB.fmt) on those processors. However this whole instruction subset is always available on MIPS32r2 FPUs regardless of the mode selected, just as it always has been on FPUs of the 64-bit ISA line from MIPS IV up. The paired-single format however is indeed only available in the 64-bit/32-register mode as from the MIPS V ISA up. We do explicitly allow it for some (or no) reason for NMADD.PS and NMSUB.PS on MIPS32r2 processors in the 32-bit/16-register FPU mode (this is probably globally overridden elsewhere). I'm not sure where this GCC limitation came from, but there were typos in the formats listed for the MSUB.S, MSUB.D, NMADD.S, NMADD.D, NMSUB.S and NMSUB.D instructions up to and including rev. 2.50 of vol. II of the MIPS32r2 architecture documentation set (MIPS doc #MD00086). This may or may not have contributed to this problem as these instructions were listed as available from "MIPS64" up rather than from "MIPS64, MIPS32 Release 2" up, so no mention of the FPU mode there. The change below lifts the relevant restrictions removing a lot of clutter that's not needed anymore now that the data mode does not have to be checked. Also, according to MIPS IV ISA documentation these operations are only fused (i.e. don't match original IEEE 754-1985 accuracy requirements) on the original MIPS IV R8000 CPU, and MIPS architecture specs don't mention any limitations of these instructions either, so I have updated the GCC manual to document that on non-R8000 CPUs (which are ones we really care about) they are numerically equivalent to computations made with corresponding individual operations. Finally, while at it, I found it interesting that we have separate conditions to cover MADD.fmt/MSUB.fmt (ISA_HAS_FP_MADD4_MSUB4) and NMADD.fmt/NMADD.fmt (ISA_HAS_NMADD4_NMSUB4) while all the four instructions need to be implemented as a whole group per data format supported and cannot be separated (the MIPS architecture specification explicitly forbids subsetting). The difference between the two conditions is the former expands to ISA_HAS_FP4, that is enables the subsubset for any MIPS IV and up FPU while the latter has an extra "&& (!TARGET_MIPS5400 || TARGET_MAD)" qualifier. I went ahead and checked available NEC VR54xx documentation and here's what I came up with: 1. "VR5400 MIPS RISC Microprocessor Family" datasheet (NEC doc #13362) says: "The VR5400 processor family complies with the MIPS IV instruction set and IEEE-754 floating-point and IEEE-1149.1/1149.1a JTAG specification, [...]" 2. "VR5432 MIPS RISC Microprocessor User's Manual, Volume 2" (NEC doc #13751) lists all the individual MADD.fmt, MSUB.fmt, NMADD.fmt and NMSUB.fmt instructions in Chapter 18 "Floating-Point Unit Instruction Set" with no restrictions as to their availability (the only other member of the VR54xx family I know of is the VR5464 that is a high-performance version of the VR5432 and is fully software compatible). Further to that TARGET_MAD controls whether to "Use PMC-style 'mad' instructions" that are all CPU rather than FPU instructions. The VR5432 indeed supports extra integer multiply-accumulate instructions, as documented in #2 above; these are the MACC/MACCHI/MACCHIU/MACCU and MSAC/MSACHI/MSACHIU/MSACU instructions as roughly covered by our ISA_HAS_MACC, ISA_HAS_MSAC and ISA_HAS_MACCHI knobs (the latter is not implied for TARGET_MIPS5400, perhaps because the family does not support the doubleword variants). All in all it looks to me like a misplaced hunk. It was introduced in rev. 56471 (you were named as one of the contributors on that commit, so you may be able to remember and/or correct me if I am wrong here anywhere) and it looks to me it should have been applied to the ISA_HAS_MADD_MSUB macro instead that's still just a few lines above ISA_HAS_NMADD4_NMSUB4 (and was even closer to ISA_HAS_NMADD_NMSUB as the latter was then called; the bodies were close enough back then for a hunk to apply cleanly to either). These days we handle ISA_HAS_MADD_MSUB indirectly through GENERATE_MADD_MSUB and in many more places than back at rev. 56471. We also handle TARGET_MAD and ISA_HAS_MACC/ISA_HAS_MSAC/ISA_HAS_MACCHI explicitly throughout mips.md, so I think we should simply discard this incorrect condition, and then, as ISA_HAS_FP_MADD4_MSUB4 and ISA_HAS_NMADD4_NMSUB4 will have become identical, fold the two macros into one, perhaps ISA_HAS_FP_MADD4. And likewise ISA_HAS_FP_MADD3. Thoughts? Back to the change considered here, it was successfully regression-tested with the gcc, g++ and libstdc++ testsuites, with the mips-linux-gnu target and the o32/mips32r2 and n64/mips64r2 multilibs, both endiannesses. I examined some test cases executed to verify the instructions concerned have been emitted as appropriate where previously they were not. I hope this change is OK to apply as soon as 4.9 has opened. BTW, do you happen to know a way to reliable force all our testsuites NOT to delete executables after run? Personally I think it's missing the point to have them deleted -- how can one debug any regressions then? I have a rather gross hack patching up most of the TCL scripts throughout to remove the various instances of file deletion commands and stick the -keep-output option onto dg-test calls, but there always seems to be something that escapes. I find it very frustrating. What do other people do? I can't believe all the GCC developers are happy to accept this pain. 2013-02-20 Maciej W. Rozycki gcc/ * config/mips/mips.h (ISA_HAS_FP4): Don't restrict ISA_MIPS32R2 to TARGET_FLOAT64. (ISA_HAS_NMADD4_NMSUB4): Remove the MODE argument; don't restrict ISA_MIPS32R2. (ISA_HAS_NMADD3_NMSUB3): Remove the MODE argument. * config/mips/mips.c (mips_rtx_costs): Update according to changes to ISA_HAS_NMADD4_NMSUB4 and ISA_HAS_NMADD3_NMSUB3. * config/mips/mips.md (nmadd4, nmadd3): Likewise. (nmadd4_fastmath, nmadd3_fastmath): Likewise. (nmsub4, nmsub3): Likewise. (nmsub4_fastmath, nmsub3_fastmath): Likewise. * doc/invoke.texi (MIPS Options): Update documentation of the floating-point multiply-accumulate instruction restrictions. Maciej gcc-mips32r2-madd.patch Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c 2013-02-07 02:59:05.465114046 +0000 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c 2013-02-07 02:59:48.575511623 +0000 @@ -3798,7 +3798,7 @@ mips_rtx_costs (rtx x, int code, int out case MINUS: if (float_mode_p - && (ISA_HAS_NMADD4_NMSUB4 (mode) || ISA_HAS_NMADD3_NMSUB3 (mode)) + && (ISA_HAS_NMADD4_NMSUB4 || ISA_HAS_NMADD3_NMSUB3) && TARGET_FUSED_MADD && !HONOR_NANS (mode) && !HONOR_SIGNED_ZEROS (mode)) @@ -3850,7 +3850,7 @@ mips_rtx_costs (rtx x, int code, int out case NEG: if (float_mode_p - && (ISA_HAS_NMADD4_NMSUB4 (mode) || ISA_HAS_NMADD3_NMSUB3 (mode)) + && (ISA_HAS_NMADD4_NMSUB4 || ISA_HAS_NMADD3_NMSUB3) && TARGET_FUSED_MADD && !HONOR_NANS (mode) && HONOR_SIGNED_ZEROS (mode)) Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.h =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.h 2013-02-07 02:35:34.024073830 +0000 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.h 2013-02-07 02:59:48.575511623 +0000 @@ -855,7 +855,7 @@ struct mips_cpu_info { FP madd and msub instructions, and the FP recip and recip sqrt instructions. */ #define ISA_HAS_FP4 ((ISA_MIPS4 \ - || (ISA_MIPS32R2 && TARGET_FLOAT64) \ + || ISA_MIPS32R2 \ || ISA_MIPS64 \ || ISA_MIPS64R2) \ && !TARGET_MIPS16) @@ -885,18 +885,12 @@ struct mips_cpu_info { /* ISA has floating-point nmadd and nmsub instructions 'd = -((a * b) [+-] c)'. */ -#define ISA_HAS_NMADD4_NMSUB4(MODE) \ - ((ISA_MIPS4 \ - || (ISA_MIPS32R2 && (MODE) == V2SFmode) \ - || ISA_MIPS64 \ - || ISA_MIPS64R2) \ - && (!TARGET_MIPS5400 || TARGET_MAD) \ - && !TARGET_MIPS16) +#define ISA_HAS_NMADD4_NMSUB4 (ISA_HAS_FP4 \ + && (!TARGET_MIPS5400 || TARGET_MAD)) /* ISA has floating-point nmadd and nmsub instructions 'c = -((a * b) [+-] c)'. */ -#define ISA_HAS_NMADD3_NMSUB3(MODE) \ - TARGET_LOONGSON_2EF +#define ISA_HAS_NMADD3_NMSUB3 TARGET_LOONGSON_2EF /* ISA has count leading zeroes/ones instruction (not implemented). */ #define ISA_HAS_CLZ_CLO ((ISA_MIPS32 \ Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.md =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.md 2013-02-07 02:35:34.004034605 +0000 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.md 2013-02-07 02:59:48.585476315 +0000 @@ -2344,7 +2344,7 @@ (mult:ANYF (match_operand:ANYF 1 "register_operand" "f") (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "f"))))] - "ISA_HAS_NMADD4_NMSUB4 (mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2359,7 +2359,7 @@ (mult:ANYF (match_operand:ANYF 1 "register_operand" "f") (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "0"))))] - "ISA_HAS_NMADD3_NMSUB3 (mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2374,7 +2374,7 @@ (mult:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f")) (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "f")))] - "ISA_HAS_NMADD4_NMSUB4 (mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2389,7 +2389,7 @@ (mult:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f")) (match_operand:ANYF 2 "register_operand" "f")) (match_operand:ANYF 3 "register_operand" "0")))] - "ISA_HAS_NMADD3_NMSUB3 (mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2404,7 +2404,7 @@ (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "f")) (match_operand:ANYF 1 "register_operand" "f"))))] - "ISA_HAS_NMADD4_NMSUB4 (mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2419,7 +2419,7 @@ (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "f")) (match_operand:ANYF 1 "register_operand" "0"))))] - "ISA_HAS_NMADD3_NMSUB3 (mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2434,7 +2434,7 @@ (match_operand:ANYF 1 "register_operand" "f") (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "f"))))] - "ISA_HAS_NMADD4_NMSUB4 (mode) + "ISA_HAS_NMADD4_NMSUB4 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" @@ -2449,7 +2449,7 @@ (match_operand:ANYF 1 "register_operand" "f") (mult:ANYF (match_operand:ANYF 2 "register_operand" "f") (match_operand:ANYF 3 "register_operand" "0"))))] - "ISA_HAS_NMADD3_NMSUB3 (mode) + "ISA_HAS_NMADD3_NMSUB3 && TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (mode) && !HONOR_NANS (mode)" Index: gcc-fsf-trunk-quilt/gcc/doc/invoke.texi =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/doc/invoke.texi 2013-02-07 02:21:07.574131467 +0000 +++ gcc-fsf-trunk-quilt/gcc/doc/invoke.texi 2013-02-07 02:59:48.585476315 +0000 @@ -16440,10 +16440,12 @@ Enable (disable) use of the floating-poi instructions, when they are available. The default is @option{-mfused-madd}. -When multiply-accumulate instructions are used, the intermediate -product is calculated to infinite precision and is not subject to -the FCSR Flush to Zero bit. This may be undesirable in some -circumstances. +On the R8000 CPU when multiply-accumulate instructions are used, +the intermediate product is calculated to infinite precision +and is not subject to the FCSR Flush to Zero bit. This may be +undesirable in some circumstances. On other processors the result +is numerically identical to the equivalent computation using +separate multiply, add, subtract and negate instructions. @item -nocpp @opindex nocpp