From patchwork Wed Nov 10 23:08:04 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 70719 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 06F91B7117 for ; Thu, 11 Nov 2010 10:08:16 +1100 (EST) Received: (qmail 5148 invoked by alias); 10 Nov 2010 23:08:13 -0000 Received: (qmail 5132 invoked by uid 22791); 10 Nov 2010 23:08:11 -0000 X-SWARE-Spam-Status: No, hits=-5.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, TW_BD, TW_VZ, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 10 Nov 2010 23:08:06 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id oAAN85cn014959 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 10 Nov 2010 18:08:05 -0500 Received: from anchor.twiddle.home (ovpn-113-114.phx2.redhat.com [10.3.113.114]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id oAAN847A017919; Wed, 10 Nov 2010 18:08:04 -0500 Message-ID: <4CDB25D4.9080204@redhat.com> Date: Wed, 10 Nov 2010 15:08:04 -0800 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Fedora/3.1.6-1.fc14 Thunderbird/3.1.6 MIME-Version: 1.0 To: "gcc-patches >> GCC Patches" CC: rguenther@suse.de Subject: [patch 3/N][i386] -mfused-madd cleanup X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch continues the elimination -mfused-madd in favor of -ffp-contract, converting the i386 backend. This target was already mostly tidied up. Just the -mfused-madd splitters to eliminate. I won't check this in until its pre-requisites are approved. r~ From 40c1cbaf9c329bfcad531ce2ec34cd87ebaba7c9 Mon Sep 17 00:00:00 2001 From: Richard Henderson Date: Wed, 10 Nov 2010 13:47:26 -0800 Subject: [PATCH 3/3] i386: move -mfused-madd to -ffp-contract. Delete the TARGET_FUSED_MADD splitters. --- gcc/config.gcc | 2 + gcc/config/i386/i386.c | 3 +- gcc/config/i386/i386.opt | 6 -- gcc/config/i386/sse.md | 98 +------------------------------ gcc/testsuite/gcc.target/i386/sse-24.c | 2 +- 5 files changed, 8 insertions(+), 103 deletions(-) diff --git a/gcc/config.gcc b/gcc/config.gcc index 6a6e47d..1865c86 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -311,6 +311,7 @@ i[34567]86-*-*) cpu_type=i386 c_target_objs="i386-c.o" cxx_target_objs="i386-c.o" + extra_options="${extra_options} fused-madd.opt" extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h @@ -322,6 +323,7 @@ x86_64-*-*) cpu_type=i386 c_target_objs="i386-c.o" cxx_target_objs="i386-c.o" + extra_options="${extra_options} fused-madd.opt" extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 138fb3f..13590bc 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -34394,8 +34394,7 @@ ix86_autovectorize_vector_sizes (void) #define TARGET_DEFAULT_TARGET_FLAGS \ (TARGET_DEFAULT \ | TARGET_SUBTARGET_DEFAULT \ - | TARGET_TLS_DIRECT_SEG_REFS_DEFAULT \ - | MASK_FUSED_MADD) + | TARGET_TLS_DIRECT_SEG_REFS_DEFAULT) #undef TARGET_HANDLE_OPTION #define TARGET_HANDLE_OPTION ix86_handle_option diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 28a921f..f55c96a 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -261,12 +261,6 @@ Target Report Mask(VZEROUPPER) Save Generate vzeroupper instruction before a transfer of control flow out of the function. -mfused-madd -Target Report Mask(FUSED_MADD) Save -Enable automatic generation of fused floating point multiply-add instructions -if the ISA supports such instructions. The -mfused-madd option is on by -default. - mdispatch-scheduler Target RejectNegative Var(flag_dispatch_scheduler) Do dispatch scheduling if processor is bdver1 and Haifa scheduling diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 717f7fe..279f111 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1856,6 +1856,10 @@ ;; (set (reg1) (mem (addr1))) ;; (set (reg2) (mult (reg1) (mem (addr2)))) ;; (set (reg3) (plus (reg2) (mem (addr3)))) +;; +;; ??? This is historic, pre-dating the gimple fma transformation. +;; We could now properly represent that only one memory operand is +;; allowed and not be penalized during optimization. ;; Intrinsic FMA operations. @@ -2180,100 +2184,6 @@ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; -;; Non-intrinsic versions, matched when fused-multiply-add is allowed. -;; -;; ??? If fused-madd were a generic flag, combine could do this without -;; needing splitters here in the backend. Irritatingly, combine won't -;; recognize many of these with mere splits, since only 3 or more insns -;; are allowed to split during combine. Thankfully, there's always a -;; split_all_insns pass that runs before reload. -;; -;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; - -(define_insn_and_split "*split_fma" - [(set (match_operand:FMAMODE 0 "register_operand") - (plus:FMAMODE - (mult:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand") - (match_operand:FMAMODE 2 "nonimmediate_operand")) - (match_operand:FMAMODE 3 "nonimmediate_operand")))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (match_dup 1) - (match_dup 2) - (match_dup 3)))] - "") - -;; Floating multiply and subtract. -(define_insn_and_split "*split_fms" - [(set (match_operand:FMAMODE 0 "register_operand") - (minus:FMAMODE - (mult:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand") - (match_operand:FMAMODE 2 "nonimmediate_operand")) - (match_operand:FMAMODE 3 "nonimmediate_operand")))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (match_dup 1) - (match_dup 2) - (neg:FMAMODE (match_dup 3))))] - "") - -;; Floating point negative multiply and add. -;; Recognize (-a * b + c) via the canonical form: c - (a * b). -(define_insn_and_split "*split_fnma" - [(set (match_operand:FMAMODE 0 "register_operand") - (minus:FMAMODE - (match_operand:FMAMODE 3 "nonimmediate_operand") - (mult:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand") - (match_operand:FMAMODE 2 "nonimmediate_operand"))))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (neg:FMAMODE (match_dup 1)) - (match_dup 2) - (match_dup 3)))] - "") - -;; Floating point negative multiply and subtract. -;; Recognize (-a * b - c) via the canonical form: c - (-a * b). -(define_insn_and_split "*split_fnms" - [(set (match_operand:FMAMODE 0 "register_operand") - (minus:FMAMODE - (mult:FMAMODE - (neg:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand")) - (match_operand:FMAMODE 2 "nonimmediate_operand")) - (match_operand:FMAMODE 3 "nonimmediate_operand")))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (neg:FMAMODE (match_dup 1)) - (match_dup 2) - (neg:FMAMODE (match_dup 3))))] - "") - -;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -;; ;; Parallel single-precision floating point conversion operations ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; diff --git a/gcc/testsuite/gcc.target/i386/sse-24.c b/gcc/testsuite/gcc.target/i386/sse-24.c index d18b08e..daeb968 100644 --- a/gcc/testsuite/gcc.target/i386/sse-24.c +++ b/gcc/testsuite/gcc.target/i386/sse-24.c @@ -1,5 +1,5 @@ /* PR target/44338 */ /* { dg-do compile } */ -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -mno-fused-madd" } */ +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -ffp-contract=off" } */ #include "sse-23.c"