From patchwork Fri Mar 30 14:18:58 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Teresa Johnson X-Patchwork-Id: 149662 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E10A9B6EEC for ; Sat, 31 Mar 2012 01:19:49 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1333721990; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: MIME-Version:Received:Received:Received:Received:To:Subject: Message-Id:Date:From:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=vLT5yWAWPm+w8CgFjXDyu7Vx3gk=; b=BfPMCrTa2DVYCxY rPkFCnL8h6+jOXA7m6yto7fw+vd3HKhkJgBte2Gl93tA593Pld4ycGb9Wc2urYIG lp6ujA0g2SA1NmD7J/4d8jnnYH28TOV1DWYstKpQTDImNVa2q49c63ZX9glbyzp0 8fMCVbH+JuY5c9VYChARAtT6b+B8= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:Received:MIME-Version:Received:Received:Received:Received:To:Subject:Message-Id:Date:From:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=n2CVEd1T2Ch7GBFR6LUIEixZ9PPyd/gKAocO+m9its4xMqqUQnDKnCKsrswHlb 9m/cnW7R4nc6vI02lO3U1zV1eOyFELii6oxWqB6KBA7ohHtG6WQIq8g2W9RLuprs CnAgeF2qqYJu6bOsIDbKX2uT97JmWPu5ZGYGMYFnMQFXM=; Received: (qmail 14752 invoked by alias); 30 Mar 2012 14:19:26 -0000 Received: (qmail 14613 invoked by uid 22791); 30 Mar 2012 14:19:24 -0000 X-SWARE-Spam-Status: No, hits=-4.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KHOP_RCVD_TRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, TW_OV, TW_VH, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail-ee0-f73.google.com (HELO mail-ee0-f73.google.com) (74.125.83.73) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 30 Mar 2012 14:19:00 +0000 Received: by eeit10 with SMTP id t10so42197eei.2 for ; Fri, 30 Mar 2012 07:18:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:to:subject:message-id:date:from:x-gm-message-state; bh=0uKtDB4lMt9VhMb157UgIKjB8p14QwdsGaPhVa9hRzM=; b=CjL2xhSe0FD73g8WVjAyGn7Vy4hb5Wp7zQWbcsjtoIcocVjL5v3gHlMTBcc8vg+37e OU+YlEkPtIqWBMP9TOeMlFS5eZd9wNSuDn4qhRDfJtSxwhbrVE7+iAIHaIezwlbWxHXN JhOa8YGAgXLiB9XCBdAFLZcn9MTHxHItv6QktkkUNbXBeICabq4LAwygdEmJB2MQ7cJW tYVKf3sloa2JhWhHOqVvxBVifEmriRdBp3Nkx5uOl0l0lXaRc8FvEpcbd35267wEhT5o Br93QQ7FVFlqrBTwqJ6OM1xu9GMsLP0cF1HAMClJSIxfXwMvTRTCcQtr+VwMqHrkUAJp fUOA== Received: by 10.14.37.16 with SMTP id x16mr731649eea.1.1333117139324; Fri, 30 Mar 2012 07:18:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.14.37.16 with SMTP id x16mr731637eea.1.1333117139212; Fri, 30 Mar 2012 07:18:59 -0700 (PDT) Received: from hpza10.eem.corp.google.com ([74.125.121.33]) by gmr-mx.google.com with ESMTPS id a14si6730802een.0.2012.03.30.07.18.59 (version=TLSv1/SSLv3 cipher=AES128-SHA); Fri, 30 Mar 2012 07:18:59 -0700 (PDT) Received: from tjsboxrox.mtv.corp.google.com (tjsboxrox.mtv.corp.google.com [172.18.110.68]) by hpza10.eem.corp.google.com (Postfix) with ESMTP id F39CF200057; Fri, 30 Mar 2012 07:18:58 -0700 (PDT) Received: by tjsboxrox.mtv.corp.google.com (Postfix, from userid 147431) id 420D5615CC; Fri, 30 Mar 2012 07:18:58 -0700 (PDT) To: reply@codereview.appspotmail.com,gcc-patches@gcc.gnu.org Subject: [Patch, i386] Avoid LCP stalls (issue5975045) Message-Id: <20120330141858.420D5615CC@tjsboxrox.mtv.corp.google.com> Date: Fri, 30 Mar 2012 07:18:58 -0700 (PDT) From: tejohnson@google.com (Teresa Johnson) X-Gm-Message-State: ALoCoQldMFGhumk8vkP2LSrc0eDRJtgGXqk4e2IuctMUNgKlDHUB1rIYrhO5Y0KSxCwmyK9NwcrXjEM+q37+g9qmQeBFD+wb7Ii6tcCCRADdGLXB8Qdb7IRsuS4uCpgPrMqTjW+Qfm7VchAA2wx39WOuGkWjF4lVw13+5UpYh9bHCwSYXfuHuNA= X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch addresses instructions that incur expensive length-changing prefix (LCP) stalls on some x86-64 implementations, notably Core2 and Corei7. Specifically, a move of a 16-bit constant into memory requires a length-changing prefix and can incur significant penalties. The attached patch avoids this by forcing such instructions to be split into two: a move of the corresponding 32-bit constant into a register, and a move of the register's lower 16 bits into memory. Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk? Thanks, Teresa 2012-03-29 Teresa Johnson * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_LCP_STALL. * config/i386/i386.md (movhi_internal): Split to movhi_internal and movhi_imm_internal. * config/i386/i386.c (initial_ix86_tune_features): Initialize X86_TUNE_LCP_STALL entry. --- This patch is available for review at http://codereview.appspot.com/5975045 Index: config/i386/i386.h =================================================================== --- config/i386/i386.h (revision 185920) +++ config/i386/i386.h (working copy) @@ -262,6 +262,7 @@ enum ix86_tune_indices { X86_TUNE_MOVX, X86_TUNE_PARTIAL_REG_STALL, X86_TUNE_PARTIAL_FLAG_REG_STALL, + X86_TUNE_LCP_STALL, X86_TUNE_USE_HIMODE_FIOP, X86_TUNE_USE_SIMODE_FIOP, X86_TUNE_USE_MOV0, @@ -340,6 +341,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL] #define TARGET_PARTIAL_FLAG_REG_STALL \ ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL] +#define TARGET_LCP_STALL \ + ix86_tune_features[X86_TUNE_LCP_STALL] #define TARGET_USE_HIMODE_FIOP ix86_tune_features[X86_TUNE_USE_HIMODE_FIOP] #define TARGET_USE_SIMODE_FIOP ix86_tune_features[X86_TUNE_USE_SIMODE_FIOP] #define TARGET_USE_MOV0 ix86_tune_features[X86_TUNE_USE_MOV0] Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 185920) +++ config/i386/i386.md (working copy) @@ -2262,9 +2262,19 @@ ] (const_string "SI")))]) +(define_insn "*movhi_imm_internal" + [(set (match_operand:HI 0 "memory_operand" "=m") + (match_operand:HI 1 "immediate_operand" "n"))] + "!TARGET_LCP_STALL && !(MEM_P (operands[0]) && MEM_P (operands[1]))" +{ + return "mov{w}\t{%1, %0|%0, %1}"; +} + [(set (attr "type") (const_string "imov")) + (set (attr "mode") (const_string "HI"))]) + (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m") - (match_operand:HI 1 "general_operand" "r,rn,rm,rn"))] + (match_operand:HI 1 "general_operand" "r,rn,rm,r"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 185920) +++ config/i386/i386.c (working copy) @@ -1964,6 +1964,11 @@ static unsigned int initial_ix86_tune_features[X86 /* X86_TUNE_PARTIAL_FLAG_REG_STALL */ m_CORE2I7 | m_GENERIC, + /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall + * on 16-bit immediate moves into memory on Core2 and Corei7, + * which may also affect AMD implementations. */ + m_CORE2I7 | m_GENERIC | m_AMD_MULTIPLE, + /* X86_TUNE_USE_HIMODE_FIOP */ m_386 | m_486 | m_K6_GEODE,