From patchwork Wed Nov 9 19:56:15 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 124689 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 5FCC0B6F69 for ; Thu, 10 Nov 2011 06:56:38 +1100 (EST) Received: (qmail 16238 invoked by alias); 9 Nov 2011 19:56:37 -0000 Received: (qmail 16230 invoked by uid 22791); 9 Nov 2011 19:56:36 -0000 X-SWARE-Spam-Status: No, hits=-7.0 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_PX, TW_SV X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 09 Nov 2011 19:56:18 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pA9JuG5T021867 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 9 Nov 2011 14:56:16 -0500 Received: from anchor.twiddle.net (vpn-225-179.phx2.redhat.com [10.3.225.179]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id pA9JuFdr023736; Wed, 9 Nov 2011 14:56:15 -0500 Message-ID: <4EBADADF.5060602@redhat.com> Date: Wed, 09 Nov 2011 11:56:15 -0800 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0) Gecko/20110927 Thunderbird/7.0 MIME-Version: 1.0 To: GCC Patches CC: Rainer Orth Subject: [libitm] avoid non-portable branch mnemonics X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org I said elsewhere that I would convert this to __atomic, but then I re-read my commentary about using cmpxchg *without* a lock prefix. What we're looking for here is more or less non-interruptible, rather than atomic. And apparently I benchmarked this a while back as a 10x performance improvement. Seems like the easiest thing is simply to use .byte instead of ,pn. Committed. r~ commit f3210a53394de39a8aa74ec9dcb23f2cc0551322 Author: rth Date: Wed Nov 9 19:51:49 2011 +0000 libitm: Avoid non-portable x86 branch prediction mnemonic. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181233 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/libitm/ChangeLog b/libitm/ChangeLog index e78716d..0501d16 100644 --- a/libitm/ChangeLog +++ b/libitm/ChangeLog @@ -1,5 +1,8 @@ 2011-11-09 Richard Henderson + * config/x86/cacheline.h (gtm_cacheline::store_mask): Use .byte + to emit branch prediction hint. + * config/x86/sjlj.S: Protect elf directives with __ELF__. Protect .note.GNU-stack with __linux__. diff --git a/libitm/config/x86/cacheline.h b/libitm/config/x86/cacheline.h index 15a95b0..f91d7cc 100644 --- a/libitm/config/x86/cacheline.h +++ b/libitm/config/x86/cacheline.h @@ -144,7 +144,7 @@ gtm_cacheline::operator= (const gtm_cacheline & __restrict s) } #endif -// ??? Support masked integer stores more efficiently with an unlocked cmpxchg +// Support masked integer stores more efficiently with an unlocked cmpxchg // insn. My reasoning is that while we write to locations that we do not wish // to modify, we do it in an uninterruptable insn, and so we either truely // write back the original data or the insn fails -- unlike with a @@ -171,7 +171,8 @@ gtm_cacheline::store_mask (uint32_t *d, uint32_t s, uint8_t m) "and %[m], %[n]\n\t" "or %[s], %[n]\n\t" "cmpxchg %[n], %[d]\n\t" - "jnz,pn 0b" + ".byte 0x2e\n\t" // predict not-taken, aka jnz,pn + "jnz 0b" : [d] "+m"(*d), [n] "=&r" (n), [o] "+a"(o) : [s] "r" (s & bm), [m] "r" (~bm)); } @@ -198,7 +199,8 @@ gtm_cacheline::store_mask (uint64_t *d, uint64_t s, uint8_t m) "and %[m], %[n]\n\t" "or %[s], %[n]\n\t" "cmpxchg %[n], %[d]\n\t" - "jnz,pn 0b" + ".byte 0x2e\n\t" // predict not-taken, aka jnz,pn + "jnz 0b" : [d] "+m"(*d), [n] "=&r" (n), [o] "+a"(o) : [s] "r" (s & bm), [m] "r" (~bm)); #else