From patchwork Mon Nov 28 14:02:52 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Modra X-Patchwork-Id: 127994 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 5220AB6F85 for ; Tue, 29 Nov 2011 01:03:31 +1100 (EST) Received: (qmail 6699 invoked by alias); 28 Nov 2011 14:03:21 -0000 Received: (qmail 6656 invoked by uid 22791); 28 Nov 2011 14:03:14 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-gx0-f175.google.com (HELO mail-gx0-f175.google.com) (209.85.161.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 28 Nov 2011 14:03:00 +0000 Received: by ggnh1 with SMTP id h1so1277951ggn.20 for ; Mon, 28 Nov 2011 06:02:59 -0800 (PST) Received: by 10.236.156.33 with SMTP id l21mr2569710yhk.96.1322488979530; Mon, 28 Nov 2011 06:02:59 -0800 (PST) Received: from bubble.grove.modra.org ([115.187.252.19]) by mx.google.com with ESMTPS id 4sm47461602ano.9.2011.11.28.06.02.56 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 28 Nov 2011 06:02:58 -0800 (PST) Received: by bubble.grove.modra.org (Postfix, from userid 1000) id 7A060170C2BF; Tue, 29 Nov 2011 00:32:52 +1030 (CST) Date: Tue, 29 Nov 2011 00:32:52 +1030 From: Alan Modra To: Jakub Jelinek , gcc-patches@gcc.gnu.org, Richard Henderson Subject: Re: Fix PR51298, libgomp barrier failure Message-ID: <20111128140252.GK7827@bubble.grove.modra.org> Mail-Followup-To: Jakub Jelinek , gcc-patches@gcc.gnu.org, Richard Henderson References: <20111125000328.GD5085@bubble.grove.modra.org> <20111125073839.GN27242@tyan-ft48-01.lab.bos.redhat.com> <20111127234209.GC7827@bubble.grove.modra.org> <20111128081327.GU27242@tyan-ft48-01.lab.bos.redhat.com> <20111128093901.GJ7827@bubble.grove.modra.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20111128093901.GJ7827@bubble.grove.modra.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Mon, Nov 28, 2011 at 08:09:01PM +1030, Alan Modra wrote: > On Mon, Nov 28, 2011 at 09:13:27AM +0100, Jakub Jelinek wrote: > > On Mon, Nov 28, 2011 at 10:12:09AM +1030, Alan Modra wrote: > > > --- libgomp/config/linux/bar.c (revision 181718) > > > +++ libgomp/config/linux/bar.c (working copy) > > > @@ -36,18 +36,15 @@ gomp_barrier_wait_end (gomp_barrier_t *b > > > if (__builtin_expect ((state & 1) != 0, 0)) > > > { > > > /* Next time we'll be awaiting TOTAL threads again. */ > > > - bar->awaited = bar->total; > > > - atomic_write_barrier (); > > > - bar->generation += 4; > > > + __atomic_store_4 (&bar->awaited, bar->total, MEMMODEL_RELEASE); > > > + __atomic_add_fetch (&bar->generation, 4, MEMMODEL_ACQ_REL); > > > futex_wake ((int *) &bar->generation, INT_MAX); > > > } > > > > The above is unfortunate, bar->generation should be only modified from a > > single thread at this point, but the __atomic_add_fetch will enforce there > > an atomic instruction on it rather than just some kind of barrier. > > I will try without the atomic add and see what happens. Seems to be fine. I took out the extra write barriers too, so we now just have two MEMMODEL_RELEASE atomic store barriers replacing the two old atomic_write_barriers, and a number of new acquire barriers. PR libgomp/51298 * config/linux/bar.h: Use atomic rather than sync builtins. * config/linux/bar.c: Likewise. Add missing acquire synchronisation on generation field. * task.c (gomp_barrier_handle_tasks): Regain lock so as to not double unlock. Index: libgomp/task.c =================================================================== --- libgomp/task.c (revision 181718) +++ libgomp/task.c (working copy) @@ -273,6 +273,7 @@ gomp_barrier_handle_tasks (gomp_barrier_ gomp_team_barrier_done (&team->barrier, state); gomp_mutex_unlock (&team->task_lock); gomp_team_barrier_wake (&team->barrier, 0); + gomp_mutex_lock (&team->task_lock); } } } Index: libgomp/config/linux/bar.h =================================================================== --- libgomp/config/linux/bar.h (revision 181770) +++ libgomp/config/linux/bar.h (working copy) @@ -50,7 +50,7 @@ static inline void gomp_barrier_init (go static inline void gomp_barrier_reinit (gomp_barrier_t *bar, unsigned count) { - __sync_fetch_and_add (&bar->awaited, count - bar->total); + __atomic_add_fetch (&bar->awaited, count - bar->total, MEMMODEL_ACQ_REL); bar->total = count; } @@ -69,10 +69,8 @@ extern void gomp_team_barrier_wake (gomp static inline gomp_barrier_state_t gomp_barrier_wait_start (gomp_barrier_t *bar) { - unsigned int ret = bar->generation & ~3; - /* Do we need any barrier here or is __sync_add_and_fetch acting - as the needed LoadLoad barrier already? */ - ret += __sync_add_and_fetch (&bar->awaited, -1) == 0; + unsigned int ret = __atomic_load_4 (&bar->generation, MEMMODEL_ACQUIRE) & ~3; + ret += __atomic_add_fetch (&bar->awaited, -1, MEMMODEL_ACQ_REL) == 0; return ret; } Index: libgomp/config/linux/bar.c =================================================================== --- libgomp/config/linux/bar.c (revision 181770) +++ libgomp/config/linux/bar.c (working copy) @@ -37,17 +37,15 @@ gomp_barrier_wait_end (gomp_barrier_t *b { /* Next time we'll be awaiting TOTAL threads again. */ bar->awaited = bar->total; - atomic_write_barrier (); - bar->generation += 4; + __atomic_store_4 (&bar->generation, bar->generation + 4, + MEMMODEL_RELEASE); futex_wake ((int *) &bar->generation, INT_MAX); } else { - unsigned int generation = state; - do - do_wait ((int *) &bar->generation, generation); - while (bar->generation == generation); + do_wait ((int *) &bar->generation, state); + while (__atomic_load_4 (&bar->generation, MEMMODEL_ACQUIRE) == state); } } @@ -81,15 +79,15 @@ gomp_team_barrier_wake (gomp_barrier_t * void gomp_team_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state) { - unsigned int generation; + unsigned int generation, gen; if (__builtin_expect ((state & 1) != 0, 0)) { /* Next time we'll be awaiting TOTAL threads again. */ struct gomp_thread *thr = gomp_thread (); struct gomp_team *team = thr->ts.team; + bar->awaited = bar->total; - atomic_write_barrier (); if (__builtin_expect (team->task_count, 0)) { gomp_barrier_handle_tasks (state); @@ -97,7 +95,7 @@ gomp_team_barrier_wait_end (gomp_barrier } else { - bar->generation = state + 3; + __atomic_store_4 (&bar->generation, state + 3, MEMMODEL_RELEASE); futex_wake ((int *) &bar->generation, INT_MAX); return; } @@ -107,12 +105,16 @@ gomp_team_barrier_wait_end (gomp_barrier do { do_wait ((int *) &bar->generation, generation); - if (__builtin_expect (bar->generation & 1, 0)) - gomp_barrier_handle_tasks (state); - if ((bar->generation & 2)) + gen = __atomic_load_4 (&bar->generation, MEMMODEL_ACQUIRE); + if (__builtin_expect (gen & 1, 0)) + { + gomp_barrier_handle_tasks (state); + gen = __atomic_load_4 (&bar->generation, MEMMODEL_ACQUIRE); + } + if ((gen & 2) != 0) generation |= 2; } - while (bar->generation != state + 4); + while (gen != state + 4); } void