From patchwork Wed Jun 3 15:24:49 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Siddhesh Poyarekar X-Patchwork-Id: 480012 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 93467140271 for ; Thu, 4 Jun 2015 01:25:03 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=sourceware.org header.i=@sourceware.org header.b=YuhYBy4b; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:in-reply-to; q=dns; s=default; b=Je4i H61N3kcbLRJivue8M0OI7G1yfQUHSYmEIejcpK+1TPFy3klVaLpNJwS16LyWnRdq HiF3Mu/fBHm3LmkXYIrEcGVgWdyK0iS1iX+ntOWcbtWiFVgJ5bmkUZw+lR1JlVrA qXHkjzQ/N14k2/krRwRRBbGS0tnhB8aIaMRNwdc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:in-reply-to; s=default; bh=/FV+J2NfLn f2TwjWoJyixQYeAeY=; b=YuhYBy4bftLl3lrkpz7aeQXoHI+ndsZcjEuJfvlHV0 btrOgRzK5bBCy5Zy+Ho6ro9ooiAlzgx62VsywBkxSeWUvb7b5wjSrh/zc1wsXphP EsYK9gHwtDaF3EjcdQYI9UMkD+TXLt6QiJ95Dh4TrdDF24czFq6p2JhJ7l4PyyiZ g= Received: (qmail 39405 invoked by alias); 3 Jun 2015 15:24:56 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 39390 invoked by uid 89); 3 Jun 2015 15:24:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.2 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_PASS, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Date: Wed, 3 Jun 2015 20:54:49 +0530 From: Siddhesh Poyarekar To: Alexandre Oliva Cc: libc-alpha@sourceware.org Subject: Re: [PR18457] Don't require rtld lock to compute DTV addr for static TLS Message-ID: <20150603152448.GC32684@spoyarek.pnq.redhat.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) On Wed, Jun 03, 2015 at 03:44:58AM -0300, Alexandre Oliva wrote: > We used to store static TLS addrs in the DTV at module load time, but > this required one thread to modify another thread's DTV. Now that we > defer the DTV update to the first use in the thread, we should do so > without taking the rtld lock if the module is already assigned to static > TLS. Taking the lock introduces deadlocks where there weren't any > before. > > This patch fixes the deadlock caused by tls_get_addr's unnecessarily > taking the rtld lock to initialize the DTV entry for tls_dtor_list > within __call_dtors_list, which deadlocks with a dlclose of a module > whose finalizer joins with that thread. The patch does not, however, > attempt to fix other potential sources of similar deadlocks, such as > the explicit rtld locks taken by call_dtors_list, when the dtor list > is not empty; lazy relocation of the reference to tls_dtor_list, when > TLS Descriptors are in use; when tls dtors call functions through the > PLT and lazy relocation needs to be performed, or any such called > functions interact with the dynamic loader in ways that require its > lock to be taken. > > Ok to install? It's not good enough and is in fact probably just dancing around the problem. The simple patch to the test case below will cause the test case to deadlock. Andreas' reproducer can be fixed by simply setting the TLS variables in cxa_thread_atexit as initial exec; I've got a patch for it that I'll post shortly. That would leave two other problems: 1. All of the lock taking and NODELETE flag clearing in cxa_thread_atexit. Not only can it cause a deadlock, clearing the flag like that may actually be wrong. We may be better off not unloading the DSO at all, but I'll see if there's another way out. 2. The lock taking in tls_get_addr_tail. That has to go and we need to figure out another way to wait for another dlopen to complete. I haven't wrapped my head around this bit of the code properly yet and you may be better placed to debug this. If you don't have enough time, I could run with any kind of help/guidance you may provide. Siddhesh diff --git a/nptl/tst-join7mod.c b/nptl/tst-join7mod.c index a8c7bc0..a066a1f 100644 --- a/nptl/tst-join7mod.c +++ b/nptl/tst-join7mod.c @@ -4,12 +4,15 @@ static pthread_t th; static int running = 1; +static __thread int foo; + static void * test_run (void *p) { while (running) fprintf (stderr, "XXX test_run\n"); fprintf (stderr, "XXX test_run FINISHED\n"); + foo = 42; return NULL; }