From patchwork Fri Jun 28 23:27:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 1954186 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=MXQ+cjGO; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W9s6c5Vdgz20Xg for ; Sat, 29 Jun 2024 09:27:36 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C5E103861833 for ; Fri, 28 Jun 2024 23:27:34 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 404FA3858D3C for ; Fri, 28 Jun 2024 23:27:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 404FA3858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 404FA3858D3C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719617238; cv=none; b=tx4ifyMX7HiVuFSO2gI709qkm7SPWjdkdZtHH0sLdvJYG2jzzt/OcBaZzAfPjG8yH7B6KYUzQxyQdfqgbJ64LZF8f8C7h3DJi1vROYbGtS149b8czzv5syjWvW+VyrWTMA5iifundtZO0OhCdCng64LTXZ1ZwV7KNfq5jQESAvE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719617238; c=relaxed/simple; bh=xobsrcm834Pa/UTGndFDRivA3E3mHD2XPKAPjU7dFcA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=attiXLVA/niy7QtWr9J5qp7hFkz6zzNKip/v4/KXabZgBSn6VLL3FD4rMskkfTU4GtBfm3Lb4fvcJ5Q5Gu0CzEYuhSZ197c+wnUdLzv7dsnDYFH/y0b22AE4mljG641sxvmnR/L5X76Z3l1cOO/z/mRMZXsbjN/rzV9IgzcgPg4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719617233; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=nCuGEnJHOb/ymz6HknJZqrWeds8rWVQzm9MgL6P2pXU=; b=MXQ+cjGOKRl66Tgc4nAaqK3RnGgVLyHxr+BSCj2RJNVguHu8MdYNsqVhjZZ6Ko3GrW9mlI 9pRBNolNAVimKIgolXq2orm1WmJnnjOAWA3CR2yc9Lz2ESeYPSDXtMKxhVLoVgC7Aful/8 IuccTPX/v8m5JEWzLzLyTCheqF4iIus= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-322-NJ0HcGFdOHarxMGjyBYLZA-1; Fri, 28 Jun 2024 19:27:12 -0400 X-MC-Unique: NJ0HcGFdOHarxMGjyBYLZA-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4A5F319560A6 for ; Fri, 28 Jun 2024 23:27:11 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.45.225.145]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A9E8D1955BD4 for ; Fri, 28 Jun 2024 23:27:09 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Subject: [PATCH] stdlib: Make getenv thread-safe in more cases Date: Sat, 29 Jun 2024 01:27:06 +0200 Message-ID: <87le2od4xh.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Async-signal-safety is improved as well. I think this achieves full thread safety and async-signal-safety for getenv if the environment is only updated using setenv, unsetenv, clearenv. If putenv is used, it depends on how the passed string is updated afterwards. Likewise for direct environ writes. This is fairly easy to accomplish because the current implementation never frees any variables set using setenv. The variables and their values observed using getenv do not necessarily follow the order in which they were set. But at least the phenomenon that random unrelated variables can get temporarily become lost (getenv returns NULL for them) during concurrent unsetenv calls should no longer happen. The cost is two extra acquire loads in getenv (only one in case the environment variable is found), and between one and two extra pointers per active environment variable (which is only a very moderate increase, given that tsearch already uses three pointers per variable that was ever set, not just every active one). In setenv/putenv/unsetenv, there are new atomic instructions, but these operations are already quite slow, use locking, and are generally not performance-critical. Some cases of setenv/putenv are faster because the new implementation can detect when the environ has a known size and use unused space in it, avoid the need for a realloc on every variable addition. I have something that can deal with internal environ use for passing to execve, too, but I want to get some feedback if this is a direction in which we want to move. All this may be too late for 2.40, but there's probably a bug or two that is fixed by this. The testsuite passes on aarch64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, s390x-linux-gnu, x86_64-linux-gnu. The snapshotting for getenv is admittedly rather tricky, hence the very long comment there. Initially, it was not working correctly on aarch64 because I had a relaxed MO store here at first: /* This store overwrites a value that has been removed, or that has already been written to a previous value. Release MO so that this store does not get reordered before the counter update in the previous loop iteration. */ atomic_store_release (dp, next_value); And I assume the incorrect snapshot was a result of stores being moved reordered quite far back, so that the core assumption of getenv (at most one write to the array if the counter does not change) was not correct. This led to a failure of stdlib/tst-getenv-unsetenv under load. Thanks, Florian --- stdlib/Makefile | 7 ++ stdlib/getenv.c | 138 ++++++++++++++++++++++++--- stdlib/putenv.c | 2 + stdlib/setenv.c | 217 ++++++++++++++++++++++++++++++------------- stdlib/setenv.h | 73 +++++++++++++++ stdlib/tst-environ.c | 13 +-- stdlib/tst-getenv-signal.c | 94 +++++++++++++++++++ stdlib/tst-getenv-thread.c | 61 ++++++++++++ stdlib/tst-getenv-unsetenv.c | 75 +++++++++++++++ 9 files changed, 596 insertions(+), 84 deletions(-) base-commit: c0f21e073d45670cb54811b07fce8e612a91f107 diff --git a/stdlib/Makefile b/stdlib/Makefile index 8b0ac63ddb..697b547540 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -275,6 +275,9 @@ tests := \ tst-canon-bz26341 \ tst-cxa_atexit \ tst-environ \ + tst-getenv-signal \ + tst-getenv-thread \ + tst-getenv-unsetenv \ tst-getrandom \ tst-labs \ tst-limits \ @@ -623,3 +626,7 @@ $(objpfx)tst-setcontext3.out: tst-setcontext3.sh $(objpfx)tst-setcontext3 $(evaluate-test) $(objpfx)tst-qsort5: $(libm) + +$(objpfx)tst-getenv-signal: $(shared-thread-library) +$(objpfx)tst-getenv-thread: $(shared-thread-library) +$(objpfx)tst-getenv-unsetenv: $(shared-thread-library) diff --git a/stdlib/getenv.c b/stdlib/getenv.c index bea69e0f40..ad7f0d00c8 100644 --- a/stdlib/getenv.c +++ b/stdlib/getenv.c @@ -15,24 +15,140 @@ License along with the GNU C Library; if not, see . */ -#include +#include +#include #include #include +struct environ_array *__environ_array_list; +environ_counter __environ_counter; + char * getenv (const char *name) { - if (__environ == NULL || name[0] == '\0') - return NULL; - - size_t len = strlen (name); - for (char **ep = __environ; *ep != NULL; ++ep) + while (true) { - if (name[0] == (*ep)[0] - && strncmp (name, *ep, len) == 0 && (*ep)[len] == '=') - return *ep + len + 1; + /* Used to deal with concurrent unsetenv. */ + environ_counter start_counter = atomic_load_acquire (&__environ_counter); + + /* We use relaxed MO here because we assume that loads carry a + dependency, matching the release MO store to __environ in + __add_to_environ. Objects pointed to by pointers stored in the + __environ array are never deallocated (except perhaps if putenv + is used, but then synchronization is the responsibility of the + applications). The backing store for __environ is allocated + zeroed. In summary, we can assume that the pointers we observe + are either valid or null. String stores into the environment + array use release MO, so we can assume the loads here see the + initialized contents (againt except for putenv). */ + char **start_environ = atomic_load_relaxed (&__environ); + if (start_environ == NULL || name[0] == '\0') + return NULL; + + size_t len = strlen (name); + for (char **ep = start_environ; ; ++ep) + { + char *entry = atomic_load_relaxed (ep); + if (entry == NULL) + break; + + /* If there is a match, return that value. It was valid at + one point, so we can return it. */ + if (name[0] == entry[0] + && strncmp (name, entry, len) == 0 && entry[len] == '=') + return entry + len + 1; + } + + /* The variable was not found. This might be a false negative + because unsetenv has shuffled around entries. Check if it is + necessary to retry. */ + + /* See Hans Boehm, Can Seqlocks Get Along with Programming Language + Memory Models?, Section 4. This is necessary so that loads in + the loop above are not ordered past the counter check below. */ + atomic_thread_fence_acquire (); + + if (atomic_load_acquire (&__environ_counter) == start_counter) + /* If we reach this point and there was a concurrent + unsetenv call which removed the key we tried to find, the + NULL return value is valid. We can also try again, not + find the value, and then return NULL (assuming there are + no further concurrent unsetenv calls). + + However, if getenv is called to find a value that is + present originally and not removed by any of the + concurrent unsetenv calls, we must not return NULL here. + + If the counter did not change, there was at most one + write to the array in unsetenv while the scanning loop + above was running. This means that there are at most two + different versions of the array to consider. For the + sake of argument, we assume that each load can make an + independent choice which version to use. An arbitrary + number of unsetenv and setenv calls may have happened + since start of getenv. Lets write E[0], E[1], ... for + the original environment elements, a(0) < (1) < ... for a + sequence of increasing integers that are the indices of + the environment variables remaing after the removals, and + N[0], N[1], ... for the new variables added by setenv or + putenv. Then at the start of the last unsetenv call, the + environment contains + + E[a(0)], E[a(1)], ..., N[0], N[1], ... + + (the N[k] are optional.) Let's assume that + we are looking for the value E[j]. Then one of the + a(i) == j (otherwise we may return NULL here because + of a unsetenv for the value we are looking for). In the + discussion below it will become clear that the N[k] do + not actually matter. + + The two versions of array we can choose from differ only + in one element, say E[a(i)]. There are two cases: + + Case (A): E[a(i)] is an element being removed by unsetenv + (the target of the first write). We can see the original + version: + + ..., E[a(i-1)], E[a(i)], E[a(i+1)], ..., N[0], ... + ------- + And the overwritten version: + + ..., E[a(i-1)], E[a(i+1)], E[a(i+1)], ..., N[0], ... + --------- + + (The valueE[a(i+1)] can be the terminating NULL.) + As discussed, we are not considering the removal of the + variable being searched for, so a(i) != j, and the + variable getenv is looking for is available in either + version, and we would have found it above. + + Case (B): E[a(i)] is an element that has already been + moved forward and is now itself being overwritten with + its sucessor value E[a(i+1)]. The two versions of the + array look like this: + + ..., E[a(i-1)], E[a(i)], E[a(i)], E[a(i+1)], ..., N[0], ... + ------- + And with the overwrite in place: + + ..., E[a(i-1)], E[a(i)], E[a(i+1)], E[a(i+1)], ..., N[0], ... + --------- + + The key observation here is that even in the second + version with the overwrite present, the scanning loop + will still encounter the overwritten value E[a(i)] in the + previous array element. This means that as long as the + E[j] is still present among the initial E[a(...)] (as we + assumed because there is no concurrent unsetenv for + E[j]), we encounter it while scanning here in getenv. + + In summary, if there was at most one write, a negative + result is a true negative, and we can return NULL. */ + return NULL; + /* Otherwise retry. This will never happen in a signal handler + that interrupted unsetenv because the suspended unsetenv call + cannot change the counter value. */ } - - return NULL; } libc_hidden_def (getenv) diff --git a/stdlib/putenv.c b/stdlib/putenv.c index b234ce0612..c61beb226f 100644 --- a/stdlib/putenv.c +++ b/stdlib/putenv.c @@ -23,6 +23,8 @@ # include #endif +#include + #if _LIBC || HAVE_STDLIB_H # include #endif diff --git a/stdlib/setenv.c b/stdlib/setenv.c index e2164371ad..32859bdfa2 100644 --- a/stdlib/setenv.c +++ b/stdlib/setenv.c @@ -19,6 +19,9 @@ # include #endif +#include +#include + /* Pacify GCC; see the commentary about VALLEN below. This is needed at least through GCC 4.9.2. Pacify GCC for the entire file, as there seems to be no way to pacify GCC selectively, only for the @@ -100,25 +103,51 @@ static void *known_values; #endif +/* Allocate a new environment array and put it o the + __environ_array_list. Returns NULL on memory allocation + failure. */ +static struct environ_array * +__environ_new_array (size_t required_size) +{ + /* No backing array yet, or insufficient room. */ + size_t new_size; + if (__environ_array_list == NULL + || __environ_array_list->allocated * 2 < required_size) + /* Add some unused space for future growth. */ + new_size = required_size + 16; + else + new_size = __environ_array_list->allocated * 2; -/* If this variable is not a null pointer we allocated the current - environment. */ -static char **last_environ; + size_t new_size_in_bytes; + if (__builtin_mul_overflow (new_size, sizeof (char *), + &new_size_in_bytes) + || __builtin_add_overflow (new_size_in_bytes, + offsetof (struct environ_array, + array), + &new_size_in_bytes)) + { + __set_errno (ENOMEM); + return NULL; + } + /* Zero-initialize everything, so that getenv can only + observe valid or null pointers. */ + struct environ_array *target_array = calloc (1, new_size_in_bytes); + if (target_array == NULL) + return NULL; + target_array->allocated = new_size; + assert (new_size >= target_array->allocated); + + /* Put it onto the list. */ + target_array->next = __environ_array_list; + __environ_array_list = target_array; + return target_array; +} -/* This function is used by `setenv' and `putenv'. The difference between - the two functions is that for the former must create a new string which - is then placed in the environment, while the argument of `putenv' - must be used directly. This is all complicated by the fact that we try - to reuse values once generated for a `setenv' call since we can never - free the strings. */ int __add_to_environ (const char *name, const char *value, const char *combined, int replace) { - char **ep; - size_t size; - /* Compute lengths before locking, so that the critical section is less of a performance bottleneck. VALLEN is needed only if COMBINED is null (unfortunately GCC is not smart enough to deduce @@ -133,45 +162,80 @@ __add_to_environ (const char *name, const char *value, const char *combined, LOCK; /* We have to get the pointer now that we have the lock and not earlier - since another thread might have created a new environment. */ - ep = __environ; + since another thread might have created a new environment. */ + char **start_environ = atomic_load_relaxed (&__environ); + char **ep = start_environ; - size = 0; + /* This gets written to __environ in the end. */ + char **result_environ = start_environ; + + /* Size of the environment if *ep == NULL. */ if (ep != NULL) - { - for (; *ep != NULL; ++ep) - if (!strncmp (*ep, name, namelen) && (*ep)[namelen] == '=') - break; - else - ++size; - } + for (; *ep != NULL; ++ep) + if (strncmp (*ep, name, namelen) == 0 && (*ep)[namelen] == '=') + break; - if (ep == NULL || __builtin_expect (*ep == NULL, 1)) + if (ep == NULL || __glibc_likely (*ep == NULL)) { - char **new_environ; + /* The scanning loop above reached the end of the environment. + Add a new string to it. */ + replace = true; - /* We allocated this space; we can extend it. Avoid using the raw - reallocated pointer to avoid GCC -Wuse-after-free. */ - uintptr_t ip_last_environ = (uintptr_t)last_environ; - new_environ = (char **) realloc (last_environ, - (size + 2) * sizeof (char *)); - if (new_environ == NULL) + /* + 2 for the new entry and the terminating NULL. */ + size_t required_size = (ep - start_environ) + 2; + if (__environ_is_from_array_list (start_environ) + && required_size <= __environ_array_list->allocated) + /* The __environ array is ours, and we have room in it. We + can use ep as-is. Add a null terminator in case current + usage is less than previous usage. */ + ep[1] = NULL; + else { - UNLOCK; - return -1; + /* We cannot use __environ as is and need to copy over the + __environ contents into an array managed via + __environ_array_list. */ + + struct environ_array *target_array; + if (__environ_array_list != NULL + && required_size <= __environ_array_list->allocated) + /* Existing array has enough room. Contents is copied below. */ + target_array = __environ_array_list; + else + { + /* Allocate a new array. */ + target_array = __environ_new_array (required_size); + if (target_array == NULL) + { + UNLOCK; + return -1; + } + } + + /* Copy over the __environ array contents. This is a + backwards copy if __environ points into + target_array->array, and it does not clobber values still + needing copying. This handles the case start_environ + == ep == NULL, too. */ + size_t i; + for (i = 0; start_environ + i < ep; ++i) + /* Regular store because unless there has been direct + manipulation of the environment, target_array is still + a private copy. */ + target_array->array[i] = atomic_load_relaxed (start_environ + i); + + /* This is the new place where we should add the element. */ + ep = target_array->array + i; + + /* Add the null terminator in case there was a pointer there + previously. */ + ep[1] = NULL; + + /* And __environ should be repointed to our array. */ + result_environ = &target_array->array[0]; } - - if ((uintptr_t)__environ != ip_last_environ) - memcpy ((char *) new_environ, (char *) __environ, - size * sizeof (char *)); - - new_environ[size] = NULL; - new_environ[size + 1] = NULL; - ep = new_environ + size; - - last_environ = __environ = new_environ; } - if (*ep == NULL || replace) + + if (replace || *ep == NULL) { char *np; @@ -213,9 +277,15 @@ __add_to_environ (const char *name, const char *value, const char *combined, #endif } - *ep = np; + /* Use release MO so that loads are sufficient to observe the + pointer contents because the CPU carries the dependency for + us. This also acts as a thread fence, making getenv + async-signal-safe. */ + atomic_store_release (ep, np); + atomic_store_release (&__environ, result_environ); } + UNLOCK; return 0; @@ -249,18 +319,40 @@ unsetenv (const char *name) LOCK; - ep = __environ; + ep = atomic_load_relaxed (&__environ); if (ep != NULL) - while (*ep != NULL) + while (true) { - if (!strncmp (*ep, name, len) && (*ep)[len] == '=') + char *entry = atomic_load_relaxed (ep); + if (entry == NULL) + break; + if (!strncmp (entry, name, len) && entry[len] == '=') { /* Found it. Remove this pointer by moving later ones back. */ char **dp = ep; - do - dp[0] = dp[1]; - while (*dp++); + while (true) + { + char *next_value = atomic_load_relaxed (dp + 1); + /* This store overwrites a value that has been + removed, or that has already been written to a + previous value. Release MO so that this store does + not get reordered before the counter update in the + previous loop iteration. */ + atomic_store_release (dp, next_value); + /* Release store synchronizes with acquire loads in + getenv. Non-atomic update because there is just + one writer due to the lock. + + See discussion of the counter check in getenv for + an explanation why this is sufficient synchronization. */ + atomic_store_release (&__environ_counter, + atomic_load_relaxed (&__environ_counter) + + 1); + if (next_value == NULL) + break; + ++dp; + } /* Continue the loop in case NAME appears again. */ } else @@ -278,19 +370,10 @@ unsetenv (const char *name) int clearenv (void) { - LOCK; - - if (__environ == last_environ && __environ != NULL) - { - /* We allocated this environment so we can free it. */ - free (__environ); - last_environ = NULL; - } - - /* Clear the environment pointer removes the whole environment. */ - __environ = NULL; - - UNLOCK; + /* Clear the environment pointer removes the whole environment. We + cannot deallocate environment arrays because other threads might + be iterating over them. */ + atomic_store_relaxed (&__environ, NULL); return 0; } @@ -301,6 +384,14 @@ __libc_setenv_freemem (void) /* Remove all traces. */ clearenv (); + /* Clear all backing arrays. */ + while (__environ_array_list != NULL) + { + void *ptr = __environ_array_list; + __environ_array_list = __environ_array_list->next; + free (ptr); + } + /* Now remove the search tree. */ __tdestroy (known_values, free); known_values = NULL; diff --git a/stdlib/setenv.h b/stdlib/setenv.h new file mode 100644 index 0000000000..d83160595c --- /dev/null +++ b/stdlib/setenv.h @@ -0,0 +1,73 @@ +/* Common declarations for the setenv/getenv family of functions. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _SETENV_H +#define _SETENV_H + +#include +#include + +/* We use an exponential sizing policy for environment arrays. The + arrays are not deallocating during the lifetime of the process. + This adds between one and two additional pointers per active + environemnt entry, on top of what is used by setenv to keep track + of environment values used before. */ +struct environ_array +{ + struct environ_array *next; /* Previously used environment array. */ + size_t allocated; /* Number of allocated array elments. */ + char * array[]; /* The actual environment array. */ +}; + +/* After initialization, and until the user resets environ (perhaps by + calling clearenv), &__environ[0] == &environ_array_list->array[0]. */ +extern struct environ_array *__environ_array_list attribute_hidden; + +/* Returns true if EP (which should be an __environ value) is a + pointer managed by setenv. */ +static inline _Bool +__environ_is_from_array_list (char **ep) +{ + struct environ_array *eal = atomic_load_relaxed (&__environ_array_list); + return eal != NULL && &eal->array[0] == ep; +} + +/* Counter for detecting concurrent modification in unsetenv. + Ideally, this should be a 64-bit counter that cannot wrap around, + but given that counter wrapround is probably impossible to hit + (2**32 operations in unsetenv concurrently with getenv), using + seems unnecessary. */ +#if __HAVE_64B_ATOMICS +typedef uint64_t environ_counter; +#else +typedef uint32_t environ_counter; +#endif + +/* Updated by unsetenv to detect multiple overwrites in getenv. */ +extern environ_counter __environ_counter attribute_hidden; + +/* This function is used by `setenv' and `putenv'. The difference between + the two functions is that for the former must create a new string which + is then placed in the environment, while the argument of `putenv' + must be used directly. This is all complicated by the fact that we try + to reuse values once generated for a `setenv' call since we can never + free the strings. */ +int __add_to_environ (const char *name, const char *value, + const char *combines, int replace) attribute_hidden; + +#endif /* _SETENV_H */ diff --git a/stdlib/tst-environ.c b/stdlib/tst-environ.c index bab8873f08..aff191bb4d 100644 --- a/stdlib/tst-environ.c +++ b/stdlib/tst-environ.c @@ -20,6 +20,7 @@ #include #include #include +#include #define VAR "FOOBAR" @@ -50,11 +51,7 @@ do_test (void) /* Getting this value should now be possible. */ valp = getenv (VAR); - if (valp == NULL || strcmp (valp, "one") != 0) - { - puts ("getenv #2 failed"); - result = 1; - } + TEST_COMPARE_STRING (valp, "one"); /* Try to replace without the replace flag set. This should fail. */ if (setenv (VAR, "two", 0) != 0) @@ -65,11 +62,7 @@ do_test (void) /* The value shouldn't have changed. */ valp = getenv (VAR); - if (valp == NULL || strcmp (valp, "one") != 0) - { - puts ("getenv #3 failed"); - result = 1; - } + TEST_COMPARE_STRING (valp, "one"); /* Now replace the value using putenv. */ if (putenv (putenv_val) != 0) diff --git a/stdlib/tst-getenv-signal.c b/stdlib/tst-getenv-signal.c new file mode 100644 index 0000000000..e6c5afde2a --- /dev/null +++ b/stdlib/tst-getenv-signal.c @@ -0,0 +1,94 @@ +/* Test getenv from a signal handler interrupting environent updates. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* Set to false by the main thread after doing all the setenv + calls. */ +static bool running = true; + +/* Used to synchronize the start of signal sending. */ +static pthread_barrier_t barrier; + +/* Identity of the main thread. */ +static pthread_t main_thread; + +/* Send SIGUSR1 signals to main_thread. */ +static void * +signal_thread (void *ignored) +{ + xpthread_barrier_wait (&barrier); + while (__atomic_load_n (&running, __ATOMIC_RELAXED)) + xpthread_kill (main_thread, SIGUSR1); + return NULL; +} + +/* Call getenv from a signal handler. */ +static void +signal_handler (int signo) +{ + TEST_COMPARE_STRING (getenv ("unset_variable"), NULL); + char *value = getenv ("set_variable"); + TEST_VERIFY (strncmp (value, "value", strlen ("value")) == 0); +} + +static int +do_test (void) +{ + /* Added to the environment using putenv. */ + char *variables[30]; + for (int i = 0; i < array_length (variables); ++i) + variables[i] = xasprintf("v%d=%d", i, i); + + xsignal (SIGUSR1, signal_handler); + TEST_COMPARE (setenv ("set_variable", "value", 1), 0); + xraise (SIGUSR1); + main_thread = pthread_self (); + xpthread_barrier_init (&barrier, NULL, 2); + pthread_t thr = xpthread_create (NULL, signal_thread, NULL); + xpthread_barrier_wait (&barrier); + + for (int i = 0; i < array_length (variables); ++i) + { + char buf[30]; + TEST_COMPARE (setenv ("temporary_variable", "1", 1), 0); + snprintf (buf, sizeof (buf), "V%d", i); + TEST_COMPARE (setenv (buf, buf + 1, 1), 0); + TEST_COMPARE (putenv (variables[i]), 0); + snprintf (buf, sizeof (buf), "value%d", i); + TEST_COMPARE (setenv ("set_variable", buf, 1), 0); + TEST_COMPARE (unsetenv ("temporary_variable"), 0); + } + + __atomic_store_n (&running, false, __ATOMIC_RELAXED); + xpthread_join (thr); + xpthread_barrier_destroy (&barrier); + + for (int i = 0; i < array_length (variables); ++i) + free (variables[i]); + return 0; +} + +#include diff --git a/stdlib/tst-getenv-thread.c b/stdlib/tst-getenv-thread.c new file mode 100644 index 0000000000..e5a79c5227 --- /dev/null +++ b/stdlib/tst-getenv-thread.c @@ -0,0 +1,61 @@ +/* Test getenv with concurrent setenv. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include + +/* Set to false by the main thread after doing all the setenv + calls. */ +static bool running = true; + +/* Used to synchronize the start of the getenv thread. */ +static pthread_barrier_t barrier; + +/* Invoke getenv for a nonexisting environment variable in a loop. + This checks that concurrent setenv does not invalidate the + environment array while getenv reads it. */ +static void * +getenv_thread (void *ignored) +{ + xpthread_barrier_wait (&barrier); + while (__atomic_load_n (&running, __ATOMIC_RELAXED)) + TEST_VERIFY (getenv ("unset_variable") == NULL); + return NULL; +} + +static int +do_test (void) +{ + xpthread_barrier_init (&barrier, NULL, 2); + pthread_t thr = xpthread_create (NULL, getenv_thread, NULL); + xpthread_barrier_wait (&barrier); + for (int i = 0; i < 1000; ++i) + { + char buf[30]; + snprintf (buf, sizeof (buf), "V%d", i); + TEST_COMPARE (setenv (buf, buf + 1, 1), 0); + } + __atomic_store_n (&running, false, __ATOMIC_RELAXED); + xpthread_join (thr); + xpthread_barrier_destroy (&barrier); + return 0; +} + +#include diff --git a/stdlib/tst-getenv-unsetenv.c b/stdlib/tst-getenv-unsetenv.c new file mode 100644 index 0000000000..88ce92fd53 --- /dev/null +++ b/stdlib/tst-getenv-unsetenv.c @@ -0,0 +1,75 @@ +/* Test getenv with concurrent unsetenv. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include + +/* Used to synchronize the start of each test iteration. */ +static pthread_barrier_t barrier; + +/* Number of iterations. */ +enum { iterations = 10000 }; + +/* Check that even with concurrent unsetenv, a variable that is known + to be there is found. */ +static void * +getenv_thread (void *ignored) +{ + for (int i = 0; i < iterations; ++i) + { + xpthread_barrier_wait (&barrier); + TEST_COMPARE_STRING (getenv ("variable"), "value"); + xpthread_barrier_wait (&barrier); + } + return NULL; +} + +static int +do_test (void) +{ + xpthread_barrier_init (&barrier, NULL, 2); + pthread_t thr = xpthread_create (NULL, getenv_thread, NULL); + + char *variables[50]; + for (int i = 0; i < array_length (variables); ++i) + variables[i] = xasprintf("V%d", i); + + for (int i = 0; i < iterations; ++i) + { + clearenv (); + for (int j = 0; j < array_length(variables); ++j) + TEST_COMPARE (setenv (variables[j], variables[j] + 1, 1), 0); + TEST_COMPARE (setenv ("variable", "value", 1), 0); + xpthread_barrier_wait (&barrier); + /* Test runs. */ + for (int j = 0; j < array_length(variables); ++j) + TEST_COMPARE (unsetenv (variables[j]), 0); + xpthread_barrier_wait (&barrier); + } + xpthread_join (thr); + xpthread_barrier_destroy (&barrier); + for (int i = 0; i < array_length (variables); ++i) + free (variables[i]); + return 0; +} + +#include