From patchwork Tue May 7 13:52:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 1932494 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=OLwh6V1l; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VYg5N5Gc7z1xnT for ; Wed, 8 May 2024 00:04:56 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EB9D23849AF4 for ; Tue, 7 May 2024 14:04:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 9A27E3858D1E for ; Tue, 7 May 2024 14:04:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9A27E3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9A27E3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715090664; cv=none; b=M6COVkT84Cu8lcIBEJsaZKjU6oVHxml+K+zikEylAdcUq89gREgranR43T9jL45QLpBgaz88LSJc1GB/8+E6QYPrt1BTtL/vXlB/N1LrP0PBSMFcaNFTMz2t3Sk+dWh39iaPMJheRgRHyHcjw/wU/NqLenyN19rNpx33a3bthbY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715090664; c=relaxed/simple; bh=xlHvpJydU0WR5hFCcMFOinYrpm61AriRTJYEJMBJQF4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=gioO4VliNEf/RBuXJiNCrt5M/aL6eOKDcncxtflg3FV0BQEuVLdiKR95i2s4XlLbqJcU3PhKMU0PROmfBtceHKyj7qfEZtb/4NVdFtw3uM/JX5uuPUtcL6Z/c386eDb/VUx/4Nc2GY+RHCzisqlnyQnrizj2iEv5i8HuQIlUKjE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1715090662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=rqxjiGE9WTeIDSU5zOPLETxEhVWAWexELpRVAqa3jkY=; b=OLwh6V1lT1uYWHsr3bris3xnJRY/H7kH3pBTNPcIAvyNIdXGEE+sJT2whFZ7QbiAs6P/iV hvibP8HfW8Rbslw+lmv49qcqHobRzhnQwdcYkIR0fPGcnyrsGSho1ax06vP0PYOLIpF7sr M9724cOVD0SvceDVXo4MMfrdVi6ngSo= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-64-LhRbMdzjO4KvxfcZRcPY_g-1; Tue, 07 May 2024 10:04:20 -0400 X-MC-Unique: LhRbMdzjO4KvxfcZRcPY_g-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 376FD3C0E20A; Tue, 7 May 2024 14:04:20 +0000 (UTC) Received: from localhost (unknown [10.42.28.238]) by smtp.corp.redhat.com (Postfix) with ESMTP id D20311C060AE; Tue, 7 May 2024 14:04:19 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Date: Tue, 7 May 2024 14:52:29 +0100 Message-ID: <20240507140415.3821279-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Tested x86_64-linux. This seems "obviously correct", and I'd like to push it. The current code definitely has a data race, i.e. undefined behaviour. -- >8 -- The lazy caching in std::basic_ios::fill() updates a mutable member without synchronization, which can cause a data race if two threads both call fill() on the same stream object when _M_fill_init is false. To avoid this we can just cache the _M_fill member and set _M_fill_init early in std::basic_ios::init, instead of doing it lazily. As explained by the comment in init, there's a good reason for doing it lazily. When char_type is neither char nor wchar_t, the locale might not have a std::ctype, so getting the fill character would throw an exception. The current lazy init allows using unformatted I/O with such a stream, because the fill character is never needed and so it doesn't matter if the locale doesn't have a ctype facet. We can maintain this property by only setting the fill character in std::basic_ios::init if the ctype facet is present at that time. If fill() is called later and the fill character wasn't set by init, we can get it from the stream's current locale at the point when fill() is called (and not try to cache it without synchronization). This causes a change in behaviour for the following program: std::ostringstream out; out.imbue(loc); auto fill = out.fill(); Previously the fill character would have been set when fill() is called, and so would have used the new locale. This commit changes it so that the fill character is set on construction and isn't affected by the new locale being imbued later. This new behaviour seems to be what the standard requires, and matches MSVC. The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still possible to use a std::basic_ios without the ctype facet being present at construction. libstdc++-v3/ChangeLog: PR libstdc++/77704 * include/bits/basic_ios.h (basic_ios::fill()): Do not modify _M_fill and _M_fill_init in a const member function. (basic_ios::fill(char_type)): Use _M_fill directly instead of calling fill(). Set _M_fill_init to true. * include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and _M_fill_init here instead. * testsuite/27_io/basic_ios/fill/char/1.cc: New test. * testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test. --- libstdc++-v3/include/bits/basic_ios.h | 10 +-- libstdc++-v3/include/bits/basic_ios.tcc | 15 +++- .../testsuite/27_io/basic_ios/fill/char/1.cc | 78 +++++++++++++++++++ .../27_io/basic_ios/fill/wchar_t/1.cc | 55 +++++++++++++ 4 files changed, 148 insertions(+), 10 deletions(-) create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc diff --git a/libstdc++-v3/include/bits/basic_ios.h b/libstdc++-v3/include/bits/basic_ios.h index 258e6042b8f..bc3be4d2e37 100644 --- a/libstdc++-v3/include/bits/basic_ios.h +++ b/libstdc++-v3/include/bits/basic_ios.h @@ -373,11 +373,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION char_type fill() const { - if (!_M_fill_init) - { - _M_fill = this->widen(' '); - _M_fill_init = true; - } + if (__builtin_expect(!_M_fill_init, false)) + return this->widen(' '); return _M_fill; } @@ -393,8 +390,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION char_type fill(char_type __ch) { - char_type __old = this->fill(); + char_type __old = _M_fill; _M_fill = __ch; + _M_fill_init = true; return __old; } diff --git a/libstdc++-v3/include/bits/basic_ios.tcc b/libstdc++-v3/include/bits/basic_ios.tcc index a9313736e32..0197bdf8f67 100644 --- a/libstdc++-v3/include/bits/basic_ios.tcc +++ b/libstdc++-v3/include/bits/basic_ios.tcc @@ -138,13 +138,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // return without throwing an exception. Unfortunately, // ctype is not necessarily a required facet, so // streams with char_type != [char, wchar_t] will not have it by - // default. Because of this, the correct value for _M_fill is - // constructed on the first call of fill(). That way, + // default. If the ctype facet is available now, + // _M_fill is set here, but otherwise no fill character will be + // cached and a call to fill() will check for the facet again later + // (and will throw if the facet is still not present). This way // unformatted input and output with non-required basic_ios // instantiations is possible even without imbuing the expected // ctype facet. - _M_fill = _CharT(); - _M_fill_init = false; + if (_M_ctype) + { + _M_fill = _M_ctype->widen(' '); + _M_fill_init = true; + } + else + _M_fill_init = false; _M_tie = 0; _M_exception = goodbit; diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc new file mode 100644 index 00000000000..d5747c7507f --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc @@ -0,0 +1,78 @@ +// { dg-do run } + +#include +#include +#include +#include + +typedef char C; + +struct tabby_mctype : std::ctype +{ + C do_widen(char c) const { return c == ' ' ? '\t' : c; } + + const char* + do_widen(const char* lo, const char* hi, C* to) const + { + while (lo != hi) + *to++ = do_widen(*lo++); + return hi; + } +}; + +void +test01() +{ + std::basic_ios out(0); + std::locale loc(std::locale(), new tabby_mctype); + out.imbue(loc); + VERIFY( out.fill() == ' ' ); // Imbuing a new locale doesn't affect fill(). + out.fill('*'); + VERIFY( out.fill() == '*' ); // This will be cached now. + out.imbue(std::locale()); + VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill(). +} + +void +test02() +{ + std::locale loc(std::locale(), new tabby_mctype); + std::locale::global(loc); + std::basic_ios out(0); + VERIFY( out.fill() == '\t' ); + out.imbue(std::locale::classic()); + VERIFY( out.fill() == '\t' ); // Imbuing a new locale doesn't affect fill(). + out.fill('*'); + VERIFY( out.fill() == '*' ); // This will be cached now. + out.imbue(std::locale()); + VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill(). +} + +void +test03() +{ + // This function tests a libstdc++ extension: if no ctype facet + // is present when the stream is initialized, a fill character will not be + // cached. Calling fill() will obtain a fill character from the locale each + // time it's called. + typedef signed char C2; + std::basic_ios out(0); +#if __cpp_exceptions + try { + (void) out.fill(); // No ctype in the locale. + VERIFY( false ); + } catch (...) { + } +#endif + out.fill('*'); + VERIFY( out.fill() == '*' ); // This will be cached now. + out.imbue(std::locale()); + VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill(). +} + +int main() +{ + test01(); + test02(); + test03(); +} diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc new file mode 100644 index 00000000000..2d639a0844d --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc @@ -0,0 +1,55 @@ +// { dg-do run } + +#include +#include +#include +#include + +typedef wchar_t C; + +struct tabby_mctype : std::ctype +{ + C do_widen(char c) const { return c == ' ' ? L'\t' : c; } + + const char* + do_widen(const char* lo, const char* hi, C* to) const + { + while (lo != hi) + *to++ = do_widen(*lo++); + return hi; + } +}; + +void +test01() +{ + std::basic_ios out(0); + std::locale loc(std::locale(), new tabby_mctype); + out.imbue(loc); + VERIFY( out.fill() == L' ' ); // Imbuing a new locale doesn't affect fill(). + out.fill(L'*'); + VERIFY( out.fill() == L'*' ); // This will be cached now. + out.imbue(std::locale()); + VERIFY( out.fill() == L'*' ); // Imbuing a new locale doesn't affect fill(). +} + +void +test02() +{ + std::locale loc(std::locale(), new tabby_mctype); + std::locale::global(loc); + std::basic_ios out(0); + VERIFY( out.fill() == L'\t' ); + out.imbue(std::locale::classic()); + VERIFY( out.fill() == L'\t' ); // Imbuing a new locale doesn't affect fill(). + out.fill(L'*'); + VERIFY( out.fill() == L'*' ); // This will be cached now. + out.imbue(std::locale()); + VERIFY( out.fill() == L'*' ); // Imbuing a new locale doesn't affect fill(). +} + +int main() +{ + test01(); + test02(); +}