From patchwork Thu Aug 22 12:15:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 1975471 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=eLII55yR; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WqMnW1h1gz1yXY for ; Thu, 22 Aug 2024 22:23:56 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 263DA3871016 for ; Thu, 22 Aug 2024 12:23:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 4FBC1386074B for ; Thu, 22 Aug 2024 12:23:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4FBC1386074B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4FBC1386074B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724329415; cv=none; b=FJVcmmBa/ilac25WitOCN16GSnr33h+2WrjCv/Hz2LtecPbMGyImqG8g8af71UzR3ePpjIzgrgcPCgX/05RJcdEJS3NfIgWlKpVaHRKkaN43GbyV/LXJA1PUvL0tZUq+Q7u3tXa/J6J6XUjpIYYpHyptmfD5ymc5TA7h8AUHKv4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724329415; c=relaxed/simple; bh=8IDMWTuNIsEDBGMyfJl6FD+vvbUqeu7BBM6ZEJphDoM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=CulEysWZNRnygMzsTDruWDolfc3OaVDbsA38Pr+kGGhNClaAbo3u8PYilrE9cRyF21oHUbwYqIyODxBpFTC6ATen16yA/OZkXDmn3u8FdLO7Jnsedohlsz4YOTiOXzNGccY9uMI55s1knESWxT1JPG1Nr41mBOZYD1CpO1cXUF8= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1724329413; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=5yMw/uUt8eH7r+gt4x9VdqdncpFtzeViamQVnmTY/2M=; b=eLII55yRAnITpmFjOPtntgyX4mOQv6RzwhPfhUtnF2dVnjO7kEcqAg2DgW4yiUPr2kVvCm HIAU6y54fGEg+o3XeUbDMIAJJZILpos2tlP3NyT2s+yoD71tOWfNki93TRGsBBpYF1GnOW U4ONlojo8RBweQbsuEPFT/jZR5YEY4c= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-220-1BbPq2BHNIiTWw-bd24Ntg-1; Thu, 22 Aug 2024 08:23:28 -0400 X-MC-Unique: 1BbPq2BHNIiTWw-bd24Ntg-1 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C73E81956048; Thu, 22 Aug 2024 12:23:26 +0000 (UTC) Received: from localhost (unknown [10.42.28.148]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 93A8E19560AA; Thu, 22 Aug 2024 12:23:25 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Cc: Giovanni Bajo Subject: [PATCH] libstdc++: Fix std::random_shuffle for low RAND_MAX [PR88935] Date: Thu, 22 Aug 2024 13:15:59 +0100 Message-ID: <20240822122324.721216-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_DBL_SPAM autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This is a revised version of a patch Giovanni submitted some years ago, which has been unreviewed until recently. Tested x86_64-linux. I would like to push this to trunk. -- >8 -- When RAND_MAX is small and the number of elements being shuffled is close to it, we get very uneven distributions in std::random_shuffle. This uses a simple xorshift generator seeded by std::rand if we can't rely on std::rand itself. libstdc++-v3/ChangeLog: PR libstdc++/88935 * include/bits/stl_algo.h (random_shuffle) [RAND_MAX < INT_MAX]: Use xorshift instead of rand(). * testsuite/25_algorithms/random_shuffle/88935.cc: New test. Co-authored-by: Jonathan Wakely Signed-off-by: Giovanni Bajo --- libstdc++-v3/include/bits/stl_algo.h | 42 +++++++++++++++---- .../25_algorithms/random_shuffle/88935.cc | 24 +++++++++++ 2 files changed, 57 insertions(+), 9 deletions(-) create mode 100644 libstdc++-v3/testsuite/25_algorithms/random_shuffle/88935.cc diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 541f588883b..778a37ac46f 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -4521,15 +4521,39 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO _RandomAccessIterator>) __glibcxx_requires_valid_range(__first, __last); - if (__first != __last) - for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i) - { - // XXX rand() % N is not uniformly distributed - _RandomAccessIterator __j = __first - + std::rand() % ((__i - __first) + 1); - if (__i != __j) - std::iter_swap(__i, __j); - } + if (__first == __last) + return; + +#if RAND_MAX < __INT_MAX__ + if (__builtin_expect((__last - __first) >= RAND_MAX / 4, 0)) + { + // Use a xorshift implementation seeded by two calls to rand() + // instead of using rand() for all the random numbers needed. + unsigned __xss + = (unsigned)std::rand() ^ ((unsigned)std::rand() << 15); + for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i) + { + __xss += !__xss; + __xss ^= __xss << 13; + __xss ^= __xss >> 17; + __xss ^= __xss << 5; + _RandomAccessIterator __j = __first + + (__xss % ((__i - __first) + 1)); + if (__i != __j) + std::iter_swap(__i, __j); + } + return; + } +#endif + + for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i) + { + // XXX rand() % N is not uniformly distributed + _RandomAccessIterator __j = __first + + (std::rand() % ((__i - __first) + 1)); + if (__i != __j) + std::iter_swap(__i, __j); + } } /** diff --git a/libstdc++-v3/testsuite/25_algorithms/random_shuffle/88935.cc b/libstdc++-v3/testsuite/25_algorithms/random_shuffle/88935.cc new file mode 100644 index 00000000000..30dca2a897a --- /dev/null +++ b/libstdc++-v3/testsuite/25_algorithms/random_shuffle/88935.cc @@ -0,0 +1,24 @@ +// { dg-do run } +// { dg-options "-Wno-deprecated-declarations" } + +// Bug 88935 std::random_shuffle does not work if the sequence +// is longer than RAND_MAX elements + +#include +#include +#include + +int main() +{ + const std::size_t N = 30000; + std::vector v(N, (unsigned char)0); + std::fill(v.begin() + (N / 5 * 4), v.end(), (unsigned char)1); + std::random_shuffle(v.begin(), v.end()); + double sum = 0; + for (std::size_t i = 0; i < v.size(); ++i) + { + sum += v[i]; + if (i > 0 && i % (N / 100) == 0) + VERIFY( (sum / i) < 0.3 ); + } +}