From patchwork Fri Dec 20 01:12:18 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans-Peter Nilsson X-Patchwork-Id: 303863 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 501702C00D3 for ; Fri, 20 Dec 2013 15:33:58 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :message-id:from:to:cc:in-reply-to:subject:mime-version :content-type:content-transfer-encoding; q=dns; s=default; b=PzR pXp3yKpvTFZrFRmgzcREeaVMscppe+hgg/skBwsSHWM01M0HDuniY41QFZjyMH6f fmo0CZIq2XV1FoXWK2bBXvrPDpX00nvAuWx2z+7GXjb0Id0V0OtOw6xxleA5KnwY OBJELHaEAnJ4iStDRNxxcKV7Mc/A6A1d9lLvYoPI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :message-id:from:to:cc:in-reply-to:subject:mime-version :content-type:content-transfer-encoding; s=default; bh=AtTBLXs+4 LrUvAM9FZvNA7ntfZo=; b=gjvGXxpMcb9s98DKD+wKN6iOJ36jXNu0ymPvF9suB g32sR0+cfFMZDWWqvux2Uh7eHpjC+D8Hqb7BKEEGS1Pwm2pDJI9RSoI3p31mf5AH 7b6vuw5GBIBaeLEr8hglOFf+hngpYidReIT8VbvUN2Q1gd7VSHXxtiVMXdLYbJAB wQ= Received: (qmail 29298 invoked by alias); 20 Dec 2013 04:33:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 29135 invoked by uid 89); 20 Dec 2013 04:33:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: ra.se.axis.com Received: from ra.se.axis.com (HELO ra.se.axis.com) (195.60.68.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Dec 2013 04:33:34 +0000 Received: from localhost (localhost [127.0.0.1]) by ra.se.axis.com (Postfix) with ESMTP id 3AAB93F273D; Fri, 20 Dec 2013 05:33:32 +0100 (CET) Received: from ra.se.axis.com ([127.0.0.1]) by localhost (ra.se.axis.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id PfYe+wJldPQn; Fri, 20 Dec 2013 05:33:28 +0100 (CET) Received: from boulder.se.axis.com (boulder.se.axis.com [10.0.2.104]) by ra.se.axis.com (Postfix) with ESMTP id 2656C3FE3F1; Fri, 20 Dec 2013 02:12:19 +0100 (CET) Received: from boulder.se.axis.com (localhost [127.0.0.1]) by postfix.imss71 (Postfix) with ESMTP id AEEAEBA3; Fri, 20 Dec 2013 02:12:19 +0100 (CET) Received: from seth.se.axis.com (seth.se.axis.com [10.0.2.172]) by boulder.se.axis.com (Postfix) with ESMTP id A006892D; Fri, 20 Dec 2013 02:12:19 +0100 (CET) Received: from ignucius.se.axis.com (ignucius.se.axis.com [10.88.21.50]) by seth.se.axis.com (Postfix) with ESMTP id 9C3E13E091; Fri, 20 Dec 2013 02:12:19 +0100 (CET) Received: from ignucius.se.axis.com (localhost [127.0.0.1]) by ignucius.se.axis.com (8.12.8p1/8.12.8/Debian-2woody1) with ESMTP id rBK1CJh9029485; Fri, 20 Dec 2013 02:12:19 +0100 Received: (from hp@localhost) by ignucius.se.axis.com (8.12.8p1/8.12.8/Debian-2woody1) id rBK1CIxx029481; Fri, 20 Dec 2013 02:12:18 +0100 Date: Fri, 20 Dec 2013 02:12:18 +0100 Message-Id: <201312200112.rBK1CIxx029481@ignucius.se.axis.com> From: Hans-Peter Nilsson To: jwakely.gcc@gmail.com CC: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org, Yufeng.Zhang@arm.com, Renlin.Li@arm.com In-reply-to: (message from Jonathan Wakely on Mon, 16 Dec 2013 22:09:27 +0100) Subject: RFA: fix libstdc++ regression, simulator timeout from r205810 MIME-Version: 1.0 Here's a patch that splits up 20_util/hash/chi2_quality.cc *and* increases some of the iteration numbers for simulator targets to something that passes for all working targets mentioned below. I am a bit worried about the stability of these tests and the implementation, seeing this amount of target-specific differences in results. Maybe a person interested in the quality of the implementation and knowing statistics should have a look. I originally naively thought just splitting up the test would help; allowing to remove the simulator target-test-constraints completely, but that only produced timeouts for cris-elf. The effect of this patch is therefore to split it up and increase some of the SAMPLES values compared to the the current commit, assuming there's a useful linear dependence. Don't be fooled by test_document_words now being excluded at the test-top: it already was (for all SAMPLES < 100000), except for compiling and running an empty function; see the original. I've tested this on two different x86_64-linux-gnu hosts, Fedora 17 and Debian 7 aka. "wheezy" (to eliminate my suspicion of distro differences) and three simulator targets: cris-elf, powerpc-eabi and mipsisa32r2el-elf. I also tried running these tests for arm-eabi / arm-sim.exp but they all failed apparently because of memory resource constraints within the simulator setup: in libstdc++.log I see "terminate called after throwing an instance of 'std::bad_alloc'". I felt I honestly had to increase some of the SAMPLES numbers from the current number of 30000; the final number I came up with, passes for all the mentioned working targets. I didn't want to *decrease* any of the numbers (e.g. for simulator only) to exploit some local minima of the "k" values (for example there's one such for 10000 for all targets above except x86_64/32 and the ones the ARM people mentioned). So, I increased the respective number to be higher than where at least one target showed a test-suite failure. When checking the SAMPLES numbers for the hosts, I of course just hacked the default SAMPLES temporarily; they were not tested as simulator targets. The SAMPLES number for all but test_bit_flip_set change to 35000; a number somewhat arbitrarily chosen as higher than 30000. Curiously, I never saw test_bit_flip_set fail as mentioned in . Maybe that was a mistaken observation and what was observed was test_bit_string_set failing. (That's a good reason to always copy-paste instead of *typing* from memory or from reading.) That test certainly failed for most targets, but also for SAMPLES=30000, not mentioned in the referenced message, which made me suspect a distribution- or glibc-version-related difference. A higher SAMPLES number definitely has a noticeable cost in terms of test-time, challenging the statement "it doesn't take that much time either". For both cris-elf-sim and powerpc-eabi-sim running on a x86_64 host of yesteryear, the time for the affected tests went from about 30 seconds to 4 min 28 seconds and 6 min 20 seconds respectively, going from 10000 to these numbers. For the original r205810 change compared to r205803, test-time for cris-elf for a *complete "make check"-run* (C, C++, ObjC, Fortran including libraries) went from 3h56min to 4h5min (when the test timed out). Ok to commit? libstdc++-v3: * testsuite/20_util/hash/chi2_quality.h: Break out from chi2_quality.cc. * testsuite/20_util/hash/chi2_q_bit_flip_set.cc: Ditto. * testsuite/20_util/hash/chi2_q_document_words.cc: Ditto. * testsuite/20_util/hash/chi2_q_bit_string_set.cc: Ditto. Increase SAMPLES to 35000 for simulator targets. * testsuite/20_util/hash/chi2_q_numeric_pattern_set.cc: Ditto. * testsuite/20_util/hash/chi2_q_uniform_random.cc: Ditto. * testsuite/20_util/hash/chi2_quality.cc: Remove. brgds, H-P PS. Happy Holidays! --- libstdc++-v3/testsuite/20_util/hash/chi2_quality.cc Sun Dec 15 15:01:43 2013 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,218 +0,0 @@ -// { dg-options "-std=gnu++0x" } - -// Use smaller statistics when running on simulators, so it takes less time. -// { dg-options "-std=gnu++0x -DSAMPLES=30000" { target simulator } } - -// Copyright (C) 2010-2013 Free Software Foundation, Inc. -// -// This file is part of the GNU ISO C++ Library. This library is free -// software; you can redistribute it and/or modify it under the -// terms of the GNU General Public License as published by the -// Free Software Foundation; either version 3, or (at your option) -// any later version. -// -// This library is distributed in the hope that it will be useful, -// but WITHOUT ANY WARRANTY; without even the implied warranty of -// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -// GNU General Public License for more details. -// -// You should have received a copy of the GNU General Public License -// along with this library; see the file COPYING3. If not see -// . - -// This file uses the chi^2 test to measure the quality of a hash -// function, by computing the uniformity with which it distributes a set -// of N strings into k buckets (where k is significantly greater than N). -// -// Each bucket has B[i] strings in it. The expected value of each bucket -// for a uniform distribution is z = N/k, so -// chi^2 = Sum_i (B[i] - z)^2 / z. -// -// We check whether chi^2 is small enough to be consistent with the -// hypothesis of a uniform distribution. If F(chi^2, k-1) is close to -// 0 (where F is the cumulative probability distribution), we can -// reject that hypothesis. So we don't want F to be too small, which -// for large k, means we want chi^2 to be not too much larger than k. -// -// We use the chi^2 test for several sets of strings. Any non-horrible -// hash function should do well with purely random strings. A really -// good hash function will also do well with more structured sets, -// including ones where the strings differ by only a few bits. - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#ifndef SAMPLES -#define SAMPLES 300000 -#endif - -template - double - chi2_hash(const Container& c, long buckets) - { - std::vector counts(buckets); - std::hash hasher; - double elements = 0; - for (auto i = c.begin(); i != c.end(); ++i) - { - ++counts[hasher(*i) % buckets]; - ++elements; - } - - const double z = elements / buckets; - double sum = 0; - for (long i = 0; i < buckets; ++i) - { - double delta = counts[i] - z; - sum += delta*delta; - } - return sum/z; - } - -// Tests chi^2 for a distribution of uniformly generated random strings. -void -test_uniform_random() -{ - bool test __attribute__((unused)) = true; - std::srand(137); - std::unordered_set set; - std::string s; - const unsigned long N = SAMPLES; - const unsigned long k = N/100; - const unsigned int len = 25; - while (set.size() < N) - { - s.clear(); - for (unsigned int i = 0; i < len; ++i) - s.push_back(rand() % 128); - set.insert(s); - } - - double chi2 = chi2_hash(set, k); - VERIFY( chi2 < k*1.1 ); -} - -// Tests chi^2 for a distribution of strings that differ from each -// other by only a few bits. We start with an arbitrary base string, and -// flip three random bits for each member of the set. -void -test_bit_flip_set() -{ - bool test __attribute__((unused)) = true; - const unsigned long N = SAMPLES; - const unsigned long k = N/100; - const unsigned int len = 67; - const unsigned int bitlen = len * 8; - const unsigned int bits_to_flip = 3; - const char base[len+1] = "abcdefghijklmnopqrstuvwxyz" - "ABCDEFGHIJKLMNOPQRSTUVWXYZ" - "0123456789!@#$%"; - - std::unordered_set set; - while (set.size() < N) - { - std::string s(base, base+len); - for (unsigned int i = 0; i < bits_to_flip; ++i) - { - int bit = rand() % bitlen; - s[bit/8] ^= (1 << (bit%8)); - } - set.insert(s); - } - - double chi2 = chi2_hash(set, k); - VERIFY( chi2 < k*1.1 ); -} - -// Tests chi^2 of a set of strings that all have a similar pattern, -// intended to mimic some sort of ID string. -void -test_numeric_pattern_set() -{ - bool test __attribute__((unused)) = true; - const unsigned long N = SAMPLES; - const unsigned long k = N/100; - std::vector set; - for (unsigned long i = 0; i < N; ++i) - { - long i1 = i % 100000; - long i2 = i / 100000; - char buf[16]; - std::sprintf(buf, "XX-%05lu-%05lu", i1, i2); - set.push_back(buf); - } - - double chi2 = chi2_hash(set, k); - VERIFY( chi2 < k*1.1 ); -} - -// Tests chi^2 for a set of strings that all consist of '1' and '0'. -void -test_bit_string_set() -{ - bool test __attribute__((unused)) = true; - const unsigned long N = SAMPLES; - const unsigned long k = N/100; - std::vector set; - std::string s; - for (unsigned long i = 0; i < N; ++i) - { - s.clear(); - for (unsigned int j = 0; j < sizeof(unsigned long) * 8; ++j) - { - const bool bit = (1UL << j) & i; - s.push_back(bit ? '1' : '0'); - } - set.push_back(s); - } - - double chi2 = chi2_hash(set, k); - VERIFY( chi2 < k*1.1 ); -} - -// Tests chi^2 for a set of words taken from a document written in English. -void -test_document_words() -{ - // That file is 187587 single-word lines. To avoid a timeout, just skip - // this part, which would take up to 95% of the program runtime (with - // SAMPLES == 10000), if we're not supposed to run anywhere that long. -#if SAMPLES >= 100000 - bool test __attribute__((unused)) = true; - const std::string f_name = "thirty_years_among_the_dead_preproc.txt"; - std::ifstream in(f_name); - VERIFY( in.is_open() ); - std::vector words; - words.assign(std::istream_iterator(in), - std::istream_iterator()); - VERIFY( words.size() > 100000 ); - std::sort(words.begin(), words.end()); - auto it = std::unique(words.begin(), words.end()); - words.erase(it, words.end()); - VERIFY( words.size() > 5000 ); - - const unsigned long k = words.size() / 20; - double chi2 = chi2_hash(words, k); - VERIFY( chi2 < k*1.1 ); -#endif -} - -int -main() -{ - test_uniform_random(); - test_bit_flip_set(); - test_numeric_pattern_set(); - test_bit_string_set(); - test_document_words(); - return 0; -} --- /dev/null Tue Oct 29 15:57:07 2002 +++ libstdc++-v3/testsuite/20_util/hash/chi2_quality.h Tue Dec 17 08:14:06 2013 @@ -0,0 +1,74 @@ +// Copyright (C) 2010-2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// . + +// This file uses the chi^2 test to measure the quality of a hash +// function, by computing the uniformity with which it distributes a set +// of N strings into k buckets (where k is significantly greater than N). +// +// Each bucket has B[i] strings in it. The expected value of each bucket +// for a uniform distribution is z = N/k, so +// chi^2 = Sum_i (B[i] - z)^2 / z. +// +// We check whether chi^2 is small enough to be consistent with the +// hypothesis of a uniform distribution. If F(chi^2, k-1) is close to +// 0 (where F is the cumulative probability distribution), we can +// reject that hypothesis. So we don't want F to be too small, which +// for large k, means we want chi^2 to be not too much larger than k. +// +// We use the chi^2 test for several sets of strings. Any non-horrible +// hash function should do well with purely random strings. A really +// good hash function will also do well with more structured sets, +// including ones where the strings differ by only a few bits. + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifndef SAMPLES +#define SAMPLES 300000 +#endif + +template + double + chi2_hash(const Container& c, long buckets) + { + std::vector counts(buckets); + std::hash hasher; + double elements = 0; + for (auto i = c.begin(); i != c.end(); ++i) + { + ++counts[hasher(*i) % buckets]; + ++elements; + } + + const double z = elements / buckets; + double sum = 0; + for (long i = 0; i < buckets; ++i) + { + double delta = counts[i] - z; + sum += delta*delta; + } + return sum/z; + } --- /dev/null Tue Oct 29 15:57:07 2002 +++ libstdc++-v3/testsuite/20_util/hash/chi2_q_bit_flip_set.cc Tue Dec 17 08:18:53 2013 @@ -0,0 +1,61 @@ +// { dg-options "-std=gnu++0x" } +// Use smaller statistics when running on simulators, so it takes less time. +// { dg-options "-std=gnu++0x -DSAMPLES=30000" { target simulator } } + +// Copyright (C) 2010-2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// . + +#include "chi2_quality.h" + +// Tests chi^2 for a distribution of strings that differ from each +// other by only a few bits. We start with an arbitrary base string, and +// flip three random bits for each member of the set. +void +test_bit_flip_set() +{ + bool test __attribute__((unused)) = true; + const unsigned long N = SAMPLES; + const unsigned long k = N/100; + const unsigned int len = 67; + const unsigned int bitlen = len * 8; + const unsigned int bits_to_flip = 3; + const char base[len+1] = "abcdefghijklmnopqrstuvwxyz" + "ABCDEFGHIJKLMNOPQRSTUVWXYZ" + "0123456789!@#$%"; + + std::unordered_set set; + while (set.size() < N) + { + std::string s(base, base+len); + for (unsigned int i = 0; i < bits_to_flip; ++i) + { + int bit = rand() % bitlen; + s[bit/8] ^= (1 << (bit%8)); + } + set.insert(s); + } + + double chi2 = chi2_hash(set, k); + VERIFY( chi2 < k*1.1 ); +} + +int +main() +{ + test_bit_flip_set(); + return 0; +} --- /dev/null Tue Oct 29 15:57:07 2002 +++ libstdc++-v3/testsuite/20_util/hash/chi2_q_bit_string_set.cc Tue Dec 17 08:23:42 2013 @@ -0,0 +1,55 @@ +// { dg-options "-std=gnu++0x" } +// Use smaller statistics when running on simulators, so it takes less time. +// For e.g. cris-elf, mipsisa32r2el-elf, powerpc-eabi and i386-linux-gnu, +// this test fails for SAMPLES=30000. +// { dg-options "-std=gnu++0x -DSAMPLES=35000" { target simulator } } + +// Copyright (C) 2010-2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// . + +#include "chi2_quality.h" + +// Tests chi^2 for a set of strings that all consist of '1' and '0'. +void +test_bit_string_set() +{ + bool test __attribute__((unused)) = true; + const unsigned long N = SAMPLES; + const unsigned long k = N/100; + std::vector set; + std::string s; + for (unsigned long i = 0; i < N; ++i) + { + s.clear(); + for (unsigned int j = 0; j < sizeof(unsigned long) * 8; ++j) + { + const bool bit = (1UL << j) & i; + s.push_back(bit ? '1' : '0'); + } + set.push_back(s); + } + + double chi2 = chi2_hash(set, k); + VERIFY( chi2 < k*1.1 ); +} + +int +main() +{ + test_bit_string_set(); + return 0; +} --- /dev/null Tue Oct 29 15:57:07 2002 +++ libstdc++-v3/testsuite/20_util/hash/chi2_q_document_words.cc Tue Dec 17 08:38:03 2013 @@ -0,0 +1,52 @@ +// On some simulators, the workload is simply too large with values big +// enough for the test to pass the quality test, so just skip it altogether. +// { dg-do run { target { ! simulator } } } +// { dg-options "-std=gnu++0x" } + +// Copyright (C) 2010-2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// . + +#include "chi2_quality.h" + +// Tests chi^2 for a set of words taken from a document written in English. +void +test_document_words() +{ + bool test __attribute__((unused)) = true; + const std::string f_name = "thirty_years_among_the_dead_preproc.txt"; + std::ifstream in(f_name); + VERIFY( in.is_open() ); + std::vector words; + words.assign(std::istream_iterator(in), + std::istream_iterator()); + VERIFY( words.size() > 100000 ); + std::sort(words.begin(), words.end()); + auto it = std::unique(words.begin(), words.end()); + words.erase(it, words.end()); + VERIFY( words.size() > 5000 ); + + const unsigned long k = words.size() / 20; + double chi2 = chi2_hash(words, k); + VERIFY( chi2 < k*1.1 ); +} + +int +main() +{ + test_document_words(); + return 0; +} --- /dev/null Tue Oct 29 15:57:07 2002 +++ libstdc++-v3/testsuite/20_util/hash/chi2_q_numeric_pattern_set.cc Tue Dec 17 08:20:52 2013 @@ -0,0 +1,52 @@ +// { dg-options "-std=gnu++0x" } +// Use smaller statistics when running on simulators, so it takes less time. +// For x86_64-linux-gnu SAMPLES=30000 fails, so increase slightly. +// { dg-options "-std=gnu++0x -DSAMPLES=35000" { target simulator } } + +// Copyright (C) 2010-2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// . + +#include "chi2_quality.h" + +// Tests chi^2 of a set of strings that all have a similar pattern, +// intended to mimic some sort of ID string. +void +test_numeric_pattern_set() +{ + bool test __attribute__((unused)) = true; + const unsigned long N = SAMPLES; + const unsigned long k = N/100; + std::vector set; + for (unsigned long i = 0; i < N; ++i) + { + long i1 = i % 100000; + long i2 = i / 100000; + char buf[16]; + std::sprintf(buf, "XX-%05lu-%05lu", i1, i2); + set.push_back(buf); + } + + double chi2 = chi2_hash(set, k); + VERIFY( chi2 < k*1.1 ); +} + +int +main() +{ + test_numeric_pattern_set(); + return 0; +} --- /dev/null Tue Oct 29 15:57:07 2002 +++ libstdc++-v3/testsuite/20_util/hash/chi2_q_uniform_random.cc Tue Dec 17 08:16:42 2013 @@ -0,0 +1,53 @@ +// { dg-options "-std=gnu++0x" } +// Use smaller statistics when running on simulators, so it takes less time. +// For powerpc-eabi, SAMPLES=30000 fails. +// { dg-options "-std=gnu++0x -DSAMPLES=35000" { target simulator } } + +// Copyright (C) 2010-2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// . + +#include "chi2_quality.h" + +// Tests chi^2 for a distribution of uniformly generated random strings. +void +test_uniform_random() +{ + bool test __attribute__((unused)) = true; + std::srand(137); + std::unordered_set set; + std::string s; + const unsigned long N = SAMPLES; + const unsigned long k = N/100; + const unsigned int len = 25; + while (set.size() < N) + { + s.clear(); + for (unsigned int i = 0; i < len; ++i) + s.push_back(rand() % 128); + set.insert(s); + } + + double chi2 = chi2_hash(set, k); + VERIFY( chi2 < k*1.1 ); +} + +int +main() +{ + test_uniform_random(); + return 0; +}