From patchwork Tue May 12 21:09:46 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?T25kxZllaiBCw61sa2E=?= X-Patchwork-Id: 471567 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id F1D7B1402C4 for ; Wed, 13 May 2015 07:10:02 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=sourceware.org header.i=@sourceware.org header.b=b1NTYNJ2; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:references :mime-version:content-type:content-transfer-encoding :in-reply-to; q=dns; s=default; b=mLJ86XV5h3rreja4LgQl6ZRePYTmh8 VE5SctLHmmyTJfnl8FNfq/oqwp0RVDCI03EK7GOg6ceBNCLD/Ds30jzJUFFNoOmd BSNYLJH5xnff2EgLSG3O64upeT4utAy+tzHPCPssRxBLr+fe5J7g2pPr1mRKK50J kZufztyZ0LccQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:references :mime-version:content-type:content-transfer-encoding :in-reply-to; s=default; bh=X58rMBuSJQe/RDfnr/JGW+PlKoI=; b=b1NT YNJ2e5+XVr5//mXoHOt+16e6uUBcFO5XFPAMtiBcAqTyoUS3FrAy+aoUFMyxuRjI 7iTWZBuAr+XGOOZwj0u9OGFEsz779ZHZNpNt+lLpTGTEhpLNEuN/1wf1l4Lvwk04 pjsfd14bCwUlKDSYIAZ25prkKwounvqoqJqYUp4= Received: (qmail 40452 invoked by alias); 12 May 2015 21:09:56 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 40442 invoked by uid 89); 12 May 2015 21:09:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, SPF_NEUTRAL autolearn=no version=3.3.2 X-HELO: popelka.ms.mff.cuni.cz Date: Tue, 12 May 2015 23:09:46 +0200 From: =?utf-8?B?T25kxZllaiBCw61sa2E=?= To: libc-alpha@sourceware.org Subject: strstr benchtest. Message-ID: <20150512210946.GB24057@domone> References: <20150512205311.GA24057@domone> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150512205311.GA24057@domone> User-Agent: Mutt/1.5.20 (2009-06-14) On Tue, May 12, 2015 at 10:53:11PM +0200, Ondřej Bílka wrote: > Hi, > As I mentioned strstr benchmarking I used following. Current benchmark is as useless as rest of string benchmarks. To compile you need to delete benchtest/bench-strstr.o as this is tested by header inclusion and benchtest Makefile doesn't detect that strstr.c changed. Main problem of current benchtest was very biased input. I replaced that with randomly generated ones that are closer to average case. A problem with periodic input with period 16 is that it would skew branch prediction a lot. It also creates periodic pattern for branches and they would be predicted perfectly while in reality performance would be magnitude worse. Second problem is that it was a worst-case scenario for this heuristic instead average as first three characters form searched triplet instead randomly distributed when characters are selected independently. I couldn't get definite answer. As I said before this aproach is good when string has high entropy and problematic on lower-entropy ones. Selection which is better depends on entropy of strings used by given application. It isn't always better but benefits outweigth the cost. diff --git a/benchtests/bench-strstr.c b/benchtests/bench-strstr.c index 74f3ee8..0412c1b 100644 --- a/benchtests/bench-strstr.c +++ b/benchtests/bench-strstr.c @@ -81,31 +81,15 @@ do_test (size_t align1, size_t align2, size_t len1, size_t len2, char *s1 = (char *) (buf1 + align1); char *s2 = (char *) (buf2 + align2); - static const char d[] = "1234567890abcdef"; -#define dl (sizeof (d) - 1) - char *ss2 = s2; - for (size_t l = len2; l > 0; l = l > dl ? l - dl : 0) + for (size_t l = 0; l < len2; l++) { - size_t t = l > dl ? dl : l; - ss2 = mempcpy (ss2, d, t); + s2[l] = (rand () % 128) + 1; } s2[len2] = '\0'; - if (fail) + for (size_t l = 0; l < len1; l++) { - char *ss1 = s1; - for (size_t l = len1; l > 0; l = l > dl ? l - dl : 0) - { - size_t t = l > dl ? dl : l; - memcpy (ss1, d, t); - ++ss1[len2 > 7 ? 7 : len2 - 1]; - ss1 += t; - } - } - else - { - memset (s1, '0', len1); - memcpy (s1 + len1 - len2, s2, len2); + s1[l] = (rand () % 128) + 1; } s1[len1] = '\0';