From patchwork Wed Dec 21 23:06:01 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 708004 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tkVhc6Gn9z9snm for ; Thu, 22 Dec 2016 10:09:04 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="WgNM2tPn"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=jGUSd2EszI0xLYdmpHN4hejUosulq8Y Aq03qJasxZ4JDaCRSee6ttQYCZzZp390sACfX/bPchL/FSAVGctR75WQOhJ9iPLV eQ2d8f0eEBQVf1ErXAOopZkcMokNM3GqbHTockiERuc/YNgKbBLIg3qIPlXwyEJR txHHyqcnV0fE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=22/VMsiMSlqyCGYY3KoVKE7nx5s=; b=WgNM2 tPnS8Ba1PnLW+eC8J0gDuGzBoVRSDVGpQyfNbPUmI0Ixj/rdt+A8DTzPj5mOted/ 6K+xLiZcaDYuQOojS2P3E/oMeLOKeZpH4GIntkLj9xilACSO33trYxcixaNSEnZ/ aCfcZPjHKWxEXCHSLbDzazKIPcy+F+WG7jbtZk= Received: (qmail 53133 invoked by alias); 21 Dec 2016 23:06:36 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 52468 invoked by uid 89); 21 Dec 2016 23:06:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=no version=3.3.2 spammy=1318, 2238, Hx-spam-relays-external:74.125.83.68, H*RU:74.125.83.68 X-HELO: mail-pg0-f68.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references; bh=6zHiId+O4OB9Ugfgj/NLiIhv3pr4NPn999HEaNuAGDo=; b=cCNPG5NDHeAw6t1QzU5RzDUHju5XxPyAkfNecc7AgjH1lTOiceD3HmdzZ9YbfIoP8M xra+okooY1n8G55+KyUqV3jhHZd3FdcnOWwCTn87NeXe1a4fFr5iCnGG4FF2pmrFXpF6 Gjs49Y0PrA3CaxWpHslrdaE5RwROm04IK/RtlsbAcsXtTT12ve4v+PbK/H61hC2St+EW UAijtSwjmJqqZpARkcpLPNk6vmAKZxdjUZNuNQdXJeyR1cNKML6JARVtiqmW4pHNzkdN 0erITSuJWhvFVgaswhQoprD70HjnrIrhNUwuU6riqqqhLfA64gYpASXsYzW1eN6WjsaW mUKQ== X-Gm-Message-State: AIkVDXLmw3AnuF+e6ZEy9wbcC6bQCQMgykI1maXFYwZl+Dn5gxsGCYgPWHKY/Zr9eEcutg== X-Received: by 10.84.132.35 with SMTP id 32mr13319556ple.44.1482361577650; Wed, 21 Dec 2016 15:06:17 -0800 (PST) From: Richard Henderson To: libc-alpha@sourceware.org Subject: [PATCH v2 12/16] hppa: Add string-fzb.h and string-fzi.h Date: Wed, 21 Dec 2016 15:06:01 -0800 Message-Id: <20161221230605.28638-13-rth@twiddle.net> In-Reply-To: <20161221230605.28638-1-rth@twiddle.net> References: <20161221230605.28638-1-rth@twiddle.net> Use UXOR,SBZ to test for a zero byte within a word. While we can get semi-decent code out of asm-goto, we would do slightly better with a compiler builtin. For index_zero et al, sequential testing of bytes is less expensive than any tricks that involve a count-leading-zeros insn that we don't have. * sysdeps/hppa/string-fza.h: New file. * sysdeps/hppa/string-fzb.h: New file. * sysdeps/hppa/string-fzi.h: New file. --- sysdeps/hppa/string-fza.h | 1 + sysdeps/hppa/string-fzb.h | 69 +++++++++++++++++++++++++ sysdeps/hppa/string-fzi.h | 129 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 199 insertions(+) create mode 100644 sysdeps/hppa/string-fza.h create mode 100644 sysdeps/hppa/string-fzb.h create mode 100644 sysdeps/hppa/string-fzi.h diff --git a/sysdeps/hppa/string-fza.h b/sysdeps/hppa/string-fza.h new file mode 100644 index 0000000..92cfad7 --- /dev/null +++ b/sysdeps/hppa/string-fza.h @@ -0,0 +1 @@ +#error "string-fza.h is unused on HPPA" diff --git a/sysdeps/hppa/string-fzb.h b/sysdeps/hppa/string-fzb.h new file mode 100644 index 0000000..97f1b64 --- /dev/null +++ b/sysdeps/hppa/string-fzb.h @@ -0,0 +1,69 @@ +/* string-fzb.h -- zero byte detection, boolean. HPPA version. + Copyright (C) 2016 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static inline _Bool +has_zero (op_t x) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* It's more useful to expose a control transfer to the compiler + than to expose a proper boolean result. */ + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "b,n %l1" : : "r"(x) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static inline _Bool +has_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "uxor,nbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : sbz); + return 0; + sbz: + return 1; +} + +#endif /* STRING_HASZERO_H */ diff --git a/sysdeps/hppa/string-fzi.h b/sysdeps/hppa/string-fzi.h new file mode 100644 index 0000000..ae310d9 --- /dev/null +++ b/sysdeps/hppa/string-fzi.h @@ -0,0 +1,129 @@ +/* string-fzi.h -- zero byte detection; indexes. HPPA version. + Copyright (C) 2016 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZI_H +#define STRING_FZI_H 1 + +#include + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the long in memory order. */ + +static inline unsigned int +index_first_zero (op_t x) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ + +static inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ + +static inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,= %1,23,8,%%r0\n\t" + "extrw,u,<> %2,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,= %1,15,8,%%r0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,= %1,7,8,%%r0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ + +static inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %2,23,8,%%r0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but search for the last zero within X. */ + +static inline unsigned int +index_last_zero (op_t x) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no ctz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,31,8,%%r0\n\t" + "ldi 3,%0" + : "=r"(ret) : "r"(x), "0"(0)); + + return ret; +} + +#endif /* STRING_FZI_H */