From patchwork Wed Jul 26 15:23:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mike FABIAN X-Patchwork-Id: 793980 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-82443-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="mct+E5Md"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xHf5W0YNCz9sN5 for ; Thu, 27 Jul 2017 01:23:42 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-type:content-transfer-encoding; q=dns; s=default; b=J6q2RLfcWSuNRxJJhS4Yiv2Dj6bbSfwRHD9pc6c+CyT /OXkBl8TjJ09bGoEGCZgk8sXU9+ivpleME0w+GV+vYhs+maQN1tS7UA8B9ld5Or/ Bn2G9CprczbBNjzlBxOvO4a01ciq7MFoBmll92nW7RSx3bD7Wqh5dr0ZcvgUE04o = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-type:content-transfer-encoding; s=default; bh=h/UktowZ0XwWb6N8BkgpbSzdUwU=; b=mct+E5Md80qvcgyQ/ 6iZev6YX12yzcgEpI1PtJS3cLGzsMN1u0injxzNdP4YM9tRPSKtnTU76kM4kid4V 2CkpQpe1rEHwt1rtdOz0OZq+Per6mWjLlgPcwsLfQjXU4Celgool20C5Ft4WGKe1 hP/fOjH4owqp6HM3KbfgAkCT44= Received: (qmail 125692 invoked by alias); 26 Jul 2017 15:23:36 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 125671 invoked by uid 89); 26 Jul 2017 15:23:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 04BC440245 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=mfabian@redhat.com From: Mike FABIAN To: Zack Weinberg Cc: GNU C Library Subject: Re: RFC: locale-source validation script References: Date: Wed, 26 Jul 2017 17:23:30 +0200 In-Reply-To: (Zack Weinberg's message of "Wed, 26 Jul 2017 10:49:26 -0400") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux) MIME-Version: 1.0 Zack Weinberg wrote: > On Wed, Jul 26, 2017 at 8:44 AM, Mike FABIAN wrote: >> Zack Weinberg wrote: >> >>> - The complaints about "inappropriate character '\t'" are all caused >>> by _unintentional_ tabs inside strings. If you write >>> >>> message "xyz/ >>> abc" >>> >>> the whitespace on the second line gets included in the string, which >>> is not what you want. >> >> Yes, at the moment we get for example: >> >> $ LC_ALL=et_EE.UTF-8 locale -k postal_fmt >> postal_fmt="%a%N %f%N %d%N %b%N %s%t%h%t%e%t%r%N %C-%z %T%N %c%N" >> >> I’ll fix it like this, this is far more readable as well: > > Note that there's probably a bunch of similar cases where the > undesirable whitespace is just space characters, no tabs - my script > won't catch that. (I won't be working on it today, but this is on my > list of things to fix.) Just as a quick hack to find these cases I added the following to your script to find sequences of 2 or more spaces in strings: This found only 3 instances of space sequences, all of them appeared to be errors and I pushed a fix. --- check-localedef.py.~1~ 2017-07-26 08:14:27.052046435 +0200 +++ check-localedef.py 2017-07-26 16:32:50.625185637 +0200 @@ -369,6 +369,9 @@ if c != ' ' and not isgraph(c): log.error(fp.lineno, "inappropriate character {!r} in {}", c, "string" if end_char == '"' else "symbol") + if c == ' ' and tbuf[-1:] == [' ']: + log.error(fp.lineno, "suspicous sequence of spaces {}", tbuf) + tbuf.append(c) else: tbuf.append(c)