From patchwork Sat Sep 22 06:52:21 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Lance Taylor X-Patchwork-Id: 186099 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id BD20A2C0080 for ; Sat, 22 Sep 2012 16:53:01 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1348901583; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:From:To:Subject:Date:Message-ID:User-Agent: MIME-Version:Content-Type:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=Jeu4SUPo1lS+qIn/l4LRArEnH24=; b=j6YYdfhSID4A4ZZ DZ0/Rax9R+a/UY9bMgxIY6wK4I5jdsQlyjf8vTMtHiH1qVem9k9NhI254zfNRjeZ Dxn7KIj/Pv430/KnmTxeZjCQIuASlsNVc3UJwl9tw6IJ56nbtDdmk5QzT5I4hfuP zbn3CfTMGiZAYo0uf/oJCX258ud8= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:Received:Received:Received:From:To:Subject:Date:Message-ID:User-Agent:MIME-Version:Content-Type:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=sPX4TTcu+6h3TRmJJLIxUKPcfSiFTd+YBhBxmXh0g8ApEAnZjvNir1Yiy/00tN Fs664QFf7oTCurivRliT3baiZzm7i9tKDo7V4y4dBI2MIy/53BFKj7yrfDLuv9nc 3twy2fgXPKvE9Bil8LH55ZgrAZHVHXGVrKrzZHG474+Zg=; Received: (qmail 29811 invoked by alias); 22 Sep 2012 06:52:57 -0000 Received: (qmail 29800 invoked by uid 22791); 22 Sep 2012 06:52:55 -0000 X-SWARE-Spam-Status: No, hits=-4.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KHOP_RCVD_TRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD, T_TVD_MIME_NO_HEADERS X-Spam-Check-By: sourceware.org Received: from mail-pb0-f47.google.com (HELO mail-pb0-f47.google.com) (209.85.160.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 22 Sep 2012 06:52:23 +0000 Received: by pbcwy7 with SMTP id wy7so9686934pbc.20 for ; Fri, 21 Sep 2012 23:52:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:subject:date:message-id:user-agent:mime-version :content-type:x-gm-message-state; bh=1rqDWNJ7mRoY2OScjmAQ4NjzLk08K7x+S7UHnT6yFwQ=; b=D2ZsfEx4o4oqCSRtYClZ4EGlwL7cOUcPHgQqwxWH7YVbaSsRm2cn6rRbVTnQDG5ACO ofvXVy6TPF17hIm0ZHoFN+OzQ9HE/7McGveYk0eNK4b3tJ8jEgHwZYaiZQze/Mb+xkuY TsR932Rgje3vc0Tzk4Mg4aqrIboya+jm/TxvBjTk8et+0s9+XXW+0LF4fZTzgJdz2Wa4 rDnoFCM38MuGKLaCDctDNkniRU9Niv6/z5mJp2RTZZTKS1rBZZQrQ0ycApOGgYFb6pfj h3WJximjPdSAI9OrrmP8wKuTokZVVz0fJhmt6ZovdbnaBQs1Kt9GdIEEZQV1osNeYNpP 91vA== Received: by 10.68.222.167 with SMTP id qn7mr20901064pbc.98.1348296743099; Fri, 21 Sep 2012 23:52:23 -0700 (PDT) Received: by 10.68.222.167 with SMTP id qn7mr20901055pbc.98.1348296742999; Fri, 21 Sep 2012 23:52:22 -0700 (PDT) Received: from coign.google.com (adsl-71-133-8-30.dsl.pltn13.pacbell.net. [71.133.8.30]) by mx.google.com with ESMTPS id pa6sm6261965pbc.71.2012.09.21.23.52.21 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 21 Sep 2012 23:52:22 -0700 (PDT) From: Ian Lance Taylor To: gcc-patches@gcc.gnu.org, gofrontend-dev@googlegroups.com Subject: Go patch committed: Reject surrogate pairs converting int to string Date: Fri, 21 Sep 2012 23:52:21 -0700 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 X-Gm-Message-State: ALoCoQk6YSpYMe4BoBFTzLNoo/tHTJqSdp0K/i4q+WhYR4l5FlzUpSTFiKd9UN5K/G3v3TBE5yIfga/gVV8qy3SEesT0UjxJP72PsXYam+kGT/TinEPUdHIM5xxmvZD3RTd/QjkuK5q9MvHevCaCdL7wzKW+s7TvjbS99lL2POFIVOaSSB7UNWXMXZpd+MzF/1rFYG6WVMn4 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch to the Go frontend and libgo rejects surrogate pairs when converting an int to a string. They are not valid UTF-8. The patch also rejects a negative int--negative ints were already rejected by the compiler, but not by the runtime. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r f16ad4ccc868 go/lex.cc --- a/go/lex.cc Fri Sep 21 23:32:36 2012 -0700 +++ b/go/lex.cc Fri Sep 21 23:42:31 2012 -0700 @@ -1312,6 +1312,12 @@ // Turn it into the "replacement character". v = 0xfffd; } + if (v >= 0xd800 && v < 0xe000) + { + warning_at(location, 0, + "unicode code point 0x%x is invalid surrogate pair", v); + v = 0xfffd; + } if (v <= 0xffff) { buf[0] = 0xe0 + (v >> 12); diff -r f16ad4ccc868 libgo/runtime/go-int-to-string.c --- a/libgo/runtime/go-int-to-string.c Fri Sep 21 23:32:36 2012 -0700 +++ b/libgo/runtime/go-int-to-string.c Fri Sep 21 23:42:31 2012 -0700 @@ -17,6 +17,11 @@ unsigned char *retdata; struct __go_string ret; + /* A negative value is not valid UTF-8; turn it into the replacement + character. */ + if (v < 0) + v = 0xfffd; + if (v <= 0x7f) { buf[0] = v; @@ -34,6 +39,10 @@ "replacement character". */ if (v > 0x10ffff) v = 0xfffd; + /* If the value is a surrogate pair, which is invalid in UTF-8, + turn it into the replacement character. */ + if (v >= 0xd800 && v < 0xe000) + v = 0xfffd; if (v <= 0xffff) {