From patchwork Mon Nov 4 10:32:16 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 288149 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id ACCC92C0109 for ; Mon, 4 Nov 2013 21:32:35 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=go8XkikoR/1WzyEfkTbhjo+u8L4uG g1ISpEVGzoz19vfPAdqWWBl7swY9ckSeQ6lAm8lsAq9JIPDekF0OGeAH+jZhOhww +z4h4GJATjuUfPByCx6sDrju+UyUthao0Y8F/NJWpKzFmGtXy9a2Bu2TMiUc5OuB APf508uwY4CqCQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=oVkw5L/jAysGubjYxF8cBP5pWGs=; b=DvB RL/4gEsC/uJECLnx2Ut1mD5SN34zuGNsGG4dHNtXhhWGZeBtYlPtPKr9A+nEu4No bMDNGoTLl54NwLjcwpv4v02fwwkwGLleWWLA8eCi7aZATQa/tDYR/NP9dpVXF52u Wzt48JOOTGqjXibShlLw1TVvtdoiO5FqwCQr5tbk= Received: (qmail 5161 invoked by alias); 4 Nov 2013 10:32:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 5150 invoked by uid 89); 4 Nov 2013 10:32:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, NO_DNS_FOR_FROM, RDNS_NONE, URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: mga03.intel.com Received: from Unknown (HELO mga03.intel.com) (143.182.124.21) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 04 Nov 2013 10:32:25 +0000 Received: from azsmga001.ch.intel.com ([10.2.17.19]) by azsmga101.ch.intel.com with ESMTP; 04 Nov 2013 02:32:17 -0800 X-ExtLoop1: 1 Received: from gnu-18.sc.intel.com ([172.25.70.60]) by azsmga001.ch.intel.com with ESMTP; 04 Nov 2013 02:32:17 -0800 Received: by gnu-18.sc.intel.com (Postfix, from userid 500) id 99501DCA; Mon, 4 Nov 2013 02:32:16 -0800 (PST) Date: Mon, 4 Nov 2013 02:32:16 -0800 From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: law@redhat.com, rguenther@suse.de Subject: PATCH: middle-end/58981: movmem/setmem use mode wider than Pmode for size Message-ID: <20131104103216.GA13798@lucon.org> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) emit_block_move_via_movmem and set_storage_via_setmem have for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode; mode = GET_MODE_WIDER_MODE (mode)) { enum insn_code code = direct_optab_handler (movmem_optab, mode); if (code != CODE_FOR_nothing /* We don't need MODE to be narrower than BITS_PER_HOST_WIDE_INT here because if SIZE is less than the mode mask, as it is returned by the macro, it will definitely be less than the actual mode mask. */ && ((CONST_INT_P (size) && ((unsigned HOST_WIDE_INT) INTVAL (size) <= (GET_MODE_MASK (mode) >> 1))) || GET_MODE_BITSIZE (mode) >= BITS_PER_WORD)) { Backend may assume mode of size in movmem and setmem expanders is no widder than Pmode since size is within the Pmode address space. X86 backend expand_set_or_movmem_prologue_epilogue_by_misaligned has rtx saveddest = *destptr; gcc_assert (desired_align <= size); /* Align destptr up, place it to new register. */ *destptr = expand_simple_binop (GET_MODE (*destptr), PLUS, *destptr, GEN_INT (prolog_size), NULL_RTX, 1, OPTAB_DIRECT); *destptr = expand_simple_binop (GET_MODE (*destptr), AND, *destptr, GEN_INT (-desired_align), *destptr, 1, OPTAB_DIRECT); /* See how many bytes we skipped. */ saveddest = expand_simple_binop (GET_MODE (*destptr), MINUS, saveddest, *destptr, saveddest, 1, OPTAB_DIRECT); /* Adjust srcptr and count. */ if (!issetmem) *srcptr = expand_simple_binop (GET_MODE (*srcptr), MINUS, *srcptr, saveddest, *srcptr, 1, OPTAB_DIRECT); *count = expand_simple_binop (GET_MODE (*count), PLUS, *count, saveddest, *count, 1, OPTAB_DIRECT); saveddest is a negative number in Pmode and *count is in word_mode. For x32, when Pmode is SImode and word_mode is DImode, saveddest + *count leads to overflow. We could fix it by using mode of saveddest to compute saveddest + *count. But it leads to extra conversions and other backends may run into the same problem. A better fix is to limit mode of size in movmem and setmem expanders to Pmode. It generates better and correct memcpy and memset for x32. There is also a typo in comments. It should be BITS_PER_WORD, not BITS_PER_HOST_WIDE_INT. Tested on x32. OK to install? Thanks. H.J. --- gcc/ 2013-11-04 H.J. Lu PR middle-end/58981 * expr.c (emit_block_move_via_movmem): Don't use mode wider than Pmode for size. (set_storage_via_setmem): Likewise. gcc/testsuite/ 2013-11-04 H.J. Lu PR middle-end/58981 * gcc.dg/pr58981.c: New test. diff --git a/gcc/expr.c b/gcc/expr.c index 551a660..1a869650 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1294,13 +1294,16 @@ emit_block_move_via_movmem (rtx x, rtx y, rtx size, unsigned int align, enum insn_code code = direct_optab_handler (movmem_optab, mode); if (code != CODE_FOR_nothing - /* We don't need MODE to be narrower than BITS_PER_HOST_WIDE_INT - here because if SIZE is less than the mode mask, as it is + /* We don't need MODE to be narrower than BITS_PER_WORD here + because if SIZE is less than the mode mask, as it is returned by the macro, it will definitely be less than the - actual mode mask. */ + actual mode mask unless MODE is wider than Pmode. Since + SIZE is within the Pmode address space, we should use + Pmode in this case. */ && ((CONST_INT_P (size) && ((unsigned HOST_WIDE_INT) INTVAL (size) <= (GET_MODE_MASK (mode) >> 1))) + || GET_MODE_BITSIZE (mode) >= GET_MODE_BITSIZE (Pmode) || GET_MODE_BITSIZE (mode) >= BITS_PER_WORD)) { struct expand_operand ops[6]; @@ -2879,13 +2882,16 @@ set_storage_via_setmem (rtx object, rtx size, rtx val, unsigned int align, enum insn_code code = direct_optab_handler (setmem_optab, mode); if (code != CODE_FOR_nothing - /* We don't need MODE to be narrower than - BITS_PER_HOST_WIDE_INT here because if SIZE is less than - the mode mask, as it is returned by the macro, it will - definitely be less than the actual mode mask. */ + /* We don't need MODE to be narrower than BITS_PER_WORD here + because if SIZE is less than the mode mask, as it is + returned by the macro, it will definitely be less than the + actual mode mask unless MODE is wider than Pmode. Since + SIZE is within the Pmode address space, we should use + Pmode in this case. */ && ((CONST_INT_P (size) && ((unsigned HOST_WIDE_INT) INTVAL (size) <= (GET_MODE_MASK (mode) >> 1))) + || GET_MODE_BITSIZE (mode) >= GET_MODE_BITSIZE (Pmode) || GET_MODE_BITSIZE (mode) >= BITS_PER_WORD)) { struct expand_operand ops[6]; diff --git a/gcc/testsuite/gcc.dg/pr58981.c b/gcc/testsuite/gcc.dg/pr58981.c new file mode 100644 index 0000000..1c8293e --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr58981.c @@ -0,0 +1,55 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ +/* { dg-additional-options "-minline-all-stringops" { target { i?86-*-* x86_64-*-* } } } */ + +extern void abort (void); + +#define MAX_OFFSET (sizeof (long long)) +#define MAX_COPY (8 * sizeof (long long)) +#define MAX_EXTRA (sizeof (long long)) + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +char A[MAX_LENGTH]; + +int +main () +{ + int off, len, i; + char *p, *q; + + for (i = 0; i < MAX_LENGTH; i++) + A[i] = 'A'; + + for (off = 0; off < MAX_OFFSET; off++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0; i < MAX_LENGTH; i++) + u.buf[i] = 'a'; + + p = __builtin_memcpy (u.buf + off, A, len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != 'A') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } + + return 0; +}