From patchwork Tue Nov 20 09:05:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 1000313 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-490484-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=adacore.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="P3SVg4sk"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42zftp4YRTz9s1x for ; Tue, 20 Nov 2018 20:05:36 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=vsHNudc3MsBh+Dv1 4oZevRutlwpid2VjXqpAuwAsJoPqViLq6+flKhaVqLb7QbiVdCfMxDf2HJGb7s/V kZpU9YUI88yTYmmNLSTXsiT9/IvSvqZ9IdqXpIk6JEdVoE89YeDgVRvhDfFzMlmA Bs3jlaRrsK7AMkVNsgc2405hX20= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=XBuYkzwl3lyFPb7/2T9VW8 cOVXo=; b=P3SVg4skXNahThB1UP8ZFJiuOL7hxPTOn4+T4C1BGzXiGLPdU7eBIr b96kF4HF8H2zi7aR2+GATr5Bmv/+0QYGUH+1PFJeXHqfzyWc9Pr8WpsOxsvNhcaU 1Xzwj+VGv8D/6ZpuBdie+Gf43SjK57LQOvr3+Tik4YZeRtnOfoe3w= Received: (qmail 39072 invoked by alias); 20 Nov 2018 09:05:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 39060 invoked by uid 89); 20 Nov 2018 09:05:28 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=approaches, himode, HImode, wider X-HELO: smtp.eu.adacore.com Received: from mel.act-europe.fr (HELO smtp.eu.adacore.com) (194.98.77.210) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Nov 2018 09:05:27 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id 42C478139A for ; Tue, 20 Nov 2018 10:05:24 +0100 (CET) Received: from smtp.eu.adacore.com ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QXDd1fz8r2TQ for ; Tue, 20 Nov 2018 10:05:24 +0100 (CET) Received: from polaris.localnet (bon31-6-88-161-99-133.fbx.proxad.net [88.161.99.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.eu.adacore.com (Postfix) with ESMTPSA id F0EA581395 for ; Tue, 20 Nov 2018 10:05:23 +0100 (CET) From: Eric Botcazou To: gcc-patches@gcc.gnu.org Subject: Fix PR rtl-optimization/85925 Date: Tue, 20 Nov 2018 10:05:21 +0100 Message-ID: <2548654.RuPc4GTstv@polaris> MIME-Version: 1.0 This is a regression present on all active branches: the combiner wrongly optimizes away a zero-extension on the ARM because it rewrites a ZERO_EXTRACT from SImode to HImode after having recorded that the upper bits of the results are cleared for WORD_REGISTER_OPERATIONS architectures. I tried 3 approaches to fix the bug (with the help of Segher to evaluate the pessimization on various architectures): 1. Disabling the WORD_REGISTER_OPERATIONS mechanism in the combiner, 2. Preventing the ZERO_EXTRACT from being rewritten from SImode to HImode, 3. Selectively disabling the WORD_REGISTER_OPERATIONS mechanism. The 3 approaches pessimize (as expected) in the following order: 2 > 1 > 3. The attached patch implements the 3rd approach, which seems a good compromise. Tested on arm-elf and sparc-sun-solaris2.11, applied on all active branches. 2018-11-20 Eric Botcazou PR rtl-optimization/85925 * rtl.h (word_register_operation_p): New predicate. * combine.c (record_dead_and_set_regs_1): Only apply specific handling for WORD_REGISTER_OPERATIONS targets to word_register_operation_p RTX. * rtlanal.c (nonzero_bits1): Likewise. Adjust couple of comments. (num_sign_bit_copies1): Likewise. 2018-11-20 Eric Botcazou * gcc.c-torture/execute/20181120-1.c: New test. Index: rtl.h =================================================================== --- rtl.h (revision 266178) +++ rtl.h (working copy) @@ -4374,6 +4375,25 @@ strip_offset_and_add (rtx x, poly_int64_ return x; } +/* Return true if X is an operation that always operates on the full + registers for WORD_REGISTER_OPERATIONS architectures. */ + +inline bool +word_register_operation_p (const_rtx x) +{ + switch (GET_CODE (x)) + { + case ROTATE: + case ROTATERT: + case SIGN_EXTRACT: + case ZERO_EXTRACT: + return false; + + default: + return true; + } +} + /* gtype-desc.c. */ extern void gt_ggc_mx (rtx &); extern void gt_pch_nx (rtx &); Index: combine.c =================================================================== --- combine.c (revision 266178) +++ combine.c (working copy) @@ -13331,6 +13331,7 @@ record_dead_and_set_regs_1 (rtx dest, co && subreg_lowpart_p (SET_DEST (setter))) record_value_for_reg (dest, record_dead_insn, WORD_REGISTER_OPERATIONS + && word_register_operation_p (SET_SRC (setter)) && paradoxical_subreg_p (SET_DEST (setter)) ? SET_SRC (setter) : gen_lowpart (GET_MODE (dest), Index: rtlanal.c =================================================================== --- rtlanal.c (revision 266178) +++ rtlanal.c (working copy) @@ -4485,12 +4485,12 @@ nonzero_bits1 (const_rtx x, scalar_int_m might be nonzero in its own mode, taking into account the fact that, on CISC machines, accessing an object in a wider mode generally causes the high-order bits to become undefined, so they are not known to be zero. - We extend this reasoning to RISC machines for rotate operations since the - semantics of the operations in the larger mode is not well defined. */ + We extend this reasoning to RISC machines for operations that might not + operate on the full registers. */ if (mode_width > xmode_width && xmode_width <= BITS_PER_WORD && xmode_width <= HOST_BITS_PER_WIDE_INT - && (!WORD_REGISTER_OPERATIONS || code == ROTATE || code == ROTATERT)) + && !(WORD_REGISTER_OPERATIONS && word_register_operation_p (x))) { nonzero &= cached_nonzero_bits (x, xmode, known_x, known_mode, known_ret); @@ -4758,13 +4758,16 @@ nonzero_bits1 (const_rtx x, scalar_int_m nonzero &= cached_nonzero_bits (SUBREG_REG (x), mode, known_x, known_mode, known_ret); - /* On many CISC machines, accessing an object in a wider mode + /* On a typical CISC machine, accessing an object in a wider mode causes the high-order bits to become undefined. So they are - not known to be zero. */ + not known to be zero. + + On a typical RISC machine, we only have to worry about the way + loads are extended. Otherwise, if we get a reload for the inner + part, it may be loaded from the stack, and then we may lose all + the zero bits that existed before the store to the stack. */ rtx_code extend_op; if ((!WORD_REGISTER_OPERATIONS - /* If this is a typical RISC machine, we only have to worry - about the way loads are extended. */ || ((extend_op = load_extend_op (inner_mode)) == SIGN_EXTEND ? val_signbit_known_set_p (inner_mode, nonzero) : extend_op != ZERO_EXTEND) @@ -5025,10 +5028,9 @@ num_sign_bit_copies1 (const_rtx x, scala { /* If this machine does not do all register operations on the entire register and MODE is wider than the mode of X, we can say nothing - at all about the high-order bits. We extend this reasoning to every - machine for rotate operations since the semantics of the operations - in the larger mode is not well defined. */ - if (!WORD_REGISTER_OPERATIONS || code == ROTATE || code == ROTATERT) + at all about the high-order bits. We extend this reasoning to RISC + machines for operations that might not operate on full registers. */ + if (!(WORD_REGISTER_OPERATIONS && word_register_operation_p (x))) return 1; /* Likewise on machines that do, if the mode of the object is smaller @@ -5107,13 +5109,12 @@ num_sign_bit_copies1 (const_rtx x, scala /* For paradoxical SUBREGs on machines where all register operations affect the entire register, just look inside. Note that we are passing MODE to the recursive call, so the number of sign bit - copies will remain relative to that mode, not the inner mode. */ + copies will remain relative to that mode, not the inner mode. - /* This works only if loads sign extend. Otherwise, if we get a + This works only if loads sign extend. Otherwise, if we get a reload for the inner part, it may be loaded from the stack, and then we lose all sign bit copies that existed before the store to the stack. */ - if (WORD_REGISTER_OPERATIONS && load_extend_op (inner_mode) == SIGN_EXTEND && paradoxical_subreg_p (x)