From patchwork Wed Jan 4 15:39:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 711034 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ttw456w8Xz9t0v for ; Thu, 5 Jan 2017 02:40:05 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="HgFEhiug"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=ARk8EYeLGi16An7e5 WPp0+YZ08vh5jNevMgNd3XAC0Zca22vmTM7i2WWGS3rW8WMuD1OAVZrvwTdwIw+c otpQvj/YxHhVsbfuGRGtJjlX/3fA9Ad7x/Ta7KA5+jZ0oW9105XKmq9/Hn8kINhj kf5B8b0GvbW5TyuGT1TiWfYIA4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=E6wtNzlqW2MBQucJgF3mWQ9 Vo08=; b=HgFEhiug9+JEkMMzcxXbmGbkzFOdT0FGCgzqtd66b9wiE9ClT9N5eWo apINpwzItTsA08ke6u59FP4D25fq+CL/jmOLqdw6ladJnndcMFIin08yCGa24Y9i dKDIU8yzaIvURgV5QoT+ALG2OkquryHlhxFFebaW1LqpXz8/YRME= Received: (qmail 13078 invoked by alias); 4 Jan 2017 15:39:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 12283 invoked by uid 89); 4 Jan 2017 15:39:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.6 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=cure, r31, Lay, numbered X-HELO: mo4-p00-ob.smtp.rzone.de Received: from mo4-p00-ob.smtp.rzone.de (HELO mo4-p00-ob.smtp.rzone.de) (81.169.146.219) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 04 Jan 2017 15:39:44 +0000 X-RZG-AUTH: :LXoWVUeid/7A29J/hMvvT3ol15ykJcYwTPLBCxG2PQt7BpWLF2I= X-RZG-CLASS-ID: mo00 Received: from [192.168.0.123] (ip5f5871bb.dynamic.kabel-deutschland.de [95.88.113.187]) by smtp.strato.de (RZmta 39.11 DYNA|AUTH) with ESMTPSA id V0804ft04FdaKYd (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (curve secp521r1 with 521 ECDH bits, eq. 15360 bits RSA)) (Client did not present a certificate); Wed, 4 Jan 2017 16:39:36 +0100 (CET) Subject: [patch,avr] PR78883: Implement a dummy scheduler To: vogt@linux.vnet.ibm.com, gcc-patches , Denis Chertykov , Segher Boessenkool , rdsandiford@googlemail.com References: <20170102145409.GA12313@linux.vnet.ibm.com> <877f6cw97e.fsf@googlemail.com> From: Georg-Johann Lay Message-ID: Date: Wed, 4 Jan 2017 16:39:36 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <877f6cw97e.fsf@googlemail.com> X-IsSubscribed: yes On 03.01.2017 14:43, Richard Sandiford wrote: > Georg-Johann Lay writes: >> On 02.01.2017 15:54, Dominik Vogt wrote: >>> On Mon, Jan 02, 2017 at 03:47:43PM +0100, Georg-Johann Lay wrote: >>>> This fixes PR78883 which is a problem in reload revealed by a >>>> change to combine.c. The fix is as proposed by Segher: implement >>>> CANNOT_CHANGE_MODE_CLASS. >>>> >>>> Ok for trunk? >>>> >>>> Johann >>>> >>>> >>>> gcc/ >>>> PR target/78883 >>>> * config/avr/avr.h (CANNOT_CHANGE_MODE_CLASS): New define. >>>> * config/avr/avr-protos.h (avr_cannot_change_mode_class): New proto. >>>> * config/avr/avr.c (avr_cannot_change_mode_class): New function. >>>> >>>> gcc/testsuite/ >>>> PR target/78883 >>>> * gcc.c-torture/compile/pr78883.c: New test. >>> >>>> Index: config/avr/avr-protos.h >>>> =================================================================== >>>> --- config/avr/avr-protos.h (revision 244001) >>>> +++ config/avr/avr-protos.h (working copy) >>>> @@ -111,7 +111,7 @@ extern int _reg_unused_after (rtx_insn * >>>> extern int avr_jump_mode (rtx x, rtx_insn *insn); >>>> extern int test_hard_reg_class (enum reg_class rclass, rtx x); >>>> extern int jump_over_one_insn_p (rtx_insn *insn, rtx dest); >>>> - >>>> +extern int avr_cannot_change_mode_class (machine_mode, machine_mode, enum reg_class); >>>> extern int avr_hard_regno_mode_ok (int regno, machine_mode mode); >>>> extern void avr_final_prescan_insn (rtx_insn *insn, rtx *operand, >>>> int num_operands); >>>> Index: config/avr/avr.c >>>> =================================================================== >>>> --- config/avr/avr.c (revision 244001) >>>> +++ config/avr/avr.c (working copy) >>>> @@ -11833,6 +11833,21 @@ jump_over_one_insn_p (rtx_insn *insn, rt >>>> } >>>> >>>> >>>> +/* Worker function for `CANNOT_CHANGE_MODE_CLASS'. */ >>>> + >>>> +int >>>> +avr_cannot_change_mode_class (machine_mode from, machine_mode to, >>>> + enum reg_class /* rclass */) >>>> +{ >>>> + /* We cannot access a hard register in a wider mode, for example we >>>> + must not access (reg:QI 31) as (reg:HI 31). HARD_REGNO_MODE_OK >>>> + would avoid such hard regs, but reload would generate it anyway >>>> + from paradoxical subregs of mem, cf. PR78883. */ >>>> + >>>> + return GET_MODE_SIZE (to) > GET_MODE_SIZE (from); >>> >>> I understand how this fixes the ICE, but is it really necessary to >>> suppress conversions to a wider mode for lower numbered registers? >> >> If there is a better hook, I'll propose an according patch. >> >> My expectation was that HARD_REGNO_MODE_OK would be enough to keep >> reload from putting HI into odd registers (and in particular into R31). >> But this is obviously not the case... > > It should be enough in principle. It's just a case of whether you want > to fix reload, hack around it, or take the plunge and switch to LRA. Well, if it can be done in the back-end, then that's generally my strong preference. And the blocker for LRA is that it doesn't support cc0, hence LRA will require an almost complete rewrite of the avr back-end... > Having a (subreg (mem)) is probably key here. If it had been > (subreg (reg:HI X)) for some pseudo X then init_subregs_of_mode should > have realised that 31 isn't a valid choice for X. > > I think the reload fix would be to honour simplifiable_subregs when > reloading the (subreg (mem)). > >> And internals are not very helpful here. It only mentions modifying >> ordinary subregs of pseudos, but not paradoxical subreg of memory. >> >> What's also astonishing me is that this problem never popped up >> during the last > 15 years of avr back-end. > > FWIW, the current init_subregs_of_mode/simplifiable_subregs behaviour > is fairly recent (2014) and CANNOT_CHANGE_MODE_CLASS had been used in > the past to avoid errors like this. Using it that way was always a > hack though. > > An alternative would be to add a new macro to control this block in > general_operand: > > #ifdef INSN_SCHEDULING > /* On machines that have insn scheduling, we want all memory > reference to be explicit, so outlaw paradoxical SUBREGs. > However, we must allow them after reload so that they can > get cleaned up by cleanup_subreg_operands. */ > if (!reload_completed && MEM_P (sub) > && GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (sub))) > return 0; > #endif > > The default would still be INSN_SCHEDULING, but AVR could override it > to 1 and reject the same subregs. > > That would still be a hack, but at least it would be taking things in > a good direction. Having different rules depending on whether targets > define a scheduler is just a horrible wart that no-one's ever had chance > to fix. If using the code above works well on AVR then that'd be a big > step towards making the code unconditional. > > It'd definitely be worth checking how it affects code quality though. > (Although the same goes for the current patch, since C_C_M_C is a pretty > big hammer.) > > Thanks, > Richard For now, here is the 2nd approach implemented: Add a dummy scheduler. At least the test case passes with this change. Johann gcc/ PR78883 * config/avr/avr-dfa.md: New file. * config/avr/avr.md (avr-dfa.md): Include it. * common/config/avr/avr-common.c (avr_option_optimization_table) [OPT_LEVELS_ALL]: Disable -fschedule-insns. [OPT_LEVELS_ALL]: Disable -fschedule-insns2. * config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): Change argument #2 to not use machine_mode. * config/avr/avr.c (avr_hard_regno_call_part_clobbered): Same. * config/avr/avr.h (avr_hard_regno_call_part_clobbered): New proto. gcc/testsuite/ PR78883 * gcc.c-torture/compile/pr78883.c: New test. Index: common/config/avr/avr-common.c =================================================================== --- common/config/avr/avr-common.c (revision 244001) +++ common/config/avr/avr-common.c (working copy) @@ -28,9 +28,18 @@ static const struct default_options avr_option_optimization_table[] = { { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 }, + // The only effect of -fcaller-saves might be that it triggers // a frame without need when it tries to be smart around calls. { OPT_LEVELS_ALL, OPT_fcaller_saves, NULL, 0 }, + + // Currently, the only purpose of the insn scheduler is to work + // around PR78883, i.e. we only need the side effect of defining + // INSN_SCHEDULING. Even with a dummy scheduler we will see reodering + // of instructions, which is not wanted if not actually needed. + { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 }, + { OPT_LEVELS_ALL, OPT_fschedule_insns2, NULL, 0 }, + { OPT_LEVELS_NONE, 0, NULL, 0 } }; Index: config/avr/avr-dfa.md =================================================================== --- config/avr/avr-dfa.md (nonexistent) +++ config/avr/avr-dfa.md (working copy) @@ -0,0 +1,48 @@ +;; Copyright (C) 1998-2017 Free Software Foundation, Inc. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . */ + + +;; FIXME: The only purpose of this dummy scheduler is to work around PR78883 +;; as a less intrusive alternative to implementing CANNOT_CHANGE_MODE_CLASS, +;; cf. https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00078.html +;; +;; Desired side effect is defining INSN_SCHEDULING, which in turn enables the +;; following chunk of code in recog.c::general_operand, which in turn +;; defeats the paradoxical subregs of mem which caused PR78883. +;; +;; #ifdef INSN_SCHEDULING +;; /* On machines that have insn scheduling, we want all memory +;; reference to be explicit, so outlaw paradoxical SUBREGs. +;; However, we must allow them after reload so that they can +;; get cleaned up by cleanup_subreg_operands. */ +;; if (!reload_completed && MEM_P (sub) +;; && GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (sub))) +;; return 0; +;; #endif +;; +;; Notice that we disable -fschedule-insns and -fschedule-insns2 in +;; avr-common.c::TARGET_OPTION_OPTIMIZATION_TABLE because some "random" +;; reordering of instructions is not wanted. + +(define_automaton "avr_dfa") + +(define_cpu_unit "avr_cpu" "avr_dfa") + +(define_insn_reservation "avr_cpu_reserve" 1 + (const_int 1) + "avr_cpu") Index: config/avr/avr-protos.h =================================================================== --- config/avr/avr-protos.h (revision 244001) +++ config/avr/avr-protos.h (working copy) @@ -31,6 +31,7 @@ extern int avr_hard_regno_rename_ok (uns extern rtx avr_return_addr_rtx (int count, rtx tem); extern void avr_register_target_pragmas (void); extern void avr_init_expanders (void); +extern int avr_hard_regno_call_part_clobbered (unsigned, int); #ifdef TREE_CODE extern void avr_asm_output_aligned_decl_common (FILE*, tree, const char*, unsigned HOST_WIDE_INT, unsigned int, bool); @@ -46,7 +47,6 @@ extern void avr_init_cumulative_args (CU #endif /* TREE_CODE */ #ifdef RTX_CODE -extern int avr_hard_regno_call_part_clobbered (unsigned, machine_mode); extern const char *output_movqi (rtx_insn *insn, rtx operands[], int *l); extern const char *output_movhi (rtx_insn *insn, rtx operands[], int *l); extern const char *output_movsisf (rtx_insn *insn, rtx operands[], int *l); Index: config/avr/avr.c =================================================================== --- config/avr/avr.c (revision 244001) +++ config/avr/avr.c (working copy) @@ -11873,8 +11873,10 @@ avr_hard_regno_mode_ok (int regno, machi /* Implement `HARD_REGNO_CALL_PART_CLOBBERED'. */ int -avr_hard_regno_call_part_clobbered (unsigned regno, machine_mode mode) +avr_hard_regno_call_part_clobbered (unsigned regno, int imode) { + machine_mode mode = (machine_mode) imode; + /* FIXME: This hook gets called with MODE:REGNO combinations that don't represent valid hard registers like, e.g. HI:29. Returning TRUE for such registers can lead to performance degradation as mentioned Index: config/avr/avr.h =================================================================== --- config/avr/avr.h (revision 244001) +++ config/avr/avr.h (working copy) @@ -285,8 +285,11 @@ enum reg_class { #define REGNO_OK_FOR_INDEX_P(NUM) 0 +/* Prototype needed because sched-deps.c does not include tm_p.h. */ +extern int avr_hard_regno_call_part_clobbered (unsigned, int); + #define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \ - avr_hard_regno_call_part_clobbered (REGNO, MODE) + avr_hard_regno_call_part_clobbered (REGNO, (int) (MODE)) #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P hook_bool_mode_true Index: config/avr/avr.md =================================================================== --- config/avr/avr.md (revision 244001) +++ config/avr/avr.md (working copy) @@ -6927,3 +6927,6 @@ (define_insn_and_split "*extract.subreg. ;; Operations on 64-bit registers (include "avr-dimode.md") + +;; A dummy DFA to cure PR78883 +(include "avr-dfa.md") Index: testsuite/gcc.c-torture/compile/pr78883.c =================================================================== --- testsuite/gcc.c-torture/compile/pr78883.c (nonexistent) +++ testsuite/gcc.c-torture/compile/pr78883.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ + +int foo (int *p) +{ + int i; + for (i = 0; i < 5; i++) + { + if (p[i] & 1) + return i; + } + return -1; +}