From patchwork Wed Jan 18 15:03:47 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 136638 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D2811B6EF1 for ; Thu, 19 Jan 2012 02:04:08 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1327503849; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:Received:Date:Message-ID:Subject:From:To: Cc:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=E7yEQeS irDqSjRPTd90I5ov+lrE=; b=CvnwVnh9DJ7OggD09d3qFQnP41I+pVN0L4hha88 1CNRqRkwtmdvCvZIXayRr8VPXzf+qpioGNlzKhuN71VTWCW7SMUhjsHtm71r+zKF 9h0WfMbLO0l8We92wLDm9cTl3SV+p57GtXwjntM7Dp2I58kzIjrHNMc2uH2xF45L 1wzg= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:MIME-Version:Received:Received:Date:Message-ID:Subject:From:To:Cc:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=kRyggll4G3BInOj2LF6VsK8hPXr2MtarD7KG3nwX95tdKXLOvL1BN2M6V2zfbd HV5Y15aFzOYOG9bG6Bm/FqLudbd2EsO6gaWe3QQ/4wHBRD02OUkZHgMn8akZyGw0 vPTC8uKeKmWlj5GL3pGb4LSuVrEMo4azXICswwwbPVPkM=; Received: (qmail 5177 invoked by alias); 18 Jan 2012 15:04:04 -0000 Received: (qmail 5163 invoked by uid 22791); 18 Jan 2012 15:04:02 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL, BAYES_00, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, TW_OV X-Spam-Check-By: sourceware.org Received: from mail-gx0-f175.google.com (HELO mail-gx0-f175.google.com) (209.85.161.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 18 Jan 2012 15:03:48 +0000 Received: by ggno1 with SMTP id o1so4059381ggn.20 for ; Wed, 18 Jan 2012 07:03:47 -0800 (PST) MIME-Version: 1.0 Received: by 10.182.226.6 with SMTP id ro6mr19487624obc.3.1326899027665; Wed, 18 Jan 2012 07:03:47 -0800 (PST) Received: by 10.182.88.7 with HTTP; Wed, 18 Jan 2012 07:03:47 -0800 (PST) Date: Wed, 18 Jan 2012 15:03:47 +0000 Message-ID: Subject: [Patch ARM] Fix PR50313 and handle PIC addresses properly. From: Ramana Radhakrishnan To: gcc-patches Cc: Patch Tracking X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi , PR50313 is a case where having the 2 patterns "pic_load_addr_*" and "pic_add_dot_plus_eight/four" from expand time becomes a problem as discussed by Jakub in his comment in the audit trail for PR48308. There is a separate problem in combine as explained by my comment in the audit trail for PR48308 and my RFC patch for the same sometime last week. What I've done in this case is to take the existing patterns merged them into a UNIFIED form which means that we can hoist the entire computation out with any of the loop optimizers rather than piece-meal hoping to have each one being hoisted correctly. I've had to adjust the cost of this to be equivalent to that of a memory load ( it's in fact the cost of a memory load + the cost of an add instruction but it's good enough for the moment). It's reasonably close enough for arm_size_rtx_costs, and thumb1_rtx_costs pushes this high enough that it will warrant a hoist anyway. I've also been conservative in adjusting cannot_copy_insn_p to disallow copying such insns to prevent any cases where you might have duplicate labels being generated. Once this happens the original bug report in PR50313 appears to be fixed and I've had this survive a bootstrap and regression test on an A9 board and this has successfully cross-built a version of eglibc with an ARM v7-a configuration (which is currently undergoing testing). There are a few ways by which we can get better code - one is to replace the load from the constant pool with a movw / movt pair of the actual differences in which case we'd be able to get performant code in libraries for this case, the other which as RichardE suggested in a conversation was to allow the splitter to actually generate the internal labels in the pic_load_addr and pic_add_dot_plus_eight which could then mean that the label number is really not reserved until this complex pattern has been split. I don't think these enhancements are worth doing so late in stage4 and I do want to backport this patch as it stands into the 4.6 branch. OK (and to backport to 4.6 branch ?) if no regressions ? cheers Ramana 2012-01-18 Ramana Radhakrishnan PR target/50313 * config/arm/arm.c (arm_load_pic_register): Use gen_pic_load_addr_unified. Delete calls to gen_pic_load_addr_32bit , gen_pic_add_dot_plus_eight and gen_pic_add_dot_plus_four. (arm_pic_static_addr): Likewise. (arm_rtx_costs_1): Adjust cost for UNSPEC_PIC_UNIFIED. (arm_note_pic_base): Handle UNSPEC_PIC_UNIFIED. * config/arm/arm.md (UNSPEC_PIC_UNIFIED): Define. (pic_load_addr_unified): New. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4c310d4..240434f 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -5578,11 +5578,7 @@ arm_load_pic_register (unsigned long saved_regs ATTRIBUTE_UNUSED) if (TARGET_32BIT) { - emit_insn (gen_pic_load_addr_32bit (pic_reg, pic_rtx)); - if (TARGET_ARM) - emit_insn (gen_pic_add_dot_plus_eight (pic_reg, pic_reg, labelno)); - else - emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno)); + emit_insn (gen_pic_load_addr_unified (pic_reg, pic_rtx, labelno)); } else /* TARGET_THUMB1 */ { @@ -5595,10 +5591,10 @@ arm_load_pic_register (unsigned long saved_regs ATTRIBUTE_UNUSED) thumb_find_work_register (saved_regs)); emit_insn (gen_pic_load_addr_thumb1 (pic_tmp, pic_rtx)); emit_insn (gen_movsi (pic_offset_table_rtx, pic_tmp)); + emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno)); } else - emit_insn (gen_pic_load_addr_thumb1 (pic_reg, pic_rtx)); - emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno)); + emit_insn (gen_pic_load_addr_unified (pic_reg, pic_rtx, labelno)); } } @@ -5628,20 +5624,7 @@ arm_pic_static_addr (rtx orig, rtx reg) UNSPEC_SYMBOL_OFFSET); offset_rtx = gen_rtx_CONST (Pmode, offset_rtx); - if (TARGET_32BIT) - { - emit_insn (gen_pic_load_addr_32bit (reg, offset_rtx)); - if (TARGET_ARM) - insn = emit_insn (gen_pic_add_dot_plus_eight (reg, reg, labelno)); - else - insn = emit_insn (gen_pic_add_dot_plus_four (reg, reg, labelno)); - } - else /* TARGET_THUMB1 */ - { - emit_insn (gen_pic_load_addr_thumb1 (reg, offset_rtx)); - insn = emit_insn (gen_pic_add_dot_plus_four (reg, reg, labelno)); - } - + insn = emit_insn (gen_pic_load_addr_unified (reg, offset_rtx, labelno)); return insn; } @@ -7648,6 +7631,15 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed) case SET: return false; + + case UNSPEC: + /* We cost this as high as our memory costs to allow this to + be hoisted from loops. */ + if (XINT (x, 1) == UNSPEC_PIC_UNIFIED) + { + *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode)); + } + return true; default: *total = COSTS_N_INSNS (4); @@ -10008,7 +10000,8 @@ static int arm_note_pic_base (rtx *x, void *date ATTRIBUTE_UNUSED) { if (GET_CODE (*x) == UNSPEC - && XINT (*x, 1) == UNSPEC_PIC_BASE) + && (XINT (*x, 1) == UNSPEC_PIC_BASE + || XINT (*x, 1) == UNSPEC_PIC_UNIFIED)) return 1; return 0; } diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 5620d7d..8842846 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -116,6 +116,7 @@ ; unaligned locations, on architectures which support ; that. UNSPEC_UNALIGNED_STORE ; Same for str/strh. + UNSPEC_PIC_UNIFIED ; Create a common pic addressing form. ]) ;; UNSPEC_VOLATILE Usage: @@ -5601,7 +5602,7 @@ "flag_pic" ) -;; Split calculate_pic_address into pic_load_addr_* and a move. +;; Split calculate_pic_address into pic_load_addr_32bit/thumb1 and a move. (define_split [(set (match_operand:SI 0 "register_operand" "") (mem:SI (plus:SI (match_operand:SI 1 "register_operand" "") @@ -5613,6 +5614,29 @@ "operands[3] = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];" ) +;; operand1 is the memory address to go into pic_load_addr_32bit. +;; operand2 is the PIC label to be emitted from pic_add_dot_plus_*. +;; We do this to allow hoisting of the entire c +(define_insn_and_split "pic_load_addr_unified" + [(set (match_operand:SI 0 "s_register_operand" "=r,r,l") + (unspec:SI [(match_operand:SI 1 "" "mX,mX,mX") + (match_operand:SI 2 "" "")] + UNSPEC_PIC_UNIFIED))] + "flag_pic" + "#" + "flag_pic" + [(set (match_dup 3) (unspec:SI [(match_dup 1)] UNSPEC_PIC_SYM)) + (set (match_dup 0) (unspec:SI [(match_dup 3) (match_dup 4) + (match_dup 2)] UNSPEC_PIC_BASE))] + "operands[3] = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0]; + operands[4] = TARGET_THUMB ? GEN_INT (4) : GEN_INT (8);" + [(set_attr "type" "load1,load1,load1") + (set_attr "pool_range" "4096,4096,1024") + (set_attr "neg_pool_range" "0,4084,0") + (set_attr "arch" "a,t2,t1") + (set_attr "length" "8,6,4")] +) + ;; The rather odd constraints on the following are to force reload to leave ;; the insn alone, and to force the minipool generation pass to then move ;; the GOT symbol to memory.