From patchwork Wed Nov 27 17:09:07 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejas Belagod X-Patchwork-Id: 294627 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 655902C007C for ; Thu, 28 Nov 2013 05:09:23 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:references :in-reply-to:content-type; q=dns; s=default; b=m3TqdjL7vdITwNT8t La2cN4HaKm4CVuBIl6wBcBFElzn7wxX7QNnRTOj8qn5nKSqImQ8YU9hyjs6O4zXJ za5WsWwTrtgjkFDEx2zny0yiSkxJCqkIUzfcNnGEQyGSQzM2IvSXnMfuZLH7RPhR P/e+wQVNpYpRGxDSt4yCgAO8Qo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:references :in-reply-to:content-type; s=default; bh=l43fkExGoVlU1RG3QhN1cbG xAu4=; b=CEhw8YswJ9Vv31tU+h2vcA4qu3Y/F1JExuk2VihPcokZSTeF11/6Zkw Itqh7YlZAAS0svjv9WG5h3tgZ02RhY39sRM10f2b6HBUOui6m9e9CDi43h/duy5Q 4wXflwayQvEt9Y65mImzv4gNT0vsNcgox0ET0pcgkeXkb4vzjeL8= Received: (qmail 6692 invoked by alias); 27 Nov 2013 17:09:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 6674 invoked by uid 89); 27 Nov 2013 17:09:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL, BAYES_50, RDNS_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: service87.mimecast.com Received: from Unknown (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 27 Nov 2013 17:09:18 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Wed, 27 Nov 2013 17:09:09 +0000 Received: from [10.1.203.80] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 27 Nov 2013 17:09:08 +0000 Message-ID: <52962733.7030005@arm.com> Date: Wed, 27 Nov 2013 17:09:07 +0000 From: Tejas Belagod User-Agent: Thunderbird 2.0.0.18 (X11/20081120) MIME-Version: 1.0 To: Bill Schmidt , "gcc-patches@gcc.gnu.org" , rdsandiford@googlemail.com Subject: Re: [Patch, RTL] Eliminate redundant vec_select moves. References: <527A4309.70209@arm.com> <8738n9sj8o.fsf@talisman.default> <527A5EF4.5090505@arm.com> <87y551r01p.fsf@talisman.default> <527A7612.2080406@arm.com> <877gcll9ht.fsf@talisman.default> <527BA073.30900@arm.com> <87zjpg1d5p.fsf@sandifor-thinkpad.stglab.manchester.uk.ibm.com> <527BD411.6060300@arm.com> <878uwwdnx0.fsf@talisman.default> In-Reply-To: <878uwwdnx0.fsf@talisman.default> X-MC-Unique: 113112717090905801 X-IsSubscribed: yes Richard Sandiford wrote: > Tejas Belagod writes: >>> The problem is that one reg rtx can span several hard registers. >>> E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32), >>> but it might instead represent two 32-bit registers (nos. 32 and 33). >>> Obviously the latter's not very likely for vectors this small, >>> but more likely for larger ones (including on NEON IIRC). >>> >>> So if we had 2 32-bit registers being treated as a V4HI, it would be: >>> >>> <--32--><--33--> >>> msb lsb >>> 0000111122223333 >>> VVVVVVVV >>> 00001111 >>> msb lsb >>> <--32--> >>> >>> for big endian and: >>> >>> <--33--><--32--> >>> msb lsb >>> 3333222211110000 >>> VVVVVVVV >>> 11110000 >>> msb lsb >>> <--32--> >>> >>> for little endian. >> Ah, ok, that makes things clearer. Thanks for that. >> >> I can't find any helper function that figures out if we're writing partial or >> full result regs. Would something like >> >> REGNO (src) == REGNO (dst) && >> HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1 >> >> be a sane check for partial result regs? > > Yeah, that should work. I think a more general alternative would be: > > simplify_subreg_regno (REGNO (src), GET_MODE (src), > offset, GET_MODE (dst)) == (int) REGNO (dst) > > where: > > offset = GET_MODE_UNIT_SIZE (GET_MODE (src)) * INTVAL (XVECEXP (sel, 0)) > > That offset is the byte offset of the first selected element from the > start of a vector in memory, which is also the way that SUBREG_BYTEs > are counted. For little-endian it gives the offset of the lsb of the > slice, while for big-endian it gives the offset of the msb (which is > also how SUBREG_BYTEs work). > > The simplify_subreg_regno should cope with both single-register vectors > and multi-register vectors. Sorry for the delayed response to this. Thanks for the tip. Here's an improved patch that implements the simplify_sureg_regno () method of eliminating redundant moves. Regarding the test case, I failed to get the ppc back-end to generate RTL pattern that this patch checks for. I can easily write a test case for aarch64(big and little endian) on these lines typedef float float32x4_t __attribute__ ((__vector_size__ (16))); float foo_be (float32x4_t x) { return x[3]; } float foo_le (float32x4_t x) { return x[0]; } where I know that the vector indexing will generate a vec_select on the same src and dst regs that could be optimized away and hence test it. But I'm struggling to get a test case that the ppc altivec back-end will generate such a vec_select for. I see that altivec does not define vec_extract, so a simple indexing like this seems to happen via memory. Also, I don't know enough about the ppc PCS or architecture to write a test that will check for this optimization opportunity on same src and dst hard-registers. Any hints? This patch has been bootstrapped on x64_64 and regressed on aarch64-none-elf and aarch64_be-none-elf. Thanks for your patience, Tejas. diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c index 0cd0c7e..ca25ce5 100644 --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -1180,6 +1180,22 @@ set_noop_p (const_rtx set) dst = SUBREG_REG (dst); } + /* It is a NOOP if destination overlaps with selected src vector + elements. */ + if (GET_CODE (src) == VEC_SELECT + && REG_P (XEXP (src, 0)) && REG_P (dst) + && HARD_REGISTER_P (XEXP (src, 0)) + && HARD_REGISTER_P (dst)) + { + rtx par = XEXP (src, 1); + rtx src0 = XEXP (src, 0); + HOST_WIDE_INT offset = + GET_MODE_UNIT_SIZE (GET_MODE (src0)) * INTVAL (XVECEXP (par, 0, 0)); + + return simplify_subreg_regno (REGNO (src0), GET_MODE (src0), + offset, GET_MODE (dst)) == (int)REGNO (dst); + } + return (REG_P (src) && REG_P (dst) && REGNO (src) == REGNO (dst)); }