From patchwork Tue Mar 31 12:10:54 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Lawrence X-Patchwork-Id: 456596 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4010914007D for ; Tue, 31 Mar 2015 23:11:15 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=kpSlveYT; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type:content-transfer-encoding; q=dns; s= default; b=MxZZhm6vIpz4NQRib/pMwHKouLoMdlsWH/CMcvuFbtItJtyIwQSnW aE8Z+heve1DVHATOYnpfsAILKwA9w3tFrnABTUe5G0n55jP0CZxQhc5bfXdMjaFR Pa0punb8MREM3KOrs/lqzuspDu9qF0e7TkGtXyQvSRJHAC2CuNpmD0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type:content-transfer-encoding; s=default; bh=auV9mdfEQy1EvY0FxmLEQTBoXyA=; b=kpSlveYTC60QSoPr6VBa7BV7OF1a AjdvJcAAUsZNqPwywX2C2yzOD5L8wWSsCqWIzPLMl3Mlmrk0pLslhzC+IV35l+52 VRlBOuR9LyefzwKn7MYx4kai6Vs2h6B7UUt3GUaGMjoFModDbiNwRy+QQqFlDNbN mqO77MG8R7XPZVk= Received: (qmail 128834 invoked by alias); 31 Mar 2015 12:11:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 128806 invoked by uid 89); 31 Mar 2015 12:11:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 31 Mar 2015 12:11:03 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-28.uk.mimecast.lan; Tue, 31 Mar 2015 13:10:55 +0100 Received: from [10.2.207.65] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 31 Mar 2015 13:10:54 +0100 Message-ID: <551A8ECE.6040802@arm.com> Date: Tue, 31 Mar 2015 13:10:54 +0100 From: Alan Lawrence User-Agent: Thunderbird 2.0.0.24 (X11/20101213) MIME-Version: 1.0 To: Richard Earnshaw CC: Richard Biener , Richard Biener , "gcc-patches@gcc.gnu.org" , Marcus Shawcroft Subject: Re: New regression on ARM Linux References: <55193E77.3040401@arm.com> <55193EFB.6030009@arm.com> <55197DAE.1050004@arm.com> <16A8C5AA-994C-4FA3-BC25-8547166DC13C@suse.de> <551A6C46.7010305@foss.arm.com> <551A729F.3090200@foss.arm.com> <551A7981.30805@foss.arm.com> <551A7C29.2080105@foss.arm.com> In-Reply-To: <551A7C29.2080105@foss.arm.com> X-MC-Unique: ITJdx9S5Soapr4AhaJIyJg-1 X-IsSubscribed: yes Richard Earnshaw wrote: > On 31/03/15 11:45, Richard Biener wrote: >> On Tue, 31 Mar 2015, Richard Earnshaw wrote: >> >>> On 31/03/15 11:36, Richard Biener wrote: >>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote: >>>> >>>>> On 31/03/15 11:00, Richard Biener wrote: >>>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote: >>>>>> >>>>>>> On 31/03/15 08:50, Richard Biener wrote: >>>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener wrote: >>>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence wrote: >>>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes >>>>>>>>>> it. >>>>>>>>>> >>>>>>>>>> The problem appears to be in laying out arguments, specifically >>>>>>>>>> varargs. From >>>>>>>>>> the "good" -fdump-rtl-expand: >>>>>>>>>> >>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0 >>>>>>>>>> S4 A32]) >>>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2) >>>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1) >>>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) >>>>>>>>>> (reg:SI 118)) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (call_insn 22 21 23 2 (parallel [ >>>>>>>>>> (set (reg:SI 0 r0) >>>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] >>>>>>>>>> ) [0 __builtin_printf S4 >>>>>>>>>> A32]) >>>>>>>>>> >>>>>>>>>> The struct members are >>>>>>>>>> reg:SI 113 => int a; >>>>>>>>>> reg:DF 112 => double b; >>>>>>>>>> reg:SI 111 => int c; >>>>>>>>>> >>>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is >>>>>>>>>> pushed >>>>>>>>>> into virtual-outgoing-args. In contrast, post-change to >>>>>>>>>> build_ref_of_offset, we get: >>>>>>>>>> >>>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118) >>>>>>>>>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] >>>>>>>>> *.LC1>)) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107 >>>>>>>>>> virtual-outgoing-args) >>>>>>>>>> (const_int 8 [0x8])) [0 S4 A64]) >>>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0 >>>>>>>>>> S8 A64]) >>>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2) >>>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) >>>>>>>>>> (reg:SI 118)) reduced.c:14 -1 >>>>>>>>>> (nil)) >>>>>>>>>> (call_insn 22 21 23 2 (parallel [ >>>>>>>>>> (set (reg:SI 0 r0) >>>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] >>>>>>>>>> ) [0 __builtin_printf S4 >>>>>>>>>> A32]) >>>>>>>>>> >>>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the >>>>>>>>>> >>>>>>>>>> double+following int are all pushed into virtual-outgoing-args. This is >>>>>>>>>> because >>>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second >>>>>>>>>> argument (the >>>>>>>>>> type constructed by build_ref_for_offset); it then executes >>>>>>>>>> (aapcs_layout_arg, >>>>>>>>>> arm.c line ~~5914) >>>>>>>>>> >>>>>>>>>> /* C3 - For double-word aligned arguments, round the NCRN up to the >>>>>>>>>> next even number. */ >>>>>>>>>> ncrn = pcum->aapcs_ncrn; >>>>>>>>>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type)) >>>>>>>>>> ncrn++; >>>>>>>>>> >>>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the >>>>>>>>>> testcase >>>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a >>>>>>>>>> 32-bit-aligned int instead, which works as previously. >>>>>>>>>> >>>>>>>>>> Passing the same members of that struct in a non-vargs call, works ok - >>>>>>>>>> I think >>>>>>>>>> because these use the type of the declared parameters, rather than the >>>>>>>>>> provided >>>>>>>>>> arguments, and the former do not have the increased alignment from >>>>>>>>>> build_ref_for_offset. >>>>>>>>> It doesn't make sense to use the alignment of passed values. That looks like bs. >>>>>>>>> >>>>>>>>> This means that >>>>>>>>> >>>>>>>>> Int I __aligned__(8); >>>>>>>>> >>>>>>>>> Is passed differently than int. >>>>>>>>> >>>>>>>>> Arm_function_arg needs to be fixed. >>>>>>>> That is, >>>>>>>> >>>>>>>> typedef int myint __attribute__((aligned(8))); >>>>>>>> >>>>>>>> int main() >>>>>>>> { >>>>>>>> myint i = 1; >>>>>>>> int j = 2; >>>>>>>> __builtin_printf("%d %d\n", i, j); >>>>>>>> } >>>>>>>> >>>>>>>> or >>>>>>>> >>>>>>>> myint i; >>>>>>>> int j; >>>>>>>> myint *p = &i; >>>>>>>> int *q = &j; >>>>>>>> >>>>>>>> int main() >>>>>>>> { >>>>>>>> __builtin_printf("%d %d", *p, *q); >>>>>>>> } >>>>>>>> >>>>>>>> should behave the same. There isn't a printf modifier for an "aligned int" >>>>>>>> because that sort of thing doesn't make sense. Special-casing aligned vs. >>>>>>>> non-aligned values only makes sense for things passed by value on the stack. >>>>>>>> And then obviously only dependent on the functuion type signature, not >>>>>>>> on the type of the passed value. >>>>>>>> >>>>>>> I think the testcase is ill-formed. Just because printf doesn't have >>>>>>> such a modifier, doesn't mean that another variadic function might not >>>>>>> have a means to detect when an object in the variadic list needs to be >>>>>>> over-aligned. As such, the test should really be written as: >>>>>> A value doesn't have "alignment". A function may have alignment >>>>>> requirements on its arguments, clearly printf doesn't. >>>>>> >>>>> Values don't. But types do and variadic functions are special in that >>>>> they derive their types from the types of the actual parameters passed >>>>> not from the formals in the prototype. Any manipulation of the types >>>>> should be done in the front end, not in the back end. >>>> The following seems to help the testcase (by luck I'd say?). It >>>> makes us drop alignment information from the temporaries that >>>> SRA creates as memory replacement. >>>> >>>> But I find it odd that on ARM passing *((aligned_int *)p) as >>>> vararg (only as varargs?) changes calling conventions independent >>>> of the functions type signature. >>>> >>>> Richard. >>>> >>>> Index: gcc/tree-sra.c >>>> =================================================================== >>>> --- gcc/tree-sra.c (revision 221770) >>>> +++ gcc/tree-sra.c (working copy) >>>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access >>>> DECL_CONTEXT (repl) = current_function_decl; >>>> } >>>> else >>>> - repl = create_tmp_var (access->type, "SR"); >>>> + repl = create_tmp_var (build_qualified_type >>>> + (TYPE_MAIN_VARIANT (access->type), >>>> + TYPE_QUALS (access->type)), "SR"); >>>> if (TREE_CODE (access->type) == COMPLEX_TYPE >>>> || TREE_CODE (access->type) == VECTOR_TYPE) >>>> { >>>> >>> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the >>> backend code - but then we'd always ignore any over-alignment >>> requirements (or under, for that matter). >> The question is whether those are even in the scope of AACPS... >> > > Technically, they aren't. The argument passing rules are all based on > the alignment rules for fundamental types (and derived rules for > composite types based on those fundamental types) written in the AAPCS > itself. There's no provision for over-aligning a data type, so all of > this is off-piste. > > R. FWIW, I've just tested arm-none-linux-gnueabi with: the right way to do it however is another question! --Alan diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 48342d0..37662f4 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -6020,7 +6020,7 @@ static bool arm_needs_doubleword_align (machine_mode mode, const_tree type) { return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY - || (type && TYPE_ALIGN (type) > PARM_BOUNDARY)); + || (type && TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY)); } and found no regressions (and yes this fixes the original case). Whether this is