Fix target/45027 [was: x86_64 varargs setup jump table]

Message ID	4C48BDB4.5060209@redhat.com
State	New
Headers	show Return-Path: <gcc-patches-return-268802-incoming=patchwork.ozlabs.org@gcc.gnu.org> Message-ID: <4C48BDB4.5060209@redhat.com> Date: Thu, 22 Jul 2010 14:52:52 -0700 From: Richard Henderson <rth@redhat.com> User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Thunderbird/3.0.5 MIME-Version: 1.0 To: "H.J. Lu" <hjl.tools@gmail.com> CC: Bernd Schmidt <bernds@codesourcery.com>, GCC Patches <gcc-patches@gcc.gnu.org>, ubizjak@gmail.com Subject: Fix target/45027 [was: x86_64 varargs setup jump table] References: <4C4035C3.9080305@codesourcery.com> <4C40A5BD.9080208@redhat.com> <4C40F005.3060507@codesourcery.com> <AANLkTimnDbQ0XEbp-CLFUwQiJTX9OwTN9YvfczoxsPsp@mail.gmail.com> <4C41BD52.5040905@codesourcery.com> <4C447222.7080500@redhat.com> <AANLkTilQN3YXciBT0Dr-L7Er2YfLQSBE7139y4kOZmjc@mail.gmail.com> <AANLkTinUV4V4XD1nHsvAW98GF8_PPdyw6HEX-DLAbNZe@mail.gmail.com> <4C44C00F.3070201@redhat.com> <4C45CFA0.9070404@redhat.com> <AANLkTinP72rPdxH47kyxFhYM4xLyt4wyjuMbDdLHrzEL@mail.gmail.com> In-Reply-To: <AANLkTinP72rPdxH47kyxFhYM4xLyt4wyjuMbDdLHrzEL@mail.gmail.com> Content-Type: multipart/mixed; boundary="------------080200060602070800070906" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org

Message ID

4C48BDB4.5060209@redhat.com

State

New

Headers

Message-ID: <4C48BDB4.5060209@redhat.com>
Date: Thu, 22 Jul 2010 14:52:52 -0700
From: Richard Henderson <rth@redhat.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Thunderbird/3.0.5
MIME-Version: 1.0
To: "H.J. Lu" <hjl.tools@gmail.com>
CC: Bernd Schmidt <bernds@codesourcery.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>, ubizjak@gmail.com
Subject: Fix target/45027 [was: x86_64 varargs setup jump table]
References: <4C4035C3.9080305@codesourcery.com>	<4C40A5BD.9080208@redhat.com>	<4C40F005.3060507@codesourcery.com>	<AANLkTimnDbQ0XEbp-CLFUwQiJTX9OwTN9YvfczoxsPsp@mail.gmail.com>	<4C41BD52.5040905@codesourcery.com>	<4C447222.7080500@redhat.com>	<AANLkTilQN3YXciBT0Dr-L7Er2YfLQSBE7139y4kOZmjc@mail.gmail.com>	<AANLkTinUV4V4XD1nHsvAW98GF8_PPdyw6HEX-DLAbNZe@mail.gmail.com>	<4C44C00F.3070201@redhat.com>	<4C45CFA0.9070404@redhat.com>
	<AANLkTinP72rPdxH47kyxFhYM4xLyt4wyjuMbDdLHrzEL@mail.gmail.com>
In-Reply-To: <AANLkTinP72rPdxH47kyxFhYM4xLyt4wyjuMbDdLHrzEL@mail.gmail.com>
Content-Type: multipart/mixed;
	boundary="------------080200060602070800070906"
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org

Commit Message

Richard Henderson July 22, 2010, 9:52 p.m. UTC

On 07/21/2010 04:15 PM, H.J. Lu wrote:
> This caused:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45027

Fixed thus.

Honza and I had a discussion about this on IRC.  His original patch, 

  http://gcc.gnu.org/ml/gcc-patches/2010-04/msg01120.html

had some problems in it, but worked anyway for a complicated set of
reasons.

The following patch removes the optimization that attempts to leave
the stack alignment at 64-bits, rather than storing the varargs SSE
registers in full into a 128-bit aligned slot.

The only optimization that seems interesting to me would be to properly
track the true types of the data at each varargs slot.  In a restricted
set of circumstances we might be able to copy-propagate the incoming 
SSE register to the use.  This would require a significant amount of
work in pass_stdarg, and really doesn't seem worth the effort.

r~
* config/i386/i386.c (setup_incoming_varargs_64): Force the use
        of V4SFmode for the SSE saves; increase stack alignment if needed.
        (ix86_gimplify_va_arg): Don't increase stack alignment here.

Comments

Richard Biener July 23, 2010, 9:06 a.m. UTC | #1

On Thu, Jul 22, 2010 at 11:52 PM, Richard Henderson <rth@redhat.com> wrote:
> On 07/21/2010 04:15 PM, H.J. Lu wrote:
>> This caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45027
>
> Fixed thus.
>
> Honza and I had a discussion about this on IRC.  His original patch,
>
>  http://gcc.gnu.org/ml/gcc-patches/2010-04/msg01120.html
>
> had some problems in it, but worked anyway for a complicated set of
> reasons.
>
> The following patch removes the optimization that attempts to leave
> the stack alignment at 64-bits, rather than storing the varargs SSE
> registers in full into a 128-bit aligned slot.
>
> The only optimization that seems interesting to me would be to properly
> track the true types of the data at each varargs slot.  In a restricted
> set of circumstances we might be able to copy-propagate the incoming
> SSE register to the use.  This would require a significant amount of
> work in pass_stdarg, and really doesn't seem worth the effort.

Well, IPA-SRA could be extended to handle varargs ...

Richard.

>
> r~
>

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d9dc571..596a6db 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -7073,7 +7073,7 @@  setup_incoming_varargs_64 (CUMULATIVE_ARGS *cum)
 
   /* FPR size of varargs save area.  We don't need it if we don't pass
      anything in SSE registers.  */
-  if (cum->sse_nregs && cfun->va_list_fpr_size)
+  if (TARGET_SSE && cfun->va_list_fpr_size)
     ix86_varargs_fpr_size = X86_64_SSE_REGPARM_MAX * 16;
   else
     ix86_varargs_fpr_size = 0;
@@ -7112,12 +7112,13 @@  setup_incoming_varargs_64 (CUMULATIVE_ARGS *cum)
       emit_jump_insn (gen_cbranchqi4 (test, XEXP (test, 0), XEXP (test, 1),
 				      label));
 
-      /* If we've determined that we're only loading scalars (and not
-	 vector data) then we can store doubles instead.  */
-      if (crtl->stack_alignment_needed < 128)
-	smode = DFmode;
-      else
-	smode = V4SFmode;
+      /* ??? If !TARGET_SSE_TYPELESS_STORES, would we perform better if
+	 we used movdqa (i.e. TImode) instead?  Perhaps even better would
+	 be if we could determine the real mode of the data, via a hook
+	 into pass_stdarg.  Ignore all that for now.  */
+      smode = V4SFmode;
+      if (crtl->stack_alignment_needed < GET_MODE_ALIGNMENT (smode))
+	crtl->stack_alignment_needed = GET_MODE_ALIGNMENT (smode);
 
       max = cum->sse_regno + cfun->va_list_fpr_size / 16;
       if (max > X86_64_SSE_REGPARM_MAX)
@@ -7549,8 +7550,7 @@  ix86_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
     arg_boundary = MAX_SUPPORTED_STACK_ALIGNMENT;
 
   /* Care for on-stack alignment if needed.  */
-  if (arg_boundary <= 64
-      || integer_zerop (TYPE_SIZE (type)))
+  if (arg_boundary <= 64 || size == 0)
     t = ovf;
  else
     {
@@ -7561,9 +7561,8 @@  ix86_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
       t = build2 (BIT_AND_EXPR, TREE_TYPE (t), t,
 		  size_int (-align));
       t = fold_convert (TREE_TYPE (ovf), t);
-      if (crtl->stack_alignment_needed < arg_boundary)
-	crtl->stack_alignment_needed = arg_boundary;
     }
+
   gimplify_expr (&t, pre_p, NULL, is_gimple_val, fb_rvalue);
   gimplify_assign (addr, t, pre_p);