From patchwork Wed May 7 22:59:09 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Mi X-Patchwork-Id: 346842 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id B74D51401A5 for ; Thu, 8 May 2014 08:59:20 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=ZmNnpnfb3vTVKxfM3i Ld45y0QaOG1jymMOP9nRHorm9+oiIskShRXl1kvpp05jByYmw4yCUt6FziM0tMLQ t3pS5ro6DaDv1hAmHrQHOfFT8rvvcleg78fyakp09W/5cv2qfh7yPq/iscA0iVXb ro0Wvo0S3t5cwR2JR1g92qlwg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=6MJnEPutCygCtLvlltsZbKOX e8I=; b=mxeoHCpaLZuUyq8VYmdXo+f8Ri9zR6CPTVvbk7JRzDMX7GrDBOfm2T5k 1iHg8EagC1TGVdYMsy9pZuMIf2F0QL2B5WC7QBpzX6WoGdFRl5zs+xJzXiIZHc9S KXFuCfKC4pBfpJBH6iE6UbUt9yZvPwRD9XQ6FvAKmOR15FUs0lE= Received: (qmail 31543 invoked by alias); 7 May 2014 22:59:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 31532 invoked by uid 89); 7 May 2014 22:59:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.0 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-oa0-f45.google.com Received: from mail-oa0-f45.google.com (HELO mail-oa0-f45.google.com) (209.85.219.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 07 May 2014 22:59:11 +0000 Received: by mail-oa0-f45.google.com with SMTP id l6so2141639oag.32 for ; Wed, 07 May 2014 15:59:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=mbGETEjOqt+nlsSfiTqjO568kPpfd301H+UM+geNZiQ=; b=JaUxUyna5paQIaxeqFqWMJP0VUVYS8suG3B1qEkJwLeUDLAgf9x7wlsIzbT1b5DSa9 aXjDRrdWVmxzGIOVFlMez/t27logQgYSd+Nqlx+nnxno+jkVXIOoWwYJI91T+xHx1ViB LWivaBZAwZgy8cEHB/qH0lVmS2VqTuIEwG0kLG04CjjGWfMxsFnQ9sUpC/hhqFr6hqfM CsKqE6cOWWt3npMS9Nxs1mWBgGZHRHOwlcKsbMVS7egZAXIGSVc77B9Csfycx/f1SIec t7FbanpkqRTka+Gze8vCJ8J3f2GW1jlqHkNWOIZJF8pz0/Yh54UYH3FF6MMPmJuQKGu6 QAbw== X-Gm-Message-State: ALoCoQkrJuVDrI8GsfZk4Fi4xLirz1V1jPkL3bTv76HYkvhRDXrTX7ovA5gykOvayAcT2V4Vcm+O MIME-Version: 1.0 X-Received: by 10.60.145.199 with SMTP id sw7mr54296oeb.4.1399503549829; Wed, 07 May 2014 15:59:09 -0700 (PDT) Received: by 10.76.152.199 with HTTP; Wed, 7 May 2014 15:59:09 -0700 (PDT) In-Reply-To: References: Date: Wed, 7 May 2014 15:59:09 -0700 Message-ID: Subject: Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call From: Wei Mi To: Uros Bizjak Cc: GCC Patches , David Li , "H.J. Lu" This is the updated patch of pr58066-3.patch. The calls added in the templates of tls_local_dynamic_base_32 and tls_global_dynamic_32 in pr58066-3.patch are used to prevent sched2 from moving sp setting across implicit tls calls, but those calls make the combine of UNSPEC_TLS_LD_BASE and UNSPEC_DTPOFF difficult, so that the optimization in tls_local_dynamic_32_once to convert local_dynamic to global_dynamic mode for single tls reference cannot take effect. In the updated patch, I remove those calls from insn templates and add "reg:SI SP_REG" explicitly in the templates of UNSPEC_TLS_GD and UNSPEC_TLS_LD_BASE. It solves the sched2 and combine problems above, and now the optimization in tls_local_dynamic_32_once works. bootstrapped ok on x86_64-linux-gnu. regression is going on. Is it OK if regression passes? Thanks. Wei. ChangeLog: gcc/ 2014-05-07 Wei Mi * config/i386/i386.c (ix86_compute_frame_layout): preferred_stack_boundary updated for tls expanded call. * config/i386/i386.md: Set ix86_tls_descriptor_calls_expanded_in_cfun. gcc/testsuite/ 2014-05-07 Wei Mi * gcc.target/i386/pr58066.c: New test. Index: testsuite/gcc.target/i386/pr58066.c =================================================================== --- testsuite/gcc.target/i386/pr58066.c (revision 0) +++ testsuite/gcc.target/i386/pr58066.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-fPIC -O2" } */ + +/* Check whether the stack frame starting addresses of tls expanded calls + in foo and goo are 16bytes aligned. */ +static __thread char ccc1; +void* foo() +{ + return &ccc1; +} + +__thread char ccc2; +void* goo() +{ + return &ccc2; +} + +/* { dg-final { scan-assembler-times ".cfi_def_cfa_offset 16" 2 } } */ Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 209979) +++ config/i386/i386.c (working copy) @@ -9485,20 +9485,30 @@ ix86_compute_frame_layout (struct ix86_f frame->nregs = ix86_nsaved_regs (); frame->nsseregs = ix86_nsaved_sseregs (); - stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT; - preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT; - /* 64-bit MS ABI seem to require stack alignment to be always 16 except for function prologues and leaf. */ - if ((TARGET_64BIT_MS_ABI && preferred_alignment < 16) + if ((TARGET_64BIT_MS_ABI && crtl->preferred_stack_boundary < 128) && (!crtl->is_leaf || cfun->calls_alloca != 0 || ix86_current_function_calls_tls_descriptor)) { - preferred_alignment = 16; - stack_alignment_needed = 16; crtl->preferred_stack_boundary = 128; crtl->stack_alignment_needed = 128; } + /* preferred_stack_boundary is never updated for call + expanded from tls descriptor. Update it here. We don't update it in + expand stage because according to the comments before + ix86_current_function_calls_tls_descriptor, tls calls may be optimized + away. */ + else if (ix86_current_function_calls_tls_descriptor + && crtl->preferred_stack_boundary < PREFERRED_STACK_BOUNDARY) + { + crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY; + if (crtl->stack_alignment_needed < PREFERRED_STACK_BOUNDARY) + crtl->stack_alignment_needed = PREFERRED_STACK_BOUNDARY; + } + + stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT; + preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT; gcc_assert (!size || stack_alignment_needed); gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT); Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 209979) +++ config/i386/i386.md (working copy) @@ -12530,7 +12530,8 @@ (unspec:SI [(match_operand:SI 1 "register_operand" "b") (match_operand 2 "tls_symbolic_operand") - (match_operand 3 "constant_call_address_operand" "z")] + (match_operand 3 "constant_call_address_operand" "z") + (reg:SI SP_REG)] UNSPEC_TLS_GD)) (clobber (match_scratch:SI 4 "=d")) (clobber (match_scratch:SI 5 "=c")) @@ -12555,11 +12556,14 @@ [(set (match_operand:SI 0 "register_operand") (unspec:SI [(match_operand:SI 2 "register_operand") (match_operand 1 "tls_symbolic_operand") - (match_operand 3 "constant_call_address_operand")] + (match_operand 3 "constant_call_address_operand") + (reg:SI SP_REG)] UNSPEC_TLS_GD)) (clobber (match_scratch:SI 4)) (clobber (match_scratch:SI 5)) - (clobber (reg:CC FLAGS_REG))])]) + (clobber (reg:CC FLAGS_REG))])] + "" + "ix86_tls_descriptor_calls_expanded_in_cfun = true;") (define_insn "*tls_global_dynamic_64_" [(set (match_operand:P 0 "register_operand" "=a") @@ -12614,13 +12618,15 @@ (const_int 0))) (unspec:P [(match_operand 1 "tls_symbolic_operand")] UNSPEC_TLS_GD)])] - "TARGET_64BIT") + "TARGET_64BIT" + "ix86_tls_descriptor_calls_expanded_in_cfun = true;") (define_insn "*tls_local_dynamic_base_32_gnu" [(set (match_operand:SI 0 "register_operand" "=a") (unspec:SI [(match_operand:SI 1 "register_operand" "b") - (match_operand 2 "constant_call_address_operand" "z")] + (match_operand 2 "constant_call_address_operand" "z") + (reg:SI SP_REG)] UNSPEC_TLS_LD_BASE)) (clobber (match_scratch:SI 3 "=d")) (clobber (match_scratch:SI 4 "=c")) @@ -12646,11 +12652,14 @@ [(set (match_operand:SI 0 "register_operand") (unspec:SI [(match_operand:SI 1 "register_operand") - (match_operand 2 "constant_call_address_operand")] + (match_operand 2 "constant_call_address_operand") + (reg:SI SP_REG)] UNSPEC_TLS_LD_BASE)) (clobber (match_scratch:SI 3)) (clobber (match_scratch:SI 4)) - (clobber (reg:CC FLAGS_REG))])]) + (clobber (reg:CC FLAGS_REG))])] + "" + "ix86_tls_descriptor_calls_expanded_in_cfun = true;") (define_insn "*tls_local_dynamic_base_64_" [(set (match_operand:P 0 "register_operand" "=a") @@ -12697,7 +12706,8 @@ (mem:QI (match_operand 1)) (const_int 0))) (unspec:P [(const_int 0)] UNSPEC_TLS_LD_BASE)])] - "TARGET_64BIT") + "TARGET_64BIT" + "ix86_tls_descriptor_calls_expanded_in_cfun = true;") ;; Local dynamic of a single variable is a lose. Show combine how ;; to convert that back to global dynamic. @@ -12706,7 +12716,8 @@ [(set (match_operand:SI 0 "register_operand" "=a") (plus:SI (unspec:SI [(match_operand:SI 1 "register_operand" "b") - (match_operand 2 "constant_call_address_operand" "z")] + (match_operand 2 "constant_call_address_operand" "z") + (reg:SI SP_REG)] UNSPEC_TLS_LD_BASE) (const:SI (unspec:SI [(match_operand 3 "tls_symbolic_operand")] @@ -12719,7 +12730,8 @@ "" [(parallel [(set (match_dup 0) - (unspec:SI [(match_dup 1) (match_dup 3) (match_dup 2)] + (unspec:SI [(match_dup 1) (match_dup 3) (match_dup 2) + (reg:SI SP_REG)] UNSPEC_TLS_GD)) (clobber (match_dup 4)) (clobber (match_dup 5))