From patchwork Thu Sep 20 15:13:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 972493 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-486056-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="dpdN4PDT"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="go4rCMhb"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42GKy51QHHz9s8T for ; Fri, 21 Sep 2018 01:14:02 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=DDjG0WO+Q8o3B2TBEZwsxn2yvkODgWcUZLYPx0LRuNX 0TsZoAakxKyZaLoaXkfUaIxa5y7raZtxBjvW0Uk56cJFSf2RZKoFFqRV4lPT1jgt YE7hTUZUHCZT5yUapb+QUtiJTPzmnXeWTF6kiMcB50baZhSxbh8KSHrTws+TIfWk = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=vWX45gGzrYcT/FYdCw27cDhrKwI=; b=dpdN4PDTDoRVd3/mS Z2rqeKmom1+smXf6R4wkHwJuA7yR85Cj0BNatNWe/Sxoq7FZn3OIZL3tDzB2NPM4 WlNR1mXwQKzkd3ePXGvL0/ZpTFsvZMQtNM9OxHWUpesVBs4FdjZAHrVnli7xjcb1 KvEWONPB9W1zuI/w//u5bJq4MM= Received: (qmail 74894 invoked by alias); 20 Sep 2018 15:13:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 74882 invoked by uid 89); 20 Sep 2018 15:13:53 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=Sign, ac, GEN, bc X-HELO: NAM02-SN1-obe.outbound.protection.outlook.com Received: from mail-sn1nam02on0061.outbound.protection.outlook.com (HELO NAM02-SN1-obe.outbound.protection.outlook.com) (104.47.36.61) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Sep 2018 15:13:48 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/Tqc5ppNvvPSR1D4IFZSi0faP4woMVkDbe8JkIyvU9c=; b=go4rCMhbTU+96s59ZfNg8cKOOTDXQvrqRBAKIdq2gk1AIUeAVokohCSCWjfnXSntyL/oh+EvPWZ7Je7Jpf/vHLAavEby/lP74ssgv62pZppfq2/rqjxaFL5IUhHdhtv/GzSEuh/BUBvxw3rf/JnSorMiIr+Z+WkcLip3UXBlL8k= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB5704.namprd07.prod.outlook.com (20.178.0.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1143.17; Thu, 20 Sep 2018 15:13:44 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::3885:f2b5:7f36:9e86]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::3885:f2b5:7f36:9e86%4]) with mapi id 15.20.1143.017; Thu, 20 Sep 2018 15:13:44 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 1/3][Aarch64] Implement Aarch64 SIMD ABI Date: Thu, 20 Sep 2018 15:13:44 +0000 Message-ID: <1537456422.24844.12.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: Here is a new version of my patch to support the Aarch64 SIMD ABI in GCC. There is no functional change, I just removed the definition of V23_REGNUM from aarch64.md.  This is no longer needed because another patch that was checked in has added it.  I am following up this patch with two more, one to add the TARGET_SIMD_CLONE* macros and functions and one to modify code that checks for register usage by functions so that we can differentiate between regular functions and simd functions on Aarch64. This first patch has been tested with no regressions and should be ready to checkin if approved. The other two are not fully tested but are being submitted for to get feedback. Steve Ellcey sellcey@cavium.com 2018-09-20  Steve Ellcey   * config/aarch64/aarch64-protos.h (aarch64_use_simple_return_insn_p): New prototype. (aarch64_epilogue_uses): Ditto. * config/aarch64/aarch64.c (aarch64_attribute_table): New array. (aarch64_simd_decl_p): New function. (aarch64_reg_save_mode): New function. (aarch64_is_simd_call_p): New function. (aarch64_function_ok_for_sibcall): Check for simd calls. (aarch64_layout_frame): Check for simd function. (aarch64_gen_storewb_pair): Handle E_TFmode. (aarch64_push_regs): Use aarch64_reg_save_mode to get mode. (aarch64_gen_loadwb_pair): Handle E_TFmode. (aarch64_pop_regs): Use aarch64_reg_save_mode to get mode. (aarch64_gen_store_pair): Handle E_TFmode. (aarch64_gen_load_pair): Ditto. (aarch64_save_callee_saves): Handle different mode sizes. (aarch64_restore_callee_saves): Ditto. (aarch64_components_for_bb): Check for simd function. (aarch64_epilogue_uses): New function. (aarch64_process_components): Check for simd function. (aarch64_expand_prologue): Ditto. (aarch64_expand_epilogue): Ditto. (aarch64_expand_call): Ditto. (TARGET_ATTRIBUTE_TABLE): New define. * config/aarch64/aarch64.h (EPILOGUE_USES): Redefine. (FP_SIMD_SAVED_REGNUM_P): New macro. * config/aarch64/aarch64.md (simple_return): New define_expand. (load_pair_dw_tftf): New instruction. (store_pair_dw_tftf): Ditto. (loadwb_pair_): Ditto. ("storewb_pair_): Ditto. Testsuite ChangeLog: 2018-09-20  Steve Ellcey   * gcc.target/aarch64/torture/aarch64-torture.exp: New file. * gcc.target/aarch64/torture/simd-abi-1.c: New test. * gcc.target/aarch64/torture/simd-abi-2.c: Ditto. * gcc.target/aarch64/torture/simd-abi-3.c: Ditto. * gcc.target/aarch64/torture/simd-abi-4.c: Ditto. diff --git a/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp b/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp index e69de29..22f08ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp +++ b/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp @@ -0,0 +1,41 @@ +# Copyright (C) 2018 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `gcc-dg.exp' driver, looping over +# optimization options. + +# Exit immediately if this isn't a Aarch64 target. +if { ![istarget aarch64*-*-*] } then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +# Initialize `dg'. +dg-init + +# Main loop. +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] "" $DEFAULT_CFLAGS + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c index e69de29..249554e 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ + +void __attribute__ ((aarch64_vector_pcs)) +f (void) +{ + /* Clobber all fp/simd regs and verify that the correct ones are saved + and restored in the prologue and epilogue of a SIMD function. */ + __asm__ __volatile__ ("" ::: "q0", "q1", "q2", "q3"); + __asm__ __volatile__ ("" ::: "q4", "q5", "q6", "q7"); + __asm__ __volatile__ ("" ::: "q8", "q9", "q10", "q11"); + __asm__ __volatile__ ("" ::: "q12", "q13", "q14", "q15"); + __asm__ __volatile__ ("" ::: "q16", "q17", "q18", "q19"); + __asm__ __volatile__ ("" ::: "q20", "q21", "q22", "q23"); + __asm__ __volatile__ ("" ::: "q24", "q25", "q26", "q27"); + __asm__ __volatile__ ("" ::: "q28", "q29", "q30", "q31"); +} + +/* { dg-final { scan-assembler {\sstp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sstp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sstp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sstp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sstp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sstp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sstp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sstp\tq22, q23} } } */ +/* { dg-final { scan-assembler {\sldp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sldp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sldp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sldp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sldp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sldp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sldp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sldp\tq22, q23} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sstp\td} } } */ +/* { dg-final { scan-assembler-not {\sldp\td} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c index e69de29..bf6e64a 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ + +void +f (void) +{ + /* Clobber all fp/simd regs and verify that the correct ones are saved + and restored in the prologue and epilogue of a SIMD function. */ + __asm__ __volatile__ ("" ::: "q0", "q1", "q2", "q3"); + __asm__ __volatile__ ("" ::: "q4", "q5", "q6", "q7"); + __asm__ __volatile__ ("" ::: "q8", "q9", "q10", "q11"); + __asm__ __volatile__ ("" ::: "q12", "q13", "q14", "q15"); + __asm__ __volatile__ ("" ::: "q16", "q17", "q18", "q19"); + __asm__ __volatile__ ("" ::: "q20", "q21", "q22", "q23"); + __asm__ __volatile__ ("" ::: "q24", "q25", "q26", "q27"); + __asm__ __volatile__ ("" ::: "q28", "q29", "q30", "q31"); +} + +/* { dg-final { scan-assembler {\sstp\td8, d9} } } */ +/* { dg-final { scan-assembler {\sstp\td10, d11} } } */ +/* { dg-final { scan-assembler {\sstp\td12, d13} } } */ +/* { dg-final { scan-assembler {\sstp\td14, d15} } } */ +/* { dg-final { scan-assembler {\sldp\td8, d9} } } */ +/* { dg-final { scan-assembler {\sldp\td10, d11} } } */ +/* { dg-final { scan-assembler {\sldp\td12, d13} } } */ +/* { dg-final { scan-assembler {\sldp\td14, d15} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[01234567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[01234567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq1[6789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq1[6789]} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c index e69de29..7d4f54f 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ + +extern void g (void); + +void __attribute__ ((aarch64_vector_pcs)) +f (void) +{ + g(); +} + +/* { dg-final { scan-assembler {\sstp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sstp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sstp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sstp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sstp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sstp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sstp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sstp\tq22, q23} } } */ +/* { dg-final { scan-assembler {\sldp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sldp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sldp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sldp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sldp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sldp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sldp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sldp\tq22, q23} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sstp\td} } } */ +/* { dg-final { scan-assembler-not {\sldp\td} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c index e69de29..e399690 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c @@ -0,0 +1,34 @@ +/* dg-do run */ +/* { dg-additional-options "-std=c99" } */ + + + +/* There is nothing special about the calculations here, this is just + a test that can be compiled and run. */ + +extern void abort (void); + +__Float64x2_t __attribute__ ((noinline, aarch64_vector_pcs)) +foo(__Float64x2_t a, __Float64x2_t b, __Float64x2_t c, + __Float64x2_t d, __Float64x2_t e, __Float64x2_t f, + __Float64x2_t g, __Float64x2_t h, __Float64x2_t i) +{ + __Float64x2_t w, x, y, z; + w = a + b * c; + x = d + e * f; + y = g + h * i; + return w + x * y; +} + + +int main() +{ + __Float64x2_t a, b, c, d; + a = (__Float64x2_t) { 1.0, 2.0 }; + b = (__Float64x2_t) { 3.0, 4.0 }; + c = (__Float64x2_t) { 5.0, 6.0 }; + d = foo (a, b, c, (a+b), (b+c), (a+c), (a-b), (b-c), (a-c)) + a + b + c; + if (d[0] != 337.0 || d[1] != 554.0) + abort (); + return 0; +} From patchwork Thu Sep 20 15:15:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 972494 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-486057-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="nXiJjhSv"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="JnRoNVG5"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42GL02185Zz9s3x for ; Fri, 21 Sep 2018 01:15:45 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=qai7yl88JtkouyuCOXnpRkMxP3ujsRA+USLz4J1+jhc ezjMFXBnJdEuzctXtBYiFDZelRJfCXvhgICdHiG7zrwSeZz5uMMJfoGv4vGrgku0 nLs/yI4vHho5rGDJHHs225bb20PY3Ixh3sB8ON2zf9Kjf1A52hAa0wF9imIFO2wQ = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=7FZdfMhooC1PGkL5qwMwnzGyt58=; b=nXiJjhSvZk9G0nhgw k+4qrPlot2xH5KGZz3YF3WD/8jfBhkwRXSKkRZW2H7qDmqkUUnupLQShsyZ1uZsV fg3zLzFupafuBZMiu35zSk82oLd3FSnLYrRKGcD8j76V7FC+jLXx8zqj8Da4rXWq 3kxR6JQnH3D8m6hQ9b5PtYO8zc= Received: (qmail 80499 invoked by alias); 20 Sep 2018 15:15:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 80445 invoked by uid 89); 20 Sep 2018 15:15:15 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=UD:regs.h, emit-rtl.h, regs.h, emitrtlh X-HELO: NAM02-SN1-obe.outbound.protection.outlook.com Received: from mail-sn1nam02on0064.outbound.protection.outlook.com (HELO NAM02-SN1-obe.outbound.protection.outlook.com) (104.47.36.64) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Sep 2018 15:15:12 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HXrqpWKFACD9Pep6eVqjZUNeCCqyNj264OXKSdxZSFU=; b=JnRoNVG58MX3DrMKgKUoDWTdNpFGKlT+ds4WPrFueTIwMxP42VgANXehXDMuO36bngfX+HcaL3+g3r3tBB0/puRCh7+D8UIomVLnkn0XycQt2zN4MTKDXmGe6g/XHcHY8kw4CTZDa9IcrOTX6kwUTDpsQY4nfKX/Wl0KLztOuvY= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB5704.namprd07.prod.outlook.com (20.178.0.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1143.17; Thu, 20 Sep 2018 15:15:09 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::3885:f2b5:7f36:9e86]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::3885:f2b5:7f36:9e86%4]) with mapi id 15.20.1143.017; Thu, 20 Sep 2018 15:15:09 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 2/3][Aarch64] Implement Aarch64 SIMD ABI Date: Thu, 20 Sep 2018 15:15:09 +0000 Message-ID: <1537456506.24844.14.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: This is the second of three Aarch64 patch for SIMD ABI support.  It defines the TARGET_SIMD_CLONE_* macros so that GCC will recognize and vectorize loops containing SIMD functions.  It requires that patch one of the Aarch64 SIMD ABI get checked in first. This patch has not been fully regression tested yet but is fairly safe and I am posting it to see if there are any comments on it. Steve Ellcey sellcey@cavium.com 2018-09-20  Steve Ellcey   * config/aarch64/aarch64.c (cgraph.h): New include. (aarch64_simd_clone_compute_vecsize_and_simdlen): New function. (aarch64_simd_clone_adjust): Ditto. (aarch64_simd_clone_usable): Ditto. (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): New macro. (TARGET_SIMD_CLONE_ADJUST): Ditto. (TARGET_SIMD_CLONE_USABLE): Ditto. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 8cc738c..a86f32d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -40,6 +40,7 @@ #include "regs.h" #include "emit-rtl.h" #include "recog.h" +#include "cgraph.h" #include "diagnostic.h" #include "insn-attr.h" #include "alias.h" @@ -17472,6 +17473,131 @@ aarch64_speculation_safe_value (machine_mode mode, return result; } +/* Set CLONEI->vecsize_mangle, CLONEI->mask_mode, CLONEI->vecsize_int, + CLONEI->vecsize_float and if CLONEI->simdlen is 0, also + CLONEI->simdlen. Return 0 if SIMD clones shouldn't be emitted, + or number of vecsize_mangle variants that should be emitted. */ + +static int +aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, + struct cgraph_simd_clone *clonei, + tree base_type, + int num ATTRIBUTE_UNUSED) +{ + int ret = 0; + + if (clonei->simdlen + && (clonei->simdlen < 2 + || clonei->simdlen > 1024 + || (clonei->simdlen & (clonei->simdlen - 1)) != 0)) + { + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported simdlen %d", clonei->simdlen); + return 0; + } + + tree ret_type = TREE_TYPE (TREE_TYPE (node->decl)); + if (TREE_CODE (ret_type) != VOID_TYPE) + switch (TYPE_MODE (ret_type)) + { + case E_QImode: + case E_HImode: + case E_SImode: + case E_DImode: + case E_SFmode: + case E_DFmode: + /* case E_SCmode: */ + /* case E_DCmode: */ + break; + default: + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported return type %qT for simd\n", ret_type); + return 0; + } + + tree t; + for (t = DECL_ARGUMENTS (node->decl); t; t = DECL_CHAIN (t)) + /* FIXME: Shouldn't we allow such arguments if they are uniform? */ + switch (TYPE_MODE (TREE_TYPE (t))) + { + case E_QImode: + case E_HImode: + case E_SImode: + case E_DImode: + case E_SFmode: + case E_DFmode: + /* case E_SCmode: */ + /* case E_DCmode: */ + break; + default: + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported argument type %qT for simd\n", TREE_TYPE (t)); + return 0; + } + + if (TARGET_SIMD) + { + clonei->vecsize_mangle = 'n'; + clonei->mask_mode = VOIDmode; + clonei->vecsize_int = 128; + clonei->vecsize_float = 128; + + if (clonei->simdlen == 0) + { + if (SCALAR_INT_MODE_P (TYPE_MODE (base_type))) + clonei->simdlen = clonei->vecsize_int; + else + clonei->simdlen = clonei->vecsize_float; + clonei->simdlen /= GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)); + } + else if (clonei->simdlen > 16) + { + /* If it is possible for given SIMDLEN to pass CTYPE value in + registers (v0-v7) accept that SIMDLEN, otherwise warn and don't + emit corresponding clone. */ + int cnt = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)) * clonei->simdlen; + if (SCALAR_INT_MODE_P (TYPE_MODE (base_type))) + cnt /= clonei->vecsize_int; + else + cnt /= clonei->vecsize_float; + if (cnt > 8) + { + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported simdlen %d", clonei->simdlen); + return 0; + } + } + ret = 1; + } + return ret; +} + +/* Add target attribute to SIMD clone NODE if needed. */ + +static void +aarch64_simd_clone_adjust (struct cgraph_node *node ATTRIBUTE_UNUSED) +{ +} + +/* If SIMD clone NODE can't be used in a vectorized loop + in current function, return -1, otherwise return a badness of using it + (0 if it is most desirable from vecsize_mangle point of view, 1 + slightly less desirable, etc.). */ + +static int +aarch64_simd_clone_usable (struct cgraph_node *node) +{ + switch (node->simdclone->vecsize_mangle) + { + case 'n': + if (!TARGET_SIMD) + return -1; + return 0; + default: + gcc_unreachable (); + } +} + /* Target-specific selftests. */ #if CHECKING_P @@ -17947,6 +18073,16 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_SPECULATION_SAFE_VALUE #define TARGET_SPECULATION_SAFE_VALUE aarch64_speculation_safe_value +#undef TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN +#define TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN \ + aarch64_simd_clone_compute_vecsize_and_simdlen + +#undef TARGET_SIMD_CLONE_ADJUST +#define TARGET_SIMD_CLONE_ADJUST aarch64_simd_clone_adjust + +#undef TARGET_SIMD_CLONE_USABLE +#define TARGET_SIMD_CLONE_USABLE aarch64_simd_clone_usable + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests From patchwork Thu Sep 20 15:17:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 972497 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-486058-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="CudpOOBh"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="lHWPV7fo"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42GL1p2DBNz9s2P for ; Fri, 21 Sep 2018 01:17:18 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=hhGLthKq9KpZEOPFo1spT89Nky3HZSTi+AFoJf8veAR IIQCe24VTS/SNzpaPyst8IZwiRbNlBrQCuuOtUfKMS2ebiDrHUWDfNmGddd/1tKT 7WT4sEGnyMgLQYu4xKHi7RF5hojo4cRB7XmzBPfTIaMz5g9WcPorPseZMsdl3NQs = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=BWO4/AK84kfqREoAN2L/vZhJVVI=; b=CudpOOBhrvduxn8my yLXrHjY5KHujMrSev95nLoPdxFw3gC5LcJukPN8K1HGyCQHzPdTJqBigBBsrtNDx x8CvNhSPTb91eKfisx1eEFEgOfh2Qbhu9wMaXDUlxqxQShOaQwTPVpPVOLv5ZeCS +ox2YSWMjkmP0w3OLNaWgzdHnc= Received: (qmail 116813 invoked by alias); 20 Sep 2018 15:17:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 116792 invoked by uid 89); 20 Sep 2018 15:17:08 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=UD:B, sk:availab X-HELO: NAM02-CY1-obe.outbound.protection.outlook.com Received: from mail-cys01nam02on0076.outbound.protection.outlook.com (HELO NAM02-CY1-obe.outbound.protection.outlook.com) (104.47.37.76) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Sep 2018 15:17:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5QA4BLR0Unwu5/x7T+vrdJsTlgMq44hS0uzAeuzdIYw=; b=lHWPV7fo2/L/hDlyUUW7XtF6S3U2HDOKO1VavY6k1+PHdaWf45G4+cTjJKJ+JtBDaoOmz7kvW4frpUI6s5l4x7olO//+MCKUcx1jSgp8HPDyHr8QZCON0lrmRzhR8vLA+D6bWJfJASp8BbeqESXWVyNOmz44yNiBp4DgnE2JDZI= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB5143.namprd07.prod.outlook.com (20.176.254.208) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1143.18; Thu, 20 Sep 2018 15:17:00 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::3885:f2b5:7f36:9e86]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::3885:f2b5:7f36:9e86%4]) with mapi id 15.20.1143.017; Thu, 20 Sep 2018 15:17:00 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 3/3][Aarch64] Implement Aarch64 SIMD ABI Date: Thu, 20 Sep 2018 15:17:00 +0000 Message-ID: <1537456618.24844.16.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: This is the third of three patches for Aarch64 SIMD ABI support.  This patch is not fully tested yet but I want to post it to get comments. This is the only patch of the three that touches non-aarch64 specific code.  The changes here are made to allow GCC to have better information about what registers are clobbered by functions.  With the new SIMD ABI on Aarch64 the registers clobbered by a SIMD function is a subset of the registers clobbered by a normal (non-SIMD) function.  This can result in the caller saving and restoring more registers than is necessary. This patch addresses that by passing information about the call insn to  various routines so that they can check on what type of function is being called and modify the clobbered register set based on that information. As an example, this code:   __attribute__ ((__simd__ ("notinbranch"))) extern double sin (double __x);   __attribute__ ((__simd__ ("notinbranch"))) extern double log (double __x);   __attribute__ ((__simd__ ("notinbranch"))) extern double exp (double __x);   double foo(double * __restrict__ x, double * __restrict__ y, double * __restrict__ z, int n)   { int i; double a = 0.0; for (i = 0; i < n; i++) a = a + sin(x[i]) + log(y[i]) + exp (z[i]); return a;   } Will generate stores inside the main vectorized loop to preserve registers without this patch, but after the patch, will not do any stores and will use registers it knows the vector sin/log/exp functions do not clobber. Comments? Steve Ellcey sellcey@cavium.com 2018-09-20  Steve Ellcey   * caller-save.c (setup_save_areas): Modify get_call_reg_set_usage arguments. (save_call_clobbered_regs): Ditto. * config/aarch64/aarch64.c (aarch64_simd_function_def): New function. (aarch64_simd_call_p): Ditto. (aarch64_hard_regno_call_part_clobbered): Check for simd calls. (aarch64_check_part_clobbered): New function. (aarch64_used_reg_set): New function. (TARGET_CHECK_PART_CLOBBERED): New macro. (TARGET_USED_REG_SET): New macro. * cselib.c (cselib_process_insn): Modify targetm.hard_regno_call_part_clobbered arguments. * df-scan.c (df_get_call_refs): Modify get_call_reg_set_usage arguments. * doc/tm.texi.in (TARGET_CHECK_PART_CLOBBERED): New hook. (TARGET_USED_REG_SET): New hook. * final.c (collect_fn_hard_reg_usage): Modify get_call_reg_set_usage arguments. (get_call_reg_set_usage): Update description and argument list, modify code to return proper register set. * hooks.c (hook_bool_uint_mode_false): Rename to hook_bool_insn_uint_mode_false. * hooks.h (hook_bool_uint_mode_false): Ditto. * ira-conflicts.c (ira_build_conflicts): Modify targetm.hard_regno_call_part_clobbered arguments. * ira-costs.c (ira_tune_allocno_costs): Ditto. * ira-lives.c (process_bb_node_lives): Modify get_call_reg_set_usage arguments. * lra-constraints.c (need_for_call_save_p): Add new argument. Modify return and update arguments to targetm.hard_regno_call_part_clobbered. (need_for_split_p): Add insn argument. Pass argument to need_for_call_save_p. (split_if_necessary): Pass insn argument to need_for_split_p. (inherit_in_ebb): Pass curr_insn to need_for_split_p. * lra-int.h (struct lra_reg): Add check_part_clobbered field * lra-lives.c (lra_setup_reload_pseudo_preferenced_hard_reg): Add insn argument. (check_pseudos_live_through_calls): Add check of flag_ipa_ra. (process_bb_lives): Pass curr_insn to check_pseudos_live_through_calls. Modify get_call_reg_set_usage, targetm.check_part_clobbered, and check_pseudos_live_through_calls arguments. * lra.c (initialize_lra_reg_info_element): Initialize check_part_clobbered to false. * postreload.c (reload_combine): Modify get_call_reg_set_usage arguments. * regcprop.c (copyprop_hardreg_forward_1): Modify get_call_reg_set_usage and targetm.hard_regno_call_part_clobbered arguments. * reginfo.c (choose_hard_reg_mode): Modify targetm.hard_regno_call_part_clobbered arguments. * regrename.c (check_new_reg_p): Ditto. * regs.h (get_call_reg_set_usage): Update argument list. * reload.c (find_equiv_reg): Modify targetm.hard_regno_call_part_clobbered argument list. * reload1.c (emit_reload_insns): Ditto. * resource.c (mark_set_resources): Modify get_call_reg_set_usage argument list. * sched-deps.c (deps_analyze_insn): Modify targetm.hard_regno_call_part_clobbered argument list. * sel-sched.c (init_regs_for_mode): Ditto. (mark_unavailable_hard_regs): Ditto. * target.def (hard_regno_call_part_clobbered): Update description and argument list. (check_part_clobbered): New hook. (used_reg_set): New hook. * targhooks.c (default_dwarf_frame_reg_mode): Update  targetm.hard_regno_call_part_clobbered argument list. (default_used_reg_set): New function. * targhooks.h (default_used_reg_set): New function declaration. * var-tracking.c (dataflow_set_clear_at_call): Modify get_call_reg_set_usage argument list. diff --git a/gcc/caller-save.c b/gcc/caller-save.c index a7edbad..922b02d 100644 --- a/gcc/caller-save.c +++ b/gcc/caller-save.c @@ -442,7 +442,7 @@ setup_save_areas (void) freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn)); REG_SET_TO_HARD_REG_SET (hard_regs_to_save, &chain->live_throughout); - get_call_reg_set_usage (insn, &used_regs, call_used_reg_set); + get_call_reg_set_usage (insn, &used_regs, true); /* Record all registers set in this call insn. These don't need to be saved. N.B. the call insn might set a subreg @@ -526,7 +526,7 @@ setup_save_areas (void) REG_SET_TO_HARD_REG_SET (hard_regs_to_save, &chain->live_throughout); - get_call_reg_set_usage (insn, &used_regs, call_used_reg_set); + get_call_reg_set_usage (insn, &used_regs, true); /* Record all registers set in this call insn. These don't need to be saved. N.B. the call insn might set a subreg @@ -855,8 +855,7 @@ save_call_clobbered_regs (void) AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set); AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets); AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved); - get_call_reg_set_usage (insn, &call_def_reg_set, - call_used_reg_set); + get_call_reg_set_usage (insn, &call_def_reg_set, true); AND_HARD_REG_SET (hard_regs_to_save, call_def_reg_set); for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 8cc738c..b101c7b 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1383,16 +1383,87 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode) return false; } +/* Return true if this is a definition of a vectorized simdclone function. + We recognize this only by looking for a simd or aarch64_vector_pcs + attribute. */ + +static bool +aarch64_simd_function_def (tree fndecl) +{ + if (lookup_attribute ("aarch64_vector_pcs", DECL_ATTRIBUTES (fndecl)) != NULL) + return true; + if (lookup_attribute ("simd", DECL_ATTRIBUTES (fndecl)) == NULL) + return false; + return (VECTOR_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl)))); +} + +/* Return true if insn is a call to a simd function. */ + +static bool +aarch64_simd_call_p (rtx_insn *insn) +{ + rtx symbol; + rtx call; + tree fndecl; + + if (!insn) + return false; + call = get_call_rtx_from (insn); + if (!call) + return false; + symbol = XEXP (XEXP (call, 0), 0); + if (GET_CODE (symbol) != SYMBOL_REF) + return false; + fndecl = SYMBOL_REF_DECL (symbol); + if (!fndecl) + return false; + + return aarch64_simd_decl_p (fndecl); +} + /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED. The callee only saves the lower 64 bits of a 128-bit register. Tell the compiler the callee clobbers the top 64 bits when restoring the bottom 64 bits. */ static bool -aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode) +aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno, machine_mode mode) { + if (aarch64_simd_call_p (insn)) + { + if (FP_SIMD_SAVED_REGNUM_P (regno)) + return false; + } return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8); } +static bool +aarch64_check_part_clobbered(rtx_insn *insn) +{ + if (aarch64_simd_call_p (insn)) + return false; + return true; +} + +void +aarch64_used_reg_set (rtx_insn *insn, + HARD_REG_SET *return_set, + bool default_to_used) +{ + int regno; + + if (default_to_used) + COPY_HARD_REG_SET (*return_set, call_used_reg_set); + else + COPY_HARD_REG_SET (*return_set, regs_invalidated_by_call); + + if (aarch64_simd_call_p (insn)) + { + for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) + if (FP_SIMD_SAVED_REGNUM_P (regno)) + CLEAR_HARD_REG_BIT (*return_set, regno); + } +} + /* Implement REGMODE_NATURAL_SIZE. */ poly_uint64 aarch64_regmode_natural_size (machine_mode mode) @@ -17932,6 +18003,9 @@ aarch64_libgcc_floating_mode_supported_p #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \ aarch64_hard_regno_call_part_clobbered +#undef TARGET_CHECK_PART_CLOBBERED +#define TARGET_CHECK_PART_CLOBBERED aarch64_check_part_clobbered + #undef TARGET_CONSTANT_ALIGNMENT #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment @@ -17947,6 +18021,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_SPECULATION_SAFE_VALUE #define TARGET_SPECULATION_SAFE_VALUE aarch64_speculation_safe_value +#undef TARGET_USED_REG_SET +#define TARGET_USED_REG_SET aarch64_used_reg_set + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests diff --git a/gcc/cselib.c b/gcc/cselib.c index 6d3a407..7d6f28e 100644 --- a/gcc/cselib.c +++ b/gcc/cselib.c @@ -2769,7 +2769,7 @@ cselib_process_insn (rtx_insn *insn) if (call_used_regs[i] || (REG_VALUES (i) && REG_VALUES (i)->elt && (targetm.hard_regno_call_part_clobbered - (i, GET_MODE (REG_VALUES (i)->elt->val_rtx))))) + (insn, i, GET_MODE (REG_VALUES (i)->elt->val_rtx))))) cselib_invalidate_regno (i, reg_raw_mode[i]); /* Since it is not clear how cselib is going to be used, be diff --git a/gcc/df-scan.c b/gcc/df-scan.c index 0b119f2..4d2751d 100644 --- a/gcc/df-scan.c +++ b/gcc/df-scan.c @@ -3097,8 +3097,7 @@ df_get_call_refs (struct df_collection_rec *collection_rec, CLEAR_HARD_REG_SET (defs_generated); df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated); is_sibling_call = SIBLING_CALL_P (insn_info->insn); - get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage, - regs_invalidated_by_call); + get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage, false); for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) { diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index c509a9b..fb05da6 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -1693,6 +1693,10 @@ of @code{CALL_USED_REGISTERS}. @cindex call-saved register @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED +@hook TARGET_CHECK_PART_CLOBBERED + +@hook TARGET_USED_REG_SET + @findex fixed_regs @findex call_used_regs @findex global_regs diff --git a/gcc/final.c b/gcc/final.c index 6943c07..343dd00 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -5002,8 +5002,7 @@ collect_fn_hard_reg_usage (void) if (CALL_P (insn) && !self_recursive_call_p (insn)) { - if (!get_call_reg_set_usage (insn, &insn_used_regs, - call_used_reg_set)) + if (!get_call_reg_set_usage (insn, &insn_used_regs, true)) return; IOR_HARD_REG_SET (function_used_regs, insn_used_regs); @@ -5074,24 +5073,29 @@ get_call_cgraph_rtl_info (rtx_insn *insn) } /* Find hard registers used by function call instruction INSN, and return them - in REG_SET. Return DEFAULT_SET in REG_SET if not found. */ + in REG_SET. If not found return call_used_set in REG_SET when + default_to_used is TRUE or regs_invalidated_by_call when it is false. */ bool get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set, - HARD_REG_SET default_set) + bool default_to_used) { + HARD_REG_SET default_set; + if (flag_ipa_ra) { struct cgraph_rtl_info *node = get_call_cgraph_rtl_info (insn); if (node != NULL && node->function_used_regs_valid) { + targetm.used_reg_set (insn, &default_set, default_to_used); COPY_HARD_REG_SET (*reg_set, node->function_used_regs); AND_HARD_REG_SET (*reg_set, default_set); return true; } } + targetm.used_reg_set (insn, &default_set, default_to_used); COPY_HARD_REG_SET (*reg_set, default_set); return false; } diff --git a/gcc/hooks.c b/gcc/hooks.c index 780cc1e..c412d69 100644 --- a/gcc/hooks.c +++ b/gcc/hooks.c @@ -140,9 +140,10 @@ hook_bool_puint64_puint64_true (poly_uint64, poly_uint64) return true; } -/* Generic hook that takes (unsigned int, machine_mode) and returns false. */ +/* Generic hook that takes (rtx_insn *, unsigned int, machine_mode) and + returns false. */ bool -hook_bool_uint_mode_false (unsigned int, machine_mode) +hook_bool_insn_uint_mode_false (rtx_insn *, unsigned int, machine_mode) { return false; } diff --git a/gcc/hooks.h b/gcc/hooks.h index 0ed5b95..3ca3db2 100644 --- a/gcc/hooks.h +++ b/gcc/hooks.h @@ -40,7 +40,8 @@ extern bool hook_bool_const_rtx_insn_const_rtx_insn_true (const rtx_insn *, extern bool hook_bool_mode_uhwi_false (machine_mode, unsigned HOST_WIDE_INT); extern bool hook_bool_puint64_puint64_true (poly_uint64, poly_uint64); -extern bool hook_bool_uint_mode_false (unsigned int, machine_mode); +extern bool hook_bool_insn_uint_mode_false (rtx_insn *, unsigned int, + machine_mode); extern bool hook_bool_uint_mode_true (unsigned int, machine_mode); extern bool hook_bool_tree_false (tree); extern bool hook_bool_const_tree_false (const_tree); diff --git a/gcc/ira-conflicts.c b/gcc/ira-conflicts.c index eb85e77..8288485 100644 --- a/gcc/ira-conflicts.c +++ b/gcc/ira-conflicts.c @@ -808,7 +808,7 @@ ira_build_conflicts (void) regs must conflict with them. */ for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) if (!TEST_HARD_REG_BIT (call_used_reg_set, regno) - && targetm.hard_regno_call_part_clobbered (regno, + && targetm.hard_regno_call_part_clobbered (NULL, regno, obj_mode)) { SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c index 6fa917a..6a2a0b4 100644 --- a/gcc/ira-costs.c +++ b/gcc/ira-costs.c @@ -2337,7 +2337,7 @@ ira_tune_allocno_costs (void) *crossed_calls_clobber_regs) && (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set) - || targetm.hard_regno_call_part_clobbered (regno, + || targetm.hard_regno_call_part_clobbered (NULL, regno, mode))) cost += (ALLOCNO_CALL_FREQ (a) * (ira_memory_move_cost[mode][rclass][0] diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c index b38d4a5..39ea82a 100644 --- a/gcc/ira-lives.c +++ b/gcc/ira-lives.c @@ -1202,8 +1202,7 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) int num = ALLOCNO_NUM (a); HARD_REG_SET this_call_used_reg_set; - get_call_reg_set_usage (insn, &this_call_used_reg_set, - call_used_reg_set); + get_call_reg_set_usage (insn, &this_call_used_reg_set, true); /* Don't allocate allocnos that cross setjmps or any call, if this function receives a nonlocal diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index 8be4d46..21ef0d8 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -5344,18 +5344,32 @@ inherit_reload_reg (bool def_p, int original_regno, /* Return true if we need a caller save/restore for pseudo REGNO which was assigned to a hard register. */ static inline bool -need_for_call_save_p (int regno) +need_for_call_save_p (int regno, rtx_insn *insn ATTRIBUTE_UNUSED) { + machine_mode pmode = PSEUDO_REGNO_MODE (regno); + int new_regno = reg_renumber[regno]; + lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0); - return (usage_insns[regno].calls_num < calls_num - && (overlaps_hard_reg_set_p - ((flag_ipa_ra && - ! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set)) - ? lra_reg_info[regno].actual_call_used_reg_set - : call_used_reg_set, - PSEUDO_REGNO_MODE (regno), reg_renumber[regno]) - || (targetm.hard_regno_call_part_clobbered - (reg_renumber[regno], PSEUDO_REGNO_MODE (regno))))); + if (usage_insns[regno].calls_num >= calls_num) + return false; + + /* If we are doing interprocedural register allocation, + targetm.hard_regno_call_part_clobbered was used to set + actual_call_used_reg_set and should not to be checked + here. */ + + if (flag_ipa_ra + && !hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set)) + return (overlaps_hard_reg_set_p + (lra_reg_info[regno].actual_call_used_reg_set, + pmode, new_regno) + || (lra_reg_info[regno].check_part_clobbered + && targetm.hard_regno_call_part_clobbered + (NULL, new_regno, pmode))); + else + return (overlaps_hard_reg_set_p (call_used_reg_set, pmode, new_regno) + || targetm.hard_regno_call_part_clobbered + (NULL, new_regno, pmode)); } /* Global registers occurring in the current EBB. */ @@ -5374,7 +5388,8 @@ static bitmap_head ebb_global_regs; assignment pass because of too many generated moves which will be probably removed in the undo pass. */ static inline bool -need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno) +need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno, + rtx_insn *insn) { int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno]; @@ -5416,7 +5431,8 @@ need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno) || (regno >= FIRST_PSEUDO_REGISTER && lra_reg_info[regno].nrefs > 3 && bitmap_bit_p (&ebb_global_regs, regno)))) - || (regno >= FIRST_PSEUDO_REGISTER && need_for_call_save_p (regno))); + || (regno >= FIRST_PSEUDO_REGISTER + && need_for_call_save_p (regno, insn))); } /* Return class for the split pseudo created from original pseudo with @@ -5536,7 +5552,7 @@ split_reg (bool before_p, int original_regno, rtx_insn *insn, nregs = hard_regno_nregs (hard_regno, mode); rclass = lra_get_allocno_class (original_regno); original_reg = regno_reg_rtx[original_regno]; - call_save_p = need_for_call_save_p (original_regno); + call_save_p = need_for_call_save_p (original_regno, insn); } lra_assert (hard_regno >= 0); if (lra_dump_file != NULL) @@ -5769,7 +5785,7 @@ split_if_necessary (int regno, machine_mode mode, && INSN_UID (next_usage_insns) < max_uid) || (GET_CODE (next_usage_insns) == INSN_LIST && (INSN_UID (XEXP (next_usage_insns, 0)) < max_uid))) - && need_for_split_p (potential_reload_hard_regs, regno + i) + && need_for_split_p (potential_reload_hard_regs, regno + i, insn) && split_reg (before_p, regno + i, insn, next_usage_insns, NULL)) res = true; return res; @@ -6539,7 +6555,8 @@ inherit_in_ebb (rtx_insn *head, rtx_insn *tail) && usage_insns[j].check == curr_usage_insns_check && (next_usage_insns = usage_insns[j].insns) != NULL_RTX) { - if (need_for_split_p (potential_reload_hard_regs, j)) + if (need_for_split_p (potential_reload_hard_regs, j, + curr_insn)) { if (lra_dump_file != NULL && head_p) { diff --git a/gcc/lra-int.h b/gcc/lra-int.h index 5267b53..e6aacd2 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -117,6 +117,8 @@ struct lra_reg /* This member is set up in lra-lives.c for subsequent assignments. */ lra_copy_t copies; + /* Whether or not the register is partially clobbered. */ + bool check_part_clobbered; }; /* References to the common info about each register. */ diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c index 565c68b..5d3eab9 100644 --- a/gcc/lra-lives.c +++ b/gcc/lra-lives.c @@ -568,7 +568,8 @@ lra_setup_reload_pseudo_preferenced_hard_reg (int regno, PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS. */ static inline void check_pseudos_live_through_calls (int regno, - HARD_REG_SET last_call_used_reg_set) + HARD_REG_SET last_call_used_reg_set, + rtx_insn *insn ATTRIBUTE_UNUSED) { int hr; @@ -578,11 +579,12 @@ check_pseudos_live_through_calls (int regno, IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, last_call_used_reg_set); - for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++) - if (targetm.hard_regno_call_part_clobbered (hr, - PSEUDO_REGNO_MODE (regno))) - add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs, - PSEUDO_REGNO_MODE (regno), hr); + if (!flag_ipa_ra) + for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++) + if (targetm.hard_regno_call_part_clobbered (NULL, hr, + PSEUDO_REGNO_MODE (regno))) + add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs, + PSEUDO_REGNO_MODE (regno), hr); lra_reg_info[regno].call_p = true; if (! sparseset_bit_p (pseudos_live_through_setjumps, regno)) return; @@ -820,7 +822,8 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) |= mark_regno_live (reg->regno, reg->biggest_mode, curr_point); check_pseudos_live_through_calls (reg->regno, - last_call_used_reg_set); + last_call_used_reg_set, + curr_insn); } if (reg->regno >= FIRST_PSEUDO_REGISTER) @@ -872,8 +875,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) else { HARD_REG_SET this_call_used_reg_set; - get_call_reg_set_usage (curr_insn, &this_call_used_reg_set, - call_used_reg_set); + get_call_reg_set_usage (curr_insn, &this_call_used_reg_set, true); bool flush = (! hard_reg_set_empty_p (last_call_used_reg_set) && ! hard_reg_set_equal_p (last_call_used_reg_set, @@ -883,9 +885,13 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) { IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set, this_call_used_reg_set); + + if (targetm.check_part_clobbered (curr_insn)) + lra_reg_info[j].check_part_clobbered = true; + if (flush) check_pseudos_live_through_calls - (j, last_call_used_reg_set); + (j, last_call_used_reg_set, curr_insn); } COPY_HARD_REG_SET(last_call_used_reg_set, this_call_used_reg_set); } @@ -915,7 +921,8 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) |= mark_regno_live (reg->regno, reg->biggest_mode, curr_point); check_pseudos_live_through_calls (reg->regno, - last_call_used_reg_set); + last_call_used_reg_set, + curr_insn); } for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) @@ -1071,7 +1078,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) if (sparseset_cardinality (pseudos_live_through_calls) == 0) break; if (sparseset_bit_p (pseudos_live_through_calls, j)) - check_pseudos_live_through_calls (j, last_call_used_reg_set); + check_pseudos_live_through_calls (j, last_call_used_reg_set, NULL); } for (i = 0; i < FIRST_PSEUDO_REGISTER; ++i) diff --git a/gcc/lra.c b/gcc/lra.c index aa768fb..17cbf07 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -1344,6 +1344,7 @@ initialize_lra_reg_info_element (int i) lra_reg_info[i].val = get_new_reg_value (); lra_reg_info[i].offset = 0; lra_reg_info[i].copies = NULL; + lra_reg_info[i].check_part_clobbered = false; } /* Initialize common reg info and copies. */ diff --git a/gcc/postreload.c b/gcc/postreload.c index 56cb14d..bca4e59 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -1332,7 +1332,7 @@ reload_combine (void) rtx link; HARD_REG_SET used_regs; - get_call_reg_set_usage (insn, &used_regs, call_used_reg_set); + get_call_reg_set_usage (insn, &used_regs, true); for (r = 0; r < FIRST_PSEUDO_REGISTER; r++) if (TEST_HARD_REG_BIT (used_regs, r)) diff --git a/gcc/regcprop.c b/gcc/regcprop.c index 1f80576..cdd5b90 100644 --- a/gcc/regcprop.c +++ b/gcc/regcprop.c @@ -1048,13 +1048,11 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd) } } - get_call_reg_set_usage (insn, - ®s_invalidated_by_this_call, - regs_invalidated_by_call); + get_call_reg_set_usage (insn, ®s_invalidated_by_this_call, false); for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) if ((TEST_HARD_REG_BIT (regs_invalidated_by_this_call, regno) || (targetm.hard_regno_call_part_clobbered - (regno, vd->e[regno].mode))) + (insn, regno, vd->e[regno].mode))) && (regno < set_regno || regno >= set_regno + set_nregs)) kill_value_regno (regno, 1, vd); diff --git a/gcc/reginfo.c b/gcc/reginfo.c index 33befa5..df789b5 100644 --- a/gcc/reginfo.c +++ b/gcc/reginfo.c @@ -639,7 +639,7 @@ choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED, if (hard_regno_nregs (regno, mode) == nregs && targetm.hard_regno_mode_ok (regno, mode) && (!call_saved - || !targetm.hard_regno_call_part_clobbered (regno, mode)) + || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)) && maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode))) found_mode = mode; @@ -647,7 +647,7 @@ choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED, if (hard_regno_nregs (regno, mode) == nregs && targetm.hard_regno_mode_ok (regno, mode) && (!call_saved - || !targetm.hard_regno_call_part_clobbered (regno, mode)) + || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)) && maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode))) found_mode = mode; @@ -655,7 +655,7 @@ choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED, if (hard_regno_nregs (regno, mode) == nregs && targetm.hard_regno_mode_ok (regno, mode) && (!call_saved - || !targetm.hard_regno_call_part_clobbered (regno, mode)) + || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)) && maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode))) found_mode = mode; @@ -663,7 +663,7 @@ choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED, if (hard_regno_nregs (regno, mode) == nregs && targetm.hard_regno_mode_ok (regno, mode) && (!call_saved - || !targetm.hard_regno_call_part_clobbered (regno, mode)) + || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)) && maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode))) found_mode = mode; @@ -677,7 +677,7 @@ choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED, if (hard_regno_nregs (regno, mode) == nregs && targetm.hard_regno_mode_ok (regno, mode) && (!call_saved - || !targetm.hard_regno_call_part_clobbered (regno, mode))) + || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))) return mode; } diff --git a/gcc/regrename.c b/gcc/regrename.c index 8424093..5bee9b7 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -339,9 +339,9 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSED, int new_reg, && ! DEBUG_INSN_P (tmp->insn)) || (this_head->need_caller_save_reg && ! (targetm.hard_regno_call_part_clobbered - (reg, GET_MODE (*tmp->loc))) + (NULL, reg, GET_MODE (*tmp->loc))) && (targetm.hard_regno_call_part_clobbered - (new_reg, GET_MODE (*tmp->loc))))) + (NULL, new_reg, GET_MODE (*tmp->loc))))) return false; return true; diff --git a/gcc/regs.h b/gcc/regs.h index f143cbd..35cf969 100644 --- a/gcc/regs.h +++ b/gcc/regs.h @@ -385,6 +385,6 @@ range_in_hard_reg_set_p (const HARD_REG_SET set, unsigned regno, int nregs) /* Get registers used by given function call instruction. */ extern bool get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set, - HARD_REG_SET default_set); + bool default_to_used); #endif /* GCC_REGS_H */ diff --git a/gcc/reload.c b/gcc/reload.c index 88299a8..b26340c 100644 --- a/gcc/reload.c +++ b/gcc/reload.c @@ -6912,13 +6912,13 @@ find_equiv_reg (rtx goal, rtx_insn *insn, enum reg_class rclass, int other, if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) for (i = 0; i < nregs; ++i) if (call_used_regs[regno + i] - || targetm.hard_regno_call_part_clobbered (regno + i, mode)) + || targetm.hard_regno_call_part_clobbered (p, regno + i, mode)) return 0; if (valueno >= 0 && valueno < FIRST_PSEUDO_REGISTER) for (i = 0; i < valuenregs; ++i) if (call_used_regs[valueno + i] - || targetm.hard_regno_call_part_clobbered (valueno + i, + || targetm.hard_regno_call_part_clobbered (p, valueno + i, mode)) return 0; } diff --git a/gcc/reload1.c b/gcc/reload1.c index 3c0c9ff..f65e930 100644 --- a/gcc/reload1.c +++ b/gcc/reload1.c @@ -8289,7 +8289,8 @@ emit_reload_insns (struct insn_chain *chain) : out_regno + k); reg_reloaded_insn[regno + k] = insn; SET_HARD_REG_BIT (reg_reloaded_valid, regno + k); - if (targetm.hard_regno_call_part_clobbered (regno + k, + if (targetm.hard_regno_call_part_clobbered (insn, + regno + k, mode)) SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered, regno + k); @@ -8369,7 +8370,8 @@ emit_reload_insns (struct insn_chain *chain) : in_regno + k); reg_reloaded_insn[regno + k] = insn; SET_HARD_REG_BIT (reg_reloaded_valid, regno + k); - if (targetm.hard_regno_call_part_clobbered (regno + k, + if (targetm.hard_regno_call_part_clobbered (insn, + regno + k, mode)) SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered, regno + k); @@ -8485,7 +8487,7 @@ emit_reload_insns (struct insn_chain *chain) CLEAR_HARD_REG_BIT (reg_reloaded_dead, src_regno + k); SET_HARD_REG_BIT (reg_reloaded_valid, src_regno + k); if (targetm.hard_regno_call_part_clobbered - (src_regno + k, mode)) + (insn, src_regno + k, mode)) SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered, src_regno + k); else diff --git a/gcc/resource.c b/gcc/resource.c index fdfab69..aea497f 100644 --- a/gcc/resource.c +++ b/gcc/resource.c @@ -669,7 +669,7 @@ mark_set_resources (rtx x, struct resources *res, int in_dest, res->cc = res->memory = 1; - get_call_reg_set_usage (call_insn, ®s, regs_invalidated_by_call); + get_call_reg_set_usage (call_insn, ®s, false); IOR_HARD_REG_SET (res->regs, regs); for (link = CALL_INSN_FUNCTION_USAGE (call_insn); @@ -1040,7 +1040,7 @@ mark_target_live_regs (rtx_insn *insns, rtx target_maybe_return, struct resource HARD_REG_SET regs_invalidated_by_this_call; get_call_reg_set_usage (real_insn, ®s_invalidated_by_this_call, - regs_invalidated_by_call); + false); /* CALL clobbers all call-used regs that aren't fixed except sp, ap, and fp. Do this before setting the result of the call live. */ diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c index f89f282..fa4bdfe 100644 --- a/gcc/sched-deps.c +++ b/gcc/sched-deps.c @@ -3728,7 +3728,7 @@ deps_analyze_insn (struct deps_desc *deps, rtx_insn *insn) Since we only have a choice between 'might be clobbered' and 'definitely not clobbered', we must include all partly call-clobbered registers here. */ - else if (targetm.hard_regno_call_part_clobbered (i, + else if (targetm.hard_regno_call_part_clobbered (insn, i, reg_raw_mode[i]) || TEST_HARD_REG_BIT (regs_invalidated_by_call, i)) SET_REGNO_REG_SET (reg_pending_clobbers, i); diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c index 824f1ec..7b442ea 100644 --- a/gcc/sel-sched.c +++ b/gcc/sel-sched.c @@ -1103,7 +1103,7 @@ init_regs_for_mode (machine_mode mode) if (i >= 0) continue; - if (targetm.hard_regno_call_part_clobbered (cur_reg, mode)) + if (targetm.hard_regno_call_part_clobbered (NULL, cur_reg, mode)) SET_HARD_REG_BIT (sel_hrd.regs_for_call_clobbered[mode], cur_reg); @@ -1252,7 +1252,7 @@ mark_unavailable_hard_regs (def_t def, struct reg_rename *reg_rename_p, /* Exclude registers that are partially call clobbered. */ if (def->crosses_call - && !targetm.hard_regno_call_part_clobbered (regno, mode)) + && !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)) AND_COMPL_HARD_REG_SET (reg_rename_p->available_for_renaming, sel_hrd.regs_for_call_clobbered[mode]); diff --git a/gcc/target.def b/gcc/target.def index 9e22423..e82fc30 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5735,12 +5735,32 @@ DEFHOOK partly call-clobbered, and if a value of mode @var{mode} would be partly\n\ clobbered by a call. For example, if the low 32 bits of @var{regno} are\n\ preserved across a call but higher bits are clobbered, this hook should\n\ -return true for a 64-bit mode but false for a 32-bit mode.\n\ +return true for a 64-bit mode but false for a 32-bit mode. If insn is\n\ +not NULL then it is the call instruction being made. This allows the\n\ +function to return different values based on a function attribute or other\n\ +function specific information.\n\ \n\ The default implementation returns false, which is correct\n\ for targets that don't have partly call-clobbered registers.", - bool, (unsigned int regno, machine_mode mode), - hook_bool_uint_mode_false) + bool, (rtx_insn *insn, unsigned int regno, machine_mode mode), + hook_bool_insn_uint_mode_false) + +DEFHOOK +( + check_part_clobbered, + "This hook should return true if the function @var{insn} should obey\n\ + the hard_regno_call_part_clobbered target. False if should ignore it.", + bool, (rtx_insn *insn), + hook_bool_rtx_insn_true) + +DEFHOOK +(used_reg_set, + "This hook should set return_set to the call_used_reg_set if\n\ +@var{default_to_used} is true and regs_invalidated_by_call if it is false.\n\ +The hook may look at @var{insn} to see if the default register set\n\ +should be modified due to attributes on the function being called.", + void, (rtx_insn *insn, HARD_REG_SET *return_set, bool default_to_used), + default_used_reg_set) /* Return the smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. */ diff --git a/gcc/targhooks.c b/gcc/targhooks.c index afd56f3..8242b2a 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1928,7 +1928,7 @@ default_dwarf_frame_reg_mode (int regno) { machine_mode save_mode = reg_raw_mode[regno]; - if (targetm.hard_regno_call_part_clobbered (regno, save_mode)) + if (targetm.hard_regno_call_part_clobbered (NULL, regno, save_mode)) save_mode = choose_hard_reg_mode (regno, 1, true); return save_mode; } @@ -2370,4 +2370,15 @@ default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED, return result; } +void +default_used_reg_set (rtx_insn *insn ATTRIBUTE_UNUSED, + HARD_REG_SET *return_set, + bool default_to_used) +{ + if (default_to_used) + COPY_HARD_REG_SET (*return_set, call_used_reg_set); + else + COPY_HARD_REG_SET (*return_set, regs_invalidated_by_call); +} + #include "gt-targhooks.h" diff --git a/gcc/targhooks.h b/gcc/targhooks.h index f92ca5c..3f17efd 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -285,4 +285,6 @@ extern bool default_have_speculation_safe_value (bool); extern bool speculation_safe_value_not_needed (bool); extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx); +extern void default_used_reg_set (rtx_insn *, HARD_REG_SET *, bool); + #endif /* GCC_TARGHOOKS_H */ diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 5537fa6..81a052e 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -4901,8 +4901,7 @@ dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn) hard_reg_set_iterator hrsi; HARD_REG_SET invalidated_regs; - get_call_reg_set_usage (call_insn, &invalidated_regs, - regs_invalidated_by_call); + get_call_reg_set_usage (call_insn, &invalidated_regs, false); EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi) var_regno_delete (set, r);