From patchwork Fri Aug 5 07:18:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 656049 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3s5J846V2Qz9t0Z for ; Fri, 5 Aug 2016 17:18:59 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=SX2m+GKD; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=o3EMyMOfSk72f6MBZgQEnUY9AU25akOzIa3fpqaNULUwbA Jb1P7o1Ag4i9AQAEnm/tQdaKdDKSm1OSFKLBgNJdyMFolVU+gXRHoi14e6pNJeev tlrm7+1oJUvlUZYEna1E8hpqrNVG/OqCSGQ3quz5oZS/iyysZgatEnfErvZpg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=xBw53RVHlZtJ6b1ka77B3hPZy+o=; b=SX2m+GKDhYGspg4LbMKn skAo5E3mY6BH41lOc8TRQi/s/i+Nlbaat81qTnBHe+QIjPEPceOa+OpK7qQDm4T4 cBtv8zKujlZDOV/vPar8gbYziuoogWnTKgp4IhE6/uTOCEHfy44pcGHap8v+p9oA ZOYR3vPZAY0EHF55T4xf1TU= Received: (qmail 97311 invoked by alias); 5 Aug 2016 07:18:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 97295 invoked by uid 89); 5 Aug 2016 07:18:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 spammy=sk:AARCH64, overridden X-HELO: mail-lf0-f43.google.com Received: from mail-lf0-f43.google.com (HELO mail-lf0-f43.google.com) (209.85.215.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 05 Aug 2016 07:18:48 +0000 Received: by mail-lf0-f43.google.com with SMTP id l69so197509855lfg.1 for ; Fri, 05 Aug 2016 00:18:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=Q7zYY5HBf/eIAVm7Axo/cY1RXMswjY70fE2YWiP2eJE=; b=CtvRFT6n94IH7BumHi4dy0F3DJd5LbmjESyczHI7nb8x7nK9tse67PL+zcfBb4mFjp //wF2Fc4aQnLfxu8Tqv5A36WEjPwytXWv9i1ySuIh3AndlQjywPAzZn/P43WcaEv/IA8 JS8ElWmem5t16hpzYZSx0uU7ztS26PaatId0ePkx+SxzX//bNCQOzX+4A4dV+d07pECK zRAfGdUO5U8idS4pyjZL9hWCDzbA7KD61LzTMnilIj49yQI5BabUv0Cvomgfqe0eIC9O L52RNQMd/YjNc38sXpEnJVbzPkpLT5EIFQtYwQZ4cwE6iqbkjupoITTcpE8gQ07bcCIF Z7gw== X-Gm-Message-State: AEkoousZxKGtK9qOY0MuqFrKnL5JjGcDv1EekwUBoAZgTWFHoIpWPYwmQpx54BIQKEE7+9QGNZpmKx/c5/XyWw== X-Received: by 10.25.207.10 with SMTP id f10mr20998742lfg.108.1470381524642; Fri, 05 Aug 2016 00:18:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.42.4 with HTTP; Fri, 5 Aug 2016 00:18:43 -0700 (PDT) From: Andrew Pinski Date: Fri, 5 Aug 2016 00:18:43 -0700 Message-ID: Subject: [PATCH/AARCH64] Improve ThunderX code generation slightly with load/store pair To: GCC Patches X-IsSubscribed: yes Hi, On ThunderX, load (and store) pair that does a pair of two word (32bits) load/stores is slower in some cases than doing two load/stores. For some internal benchmarks, it provides a 2-5% improvement. This patch disables the forming of the load/store pairs for SImode if we are tuning for ThunderX. I used the tuning flags route so it can be overridden if needed later on or if someone else wants to use the same method for their core. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-tuning-flags.def (slow_ldpw): New tuning option. * config/aarch64/aarch64.c (thunderx_tunings): Enable AARCH64_EXTRA_TUNE_SLOW_LDPW. (aarch64_operands_ok_for_ldpstp): Return false if AARCH64_EXTRA_TUNE_SLOW_LDPW and the mode was SImode. (aarch64_operands_adjust_ok_for_ldpstp): Likewise. Index: gcc/config/aarch64/aarch64-tuning-flags.def =================================================================== --- gcc/config/aarch64/aarch64-tuning-flags.def (revision 239150) +++ gcc/config/aarch64/aarch64-tuning-flags.def (working copy) @@ -29,3 +29,4 @@ AARCH64_TUNE_ to give an enum name. */ AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) +AARCH64_EXTRA_TUNING_OPTION ("slow_ldpw", SLOW_LDPW) Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c (revision 239150) +++ gcc/config/aarch64/aarch64.c (working copy) @@ -712,7 +712,7 @@ 0, /* max_case_values. */ 0, /* cache_line_size. */ tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_NONE) /* tune_flags. */ + (AARCH64_EXTRA_TUNE_SLOW_LDPW) /* tune_flags. */ }; static const struct tune_params xgene1_tunings = @@ -13574,6 +13574,11 @@ enum reg_class rclass_1, rclass_2; rtx mem_1, mem_2, reg_1, reg_2, base_1, base_2, offset_1, offset_2; + if (mode == SImode + && AARCH64_EXTRA_TUNE_SLOW_LDPW + && !optimize_size) + return false; + if (load) { mem_1 = operands[1]; @@ -13673,6 +13678,11 @@ rtx mem_1, mem_2, mem_3, mem_4, reg_1, reg_2, reg_3, reg_4; rtx base_1, base_2, base_3, base_4, offset_1, offset_2, offset_3, offset_4; + if (mode == SImode + && AARCH64_EXTRA_TUNE_SLOW_LDPW + && !optimize_size) + return false; + if (load) { reg_1 = operands[0];