From patchwork Fri Oct 25 17:39:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Craig Blackmore X-Patchwork-Id: 1184353 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-511788-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="JsTnx+3o"; dkim=pass (2048-bit key; unprotected) header.d=embecosm.com header.i=@embecosm.com header.b="Z21XT1B6"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 470BGQ16Lqz9sP3 for ; Sat, 26 Oct 2019 04:40:29 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=MfJV7L/8s2A4/ZaMFMQWapci77UcbgGjG1tG9MOndAK5kbRZ2Z5yN wAa3P9zExTocm5ctnL76And/PEFrTTbka0bEGv4wR7PA2Boral3jUoJ4evS0bj2a KEJSq/068/PHcY5rSEgeDUHARSgmD0pS210BnAEle3BzkL6UkYGXIk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=k4+5jAbdmMJjbVMY03LkEdj7w4w=; b=JsTnx+3oUhPvrG+Baf3L RKivD6SOs2Cgnrlh6KU+OJ2MzOMUbD3HONouh3CZdX/Y+kPe5ox/15Hbzc6uqRlb Xwft1rCoXaKyL4KjDSZ6wvvEMkiTLGn6fMNrUB4XN4Uvp66XUVoCJSotCuUd9+Mx V2VUjwW7/DU7LN2IKiYjzjY= Received: (qmail 70728 invoked by alias); 25 Oct 2019 17:40:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 70719 invoked by uid 89); 25 Oct 2019 17:40:23 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=Cheng, paramsh, params.h, UD:params.h X-HELO: mail-wm1-f66.google.com Received: from mail-wm1-f66.google.com (HELO mail-wm1-f66.google.com) (209.85.128.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 25 Oct 2019 17:40:20 +0000 Received: by mail-wm1-f66.google.com with SMTP id c22so2834891wmd.1 for ; Fri, 25 Oct 2019 10:40:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ozsg0LoOeaeDDh6jLqY/VDVwkI2KtHmrupn7eCcn6zI=; b=Z21XT1B6M+R+PtbXf2KuDhhA3FwWEMfS4zg7ilrtzBD2oZV3T3KOcxmF/n9h3OYZpI heJhcGdVud4rDZSGC+CaaBfSdgNLvRcxOPxxhQvLVFQ2tvc8uPp2LKmBplP3rugOX1Hx 1bcxrBaTIQwlvsZyH6XYS6wXkJTZ/XFQ4s5mIs6lY0Fq3ShbA+f5WfcE31YK6mV510Zl Q640DvilqdRnHeNH4d83zUZOeq3L0pzuCRL924m3vNRXnB8/9z+caRgbcUL92I46/vyZ 8hguRMac/TBvnVuYNDGtQM0lh6u4vNUi+cr2hJHbL1W90trXRAG5lB61ebtxzQ/p1G3P x8BA== Received: from localhost.localdomain ([80.0.42.246]) by smtp.googlemail.com with ESMTPSA id 126sm2672521wma.48.2019.10.25.10.40.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 25 Oct 2019 10:40:17 -0700 (PDT) From: Craig Blackmore To: gcc-patches@gcc.gnu.org Cc: jimw@sifive.com, Ofer.Shinaar@wdc.com, Nidal.Faour@wdc.com, kito.cheng@gmail.com, law@redhat.com, Craig Blackmore Subject: [PATCH v2 0/2] RISC-V: Allow more load/stores to be compressed Date: Fri, 25 Oct 2019 18:39:09 +0100 Message-Id: <1572025151-22783-1-git-send-email-craig.blackmore@embecosm.com> In-Reply-To: References: Hi Kito, Thank you for taking the time to review my patch. I am posting an updated patchset taking into account your comments. On 18/09/2019 11:01, Kito Cheng wrote: > Hi Craig: > > Some general review comment: > - Split new pass into new file. > - Add new option to enable/disable this pass. > - Could you extend this patch to support lw/sw/ld/sd/flw/fsw/fld/fsd? > I think there is lots of common logic for supporting other types > compressed load/store > instruction, but I'd like to see those support at once. I agree the patch could be extended to other load/store instructions but unfortunately I don't have the time to do this at the moment. Can the lw/sw support be merged and the others added later? > - Do you have experimental data about doing that after register > allocation/reload, I don't think it is feasible to move the pass after reload, because the pass requires a new register to be allocated for the new base. > I'd prefer doing such optimization after RA, because we can > accurately estimate > how many byte we can gain, I guess it because RA didn't assign fit > src/dest reg > or base reg so that increase code size? > Before reload, we do not know whether the base reg will be a compressed register or not. > > On Fri, Sep 13, 2019 at 12:20 AM Craig Blackmore > wrote: >> >> This patch aims to allow more load/store instructions to be compressed by >> replacing a load/store of 'base register + large offset' with a new load/store >> of 'new base + small offset'. If the new base gets stored in a compressed >> register, then the new load/store can be compressed. Since there is an overhead >> in creating the new base, this change is only attempted when 'base register' is >> referenced in at least 4 load/stores in a basic block. >> >> The optimization is implemented in a new RISC-V specific pass called >> shorten_memrefs which is enabled for RVC targets. It has been developed for the >> 32-bit lw/sw instructions but could also be extended to 64-bit ld/sd in future. >> >> The patch saves 164 bytes (0.3%) on a proprietary application (59450 bytes >> compared to 59286 bytes) compiled for rv32imc bare metal with -Os. On the >> Embench benchmark suite (https://www.embench.org/) we see code size reductions >> of up to 18 bytes (0.7%) and only two cases where code size is increased >> slightly, by 2 bytes each: >> >> Embench results (.text size in bytes, excluding .rodata) >> >> Benchmark Without patch With patch Diff >> aha-mont64 1052 1052 0 >> crc32 232 232 0 >> cubic 2446 2448 2 >> edn 1454 1450 -4 >> huffbench 1642 1642 0 >> matmult-int 420 420 0 >> minver 1056 1056 0 >> nbody 714 714 0 >> nettle-aes 2888 2884 -4 >> nettle-sha256 5566 5564 -2 >> nsichneu 15052 15052 0 >> picojpeg 8078 8078 0 >> qrduino 6140 6140 0 >> sglib-combined 2444 2444 0 >> slre 2438 2420 -18 >> st 880 880 0 >> statemate 3842 3842 0 >> ud 702 702 0 >> wikisort 4278 4280 2 >> ------------------------------------------------- >> Total 61324 61300 -24 >> >> The patch has been tested on the following bare metal targets using QEMU >> and there were no regressions: >> >> rv32i >> rv32iac >> rv32im >> rv32imac >> rv32imafc >> rv64imac >> rv64imafdc >> >> We noticed that sched2 undoes some of the addresses generated by this >> optimization and consequently increases code size, therefore this patch adds a >> check in sched-deps.c to avoid changes that are expected to increase code size >> when not optimizing for speed. Since this change touches target-independent >> code, the patch has been bootstrapped and tested on x86 with no regressions. >> >> diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c >> index 52db3cc..92a0893 100644 >> --- a/gcc/sched-deps.c >> +++ b/gcc/sched-deps.c >> @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. If not see >> #include "sched-int.h" >> #include "params.h" >> #include "cselib.h" >> +#include "predict.h" >> >> #ifdef INSN_SCHEDULING >> >> @@ -4707,6 +4708,15 @@ attempt_change (struct mem_inc_info *mii, rtx new_addr) >> rtx mem = *mii->mem_loc; >> rtx new_mem; >> >> + /* When not optimizing for speed, avoid changes that are expected to make code >> + size larger. */ >> + addr_space_t as = MEM_ADDR_SPACE (mem); >> + bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (mii->mem_insn)); >> + int old_cost = address_cost (XEXP (mem, 0), GET_MODE (mem), as, speed); >> + int new_cost = address_cost (new_addr, GET_MODE (mem), as, speed); >> + if (new_cost > old_cost && !speed) > > I think !speed should not needed here, it mean address_cost is > incorrect if generated worse code, but this change will effect all > other targets, > so I think it would be better to split into another patch and CC > related reviewer. > > I have removed !speed in the updated patch and CC'd Jeff Law. Jeff - please could you review my change to sched-deps.c in patch 2/2? Thanks, Craig --- Craig Blackmore (2): RISC-V: Add shorten_memrefs pass sched-deps.c: Avoid replacing address if it increases address cost gcc/config.gcc | 2 +- gcc/config/riscv/riscv-passes.def | 20 +++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-shorten-memrefs.c | 188 +++++++++++++++++++++++ gcc/config/riscv/riscv.c | 86 ++++++++++- gcc/config/riscv/riscv.h | 5 + gcc/config/riscv/riscv.opt | 6 + gcc/config/riscv/t-riscv | 5 + gcc/doc/invoke.texi | 10 ++ gcc/sched-deps.c | 9 ++ 10 files changed, 327 insertions(+), 6 deletions(-) create mode 100644 gcc/config/riscv/riscv-passes.def create mode 100644 gcc/config/riscv/riscv-shorten-memrefs.c