From patchwork Wed May 30 09:35:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juerg Haefliger X-Patchwork-Id: 922662 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wlp05dZBz9s23; Wed, 30 May 2018 19:35:52 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fNxWD-0004Vy-5g; Wed, 30 May 2018 09:35:45 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fNxWB-0004Vk-Bz for kernel-team@lists.ubuntu.com; Wed, 30 May 2018 09:35:43 +0000 Received: from mail-wm0-f69.google.com ([74.125.82.69]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1fNxWB-0000Xg-4J for kernel-team@lists.ubuntu.com; Wed, 30 May 2018 09:35:43 +0000 Received: by mail-wm0-f69.google.com with SMTP id f65-v6so5587968wmd.2 for ; Wed, 30 May 2018 02:35:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=a0JDnDTkDI4QwlB/YFCTB+W9ojOtri+REi9cKSpU/HM=; b=nwgL7ZeSDRyeYZ8+6kRO8zy+pmfZZuGWJm+iUHjz0BHvwCefrTpgKxpMmget+GN04l emC+ubEU4ofKFwkjUgLIOM81Gn9e1hmeFYBuN6jUKDo2KEBMWp3Tdm69th3Fn730sBfO FpMEA9LKKQy5uav7kSKI0yJUjGnktSItpX/l9k5IiK+OkESoaaYygAQZdBTzZpqKAlT2 PctOQxomVeHu058YUm1p5d9xKKaX0seTWPfjK5W7Tzrsw/BPLLWV8q9VWj0XXT3q7oEp 0dwf7/QuVxC/kFSOraHbo9ktELuX27p0JWhFs9jBtMJ64W5xZBF7b3LgSMR2KgfRS9To wfBw== X-Gm-Message-State: ALKqPwde45EsjU6PLYoMmRuMVt+jZysDDPOkYftIYG9slzVvi06ni5WZ gx1rYvT4AAD70c6qvJK8KWVk6ikSQ5YOYJ3sXGRuawLAydwLKRi6OVAV3zvk8+kEc496385AQtN B9nP0+XWMbFfvgG4Io9Y3nt4RJ/gr6BjPse/jG8xswA== X-Received: by 2002:a50:a50f:: with SMTP id y15-v6mr2595679edb.105.1527672942636; Wed, 30 May 2018 02:35:42 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLKAik2kKvK8GOddbVRrjm9tlHNdarExSVggMSyMSNrrDDiUrZmXys+JdPrPYjjC0qW/x9vNg== X-Received: by 2002:a50:a50f:: with SMTP id y15-v6mr2595665edb.105.1527672942425; Wed, 30 May 2018 02:35:42 -0700 (PDT) Received: from gollum.fritz.box ([81.221.205.149]) by smtp.gmail.com with ESMTPSA id h39-v6sm5957659edb.65.2018.05.30.02.35.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:35:41 -0700 (PDT) From: Juerg Haefliger X-Google-Original-From: Juerg Haefliger To: kernel-team@lists.ubuntu.com Subject: [SRU][Xenial][PATCH 1/2] powerpc/64s: Improve RFI L1-D cache flush fallback Date: Wed, 30 May 2018 11:35:38 +0200 Message-Id: <20180530093539.11917-2-juergh@canonical.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530093539.11917-1-juergh@canonical.com> References: <20180530093539.11917-1-juergh@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Nicholas Piggin BugLink: https://bugs.launchpad.net/bugs/1744173 The fallback RFI flush is used when firmware does not provide a way to flush the cache. It's a "displacement flush" that evicts useful data by displacing it with an uninteresting buffer. The flush has to take care to work with implementation specific cache replacment policies, so the recipe has been in flux. The initial slow but conservative approach is to touch all lines of a congruence class, with dependencies between each load. It has since been determined that a linear pattern of loads without dependencies is sufficient, and is significantly faster. Measuring the speed of a null syscall with RFI fallback flush enabled gives the relative improvement: P8 - 1.83x P9 - 1.75x The flush also becomes simpler and more adaptable to different cache geometries. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman (backported from bdcb1aefc5b3f7d0f1dc8b02673602bca2ff7a4b) Signed-off-by: Gustavo Walbon Signed-off-by: Juerg Haefliger --- arch/powerpc/include/asm/paca.h | 3 +- arch/powerpc/kernel/asm-offsets.c | 3 +- arch/powerpc/kernel/exceptions-64s.S | 76 +++++++++++++--------------- arch/powerpc/kernel/setup_64.c | 13 +---- 4 files changed, 39 insertions(+), 56 deletions(-) diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index 9e76e27d96c7..08ea3b49cfed 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -208,8 +208,7 @@ struct paca_struct { */ u64 exrfi[13] __aligned(0x80); void *rfi_flush_fallback_area; - u64 l1d_flush_congruence; - u64 l1d_flush_sets; + u64 l1d_flush_size; #endif }; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 46acc17dfed1..ec3ec682072c 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -245,8 +245,7 @@ int main(void) DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce)); DEFINE(PACA_RFI_FLUSH_FALLBACK_AREA, offsetof(struct paca_struct, rfi_flush_fallback_area)); DEFINE(PACA_EXRFI, offsetof(struct paca_struct, exrfi)); - DEFINE(PACA_L1D_FLUSH_CONGRUENCE, offsetof(struct paca_struct, l1d_flush_congruence)); - DEFINE(PACA_L1D_FLUSH_SETS, offsetof(struct paca_struct, l1d_flush_sets)); + DEFINE(PACA_L1D_FLUSH_SIZE, offsetof(struct paca_struct, l1d_flush_size)); #endif DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id)); DEFINE(PACAKEXECSTATE, offsetof(struct paca_struct, kexec_state)); diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 218bff6ea243..088c930f5554 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1608,39 +1608,37 @@ rfi_flush_fallback: std r9,PACA_EXRFI+EX_R9(r13) std r10,PACA_EXRFI+EX_R10(r13) std r11,PACA_EXRFI+EX_R11(r13) - std r12,PACA_EXRFI+EX_R12(r13) - std r8,PACA_EXRFI+EX_R13(r13) mfctr r9 ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) - ld r11,PACA_L1D_FLUSH_SETS(r13) - ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) - /* - * The load adresses are at staggered offsets within cachelines, - * which suits some pipelines better (on others it should not - * hurt). - */ - addi r12,r12,8 + ld r11,PACA_L1D_FLUSH_SIZE(r13) + srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */ mtctr r11 DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ /* order ld/st prior to dcbt stop all streams with flushing */ sync -1: li r8,0 - .rept 8 /* 8-way set associative */ - ldx r11,r10,r8 - add r8,r8,r12 - xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not - add r8,r8,r11 // Add 0, this creates a dependency on the ldx - .endr - addi r10,r10,128 /* 128 byte cache line */ + + /* + * The load adresses are at staggered offsets within cachelines, + * which suits some pipelines better (on others it should not + * hurt). + */ +1: + ld r11,(0x80 + 8)*0(r10) + ld r11,(0x80 + 8)*1(r10) + ld r11,(0x80 + 8)*2(r10) + ld r11,(0x80 + 8)*3(r10) + ld r11,(0x80 + 8)*4(r10) + ld r11,(0x80 + 8)*5(r10) + ld r11,(0x80 + 8)*6(r10) + ld r11,(0x80 + 8)*7(r10) + addi r10,r10,0x80*8 bdnz 1b mtctr r9 ld r9,PACA_EXRFI+EX_R9(r13) ld r10,PACA_EXRFI+EX_R10(r13) ld r11,PACA_EXRFI+EX_R11(r13) - ld r12,PACA_EXRFI+EX_R12(r13) - ld r8,PACA_EXRFI+EX_R13(r13) GET_SCRATCH0(r13); rfid @@ -1651,39 +1649,37 @@ hrfi_flush_fallback: std r9,PACA_EXRFI+EX_R9(r13) std r10,PACA_EXRFI+EX_R10(r13) std r11,PACA_EXRFI+EX_R11(r13) - std r12,PACA_EXRFI+EX_R12(r13) - std r8,PACA_EXRFI+EX_R13(r13) mfctr r9 ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) - ld r11,PACA_L1D_FLUSH_SETS(r13) - ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) - /* - * The load adresses are at staggered offsets within cachelines, - * which suits some pipelines better (on others it should not - * hurt). - */ - addi r12,r12,8 + ld r11,PACA_L1D_FLUSH_SIZE(r13) + srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */ mtctr r11 DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ /* order ld/st prior to dcbt stop all streams with flushing */ sync -1: li r8,0 - .rept 8 /* 8-way set associative */ - ldx r11,r10,r8 - add r8,r8,r12 - xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not - add r8,r8,r11 // Add 0, this creates a dependency on the ldx - .endr - addi r10,r10,128 /* 128 byte cache line */ + + /* + * The load adresses are at staggered offsets within cachelines, + * which suits some pipelines better (on others it should not + * hurt). + */ +1: + ld r11,(0x80 + 8)*0(r10) + ld r11,(0x80 + 8)*1(r10) + ld r11,(0x80 + 8)*2(r10) + ld r11,(0x80 + 8)*3(r10) + ld r11,(0x80 + 8)*4(r10) + ld r11,(0x80 + 8)*5(r10) + ld r11,(0x80 + 8)*6(r10) + ld r11,(0x80 + 8)*7(r10) + addi r10,r10,0x80*8 bdnz 1b mtctr r9 ld r9,PACA_EXRFI+EX_R9(r13) ld r10,PACA_EXRFI+EX_R10(r13) ld r11,PACA_EXRFI+EX_R11(r13) - ld r12,PACA_EXRFI+EX_R12(r13) - ld r8,PACA_EXRFI+EX_R13(r13) GET_SCRATCH0(r13); hrfid diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 8a5d92f12d1a..70dfe49868e1 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -982,19 +982,8 @@ static void init_fallback_flush(void) memset(l1d_flush_fallback_area, 0, l1d_size * 2); for_each_possible_cpu(cpu) { - /* - * The fallback flush is currently coded for 8-way - * associativity. Different associativity is possible, but it - * will be treated as 8-way and may not evict the lines as - * effectively. - * - * 128 byte lines are mandatory. - */ - u64 c = l1d_size / 8; - paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area; - paca[cpu].l1d_flush_congruence = c; - paca[cpu].l1d_flush_sets = c / 128; + paca[cpu].l1d_flush_size = l1d_size; } } From patchwork Wed May 30 09:35:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juerg Haefliger X-Patchwork-Id: 922663 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wlp06Zzwz9s2S; Wed, 30 May 2018 19:35:52 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fNxWD-0004W9-9P; Wed, 30 May 2018 09:35:45 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fNxWC-0004Vs-Dm for kernel-team@lists.ubuntu.com; Wed, 30 May 2018 09:35:44 +0000 Received: from mail-wm0-f69.google.com ([74.125.82.69]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1fNxWC-0000Xj-6b for kernel-team@lists.ubuntu.com; Wed, 30 May 2018 09:35:44 +0000 Received: by mail-wm0-f69.google.com with SMTP id v2-v6so12701932wmc.0 for ; Wed, 30 May 2018 02:35:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=OjphkvLzEnpbatJXBr/n1FQt8ADMGPI6qwOrlsll+jI=; b=bHZgriJCF3HoeQcgiYcejt2zml2aYwBieEGbKKChUyTAz9OAjHu3e/ezmAKAe1UQ47 6hzDhnJrWrZ2j2/ZU/9hckOuFkMztyOQgXFtu5k/C3l42fdwyqRCdtqVMro027g1WddA djxPgsE+NDIKa30XM22N2Xmez44KjZ4yB6nSD88oFGI6d7UAEPcuOtSYPmJkUW+KUl7h bj+8X7kYueuRPIJ9PGy3stJ4WXc9PWUWRdHJgOiuEDAuUI+BivRmN2JTFSfbrEqBJGRC Plz63Hrl2pqe/dkPbEUEkDqaxa0VXTjnHMCPn5Hr8hZLJ50fm1GcYbPkxqygi1cXq8iM rlYg== X-Gm-Message-State: ALKqPwfUSy2LBVyj6eCrspFAJEGztrdkWOVffWj3XjMVsjBJssy6Oon3 LDtGeEnh/h34tm5Xqq9lzLmGiV5n/7dVf6ils2Pjb5EUrC+x1wJxWMLACyOiTuqiWSQ7J1y6hVX uWSUrLXfmNMgV/Gh6QICW9IhELPITK5u4hQwRdCnPyw== X-Received: by 2002:a50:95f0:: with SMTP id x45-v6mr2667288eda.99.1527672943582; Wed, 30 May 2018 02:35:43 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKUpqIaGFaFiSKfuIWfWhVn/SCiPurLReAKthafZm+uo0qAjMARbOB3kgLKx8ELW/nHNzcjkQ== X-Received: by 2002:a50:95f0:: with SMTP id x45-v6mr2667278eda.99.1527672943450; Wed, 30 May 2018 02:35:43 -0700 (PDT) Received: from gollum.fritz.box ([81.221.205.149]) by smtp.gmail.com with ESMTPSA id h39-v6sm5957659edb.65.2018.05.30.02.35.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:35:42 -0700 (PDT) From: Juerg Haefliger X-Google-Original-From: Juerg Haefliger To: kernel-team@lists.ubuntu.com Subject: [SRU][Xenial][PATCH 2/2] UBUNTU: SAUCE: rfi-flush: Make it possible to call setup_rfi_flush() again Date: Wed, 30 May 2018 11:35:39 +0200 Message-Id: <20180530093539.11917-3-juergh@canonical.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530093539.11917-1-juergh@canonical.com> References: <20180530093539.11917-1-juergh@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Michael Ellerman BugLink: https://bugs.launchpad.net/bugs/1744173 For PowerVM migration we want to be able to call setup_rfi_flush() again after we've migrated the partition. To support that we need to check that we're not trying to allocate the fallback flush area after memblock has gone away. If so we just fail, we don't support migrating from a patched to an unpatched machine. Or we do support it, but there will be no RFI flush enabled on the destination. Signed-off-by: Michael Ellerman Signed-off-by: Breno Leitao Signed-off-by: Juerg Haefliger --- arch/powerpc/kernel/setup_64.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 70dfe49868e1..efc6371d62b3 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -961,14 +961,22 @@ void setup_stf_barrier(void) stf_barrier_enable(enable); } -static void init_fallback_flush(void) +static bool init_fallback_flush(void) { u64 l1d_size, limit; int cpu; /* Only allocate the fallback flush area once (at boot time). */ if (l1d_flush_fallback_area) - return; + return true; + + /* + * Once the slab allocator is up it's too late to allocate the fallback + * flush area, so return an error. This could happen if we migrated from + * a patched machine to an unpatched machine. + */ + if (slab_is_available()) + return false; l1d_size = ppc64_caches.dsize; limit = min(safe_stack_limit(), ppc64_rma_size); @@ -985,13 +993,19 @@ static void init_fallback_flush(void) paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area; paca[cpu].l1d_flush_size = l1d_size; } + + return true; } void setup_rfi_flush(enum l1d_flush_type types, bool enable) { if (types & L1D_FLUSH_FALLBACK) { - pr_info("rfi-flush: fallback displacement flush available\n"); - init_fallback_flush(); + if (init_fallback_flush()) + pr_info("rfi-flush: Using fallback displacement flush\n"); + else { + pr_warn("rfi-flush: Error unable to use fallback displacement flush!\n"); + types &= ~L1D_FLUSH_FALLBACK; + } } if (types & L1D_FLUSH_ORI)