From patchwork Fri Feb 3 07:18:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1736725 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=AHiRLfcf; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P7RrJ1xNNz23hn for ; Fri, 3 Feb 2023 18:20:16 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4P7RrC71PPz3f6L for ; Fri, 3 Feb 2023 18:20:11 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=AHiRLfcf; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::62f; helo=mail-pl1-x62f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=AHiRLfcf; dkim-atps=neutral Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4P7Rpt2mLFz3bhW for ; Fri, 3 Feb 2023 18:19:02 +1100 (AEDT) Received: by mail-pl1-x62f.google.com with SMTP id n13so4386794plf.11 for ; Thu, 02 Feb 2023 23:19:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XnlBvkqnb+l7IeUx8vg7gSpE5ieSkYXmCCbewvNAOcI=; b=AHiRLfcf8eUNQBnlAsruaF06DgmSEYeMrWMiRHEZ9X5oCS8EFKY0D1+uSzk001rKHG wM6Q2U10Pn14JQyW+l0O/c8z+fkwxIm/NPitBdfs40sfGlyXsENjatpI1RZhPk232lvA zDdUzhnba9x94swlHDb7UkVrJFnMqWpYzmGo4TPTsGtXHfXv9sfrWlf3syDxwLR5ipEX dFRmiJAgq8XvaTNIFp5Qu+WClm6+oFC9LekqRQUzG9QfdZgJPNJtJkIdPe1uHKJYpQRj YyJEeIrYiO6SEa4xtr2WXCweUe8cAWZ/KQ4ykObFGtMcv3jpBg0TZs/m6RR8iSh/nfe2 i5Jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XnlBvkqnb+l7IeUx8vg7gSpE5ieSkYXmCCbewvNAOcI=; b=ZGm00+azaIg53Zp5boQBFmWY9BYxOUvTl5AXPLc1pWcxnid2tBNU0f5lh4DGRED6n2 VsBI6K38wiS2jZQ1dRbEM2mOMT+MBnbyjnSHvWSuvRQjWVexXameEMMMWghKFAnbvrEn gNG2Fwun0U6dQKWpxs8kDbVV3K+7xwwMqPrXXqHz8u+onrR8MXgS8OIvTp9gRIvRa6lK K4LXQ0IKWuz70fqsxanv6GHY+rxx46a7FeMxO1vUWi0KDFOIoIIcfjEgt9bMLlNxz6nJ tXZ0MrAGWH4p9jAZ+bojIr/LCSQkH+QvntgJ0I0+ZJfaq3L0T98NTKaHjmuOUjHUW8Za 3nYA== X-Gm-Message-State: AO0yUKVK43v+DSLvEpC8dlKhORBE3cnyqIyv7i/c8sKI7r8wlAOEbjLC Kp9qFAtezQY/RjO7xnYbU7g= X-Google-Smtp-Source: AK7set9/19Ishv+YbUs//IpJgvi84ti72zlNw3k1E7pi36wh6xfDzDjNIuz8qP6WMzuLeNCLPjc4Ag== X-Received: by 2002:a05:6a20:4413:b0:be:cd2f:1951 with SMTP id ce19-20020a056a20441300b000becd2f1951mr12089686pzb.41.1675408740099; Thu, 02 Feb 2023 23:19:00 -0800 (PST) Received: from bobo.ibm.com (193-116-117-77.tpgi.com.au. [193.116.117.77]) by smtp.gmail.com with ESMTPSA id f20-20020a637554000000b004df4ba1ebfesm877558pgn.66.2023.02.02.23.18.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 23:18:59 -0800 (PST) From: Nicholas Piggin To: Andrew Morton Subject: [PATCH v7 1/5] kthread: simplify kthread_use_mm refcounting Date: Fri, 3 Feb 2023 17:18:33 +1000 Message-Id: <20230203071837.1136453-2-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230203071837.1136453-1-npiggin@gmail.com> References: <20230203071837.1136453-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Rik van Riel , Will Deacon , Peter Zijlstra , Linus Torvalds , Dave Hansen , linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-mm@kvack.org, Andy Lutomirski , Catalin Marinas , Nadav Amit Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Remove the special case avoiding refcounting when the mm to be used is the same as the kernel thread's active (lazy tlb) mm. kthread_use_mm() should not be such a performance critical path that this matters much. This simplifies a later change to lazy tlb mm refcounting. Acked-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- kernel/kthread.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index f97fd01a2932..7424a1839e9a 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1410,14 +1410,13 @@ void kthread_use_mm(struct mm_struct *mm) WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD)); WARN_ON_ONCE(tsk->mm); + mmgrab(mm); + task_lock(tsk); /* Hold off tlb flush IPIs while switching mm's */ local_irq_disable(); active_mm = tsk->active_mm; - if (active_mm != mm) { - mmgrab(mm); - tsk->active_mm = mm; - } + tsk->active_mm = mm; tsk->mm = mm; membarrier_update_current_mm(mm); switch_mm_irqs_off(active_mm, mm, tsk); @@ -1434,12 +1433,9 @@ void kthread_use_mm(struct mm_struct *mm) * memory barrier after storing to tsk->mm, before accessing * user-space memory. A full memory barrier for membarrier * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by - * mmdrop(), or explicitly with smp_mb(). + * mmdrop(). */ - if (active_mm != mm) - mmdrop(active_mm); - else - smp_mb(); + mmdrop(active_mm); } EXPORT_SYMBOL_GPL(kthread_use_mm); From patchwork Fri Feb 3 07:18:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1736727 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=RjMq26cA; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P7RsF0sRrz23hn for ; Fri, 3 Feb 2023 18:21:04 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4P7RsD5QR3z3f7Y for ; Fri, 3 Feb 2023 18:21:04 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=RjMq26cA; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::102f; helo=mail-pj1-x102f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=RjMq26cA; dkim-atps=neutral Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4P7Rpz2F7rz3f5k for ; Fri, 3 Feb 2023 18:19:07 +1100 (AEDT) Received: by mail-pj1-x102f.google.com with SMTP id rm7-20020a17090b3ec700b0022c05558d22so4123661pjb.5 for ; Thu, 02 Feb 2023 23:19:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5zyO0Na48QxjwG/96vzDf++z0O8XkPhrocrCmdAQIr4=; b=RjMq26cALu6UQb/DQZXQLPnJ8q+5dxPkkAeNEEvpF3YmP/iIxIovFGviX1JlRz0jeg M7qLKRmW/kcc5x1c9SHmOSYnJe9+31+D24kVK96Ksr1S7Zt/aXoepztE4675W5+TKCmI l7ZXGa+Hkrlr29qD2BElEYcG56srSLiqGLTXdDhjWEbhTeezj+o3ucEl/DnGScKqEdos gjnHqOBFBvOGIB8B0ihoLS1aLm2EEzjTn2NTiSQLlM02bDjGVFKn0ywDHx4b4Xqh8fdJ /YT8fJBKv4a5hmE07ZK5OFW3fBPnCb8a4hDphqWN5QDzyLQrPez4UtETRdkzThwbDqBE bt7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5zyO0Na48QxjwG/96vzDf++z0O8XkPhrocrCmdAQIr4=; b=VdZaUQe0mlWFeiWapsIhf1yO0lEvkibWKxQsdPYbu/0x4Xc0E+gS9Ng2xuk5wUfLtE wLQrFF/VQe3FXR4A17EY2RVaexkD7mQedaz83zCOdEmXIcVcadSXCG69yjOuuj6wzizb ZHOMvFSC4UQ7GpYnaLEomwWkU9iZtpIYZ/Bib9XspVsQhWdBJj4IlDfDfJGeyZk4JPq0 zK/0uBClGTzZQC0EprPvyEr/Zi/G9nnw7bsdnX25IJFbHFjaFoF1rHqVAmsNaSfcX63H W0Tz+rceMpllo4KHOKtPikCd7D0hd4rHLP1i8ctsNnGJdzWSkA8hx9wy/Vp0v0VAS9Tj Lycw== X-Gm-Message-State: AO0yUKXlAAhKJTiaN3WhIEQHAR8+l4uxl9umdP16vBsCQDzC0V9QVlF4 KN4fvx8aqqrGyX2D4eqM7hA= X-Google-Smtp-Source: AK7set+0jg4e6z5KWbhW7fRuU0ZT2jMMbq5RDBzo5fVKlfYThFxG8WOLHSndBW6UWk3y/w+dt7g5fQ== X-Received: by 2002:a05:6a20:4428:b0:bd:278:f68f with SMTP id ce40-20020a056a20442800b000bd0278f68fmr11365775pzb.52.1675408745589; Thu, 02 Feb 2023 23:19:05 -0800 (PST) Received: from bobo.ibm.com (193-116-117-77.tpgi.com.au. [193.116.117.77]) by smtp.gmail.com with ESMTPSA id f20-20020a637554000000b004df4ba1ebfesm877558pgn.66.2023.02.02.23.19.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 23:19:05 -0800 (PST) From: Nicholas Piggin To: Andrew Morton Subject: [PATCH v7 2/5] lazy tlb: introduce lazy tlb mm refcount helper functions Date: Fri, 3 Feb 2023 17:18:34 +1000 Message-Id: <20230203071837.1136453-3-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230203071837.1136453-1-npiggin@gmail.com> References: <20230203071837.1136453-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Rik van Riel , Will Deacon , Peter Zijlstra , Linus Torvalds , Dave Hansen , linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-mm@kvack.org, Andy Lutomirski , Catalin Marinas , Nadav Amit Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Add explicit _lazy_tlb annotated functions for lazy tlb mm refcounting. This makes the lazy tlb mm references more obvious, and allows the refcounting scheme to be modified in later changes. There is no functional change with this patch. Acked-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- arch/arm/mach-rpc/ecard.c | 2 +- arch/powerpc/kernel/smp.c | 2 +- arch/powerpc/mm/book3s64/radix_tlb.c | 4 ++-- fs/exec.c | 2 +- include/linux/sched/mm.h | 16 ++++++++++++++++ kernel/cpu.c | 2 +- kernel/exit.c | 2 +- kernel/kthread.c | 12 ++++++++++-- kernel/sched/core.c | 15 ++++++++------- 9 files changed, 41 insertions(+), 16 deletions(-) diff --git a/arch/arm/mach-rpc/ecard.c b/arch/arm/mach-rpc/ecard.c index 53813f9464a2..c30df1097c52 100644 --- a/arch/arm/mach-rpc/ecard.c +++ b/arch/arm/mach-rpc/ecard.c @@ -253,7 +253,7 @@ static int ecard_init_mm(void) current->mm = mm; current->active_mm = mm; activate_mm(active_mm, mm); - mmdrop(active_mm); + mmdrop_lazy_tlb(active_mm); ecard_init_pgtables(mm); return 0; } diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 6b90f10a6c81..7db6b3faea65 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1611,7 +1611,7 @@ void start_secondary(void *unused) if (IS_ENABLED(CONFIG_PPC32)) setup_kup(); - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); current->active_mm = &init_mm; smp_store_cpu_info(cpu); diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index 4e29b619578c..282359ab525b 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -794,10 +794,10 @@ void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush) if (current->active_mm == mm) { WARN_ON_ONCE(current->mm != NULL); /* Is a kernel thread and is using mm as the lazy tlb */ - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); current->active_mm = &init_mm; switch_mm_irqs_off(mm, &init_mm, current); - mmdrop(mm); + mmdrop_lazy_tlb(mm); } /* diff --git a/fs/exec.c b/fs/exec.c index ab913243a367..1a32a88db173 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1033,7 +1033,7 @@ static int exec_mmap(struct mm_struct *mm) mmput(old_mm); return 0; } - mmdrop(active_mm); + mmdrop_lazy_tlb(active_mm); return 0; } diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 2a243616f222..5376caf6fcf3 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -79,6 +79,22 @@ static inline void mmdrop_sched(struct mm_struct *mm) } #endif +/* Helpers for lazy TLB mm refcounting */ +static inline void mmgrab_lazy_tlb(struct mm_struct *mm) +{ + mmgrab(mm); +} + +static inline void mmdrop_lazy_tlb(struct mm_struct *mm) +{ + mmdrop(mm); +} + +static inline void mmdrop_lazy_tlb_sched(struct mm_struct *mm) +{ + mmdrop_sched(mm); +} + /** * mmget() - Pin the address space associated with a &struct mm_struct. * @mm: The address space to pin. diff --git a/kernel/cpu.c b/kernel/cpu.c index 6c0a92ca6bb5..189895288d9d 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -623,7 +623,7 @@ static int finish_cpu(unsigned int cpu) */ if (mm != &init_mm) idle->active_mm = &init_mm; - mmdrop(mm); + mmdrop_lazy_tlb(mm); return 0; } diff --git a/kernel/exit.c b/kernel/exit.c index 15dc2ec80c46..1a4608d765e4 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -537,7 +537,7 @@ static void exit_mm(void) return; sync_mm_rss(mm); mmap_read_lock(mm); - mmgrab(mm); + mmgrab_lazy_tlb(mm); BUG_ON(mm != current->active_mm); /* more a memory barrier than a real lock */ task_lock(current); diff --git a/kernel/kthread.c b/kernel/kthread.c index 7424a1839e9a..e4bc32a88866 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1410,6 +1410,11 @@ void kthread_use_mm(struct mm_struct *mm) WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD)); WARN_ON_ONCE(tsk->mm); + /* + * It is possible for mm to be the same as tsk->active_mm, but + * we must still mmgrab(mm) and mmdrop_lazy_tlb(active_mm), + * because these references are not equivalent. + */ mmgrab(mm); task_lock(tsk); @@ -1433,9 +1438,9 @@ void kthread_use_mm(struct mm_struct *mm) * memory barrier after storing to tsk->mm, before accessing * user-space memory. A full memory barrier for membarrier * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by - * mmdrop(). + * mmdrop_lazy_tlb(). */ - mmdrop(active_mm); + mmdrop_lazy_tlb(active_mm); } EXPORT_SYMBOL_GPL(kthread_use_mm); @@ -1463,10 +1468,13 @@ void kthread_unuse_mm(struct mm_struct *mm) local_irq_disable(); tsk->mm = NULL; membarrier_update_current_mm(NULL); + mmgrab_lazy_tlb(mm); /* active_mm is still 'mm' */ enter_lazy_tlb(mm, tsk); local_irq_enable(); task_unlock(tsk); + + mmdrop(mm); } EXPORT_SYMBOL_GPL(kthread_unuse_mm); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e838feb6adc5..495f9a021de9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5189,13 +5189,14 @@ static struct rq *finish_task_switch(struct task_struct *prev) * rq->curr, before returning to userspace, so provide them here: * * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly - * provided by mmdrop(), + * provided by mmdrop_lazy_tlb(), * - a sync_core for SYNC_CORE. */ if (mm) { membarrier_mm_sync_core_before_usermode(mm); - mmdrop_sched(mm); + mmdrop_lazy_tlb_sched(mm); } + if (unlikely(prev_state == TASK_DEAD)) { if (prev->sched_class->task_dead) prev->sched_class->task_dead(prev); @@ -5252,9 +5253,9 @@ context_switch(struct rq *rq, struct task_struct *prev, /* * kernel -> kernel lazy + transfer active - * user -> kernel lazy + mmgrab() active + * user -> kernel lazy + mmgrab_lazy_tlb() active * - * kernel -> user switch + mmdrop() active + * kernel -> user switch + mmdrop_lazy_tlb() active * user -> user switch */ if (!next->mm) { // to kernel @@ -5262,7 +5263,7 @@ context_switch(struct rq *rq, struct task_struct *prev, next->active_mm = prev->active_mm; if (prev->mm) // from user - mmgrab(prev->active_mm); + mmgrab_lazy_tlb(prev->active_mm); else prev->active_mm = NULL; } else { // to user @@ -5279,7 +5280,7 @@ context_switch(struct rq *rq, struct task_struct *prev, lru_gen_use_mm(next->mm); if (!prev->mm) { // from kernel - /* will mmdrop() in finish_task_switch(). */ + /* will mmdrop_lazy_tlb() in finish_task_switch(). */ rq->prev_mm = prev->active_mm; prev->active_mm = NULL; } @@ -9916,7 +9917,7 @@ void __init sched_init(void) /* * The boot idle thread does lazy MMU switching as well: */ - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); enter_lazy_tlb(&init_mm, current); /* From patchwork Fri Feb 3 07:18:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1736730 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=pLnnqDRI; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P7RtG2wcJz23hn for ; Fri, 3 Feb 2023 18:21:58 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4P7RtG2K57z3f8H for ; Fri, 3 Feb 2023 18:21:58 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=pLnnqDRI; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::632; helo=mail-pl1-x632.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=pLnnqDRI; dkim-atps=neutral Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4P7Rq52RMlz3f6l for ; Fri, 3 Feb 2023 18:19:13 +1100 (AEDT) Received: by mail-pl1-x632.google.com with SMTP id r8so4423071pls.2 for ; Thu, 02 Feb 2023 23:19:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XUGCesZ+ECKfxuO6+UdvSldGg8VBbbvoIh7jdtjrlkU=; b=pLnnqDRIDYF0X+7odeq2MLz8sT29lO0h9nXJRzhqSWb2cKDmw+AjnJVXwKX4LO0+YN sy6TWCVWrrhdAqdi0Athmqz7hnxbv9sXhy7zHgw5JMbTMALz+lrRMnA7mhKMW0UOaN3X QCHhXYJFOAJf6mMzTg6U9SYCB0rngQkrXqgCC9sJi9COvb9A1wQ5Pk8YAxIHPyyrhAK8 45tFQTFNKq2y3gE/ag6BjXudpVKSfK+TUTK3G6sll8axgwNiKP1UqpP4lrWqQnLzlWn0 p2jjdVOgV/IsArGVMlzfnf1uVgzpoRPyeV1+V87NiHGnhmhY+aWACx4xhLz/DXOY+iS+ dKVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XUGCesZ+ECKfxuO6+UdvSldGg8VBbbvoIh7jdtjrlkU=; b=t1S1vwR+xbPr7IjUAKq3bmi8yAamVZAgV4oTEcgJ3pj9Y4+3OcF5VZquJOrIYXNiF/ ZkMvrwKmXlAxPveTj4xLMeo1SBJYzqI9EEKQSbxTgIWh/lQnVQwdcHBzmHJ2CrR1Zjxx lYtiP5dROWuK9sA6uBcIr0I76ukFn2EcwaTcEK8DYN3SOtG9W2O7L4IBJ3y2sDgIMJBA 8auxxJqVCQrJLEwdLZ0ED9PRc6w+VnioMl0Wgz8eLFfGXywOwcIQy9M9LgbxEeL9kKB8 ldDYmMtPr+9z9HgXbZQlq/quglBLNoqX5KtV6WKQExbfqSAQM+fcukgm5wFBtTuqEMkh lj9w== X-Gm-Message-State: AO0yUKXo05z/Ev1yT422VI2shW88mLFIkqyyS8lAHC0IyJeHtfzYQFVh 2JJ/gQDjbWYS8c98J1QQZ0c= X-Google-Smtp-Source: AK7set9Pa1+UA0tNLvIosl6qOQXe2T4uQwVT+pZWT/dLmS0GvpN8PpOxJcaipNBn1b+VrjOFZQErtw== X-Received: by 2002:a05:6a21:78a9:b0:be:a944:b07f with SMTP id bf41-20020a056a2178a900b000bea944b07fmr12686834pzc.61.1675408751037; Thu, 02 Feb 2023 23:19:11 -0800 (PST) Received: from bobo.ibm.com (193-116-117-77.tpgi.com.au. [193.116.117.77]) by smtp.gmail.com with ESMTPSA id f20-20020a637554000000b004df4ba1ebfesm877558pgn.66.2023.02.02.23.19.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 23:19:10 -0800 (PST) From: Nicholas Piggin To: Andrew Morton Subject: [PATCH v7 3/5] lazy tlb: allow lazy tlb mm refcounting to be configurable Date: Fri, 3 Feb 2023 17:18:35 +1000 Message-Id: <20230203071837.1136453-4-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230203071837.1136453-1-npiggin@gmail.com> References: <20230203071837.1136453-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Rik van Riel , Will Deacon , Peter Zijlstra , Linus Torvalds , Dave Hansen , linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-mm@kvack.org, Andy Lutomirski , Catalin Marinas , Nadav Amit Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Add CONFIG_MMU_TLB_REFCOUNT which enables refcounting of the lazy tlb mm when it is context switched. This can be disabled by architectures that don't require this refcounting if they clean up lazy tlb mms when the last refcount is dropped. Currently this is always enabled, so the patch introduces no functional change. Acked-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- Documentation/mm/active_mm.rst | 6 ++++++ arch/Kconfig | 17 +++++++++++++++++ include/linux/sched/mm.h | 18 +++++++++++++++--- 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/Documentation/mm/active_mm.rst b/Documentation/mm/active_mm.rst index 6f8269c284ed..0114d80d406a 100644 --- a/Documentation/mm/active_mm.rst +++ b/Documentation/mm/active_mm.rst @@ -4,6 +4,12 @@ Active MM ========= +Note, the mm_count refcount may no longer include the "lazy" users +(running tasks with ->active_mm == mm && ->mm == NULL) on kernels +with CONFIG_MMU_LAZY_TLB_REFCOUNT=n. Taking and releasing these lazy +references must be done with mmgrab_lazy_tlb() and mmdrop_lazy_tlb() +helpers, which abstract this config option. + :: List: linux-kernel diff --git a/arch/Kconfig b/arch/Kconfig index 12e3ddabac9d..11e8915c0652 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -465,6 +465,23 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM irqs disabled over activate_mm. Architectures that do IPI based TLB shootdowns should enable this. +# Use normal mm refcounting for MMU_LAZY_TLB kernel thread references. +# MMU_LAZY_TLB_REFCOUNT=n can improve the scalability of context switching +# to/from kernel threads when the same mm is running on a lot of CPUs (a large +# multi-threaded application), by reducing contention on the mm refcount. +# +# This can be disabled if the architecture ensures no CPUs are using an mm as a +# "lazy tlb" beyond its final refcount (i.e., by the time __mmdrop frees the mm +# or its kernel page tables). This could be arranged by arch_exit_mmap(), or +# final exit(2) TLB flush, for example. +# +# To implement this, an arch *must*: +# Ensure the _lazy_tlb variants of mmgrab/mmdrop are used when manipulating +# the lazy tlb reference of a kthread's ->active_mm (non-arch code has been +# converted already). +config MMU_LAZY_TLB_REFCOUNT + def_bool y + config ARCH_HAVE_NMI_SAFE_CMPXCHG bool diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 5376caf6fcf3..689dbe812563 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -82,17 +82,29 @@ static inline void mmdrop_sched(struct mm_struct *mm) /* Helpers for lazy TLB mm refcounting */ static inline void mmgrab_lazy_tlb(struct mm_struct *mm) { - mmgrab(mm); + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT)) + mmgrab(mm); } static inline void mmdrop_lazy_tlb(struct mm_struct *mm) { - mmdrop(mm); + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT)) { + mmdrop(mm); + } else { + /* + * mmdrop_lazy_tlb must provide a full memory barrier, see the + * membarrier comment finish_task_switch which relies on this. + */ + smp_mb(); + } } static inline void mmdrop_lazy_tlb_sched(struct mm_struct *mm) { - mmdrop_sched(mm); + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT)) + mmdrop_sched(mm); + else + smp_mb(); /* see mmdrop_lazy_tlb() above */ } /** From patchwork Fri Feb 3 07:18:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1736731 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=GCqfd0gi; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P7RvR3hzsz23hn for ; Fri, 3 Feb 2023 18:22:59 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4P7RvR18VNz3fF7 for ; Fri, 3 Feb 2023 18:22:59 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=GCqfd0gi; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::629; helo=mail-pl1-x629.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=GCqfd0gi; dkim-atps=neutral Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4P7RqB6hzGz3f6y for ; Fri, 3 Feb 2023 18:19:18 +1100 (AEDT) Received: by mail-pl1-x629.google.com with SMTP id b5so4416196plz.5 for ; Thu, 02 Feb 2023 23:19:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ISWO9h6SOLQP6WCeahRDXPdJymqgjLrbpJ6FMzUJXpQ=; b=GCqfd0gisgkgKlxYZuLHUdcwKWxDsZidEXcZ1aNaZZ2ElupC8rz5g3nIkJ3bobpmqi DprWpFD65mlWDpzUu/unAkIGKL37dle+6bCw2lSx+pZmmhDM758k0n8nOhE+ohbJrWlG nnQ2rVqpFtFDnzZ9uApSYB6/eFyeFbXLHLC1IdotvCqzph/BPkikJi0WfNZA9G0CsEND hYHG9zGzwoF31aJ8l6S4dqlVqQzy8grON+LE2Meicn20ZhPs4l0HXU0uHvm0k+HFrdBm AlnDrtDXEIMMr2ZDtrtaE7KDDPGpuU2KJd9TLaiqM1heX5H+w6emkvg9SMXCTv+a9luZ bBzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ISWO9h6SOLQP6WCeahRDXPdJymqgjLrbpJ6FMzUJXpQ=; b=2B6SO33+duc72co2nRqWgwcvUSx3ZFCg6TVIIDsgh1ElE7zUYVGb50qoK6vtiG1nXB a2nc80uZhRC9zbCC/Al/OXqnfCuEtg4xcwRHcAQPXrjrcw0PhtiC9rC0lgLgC9dh7nZB r0IVMCwO/y/SlvlCFe78wtXlfW+N0A0UKJpiN5ZJJq7264MzsMunE0U8Q7vX0IoxBgfG GvLwqjvmUui57O5qJwQ25ec3EBgtVQJAN6KCEizf2f79vHc9+9t0lbUvEYSvOXyrUYTE 9b7VwPBM5zB8wiLG6DlbueYXK8gMhW5s9w7Q/CaMyerBnhWOGhj9T3yV3qLL6Oca+Cxf VxLQ== X-Gm-Message-State: AO0yUKXLXeL0lNZXMf4TrGOfA9hEO9fOQXjBQhrEUjoTY/WQF3LNNuL8 X5YxGMR8J+lCe72RHg6DyZA= X-Google-Smtp-Source: AK7set+asp6Wk4F7oZN8F9+ScVFgiV4xOvEyfkj0CJUO9gvbWyyWzAGztHv22zCvJ0UaYCbOJNjkVQ== X-Received: by 2002:a05:6a20:6909:b0:b6:b6a6:9753 with SMTP id q9-20020a056a20690900b000b6b6a69753mr11785825pzj.8.1675408756632; Thu, 02 Feb 2023 23:19:16 -0800 (PST) Received: from bobo.ibm.com (193-116-117-77.tpgi.com.au. [193.116.117.77]) by smtp.gmail.com with ESMTPSA id f20-20020a637554000000b004df4ba1ebfesm877558pgn.66.2023.02.02.23.19.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 23:19:16 -0800 (PST) From: Nicholas Piggin To: Andrew Morton Subject: [PATCH v7 4/5] lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme Date: Fri, 3 Feb 2023 17:18:36 +1000 Message-Id: <20230203071837.1136453-5-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230203071837.1136453-1-npiggin@gmail.com> References: <20230203071837.1136453-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Rik van Riel , Will Deacon , Peter Zijlstra , Linus Torvalds , Dave Hansen , linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-mm@kvack.org, Andy Lutomirski , Catalin Marinas , Nadav Amit Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On big systems, the mm refcount can become highly contented when doing a lot of context switching with threaded applications. user<->idle switch is one of the important cases. Abandoning lazy tlb entirely slows this switching down quite a bit in the common uncontended case, so that is not viable. Implement a scheme where lazy tlb mm references do not contribute to the refcount, instead they get explicitly removed when the refcount reaches zero. The final mmdrop() sends IPIs to all CPUs in the mm_cpumask and they switch away from this mm to init_mm if it was being used as the lazy tlb mm. Enabling the shoot lazies option therefore requires that the arch ensures that mm_cpumask contains all CPUs that could possibly be using mm. A DEBUG_VM option IPIs every CPU in the system after this to ensure there are no references remaining before the mm is freed. Shootdown IPIs cost could be an issue, but they have not been observed to be a serious problem with this scheme, because short-lived processes tend not to migrate CPUs much, therefore they don't get much chance to leave lazy tlb mm references on remote CPUs. There are a lot of options to reduce them if necessary, described in comments. The near-worst-case can be benchmarked with will-it-scale: context_switch1_threads -t $(($(nproc) / 2)) This will create nproc threads (nproc / 2 switching pairs) all sharing the same mm that spread over all CPUs so each CPU does thread->idle->thread switching. [ Rik came up with basically the same idea a few years ago, so credit to him for that. ] Link: https://lore.kernel.org/linux-mm/20230118080011.2258375-1-npiggin@gmail.com/ Link: https://lore.kernel.org/all/20180728215357.3249-11-riel@surriel.com/ Acked-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- arch/Kconfig | 15 +++++++++++ kernel/fork.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++ lib/Kconfig.debug | 10 ++++++++ 3 files changed, 90 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index 11e8915c0652..0d2021aed57e 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -481,6 +481,21 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM # converted already). config MMU_LAZY_TLB_REFCOUNT def_bool y + depends on !MMU_LAZY_TLB_SHOOTDOWN + +# This option allows MMU_LAZY_TLB_REFCOUNT=n. It ensures no CPUs are using an +# mm as a lazy tlb beyond its last reference count, by shooting down these +# users before the mm is deallocated. __mmdrop() first IPIs all CPUs that may +# be using the mm as a lazy tlb, so that they may switch themselves to using +# init_mm for their active mm. mm_cpumask(mm) is used to determine which CPUs +# may be using mm as a lazy tlb mm. +# +# To implement this, an arch *must*: +# - At the time of the final mmdrop of the mm, ensure mm_cpumask(mm) contains +# at least all possible CPUs in which the mm is lazy. +# - It must meet the requirements for MMU_LAZY_TLB_REFCOUNT=n (see above). +config MMU_LAZY_TLB_SHOOTDOWN + bool config ARCH_HAVE_NMI_SAFE_CMPXCHG bool diff --git a/kernel/fork.c b/kernel/fork.c index 9f7fe3541897..e7d81db7e885 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -780,6 +780,67 @@ static void check_mm(struct mm_struct *mm) #define allocate_mm() (kmem_cache_alloc(mm_cachep, GFP_KERNEL)) #define free_mm(mm) (kmem_cache_free(mm_cachep, (mm))) +static void do_check_lazy_tlb(void *arg) +{ + struct mm_struct *mm = arg; + + WARN_ON_ONCE(current->active_mm == mm); +} + +static void do_shoot_lazy_tlb(void *arg) +{ + struct mm_struct *mm = arg; + + if (current->active_mm == mm) { + WARN_ON_ONCE(current->mm); + current->active_mm = &init_mm; + switch_mm(mm, &init_mm, current); + } +} + +static void cleanup_lazy_tlbs(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) { + /* + * In this case, lazy tlb mms are refounted and would not reach + * __mmdrop until all CPUs have switched away and mmdrop()ed. + */ + return; + } + + /* + * Lazy mm shootdown does not refcount "lazy tlb mm" usage, rather it + * requires lazy mm users to switch to another mm when the refcount + * drops to zero, before the mm is freed. This requires IPIs here to + * switch kernel threads to init_mm. + * + * archs that use IPIs to flush TLBs can piggy-back that lazy tlb mm + * switch with the final userspace teardown TLB flush which leaves the + * mm lazy on this CPU but no others, reducing the need for additional + * IPIs here. There are cases where a final IPI is still required here, + * such as the final mmdrop being performed on a different CPU than the + * one exiting, or kernel threads using the mm when userspace exits. + * + * IPI overheads have not found to be expensive, but they could be + * reduced in a number of possible ways, for example (roughly + * increasing order of complexity): + * - The last lazy reference created by exit_mm() could instead switch + * to init_mm, however it's probable this will run on the same CPU + * immediately afterwards, so this may not reduce IPIs much. + * - A batch of mms requiring IPIs could be gathered and freed at once. + * - CPUs store active_mm where it can be remotely checked without a + * lock, to filter out false-positives in the cpumask. + * - After mm_users or mm_count reaches zero, switching away from the + * mm could clear mm_cpumask to reduce some IPIs, perhaps together + * with some batching or delaying of the final IPIs. + * - A delayed freeing and RCU-like quiescing sequence based on mm + * switching to avoid IPIs completely. + */ + on_each_cpu_mask(mm_cpumask(mm), do_shoot_lazy_tlb, (void *)mm, 1); + if (IS_ENABLED(CONFIG_DEBUG_VM_SHOOT_LAZIES)) + on_each_cpu(do_check_lazy_tlb, (void *)mm, 1); +} + /* * Called when the last reference to the mm * is dropped: either by a lazy thread or by @@ -791,6 +852,10 @@ void __mmdrop(struct mm_struct *mm) BUG_ON(mm == &init_mm); WARN_ON_ONCE(mm == current->mm); + + /* Ensure no CPUs are using this as their lazy tlb mm */ + cleanup_lazy_tlbs(mm); + WARN_ON_ONCE(mm == current->active_mm); mm_free_pgd(mm); destroy_context(mm); diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 61a9425a311f..1a5849f9f414 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -852,6 +852,16 @@ config DEBUG_VM If unsure, say N. +config DEBUG_VM_SHOOT_LAZIES + bool "Debug MMU_LAZY_TLB_SHOOTDOWN implementation" + depends on DEBUG_VM + depends on MMU_LAZY_TLB_SHOOTDOWN + help + Enable additional IPIs that ensure lazy tlb mm references are removed + before the mm is freed. + + If unsure, say N. + config DEBUG_VM_MAPLE_TREE bool "Debug VM maple trees" depends on DEBUG_VM From patchwork Fri Feb 3 07:18:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1736733 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=h7nAnXYf; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P7RwS3kCHz23hn for ; Fri, 3 Feb 2023 18:23:52 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4P7RwS2bVLz3f8s for ; Fri, 3 Feb 2023 18:23:52 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=h7nAnXYf; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::1035; helo=mail-pj1-x1035.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=h7nAnXYf; dkim-atps=neutral Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4P7RqJ2lRMz3f6X for ; Fri, 3 Feb 2023 18:19:24 +1100 (AEDT) Received: by mail-pj1-x1035.google.com with SMTP id d2so524576pjd.5 for ; Thu, 02 Feb 2023 23:19:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x14kRrZZe2kdSbgJ1PJqUMQQfgWflSMZWL5LFkwvByE=; b=h7nAnXYffJkXAeKtJ4+Pb47ZmdkwzoIxpzRgCBuRqFSEuDy+TP7MF7IO/iVxeE9h3P aG7Esd2gb18pAjYT42bsVp7oBg2AItWYJWILKrT1Rv1wljyPlzKVe3HEajToWCRcZVtL zy4qIqHGZ3aoBW3DgCSA/Vp7o+H66URuFiIddPmq0wydnig1tvUkwDog28Na9VCQmXCY VaSnJ2jOszZ6JD6S3Lt/kqQ/A4667tSOoYJbLAZ3Dg/8Mt+fnJ7poqcB8fdgDCdZgr9h DFZ/UDXahTH+Bf05/C7Y+Y3gm5kj6MbI54iJLvql0E7GMYErZm9nVGNUKj5tyaG4S5jq 3YEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x14kRrZZe2kdSbgJ1PJqUMQQfgWflSMZWL5LFkwvByE=; b=Yrx8gSXsNS+BUqfvP02afTfJqDgMvK7UuAlAafU/A70uX7t30Pf5Z7ZdW7tgtjguEr 1aULPY81kDk7rljKEgIG4fHBR9poKZJpTeWvnpxaK7U+3+mp5d6nYGzOA5lUGvQEwJLn /SdioA5fOfnnVNcTWZbplTH91sgcU5nJYoo5vYWBt57qE7WSqbi0OVDVPgRa2jZ1Z3ix 2D/MKctMjhoJ0BJjgI7NT7WNIu7pq1DwJTn0FQPKe4ySYAMJpQgxvN4avmZLGSE1FczS loQ3CiKygq4V5O8yS0ymzP+9AKO58HkoG859LcHtirDf9m1Txo6mXAKb+WdmEB5ibdcK nbTA== X-Gm-Message-State: AO0yUKVJVkZeuV7t3IprWqOENVg+zYIdy3LdEyivKN8Vfu5e67liBieQ S3o+Zt+tb4NhEIt+0tOTFmU= X-Google-Smtp-Source: AK7set/pN1qkE5vW6sEUJPLmtT3NU/AerHL7q01jFwtmVHRhEkUBx1PKvOIigiSi96c+JcvQpUaLgA== X-Received: by 2002:a17:90a:11:b0:22c:8dfe:d6a6 with SMTP id 17-20020a17090a001100b0022c8dfed6a6mr9294792pja.4.1675408762194; Thu, 02 Feb 2023 23:19:22 -0800 (PST) Received: from bobo.ibm.com (193-116-117-77.tpgi.com.au. [193.116.117.77]) by smtp.gmail.com with ESMTPSA id f20-20020a637554000000b004df4ba1ebfesm877558pgn.66.2023.02.02.23.19.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 23:19:21 -0800 (PST) From: Nicholas Piggin To: Andrew Morton Subject: [PATCH v7 5/5] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Date: Fri, 3 Feb 2023 17:18:37 +1000 Message-Id: <20230203071837.1136453-6-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230203071837.1136453-1-npiggin@gmail.com> References: <20230203071837.1136453-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Rik van Riel , Will Deacon , Peter Zijlstra , Linus Torvalds , Dave Hansen , linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-mm@kvack.org, Andy Lutomirski , Catalin Marinas , Nadav Amit Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On a 16-socket 192-core POWER8 system, the context_switch1_threads benchmark from will-it-scale (see earlier changelog), upstream can achieve a rate of about 1 million context switches per second, due to contention on the mm refcount. 64s meets the prerequisites for CONFIG_MMU_LAZY_TLB_SHOOTDOWN, so enable the option. This increases the above benchmark to 118 million context switches per second. This generates 314 additional IPI interrupts on a 144 CPU system doing a kernel compile, which is in the noise in terms of kernel cycles. Acked-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- arch/powerpc/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index b8c4ac56bddc..600ace5a7f1a 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -265,6 +265,7 @@ config PPC select MMU_GATHER_PAGE_SIZE select MMU_GATHER_RCU_TABLE_FREE select MMU_GATHER_MERGE_VMAS + select MMU_LAZY_TLB_SHOOTDOWN if PPC_BOOK3S_64 select MODULES_USE_ELF_RELA select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE select NEED_PER_CPU_EMBED_FIRST_CHUNK if PPC64