From patchwork Wed Sep 21 03:19:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: AceLan Kao X-Patchwork-Id: 1680493 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ohOOu1Kx; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MXNv66WRjz1yqL for ; Wed, 21 Sep 2022 13:19:45 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oaqGw-0000pn-D5; Wed, 21 Sep 2022 03:19:38 +0000 Received: from mail-pg1-f169.google.com ([209.85.215.169]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oaqGt-0000mx-Rg for kernel-team@lists.ubuntu.com; Wed, 21 Sep 2022 03:19:36 +0000 Received: by mail-pg1-f169.google.com with SMTP id 78so4604645pgb.13 for ; Tue, 20 Sep 2022 20:19:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:sender:from:to:cc:subject:date; bh=9jAEPVXr2JazeIthBaiVYG8LONp4ziGcH4IUB2g+eLk=; b=ohOOu1Kx65Uun9j08j8BQUJwDX92bFKgHYZSsgRwUlrBA/lEWTKlaB0ROSsl89EM/T yfXQwgzw/6d368ODN42oc45JRqAWeg3puJ9D3+iOuyGsdu+Whejgc4TMOKuh41LVN/oe M4E0pd2oTfdsKL1oUc/Ri/aobVQcBMnh0xDSp+VZz3pyNf045eIe3G6sbtiahE3twCcN CzUcln599SzC5hakIXvY6OTOmq8lQgAAR6e2oMe+4gxlU2IbykAkXId14EvTkb7LboX9 35Ib7uD62LjZ+g/nxNAji99OjHtYfu3gGt0t0e9k+YsbfsL4WP3vFrFiGT6/z7Sv14U3 WXFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:sender:x-gm-message-state:from:to :cc:subject:date; bh=9jAEPVXr2JazeIthBaiVYG8LONp4ziGcH4IUB2g+eLk=; b=oB7/yMZrJkJjNojdqERwizY4uaJ+G7jqp+EIJD7nXartYhP50w7ucaJ0t4CUTk8n+a im+iQ5r53XfvpAfG/5HTJRAHr6McJS2VhsMg8UtFIHLxMFD7nGN5Ayx4i3N013x93xDk al/MIMMSlcGYbvXr/9TRF8j/CCanrnWd2MDaCso2HiGrhTbInSftTnXiWplQS1vCxJ8d zESlHeoux4k5V/JotBdfoAXMdA3Pbf3w36LT8cBpf/BCIz9lrqVjQ6s+GMUKNcLdTbIz QS6HLsgtsb3pGlYwyb9fMKC52EKnF2+02/dgDp2X134O0hlFf13iWHQMqTUlKAL/pICV sWLw== X-Gm-Message-State: ACrzQf3+zM9bv9FJFmHwAoyNdLbgknLby8HYe0tTtJVNNxDXxcPH7RTE dlcOJkuhWol0+WQKY7yX04QFkPvSbkQ= X-Google-Smtp-Source: AMsMyM49V/n8fEi6v9qfIJTYFYOG3SwufhpBqVIHUlvBHpat4xO92XsC5QHzpT2QrqW2ChjePusYBg== X-Received: by 2002:aa7:838a:0:b0:550:5e4a:ab55 with SMTP id u10-20020aa7838a000000b005505e4aab55mr9536944pfm.10.1663730373894; Tue, 20 Sep 2022 20:19:33 -0700 (PDT) Received: from localhost (220-135-95-34.hinet-ip.hinet.net. [220.135.95.34]) by smtp.gmail.com with ESMTPSA id u12-20020a170903124c00b00176b87a697fsm645651plh.269.2022.09.20.20.19.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Sep 2022 20:19:33 -0700 (PDT) From: AceLan Kao To: kernel-team@lists.ubuntu.com Subject: [PATCH 4/8][SRU][OEM-5.17] mm/memcg: protect per-CPU counter by disabling preemption on PREEMPT_RT where needed. Date: Wed, 21 Sep 2022 11:19:20 +0800 Message-Id: <20220921031924.2354693-5-acelan.kao@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220921031924.2354693-1-acelan.kao@canonical.com> References: <20220921031924.2354693-1-acelan.kao@canonical.com> MIME-Version: 1.0 Received-SPF: pass client-ip=209.85.215.169; envelope-from=acelan@gmail.com; helo=mail-pg1-f169.google.com X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Sebastian Andrzej Siewior BugLink: https://launchpad.net/bugs/1990330 The per-CPU counter are modified with the non-atomic modifier. The consistency is ensured by disabling interrupts for the update. On non PREEMPT_RT configuration this works because acquiring a spinlock_t typed lock with the _irq() suffix disables interrupts. On PREEMPT_RT configurations the RMW operation can be interrupted. Another problem is that mem_cgroup_swapout() expects to be invoked with disabled interrupts because the caller has to acquire a spinlock_t which is acquired with disabled interrupts. Since spinlock_t never disables interrupts on PREEMPT_RT the interrupts are never disabled at this point. The code is never called from in_irq() context on PREEMPT_RT therefore disabling preemption during the update is sufficient on PREEMPT_RT. The sections which explicitly disable interrupts can remain on PREEMPT_RT because the sections remain short and they don't involve sleeping locks (memcg_check_events() is doing nothing on PREEMPT_RT). Disable preemption during update of the per-CPU variables which do not explicitly disable interrupts. Link: https://lkml.kernel.org/r/20220226204144.1008339-4-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Cc: kernel test robot Cc: Michal Hocko Cc: Michal Hocko Cc: Michal Koutný Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Vladimir Davydov Cc: Waiman Long Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds (cherry picked from commit be3e67b54b437123e6144da31cf312ddcaa5aef2) Signed-off-by: Chia-Lin Kao (AceLan) --- mm/memcontrol.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 39b980ae80a7..2179c581a6a8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -632,6 +632,35 @@ static u64 flush_next_time; #define FLUSH_TIME (2UL*HZ) +/* + * Accessors to ensure that preemption is disabled on PREEMPT_RT because it can + * not rely on this as part of an acquired spinlock_t lock. These functions are + * never used in hardirq context on PREEMPT_RT and therefore disabling preemtion + * is sufficient. + */ +static void memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#else + VM_BUG_ON(!irqs_disabled()); +#endif +} + +static void __memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#endif +} + +static void memcg_stats_unlock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_enable(); +#endif +} + static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) { unsigned int x; @@ -715,6 +744,27 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg = pn->memcg; + /* + * The caller from rmap relay on disabled preemption becase they never + * update their counter from in-interrupt context. For these two + * counters we check that the update is never performed from an + * interrupt context while other caller need to have disabled interrupt. + */ + __memcg_stats_lock(); + if (IS_ENABLED(CONFIG_DEBUG_VM) && !IS_ENABLED(CONFIG_PREEMPT_RT)) { + switch (idx) { + case NR_ANON_MAPPED: + case NR_FILE_MAPPED: + case NR_ANON_THPS: + case NR_SHMEM_PMDMAPPED: + case NR_FILE_PMDMAPPED: + WARN_ON_ONCE(!in_task()); + break; + default: + WARN_ON_ONCE(!irqs_disabled()); + } + } + /* Update memcg */ __this_cpu_add(memcg->vmstats_percpu->state[idx], val); @@ -722,6 +772,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); memcg_rstat_updated(memcg, val); + memcg_stats_unlock(); } /** @@ -804,8 +855,10 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, if (mem_cgroup_disabled()) return; + memcg_stats_lock(); __this_cpu_add(memcg->vmstats_percpu->events[idx], count); memcg_rstat_updated(memcg, count); + memcg_stats_unlock(); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) @@ -7174,8 +7227,9 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) * important here to have the interrupts disabled because it is the * only synchronisation we have for updating the per-CPU variables. */ - VM_BUG_ON(!irqs_disabled()); + memcg_stats_lock(); mem_cgroup_charge_statistics(memcg, -nr_entries); + memcg_stats_unlock(); memcg_check_events(memcg, page_to_nid(page)); css_put(&memcg->css);