From patchwork Wed Jun 23 13:13:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Pisati X-Patchwork-Id: 1496114 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G93cq61NKz9sVm; Wed, 23 Jun 2021 23:14:03 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lw2hX-0000a8-7n; Wed, 23 Jun 2021 13:13:55 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lw2hV-0000a2-7a for kernel-team@lists.ubuntu.com; Wed, 23 Jun 2021 13:13:53 +0000 Received: from 1.general.ppisati.uk.vpn ([10.172.193.134] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1lw2hU-0003pG-Tr for kernel-team@lists.ubuntu.com; Wed, 23 Jun 2021 13:13:53 +0000 From: Paolo Pisati To: kernel-team@lists.ubuntu.com Subject: [PATCH] [act] UBUNTU: SAUCE: ubuntu_kernel_selftests: disable memory-hotplug Date: Wed, 23 Jun 2021 15:13:52 +0200 Message-Id: <20210623131352.51873-1-paolo.pisati@canonical.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" The memory-hotplug test has been intermittently timing out (or trashing the test VM, see below) on Impish/Hirsute ppc64el and x86-64 for quite some time now. Upon further investigation, we found that memory-hotplug has a tendency to spam the system logs (kernel.log, syslog and the systemd-journal) with thousands and thousands (up to several GBs) of dump_page() entries like this: ... [ 898.286185] migrating pfn 11c462 failed ret:1 [ 898.286186] page:00000000491a3636 refcount:3 mapcount:0 mapping:00000000e646cbed index:0xc00066 pfn:0x11c462 [ 898.286188] memcg:ffff947290991000 [ 898.286188] aops:def_blk_aops ino:800002 [ 898.286191] flags: 0x17ffffc0002022(referenced|active|private|node=0|zone=2|lastcpupid=0x1fffff) [ 898.286193] raw: 0017ffffc0002022 ffffb3618ba03ba8 ffffb3618ba03ba8 ffff947287522ab0 [ 898.286195] raw: 0000000000c00066 ffff947281729340 00000003ffffffff ffff947290991000 [ 898.286196] page dumped because: migration failure ... At this point, two things can happen: a) the constant flow of printk() slows down the VM to the point a timeout triggers (either autotest timeout or kernel selftests timeout, it doesn't matter), terminates memory-hotplug and the VM resume processing the remaning ubuntu_kernel_selftests jobs or b) the filesystem fills up to 100%, memory-hotplug fails, but so does every remaining test jobs since the VM is in an unusable state at this point Given we already disable memory-hotplug for arm* and cloud kernels, and to avoid having our tests session be trashed by this single test, i propose to disable it entirely, or at least until a ratelimit solution is put in place. If you want to reproduce this issue, just provision an openstack instance (small, medium or large - size doesn't matter) and you will always endup in scenario "b". Signed-off-by: Paolo Pisati --- ubuntu_kernel_selftests/control | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ubuntu_kernel_selftests/control b/ubuntu_kernel_selftests/control index e2874196..53c5584a 100644 --- a/ubuntu_kernel_selftests/control +++ b/ubuntu_kernel_selftests/control @@ -12,7 +12,7 @@ DOC = "" name = 'ubuntu_kernel_selftests' -tests = [ 'setup','breakpoints','cpu-hotplug','efivarfs','memfd','memory-hotplug','mount','net','ptrace','seccomp','timers','powerpc','user','ftrace' ] +tests = [ 'setup','breakpoints','cpu-hotplug','efivarfs','memfd','mount','net','ptrace','seccomp','timers','powerpc','user','ftrace' ] # # The seccomp tests on 4.19+ on non-x86 are known to be fail and