From patchwork Fri Sep 23 00:54:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lu, Davina" X-Patchwork-Id: 1681357 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=a4ld=z2=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=amazon.com header.i=@amazon.com header.a=rsa-sha256 header.s=amazon201209 header.b=Obuy8krZ; dkim-atps=neutral Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MYYbX0MjRz1yqM for ; Fri, 23 Sep 2022 10:55:20 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4MYYbX08wJz4x1L for ; Fri, 23 Sep 2022 10:55:20 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4MYYbX06tdz4xGN; Fri, 23 Sep 2022 10:55:20 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: gandalf.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=amazon.com header.i=@amazon.com header.a=rsa-sha256 header.s=amazon201209 header.b=Obuy8krZ; dkim-atps=neutral Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4MYYbW5xcFz4x1L for ; Fri, 23 Sep 2022 10:55:19 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231580AbiIWAzS (ORCPT ); Thu, 22 Sep 2022 20:55:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231556AbiIWAzQ (ORCPT ); Thu, 22 Sep 2022 20:55:16 -0400 Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com [207.171.190.10]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7F3578222; Thu, 22 Sep 2022 17:55:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1663894517; x=1695430517; h=from:to:cc:subject:date:message-id:mime-version; bh=YgcEuKjEWfOKVyJG17iTn15cj0QAKaMPA6iEniYStng=; b=Obuy8krZLvurCU9RDI8ntBhdW6dsivW3+feM8sohxwcd2gPNfPWkYDEA fymfY4AYjAVmfQT4GO2taFa8WJC6t850IfX7rWhMu5WddUpBwfUkB9DiJ HzqTPFxTWKEcP0j+LIwXT4FaPbLBbBWYjGZXRP0ogRww5oul1DVdL+dCF M=; X-Amazon-filename: ftest_write.sh, allow more ext4-rsv-conversion workqueue.patch X-IronPort-AV: E=Sophos;i="5.93,337,1654560000"; d="sh'?scan'208,223";a="227962107" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-iad-1a-828bd003.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-33001.sea14.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2022 00:55:02 +0000 Received: from EX13MTAUWB001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan2.iad.amazon.com [10.40.163.34]) by email-inbound-relay-iad-1a-828bd003.us-east-1.amazon.com (Postfix) with ESMTPS id 9CAAF80E93; Fri, 23 Sep 2022 00:55:00 +0000 (UTC) Received: from EX13D23UWC002.ant.amazon.com (10.43.162.22) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Fri, 23 Sep 2022 00:55:00 +0000 Received: from EX19D017UWC003.ant.amazon.com (10.13.139.227) by EX13D23UWC002.ant.amazon.com (10.43.162.22) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Fri, 23 Sep 2022 00:54:59 +0000 Received: from EX19D017UWC003.ant.amazon.com ([fe80::78e9:1d67:81fd:68c5]) by EX19D017UWC003.ant.amazon.com ([fe80::78e9:1d67:81fd:68c5%6]) with mapi id 15.02.1118.012; Fri, 23 Sep 2022 00:54:59 +0000 From: "Lu, Davina" To: "linux-ext4@vger.kernel.org" , "linux-kernel@vger.kernel.org" CC: "Kiselev, Oleg" , "Liu, Frank" Subject: significant drop fio IOPS performance on v5.10 Thread-Topic: significant drop fio IOPS performance on v5.10 Thread-Index: AdjO5vTzPItE0he4TsCHXHwBRGK6iA== Date: Fri, 23 Sep 2022 00:54:59 +0000 Message-ID: <357ace228adf4e859df5e9f3f4f18b49@amazon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [10.43.160.104] MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hello, I was profiling the 5.10 kernel and comparing it to 4.14. On a system with 64 virtual CPUs and 256 GiB of RAM, I am observing a significant drop in IO performance. Using the following FIO with the script "sudo ftest_write.sh " in attachment, I saw FIO iops result drop from 22K to less than 1K. The script simply does: mount a the EXT4 16GiB volume with max IOPS 64000K, mounting option is " -o noatime,nodiratime,data=ordered", then run fio with 2048 fio wring thread with 28800000 file size with { --name=16kb_rand_write_only_2048_jobs --directory=/rdsdbdata1 --rw=randwrite --ioengine=sync --buffered=1 --bs=16k --max-jobs=2048 --numjobs=2048 --runtime=60 --time_based --thread --filesize=28800000 --fsync=1 --group_reporting }. My analyzing is that the degradation is introduce by commit {244adf6426ee31a83f397b700d964cff12a247d3} and the issue is the contention on rsv_conversion_wq. The simplest option is to increase the journal size, but that introduces more operational complexity. Another option is to add the following change in attachment "allow more ext4-rsv-conversion workqueue.patch" From 27e1b0e14275a281b3529f6a60c7b23a81356751 Mon Sep 17 00:00:00 2001 From: davinalu Date: Fri, 23 Sep 2022 00:43:53 +0000 Subject: [PATCH] allow more ext4-rsv-conversion workqueue to speedup fio writing --- fs/ext4/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) My thought is: If the max_active is 1, it means the "__WQ_ORDERED" combined with WQ_UNBOUND setting, based on alloc_workqueue(). So I added it . I am not sure should we need "__WQ_ORDERED" or not? without "__WQ_ORDERED" it looks also work at my testbed, but I added since not much fio TP difference on my testbed result with/out "__WQ_ORDERED". From My understanding and observation: with dioread_unlock and delay_alloc both enabled, the bio_endio() and ext4_writepages() will trigger this work queue to ext4_do_flush_completed_IO(). Looks like the work queue is an one-by-one updating: at EXT4 extend.c io_end->list_vec list only have one io_end_vec each time. So if the BIO has high performance, and we have only one thread to do EXT4 flush will be an bottleneck here. The "ext4-rsv-conversion" this workqueue is mainly for update the EXT4_IO_END_UNWRITTEN extend block(only exist on dioread_unlock and delay_alloc options are set) and extend status if I understand correctly here. Am I correct? This works on my test system and passes xfstests, but will this cause any corruption on ext4 extends blocks updates, not even sure about the journal transaction updates either? Can you tell me what I will break if this change is made? Thanks Davina diff --git a/fs/ext4/super.c b/fs/ext4/super.c index a0af833f7da7..6b34298cdc3b 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4963,7 +4963,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) * concurrency isn't really necessary. Limit it to 1. */ EXT4_SB(sb)->rsv_conversion_wq = - alloc_workqueue("ext4-rsv-conversion", WQ_MEM_RECLAIM | WQ_UNBOUND, 1); + alloc_workqueue("ext4-rsv-conversion", WQ_MEM_RECLAIM | WQ_UNBOUND | __WQ_ORDERED, 0); if (!EXT4_SB(sb)->rsv_conversion_wq) { printk(KERN_ERR "EXT4-fs: failed to create workqueue\n"); ret = -ENOMEM;