From patchwork Thu Apr 6 13:28:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 1766072 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=vn0o=75=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Psj5d0X2Xz1yZF for ; Thu, 6 Apr 2023 23:29:25 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Psj5d05xFz4x5c for ; Thu, 6 Apr 2023 23:29:25 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4Psj5d03rfz4xDt; Thu, 6 Apr 2023 23:29:25 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Psj5d00m4z4x5c for ; Thu, 6 Apr 2023 23:29:24 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238518AbjDFN3W (ORCPT ); Thu, 6 Apr 2023 09:29:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238523AbjDFN3N (ORCPT ); Thu, 6 Apr 2023 09:29:13 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72DED65A9; Thu, 6 Apr 2023 06:29:11 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Psj2L16N8zKx5m; Thu, 6 Apr 2023 21:26:34 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 6 Apr 2023 21:29:03 +0800 From: Baokun Li To: CC: , , , , , , , , Subject: [PATCH v2 1/2] ext4: only update i_reserved_data_blocks on successful block allocation Date: Thu, 6 Apr 2023 21:28:33 +0800 Message-ID: <20230406132834.1669710-2-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230406132834.1669710-1-libaokun1@huawei.com> References: <20230406132834.1669710-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org In our fault injection test, we create an ext4 file, migrate it to non-extent based file, then punch a hole and finally trigger a WARN_ON in the ext4_da_update_reserve_space(): EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: ino 14, used 11 with only 10 reserved data blocks When writing back a non-extent based file, if we enable delalloc, the number of reserved blocks will be subtracted from the number of blocks mapped by ext4_ind_map_blocks(), and the extent status tree will be updated. We update the extent status tree by first removing the old extent_status and then inserting the new extent_status. If the block range we remove happens to be in an extent, then we need to allocate another extent_status with ext4_es_alloc_extent(). use old to remove to add new |----------|------------|------------| old extent_status The problem is that the allocation of a new extent_status failed due to a fault injection, and __es_shrink() did not get free memory, resulting in a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, we map to the same extent again, and the number of reserved blocks is again subtracted from the number of blocks in that extent. Since the blocks in the same extent are subtracted twice, we end up triggering WARN_ON at ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks. For non-extent based file, we update the number of reserved blocks after ext4_ind_map_blocks() is executed, which causes a problem that when we call ext4_ind_map_blocks() to create a block, it doesn't always create a block, but we always reduce the number of reserved blocks. So we move the logic for updating reserved blocks to ext4_ind_map_blocks() to ensure that the number of reserved blocks is updated only after we do succeed in allocating some new blocks. Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate") Signed-off-by: Baokun Li --- V1->V2: Modify the patch description and add the Fixes tag. fs/ext4/indirect.c | 8 ++++++++ fs/ext4/inode.c | 10 ---------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c index c68bebe7ff4b..9acab70ddf5e 100644 --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -651,6 +651,14 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, ext4_update_inode_fsync_trans(handle, inode, 1); count = ar.len; + + /* + * Update reserved blocks/metadata blocks after successful block + * allocation which had been deferred till now. + */ + if ((count > 0) && (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)) + ext4_da_update_reserve_space(inode, count, 1); + got_it: map->m_flags |= EXT4_MAP_MAPPED; map->m_pblk = le32_to_cpu(chain[depth-1].key); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 97eb728cb958..33ae92f0ddfb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -659,16 +659,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, */ ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE); } - - /* - * Update reserved blocks/metadata blocks after successful - * block allocation which had been deferred till now. We don't - * support fallocate for non extent files. So we can update - * reserve space here. - */ - if ((retval > 0) && - (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)) - ext4_da_update_reserve_space(inode, retval, 1); } if (retval > 0) { From patchwork Thu Apr 6 13:28:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 1766070 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=vn0o=75=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Psj5Z1z5Jz1yZF for ; Thu, 6 Apr 2023 23:29:21 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Psj5X6RHYz4x1d for ; Thu, 6 Apr 2023 23:29:20 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4Psj5X6P20z4x5c; Thu, 6 Apr 2023 23:29:20 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Psj5X1czfz4x1d for ; Thu, 6 Apr 2023 23:29:20 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237316AbjDFN3T (ORCPT ); Thu, 6 Apr 2023 09:29:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238519AbjDFN3N (ORCPT ); Thu, 6 Apr 2023 09:29:13 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2AB5659E; Thu, 6 Apr 2023 06:29:11 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Psj2L4nyDzKwxp; Thu, 6 Apr 2023 21:26:34 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 6 Apr 2023 21:29:04 +0800 From: Baokun Li To: CC: , , , , , , , , Subject: [PATCH v2 2/2] ext4: use __GFP_NOFAIL if allocating extents_status cannot fail Date: Thu, 6 Apr 2023 21:28:34 +0800 Message-ID: <20230406132834.1669710-3-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230406132834.1669710-1-libaokun1@huawei.com> References: <20230406132834.1669710-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org If extent status tree update fails, we have inconsistency between what is stored in the extent status tree and what is stored on disk. And that can cause even data corruption issues in some cases. In the extent status tree, we have extents which we can just drop without issues and extents we must not drop - this depends on the extent's status - currently ext4_es_is_delayed() extents must stay, others may be dropped. For extents that cannot be dropped we use __GFP_NOFAIL to allocate memory. A helper function is also added to help determine if the current extent can be dropped, although only ext4_es_is_delayed() extents cannot be dropped currently. In addition, with the above logic, the undo operation in __es_remove_extent that may cause inconsistency if the split extent fails is unnecessary, so we remove it as well. Suggested-by: Jan Kara Signed-off-by: Baokun Li --- V1->V2: Add the patch 2 as suggested by Jan Kara. fs/ext4/extents_status.c | 36 +++++++++++++++++++++++++++++------- 1 file changed, 29 insertions(+), 7 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 7bc221038c6c..8eed17f35b11 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -448,12 +448,29 @@ static void ext4_es_list_del(struct inode *inode) spin_unlock(&sbi->s_es_lock); } +/* + * Helper function to help determine if memory allocation for this + * extent_status is allowed to fail. + */ +static inline bool ext4_es_alloc_should_nofail(struct extent_status *es) +{ + if (ext4_es_is_delayed(es)) + return true; + + return false; +} + static struct extent_status * ext4_es_alloc_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len, - ext4_fsblk_t pblk) + ext4_fsblk_t pblk, int nofail) { struct extent_status *es; - es = kmem_cache_alloc(ext4_es_cachep, GFP_ATOMIC); + gfp_t gfp_flags = GFP_ATOMIC; + + if (nofail) + gfp_flags |= __GFP_NOFAIL; + + es = kmem_cache_alloc(ext4_es_cachep, gfp_flags); if (es == NULL) return NULL; es->es_lblk = lblk; @@ -792,9 +809,16 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes) } es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, - newes->es_pblk); - if (!es) - return -ENOMEM; + newes->es_pblk, 0); + if (!es) { + /* Use GFP_NOFAIL if the allocation cannot fail. */ + if (ext4_es_alloc_should_nofail(newes)) + es = ext4_es_alloc_extent(inode, newes->es_lblk, + newes->es_len, newes->es_pblk, 1); + else + return -ENOMEM; + } + rb_link_node(&es->rb_node, parent, p); rb_insert_color(&es->rb_node, &tree->root); @@ -1349,8 +1373,6 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_es_status(&orig_es)); err = __es_insert_extent(inode, &newes); if (err) { - es->es_lblk = orig_es.es_lblk; - es->es_len = orig_es.es_len; if ((err == -ENOMEM) && __es_shrink(EXT4_SB(inode->i_sb), 128, EXT4_I(inode)))