From patchwork Wed Nov 18 01:34:32 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daeho Jeong X-Patchwork-Id: 545836 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44C7414030F for ; Wed, 18 Nov 2015 12:34:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754229AbbKRBdm (ORCPT ); Tue, 17 Nov 2015 20:33:42 -0500 Received: from mailout1.samsung.com ([203.254.224.24]:42827 "EHLO mailout1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751702AbbKRBdl (ORCPT ); Tue, 17 Nov 2015 20:33:41 -0500 Received: from epcpsbgr5.samsung.com (u145.gpu120.samsung.co.kr [203.254.230.145]) by mailout1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0NXZ00393LO39780@mailout1.samsung.com> for linux-ext4@vger.kernel.org; Wed, 18 Nov 2015 10:33:39 +0900 (KST) Received: from epcpsbgm2new.samsung.com ( [203.254.230.46]) by epcpsbgr5.samsung.com (EPCPMTA) with SMTP id 7B.D4.05385.275DB465; Wed, 18 Nov 2015 10:33:39 +0900 (KST) X-AuditID: cbfee691-f79d66d000001509-d0-564bd5720c9a Received: from epmmp1.local.host ( [203.254.227.16]) by epcpsbgm2new.samsung.com (EPCPMTA) with SMTP id EF.EC.18629.275DB465; Wed, 18 Nov 2015 10:33:38 +0900 (KST) Received: from localhost.localdomain ([10.253.105.135]) by mmp1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0NXZ002Y7LO00D70@mmp1.samsung.com>; Wed, 18 Nov 2015 10:33:38 +0900 (KST) From: Daeho Jeong To: tytso@mit.edu, linux-ext4@vger.kernel.org, daeho.jeong@samsung.com Subject: [PATCH 1/3] ext4: handle unwritten or delalloc buffers before enabling per-file data journaling Date: Wed, 18 Nov 2015 10:34:32 +0900 Message-id: <1447810474-14840-1-git-send-email-daeho.jeong@samsung.com> X-Mailer: git-send-email 1.7.9.5 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprOLMWRmVeSWpSXmKPExsVy+t8zPd3iq95hBjPzLPZ8WsdmMXPeHTaL 1p6f7A7MHk1njjJ79G1ZxejxeZNcAHMUl01Kak5mWWqRvl0CV8bb4+/YCp6KVDR+Wc3WwPhL oIuRk0NCwERizqOZjBC2mMSFe+vZuhi5OIQEVjJKHF//mBmm6OKeHYwQiaWMErOWbWKHcH4y SmxZeYkFpIpNQFti+vJZ7CC2iICbROfpGUBxDg5hgSyJ3016IGEWAVWJDatOgg3lFXCX+Pr6 KTtIiYSAgsScSTYgIyUEzrBJzF//gA2iXkDi2+RDLBA1shKbDkDdIylxcMUNlgmMAgsYGVYx iqYWJBcUJ6UXmeoVJ+YWl+al6yXn525ihITXxB2M9w9YH2IU4GBU4uFNWOwdJsSaWFZcmXuI 0RRow0RmKdHkfGAQ55XEGxqbGVmYmpgaG5lbmimJ8+pI/wwWEkhPLEnNTk0tSC2KLyrNSS0+ xMjEwSnVwKh3TrZ2xsnSuBlLSw7/Ov9g8wG7iy2C5UIX329ONT8XE7I2/NMqQ0H/PRuV2CZ3 7n3gqikdsLZl1vTMj8dlFukICk4+o+LQF7nR83b/971Cc806Q857l/g7Tl3z+92c7cophfVL pIVUrrOud1TbniE66WRK/hvjzcpH2qX9Sm5tefhAol5n2hclluKMREMt5qLiRACzr64FKgIA AA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsVy+t9jAd2iq95hBtPWWlrs+bSOzWLmvDts Fq09P9kdmD2azhxl9ujbsorR4/MmuQDmqAZGm4zUxJTUIoXUvOT8lMy8dFsl7+B453hTMwND XUNLC3MlhbzE3FRbJRefAF23zBygTUoKZYk5pUChgMTiYiV9O0wTQkPcdC1gGiN0fUOC4HqM DNBAwjrGjLfH37EVPBWpaPyymq2B8ZdAFyMnh4SAicTFPTsYIWwxiQv31rN1MXJxCAksZZSY tWwTO4Tzk1Fiy8pLLCBVbALaEtOXz2IHsUUE3CQ6T88AinNwCAtkSfxu0gMJswioSmxYdZIZ xOYVcJf4+vopO0iJhICCxJxJNhMYuRYwMqxilEgtSC4oTkrPNcpLLdcrTswtLs1L10vOz93E CA7iZ9I7GA/vcj/EKMDBqMTDy7HQO0yINbGsuDL3EKMEB7OSCO/Vw0Ah3pTEyqrUovz4otKc 1OJDjKZA+ycyS4km5wMjLK8k3tDYxMzI0sjM2MTc2FhJnFff0yhMSCA9sSQ1OzW1ILUIpo+J g1OqgfGYJmfntZq3Rl37446FBObONP9002X1pXkrWrnFwv4FCHhacLb4ip6u2nJubeH717cO vynU21H76dZFZ43aVe5LW2feXXHb9aX6iSW7vNufvdww/UpNtp4s2yS2VSq21lunuCxJ5Ixa bjvluo117fWScu6fT3ZUfZleuenHjTILrod/rOYzuAYosRRnJBpqMRcVJwIA1BIBXXgCAAA= DLP-Filter: Pass X-MTR: 20000000000000000@CPGS X-CFilter-Loop: Reflected Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org We already allocate delalloc blocks before changing the inode mode into "per-file data journal" mode to prevent delalloc blocks from remaining not allocated, but another issue concerned with "BH_Unwritten" status still exists. For example, by fallocate(), several buffers' status change into "BH_Unwritten", but these buffers cannot be processed by ext4_alloc_da_blocks(). So, they still remain in unwritten status after per-file data journaling is enabled and they cannot be changed into written status any more and, if they are journaled and eventually checkpointed, these unwritten buffer will cause a kernel panic by the below BUG_ON() function of submit_bh_wbc() when they are submitted during checkpointing. static int submit_bh_wbc(int rw, struct buffer_head *bh,... { ... BUG_ON(buffer_unwritten(bh)); Moreover, when "dioread_nolock" option is enabled, the status of a buffer is changed into "BH_Unwritten" after write_begin() completes and the "BH_Unwritten" status will be cleared after I/O is done. Therefore, if a buffer's status is changed into unwrutten but the buffer's I/O is not submitted and completed, it can cause the same problem after enabling per-file data journaling. You can easily generate this bug by executing the following command. ./kvm-xfstests -C 10000 -m nodelalloc,dioread_nolock generic/269 To resolve these problems and define a boundary between the previous mode and per-file data journaling mode, we need to flush and wait all the I/O of buffers of a file before enabling per-file data journaling of the file. Signed-off-by: Daeho Jeong Reviewed-by: Jan Kara --- fs/ext4/inode.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 612fbcf..1f9458e 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5168,9 +5168,14 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) * be allocated any more. even more truncate on delalloc blocks * could trigger BUG by flushing delalloc blocks in journal. * There is no delalloc block in non-journal data mode. + * We also have to handle unwritten buffers generated by + * fallocate() and dioread_nolock option. Once per-file data + * journaling is enabled, unwritten buffers will remain in + * unwritten status forever and they will be the seeds of + * kernel panic when they are checkpointed. */ - if (val && test_opt(inode->i_sb, DELALLOC)) { - err = ext4_alloc_da_blocks(inode); + if (val) { + err = filemap_write_and_wait(inode->i_mapping); if (err < 0) return err; }