From patchwork Tue Mar 8 12:24:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 594063 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id B87B0140BB4; Tue, 8 Mar 2016 23:24:57 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1adGh4-0006Xe-Nw; Tue, 08 Mar 2016 12:24:54 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1adGgs-0006RB-Ig for kernel-team@lists.ubuntu.com; Tue, 08 Mar 2016 12:24:42 +0000 Received: from av-217-129-142-138.netvisao.pt ([217.129.142.138] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1adGgs-0005OI-2n; Tue, 08 Mar 2016 12:24:42 +0000 From: Luis Henriques To: Eryu Guan Subject: [3.16.y-ckt stable] Patch "ext4: don't read blocks from disk after extents being swapped" has been added to the 3.16.y-ckt tree Date: Tue, 8 Mar 2016 12:24:41 +0000 Message-Id: <1457439881-19010-1-git-send-email-luis.henriques@canonical.com> X-Extended-Stable: 3.16 Cc: kernel-team@lists.ubuntu.com, Theodore Ts'o X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled ext4: don't read blocks from disk after extents being swapped to the linux-3.16.y-queue branch of the 3.16.y-ckt extended stable tree which can be found at: http://kernel.ubuntu.com/git/ubuntu/linux.git/log/?h=linux-3.16.y-queue This patch is scheduled to be released in version 3.16.7-ckt26. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.16.y-ckt tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Luis ---8<------------------------------------------------------------ From ce6288ccdf8e0c8c24bf3fc0375608f37de19c7e Mon Sep 17 00:00:00 2001 From: Eryu Guan Date: Fri, 12 Feb 2016 01:20:43 -0500 Subject: ext4: don't read blocks from disk after extents being swapped commit bcff24887d00bce102e0857d7b0a8c44a40f53d1 upstream. I notice ext4/307 fails occasionally on ppc64 host, reporting md5 checksum mismatch after moving data from original file to donor file. The reason is that move_extent_per_page() calls __block_write_begin() and block_commit_write() to write saved data from original inode blocks to donor inode blocks, but __block_write_begin() not only maps buffer heads but also reads block content from disk if the size is not block size aligned. At this time the physical block number in mapped buffer head is pointing to the donor file not the original file, and that results in reading wrong data to page, which get written to disk in following block_commit_write call. This also can be reproduced by the following script on 1k block size ext4 on x86_64 host: mnt=/mnt/ext4 donorfile=$mnt/donor testfile=$mnt/testfile e4compact=~/xfstests/src/e4compact rm -f $donorfile $testfile # reserve space for donor file, written by 0xaa and sync to disk to # avoid EBUSY on EXT4_IOC_MOVE_EXT xfs_io -fc "pwrite -S 0xaa 0 1m" -c "fsync" $donorfile # create test file written by 0xbb xfs_io -fc "pwrite -S 0xbb 0 1023" -c "fsync" $testfile # compute initial md5sum md5sum $testfile | tee md5sum.txt # drop cache, force e4compact to read data from disk echo 3 > /proc/sys/vm/drop_caches # test defrag echo "$testfile" | $e4compact -i -v -f $donorfile # check md5sum md5sum -c md5sum.txt Fix it by creating & mapping buffer heads only but not reading blocks from disk, because all the data in page is guaranteed to be up-to-date in mext_page_mkuptodate(). Signed-off-by: Eryu Guan Signed-off-by: Theodore Ts'o [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques --- fs/ext4/move_extent.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 2484c7ec6a72..146f8cd627b2 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -915,10 +915,11 @@ move_extent_per_page(struct file *o_filp, struct inode *donor_inode, unsigned long blocksize = orig_inode->i_sb->s_blocksize; unsigned int w_flags = 0; unsigned int tmp_data_size, data_size, replaced_size; - int err2, jblocks, retries = 0; + int i, err2, jblocks, retries = 0; int replaced_count = 0; int from = data_offset_in_page << orig_inode->i_blkbits; int blocks_per_page = PAGE_CACHE_SIZE >> orig_inode->i_blkbits; + struct buffer_head *bh = NULL; /* * It needs twice the amount of ordinary journal buffers because @@ -1027,8 +1028,16 @@ data_copy: } /* Perform all necessary steps similar write_begin()/write_end() * but keeping in mind that i_size will not change */ - *err = __block_write_begin(pagep[0], from, replaced_size, - ext4_get_block); + if (!page_has_buffers(pagep[0])) + create_empty_buffers(pagep[0], 1 << orig_inode->i_blkbits, 0); + bh = page_buffers(pagep[0]); + for (i = 0; i < data_offset_in_page; i++) + bh = bh->b_this_page; + for (i = 0; i < block_len_in_page; i++) { + *err = ext4_get_block(orig_inode, orig_blk_offset + i, bh, 0); + if (*err < 0) + break; + } if (!*err) *err = block_commit_write(pagep[0], from, from + replaced_size);