Message ID | 1503830683-21455-1-git-send-email-amir73il@gmail.com |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | [RFC] fstest: regression test for ext4 crash consistency bug | expand |
On 2017/08/27 18:44, Amir Goldstein wrote: > This test is motivated by a bug found in ext4 during random crash > consistency tests. > > This test uses device mapper flakey target to demonstrate the bug > found using device mapper log-writes target. > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > > Ted, > > While working on crash consistency xfstests [1], I stubmled on what > appeared to be an ext4 crash consistency bug. > > The tests I used rely on the log-writes dm target code written > by Josef Bacik, which had little exposure to the wide community > as far as I know. I wanted to prove to myself that the found > inconsistency was not due to a test bug, so I bisected the failed > test to the minimal operations that trigger the failure and wrote > a small independent test to reproduce the issue using dm flakey target. > > The following fsck error is reliably reproduced by replaying some fsx ops > on overlapping file regions, then emulating a crash, followed by mount, > umount and fsck -nf: > > ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile > 1 write 0x137dd thru 0x21445 (0xdc69 bytes) > 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) > 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) > 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) > 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) > 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) > All 7 operations completed A-OK! > _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent > *** fsck.ext4 output *** > fsck from util-linux 2.27.1 > e2fsck 1.42.13 (17-May-2015) > Pass 1: Checking inodes, blocks, and sizes > Inode 12, end of extent exceeds allowed value > (logical block 33, physical block 33441, len 7) > Clear? no > Inode 12, i_blocks is 184, should be 128. Fix? no Hi Amir, I always get the following output when running your xfstests test case 501. --------------------------------------------------------------------------- e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Inode 12, i_size is 147456, should be 163840. Fix? no --------------------------------------------------------------------------- Could you tell me how to get the expected output as you reported? Thanks, Xiao Yang > Note that the inconsistency is "applied" by journal replay during mount. > fsck -nf before mount does not report any errors. > > I did not intend for this test to be merged as is, but rather to be used > by ext4 developers to analyze the problem and then re-write the test with > more comments and less arbitrary offset/length values. > > P.S.: crash consistency tests also reliably reproduce a btrfs fsck error. > a detailed report with I/O recording was sent to Josef. > P.S.2: crash consistency tests report file data checksum errors on xfs > after fsync+crash, but I still need to prove the reliability of > these reports. > > [1] https://github.com/amir73il/xfstests/commits/dm-log-writes > > tests/generic/501 | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++ > tests/generic/501.out | 2 ++ > tests/generic/group | 1 + > 3 files changed, 83 insertions(+) > create mode 100755 tests/generic/501 > create mode 100644 tests/generic/501.out > > diff --git a/tests/generic/501 b/tests/generic/501 > new file mode 100755 > index 0000000..ccb513d > --- /dev/null > +++ b/tests/generic/501 > @@ -0,0 +1,80 @@ > +#! /bin/bash > +# FS QA Test No. 501 > +# > +# This test is motivated by a bug found in ext4 during random crash > +# consistency tests. > +# > +#----------------------------------------------------------------------- > +# Copyright (C) 2017 CTERA Networks. All Rights Reserved. > +# Author: Amir Goldstein <amir73il@gmail.com> > +# > +# This program is free software; you can redistribute it and/or > +# modify it under the terms of the GNU General Public License as > +# published by the Free Software Foundation. > +# > +# This program is distributed in the hope that it would be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program; if not, write the Free Software Foundation, > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > +#----------------------------------------------------------------------- > +# > + > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > + > +_cleanup() > +{ > + _cleanup_flakey > + cd / > + rm -f $tmp.* > +} > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmflakey > + > +# real QA test starts here > +_supported_fs generic > +_supported_os Linux > +_require_scratch > +_require_dm_target flakey > +_require_metadata_journaling $SCRATCH_DEV > + > +rm -f $seqres.full > + > +_scratch_mkfs >> $seqres.full 2>&1 > + > +_init_flakey > +_mount_flakey > + > +fsxops=$tmp.fsxops > +cat <<EOF > $fsxops > +write 0x137dd 0xdc69 0x0 > +fallocate 0xb531 0xb5ad 0x21446 > +collapse_range 0x1c000 0x4000 0x21446 > +write 0x3e5ec 0x1a14 0x21446 > +zero_range 0x20fac 0x6d9c 0x40000 keep_size > +mapwrite 0x216ad 0x274f 0x40000 > +EOF > +run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile > + > +_flakey_drop_and_remount > +_unmount_flakey > +_cleanup_flakey > +_check_scratch_fs > + > +echo "Silence is golden" > + > +status=0 > +exit > diff --git a/tests/generic/501.out b/tests/generic/501.out > new file mode 100644 > index 0000000..00133b6 > --- /dev/null > +++ b/tests/generic/501.out > @@ -0,0 +1,2 @@ > +QA output created by 501 > +Silence is golden > diff --git a/tests/generic/group b/tests/generic/group > index 2396b72..bb870f2 100644 > --- a/tests/generic/group > +++ b/tests/generic/group > @@ -454,3 +454,4 @@ > 449 auto quick acl enospc > 450 auto quick rw > 500 auto log replay > +501 auto quick metadata
On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote: > On 2017/08/27 18:44, Amir Goldstein wrote: >> This test is motivated by a bug found in ext4 during random crash >> consistency tests. >> >> This test uses device mapper flakey target to demonstrate the bug >> found using device mapper log-writes target. >> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com> >> --- >> >> Ted, >> >> While working on crash consistency xfstests [1], I stubmled on what >> appeared to be an ext4 crash consistency bug. >> >> The tests I used rely on the log-writes dm target code written >> by Josef Bacik, which had little exposure to the wide community >> as far as I know. I wanted to prove to myself that the found >> inconsistency was not due to a test bug, so I bisected the failed >> test to the minimal operations that trigger the failure and wrote >> a small independent test to reproduce the issue using dm flakey target. >> >> The following fsck error is reliably reproduced by replaying some fsx ops >> on overlapping file regions, then emulating a crash, followed by mount, >> umount and fsck -nf: >> >> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >> All 7 operations completed A-OK! >> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent >> *** fsck.ext4 output *** >> fsck from util-linux 2.27.1 >> e2fsck 1.42.13 (17-May-2015) >> Pass 1: Checking inodes, blocks, and sizes >> Inode 12, end of extent exceeds allowed value >> (logical block 33, physical block 33441, len 7) >> Clear? no >> Inode 12, i_blocks is 184, should be 128. Fix? no > Hi Amir, > > I always get the following output when running your xfstests test case 501. Now merged as test generic/456 > --------------------------------------------------------------------------- > e2fsck 1.42.9 (28-Dec-2013) > Pass 1: Checking inodes, blocks, and sizes > Inode 12, i_size is 147456, should be 163840. Fix? no > --------------------------------------------------------------------------- > > Could you tell me how to get the expected output as you reported? I can't say I am doing anything special, but I can say that I get the same output as you did when running the test inside kvm-xfstests. Actually, I could not reproduce ANY of the the crash consistency bugs inside kvm-xfstests. Must be something to do with different timing of IO with KVM+virtio disks?? When running on my laptop (Ubuntu 16.04 with latest kernel) on a 10G SSD volume, I always get the error reported above. I just re-verified with latest stable e2fsprogs (1.43.6). Amir.
On 2017/09/25 18:53, Amir Goldstein wrote: > On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com> wrote: >> On 2017/08/27 18:44, Amir Goldstein wrote: >>> This test is motivated by a bug found in ext4 during random crash >>> consistency tests. >>> >>> This test uses device mapper flakey target to demonstrate the bug >>> found using device mapper log-writes target. >>> >>> Signed-off-by: Amir Goldstein<amir73il@gmail.com> >>> --- >>> >>> Ted, >>> >>> While working on crash consistency xfstests [1], I stubmled on what >>> appeared to be an ext4 crash consistency bug. >>> >>> The tests I used rely on the log-writes dm target code written >>> by Josef Bacik, which had little exposure to the wide community >>> as far as I know. I wanted to prove to myself that the found >>> inconsistency was not due to a test bug, so I bisected the failed >>> test to the minimal operations that trigger the failure and wrote >>> a small independent test to reproduce the issue using dm flakey target. >>> >>> The following fsck error is reliably reproduced by replaying some fsx ops >>> on overlapping file regions, then emulating a crash, followed by mount, >>> umount and fsck -nf: >>> >>> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >>> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >>> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >>> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >>> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >>> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >>> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >>> All 7 operations completed A-OK! >>> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent >>> *** fsck.ext4 output *** >>> fsck from util-linux 2.27.1 >>> e2fsck 1.42.13 (17-May-2015) >>> Pass 1: Checking inodes, blocks, and sizes >>> Inode 12, end of extent exceeds allowed value >>> (logical block 33, physical block 33441, len 7) >>> Clear? no >>> Inode 12, i_blocks is 184, should be 128. Fix? no >> Hi Amir, >> >> I always get the following output when running your xfstests test case 501. > Now merged as test generic/456 > >> --------------------------------------------------------------------------- >> e2fsck 1.42.9 (28-Dec-2013) >> Pass 1: Checking inodes, blocks, and sizes >> Inode 12, i_size is 147456, should be 163840. Fix? no >> --------------------------------------------------------------------------- >> >> Could you tell me how to get the expected output as you reported? > I can't say I am doing anything special, but I can say that I get the > same output as you did when running the test inside kvm-xfstests. > Actually, I could not reproduce ANY of the the crash consistency bugs > inside kvm-xfstests. Must be something to do with different timing of > IO with KVM+virtio disks?? > > When running on my laptop (Ubuntu 16.04 with latest kernel) > on a 10G SSD volume, I always get the error reported above. > I just re-verified with latest stable e2fsprogs (1.43.6). Hi Amir, I tested generic/456 with KVM+virtio disks and SATA volumes on some kernels (including v3.10.0, the latest kernel), but i still got the same output as i reported. Could you determine whether the two different outputs are caused by the same bug or not ? Thanks, Xiao Yang. > Amir. > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > . >
On Tue, Sep 26, 2017 at 1:45 PM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote: > On 2017/09/25 18:53, Amir Goldstein wrote: >> >> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com> >> wrote: >>> >>> On 2017/08/27 18:44, Amir Goldstein wrote: >>>> >>>> This test is motivated by a bug found in ext4 during random crash >>>> consistency tests. >>>> >>>> This test uses device mapper flakey target to demonstrate the bug >>>> found using device mapper log-writes target. >>>> >>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com> >>>> --- >>>> >>>> Ted, >>>> >>>> While working on crash consistency xfstests [1], I stubmled on what >>>> appeared to be an ext4 crash consistency bug. >>>> >>>> The tests I used rely on the log-writes dm target code written >>>> by Josef Bacik, which had little exposure to the wide community >>>> as far as I know. I wanted to prove to myself that the found >>>> inconsistency was not due to a test bug, so I bisected the failed >>>> test to the minimal operations that trigger the failure and wrote >>>> a small independent test to reproduce the issue using dm flakey target. >>>> >>>> The following fsck error is reliably reproduced by replaying some fsx >>>> ops >>>> on overlapping file regions, then emulating a crash, followed by mount, >>>> umount and fsck -nf: >>>> >>>> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >>>> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >>>> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >>>> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >>>> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >>>> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >>>> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >>>> All 7 operations completed A-OK! >>>> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is >>>> inconsistent >>>> *** fsck.ext4 output *** >>>> fsck from util-linux 2.27.1 >>>> e2fsck 1.42.13 (17-May-2015) >>>> Pass 1: Checking inodes, blocks, and sizes >>>> Inode 12, end of extent exceeds allowed value >>>> (logical block 33, physical block 33441, len 7) >>>> Clear? no >>>> Inode 12, i_blocks is 184, should be 128. Fix? no >>> >>> Hi Amir, >>> >>> I always get the following output when running your xfstests test case >>> 501. >> >> Now merged as test generic/456 >> >>> >>> --------------------------------------------------------------------------- >>> e2fsck 1.42.9 (28-Dec-2013) >>> Pass 1: Checking inodes, blocks, and sizes >>> Inode 12, i_size is 147456, should be 163840. Fix? no >>> >>> --------------------------------------------------------------------------- >>> >>> Could you tell me how to get the expected output as you reported? >> >> I can't say I am doing anything special, but I can say that I get the >> same output as you did when running the test inside kvm-xfstests. >> Actually, I could not reproduce ANY of the the crash consistency bugs >> inside kvm-xfstests. Must be something to do with different timing of >> IO with KVM+virtio disks?? >> >> When running on my laptop (Ubuntu 16.04 with latest kernel) >> on a 10G SSD volume, I always get the error reported above. >> I just re-verified with latest stable e2fsprogs (1.43.6). > > Hi Amir, > > I tested generic/456 with KVM+virtio disks and SATA volumes on some kernels I don't understand. Did you also test without KVM? Otherwise I suggest that you test without KVM/virtio. > (including > v3.10.0, the latest kernel), but i still got the same output as i reported. > > Could you determine whether the two different outputs are caused by the same > bug > or not ? No idea if those are 2 symptoms of the same bug or 2 different bugs I did not investigate the root cause. Amir.
Hi Xiao, I am a student at the University of Texas at Austin. Some researchers in the computer science department at UT, myself included, have recently been working to develop a file system crash consistency test harness called CrashMonkey [1][2]. I have been working on the CrashMonkey project since it was started late last year. With CrashMonkey we have also been able to reproduce the incorrect i_size error you noted but we have not been able to reproduce the other output that Amir found. CrashMonkey works by logging and replaying operations for a workload, so it should not be sensitive to differences in timing that could be caused by things like KVM+virtio. I also did a few experiments with Amir's new xfstests test 456 (both with and without KVM and virtio) and I was unable to reproduce the output noted in the xfstest. I have not spent a lot of time looking into the cause of the bug that Amir found and it is rather unfortunate that I was unable to reproduce it with either xfstests or CrashMonkey. At any rate, CrashMonkey is still under development, so it does have some caveats. First, we are running with a fixed random seed in our default RandomPermuter (used to generate crash states) to aid development. Second, the branch with the reproduction of this ext4 regression bug in CrashMonkey [3] will yield a few false positives due to the way CrashMonkey works and how fsx runs. These false positives are due to CrashMonkey generating crash states where the directories for files used for the test have not be fsync-ed in the file system. The top of the README in the CrashMonkey branch with this bug reproduction outlines how we determined these were false positives [1] https://github.com/utsaslab/crashmonkey [2] https://www.usenix.org/conference/hotstorage17/program/presentation/martinez [3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein <amir73il@gmail.com> wrote: > On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote: >> On 2017/08/27 18:44, Amir Goldstein wrote: >>> This test is motivated by a bug found in ext4 during random crash >>> consistency tests. >>> >>> This test uses device mapper flakey target to demonstrate the bug >>> found using device mapper log-writes target. >>> >>> Signed-off-by: Amir Goldstein <amir73il@gmail.com> >>> --- >>> >>> Ted, >>> >>> While working on crash consistency xfstests [1], I stubmled on what >>> appeared to be an ext4 crash consistency bug. >>> >>> The tests I used rely on the log-writes dm target code written >>> by Josef Bacik, which had little exposure to the wide community >>> as far as I know. I wanted to prove to myself that the found >>> inconsistency was not due to a test bug, so I bisected the failed >>> test to the minimal operations that trigger the failure and wrote >>> a small independent test to reproduce the issue using dm flakey target. >>> >>> The following fsck error is reliably reproduced by replaying some fsx ops >>> on overlapping file regions, then emulating a crash, followed by mount, >>> umount and fsck -nf: >>> >>> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >>> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >>> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >>> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >>> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >>> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >>> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >>> All 7 operations completed A-OK! >>> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent >>> *** fsck.ext4 output *** >>> fsck from util-linux 2.27.1 >>> e2fsck 1.42.13 (17-May-2015) >>> Pass 1: Checking inodes, blocks, and sizes >>> Inode 12, end of extent exceeds allowed value >>> (logical block 33, physical block 33441, len 7) >>> Clear? no >>> Inode 12, i_blocks is 184, should be 128. Fix? no >> Hi Amir, >> >> I always get the following output when running your xfstests test case 501. > > Now merged as test generic/456 > >> --------------------------------------------------------------------------- >> e2fsck 1.42.9 (28-Dec-2013) >> Pass 1: Checking inodes, blocks, and sizes >> Inode 12, i_size is 147456, should be 163840. Fix? no >> --------------------------------------------------------------------------- >> >> Could you tell me how to get the expected output as you reported? > > I can't say I am doing anything special, but I can say that I get the > same output as you did when running the test inside kvm-xfstests. > Actually, I could not reproduce ANY of the the crash consistency bugs > inside kvm-xfstests. Must be something to do with different timing of > IO with KVM+virtio disks?? > > When running on my laptop (Ubuntu 16.04 with latest kernel) > on a 10G SSD volume, I always get the error reported above. > I just re-verified with latest stable e2fsprogs (1.43.6). > > Amir.
On 2017/09/30 22:15, Ashlie Martinez wrote: > Hi Xiao, > > I am a student at the University of Texas at Austin. Some researchers > in the computer science department at UT, myself included, have > recently been working to develop a file system crash consistency test > harness called CrashMonkey [1][2]. I have been working on the > CrashMonkey project since it was started late last year. With > CrashMonkey we have also been able to reproduce the incorrect i_size > error you noted but we have not been able to reproduce the other > output that Amir found. CrashMonkey works by logging and replaying > operations for a workload, so it should not be sensitive to > differences in timing that could be caused by things like KVM+virtio. > I also did a few experiments with Amir's new xfstests test 456 (both > with and without KVM and virtio) and I was unable to reproduce the > output noted in the xfstest. I have not spent a lot of time looking > into the cause of the bug that Amir found and it is rather unfortunate > that I was unable to reproduce it with either xfstests or CrashMonkey. Hi Ashlie, Thanks for your detailed comments. 1) Do you think the output that Amir noted in xfstests is a false positive? 2) About the output that both i and you reproduced, did you look into it and find its root cause? Thanks, Xiao Yang > At any rate, CrashMonkey is still under development, so it does have > some caveats. First, we are running with a fixed random seed in our > default RandomPermuter (used to generate crash states) to aid > development. Second, the branch with the reproduction of this ext4 > regression bug in CrashMonkey [3] will yield a few false positives due > to the way CrashMonkey works and how fsx runs. These false positives > are due to CrashMonkey generating crash states where the directories > for files used for the test have not be fsync-ed in the file system. > The top of the README in the CrashMonkey branch with this bug > reproduction outlines how we determined these were false positives > > [1] https://github.com/utsaslab/crashmonkey > [2] https://www.usenix.org/conference/hotstorage17/program/presentation/martinez > [3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug > > > On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein<amir73il@gmail.com> wrote: >> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com> wrote: >>> On 2017/08/27 18:44, Amir Goldstein wrote: >>>> This test is motivated by a bug found in ext4 during random crash >>>> consistency tests. >>>> >>>> This test uses device mapper flakey target to demonstrate the bug >>>> found using device mapper log-writes target. >>>> >>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com> >>>> --- >>>> >>>> Ted, >>>> >>>> While working on crash consistency xfstests [1], I stubmled on what >>>> appeared to be an ext4 crash consistency bug. >>>> >>>> The tests I used rely on the log-writes dm target code written >>>> by Josef Bacik, which had little exposure to the wide community >>>> as far as I know. I wanted to prove to myself that the found >>>> inconsistency was not due to a test bug, so I bisected the failed >>>> test to the minimal operations that trigger the failure and wrote >>>> a small independent test to reproduce the issue using dm flakey target. >>>> >>>> The following fsck error is reliably reproduced by replaying some fsx ops >>>> on overlapping file regions, then emulating a crash, followed by mount, >>>> umount and fsck -nf: >>>> >>>> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >>>> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >>>> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >>>> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >>>> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >>>> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >>>> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >>>> All 7 operations completed A-OK! >>>> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent >>>> *** fsck.ext4 output *** >>>> fsck from util-linux 2.27.1 >>>> e2fsck 1.42.13 (17-May-2015) >>>> Pass 1: Checking inodes, blocks, and sizes >>>> Inode 12, end of extent exceeds allowed value >>>> (logical block 33, physical block 33441, len 7) >>>> Clear? no >>>> Inode 12, i_blocks is 184, should be 128. Fix? no >>> Hi Amir, >>> >>> I always get the following output when running your xfstests test case 501. >> Now merged as test generic/456 >> >>> --------------------------------------------------------------------------- >>> e2fsck 1.42.9 (28-Dec-2013) >>> Pass 1: Checking inodes, blocks, and sizes >>> Inode 12, i_size is 147456, should be 163840. Fix? no >>> --------------------------------------------------------------------------- >>> >>> Could you tell me how to get the expected output as you reported? >> I can't say I am doing anything special, but I can say that I get the >> same output as you did when running the test inside kvm-xfstests. >> Actually, I could not reproduce ANY of the the crash consistency bugs >> inside kvm-xfstests. Must be something to do with different timing of >> IO with KVM+virtio disks?? >> >> When running on my laptop (Ubuntu 16.04 with latest kernel) >> on a 10G SSD volume, I always get the error reported above. >> I just re-verified with latest stable e2fsprogs (1.43.6). >> >> Amir. > > . >
On Thu, Oct 5, 2017 at 2:27 AM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote: > On 2017/09/30 22:15, Ashlie Martinez wrote: >> >> Hi Xiao, >> >> I am a student at the University of Texas at Austin. Some researchers >> in the computer science department at UT, myself included, have >> recently been working to develop a file system crash consistency test >> harness called CrashMonkey [1][2]. I have been working on the >> CrashMonkey project since it was started late last year. With >> CrashMonkey we have also been able to reproduce the incorrect i_size >> error you noted but we have not been able to reproduce the other >> output that Amir found. CrashMonkey works by logging and replaying >> operations for a workload, so it should not be sensitive to >> differences in timing that could be caused by things like KVM+virtio. >> I also did a few experiments with Amir's new xfstests test 456 (both >> with and without KVM and virtio) and I was unable to reproduce the >> output noted in the xfstest. I have not spent a lot of time looking >> into the cause of the bug that Amir found and it is rather unfortunate >> that I was unable to reproduce it with either xfstests or CrashMonkey. > > Hi Ashlie, > > Thanks for your detailed comments. > > 1) Do you think the output that Amir noted in xfstests is a false positive? It almost seems that way, though to be honest, I don't think I know enough about 1. the setup used by Amir and 2. all the internal working of KVM+virtio to say for sure. One thing I have identified as being a potential source of false positives is code in the kernel to remap the block number sent to device drives to something relative to the start of a block device, not the start of a partition. I'm thinking this could cause trouble if 1. a partition higher than the first partition is monitored by dm-log-writes, 2. the block numbers are recorded verbatim in dm-log-writes, and 3. the dm-log-writes log is replayed on a different device with different partitions (or possibly the same device, but a different partition). I know some other undergrad students at UT on the CrashMonkey team are looking into this right now, but I have not had time to test this myself. The offending code in the kernel is in the completely misnamed `generic_make_request_checks` function which has given the CrashMonkey team trouble in the past. Links to that function and the remapping function it calls are below. http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L2041 http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L1902 > > 2) About the output that both i and you reproduced, did you look into it > and find its root cause? I have not looked to find the root cause of the issue that we both reliably reproduce. > > Thanks, > Xiao Yang >> >> At any rate, CrashMonkey is still under development, so it does have >> some caveats. First, we are running with a fixed random seed in our >> default RandomPermuter (used to generate crash states) to aid >> development. Second, the branch with the reproduction of this ext4 >> regression bug in CrashMonkey [3] will yield a few false positives due >> to the way CrashMonkey works and how fsx runs. These false positives >> are due to CrashMonkey generating crash states where the directories >> for files used for the test have not be fsync-ed in the file system. >> The top of the README in the CrashMonkey branch with this bug >> reproduction outlines how we determined these were false positives >> >> [1] https://github.com/utsaslab/crashmonkey >> [2] >> https://www.usenix.org/conference/hotstorage17/program/presentation/martinez >> [3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug >> >> >> On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein<amir73il@gmail.com> >> wrote: >>> >>> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com> >>> wrote: >>>> >>>> On 2017/08/27 18:44, Amir Goldstein wrote: >>>>> >>>>> This test is motivated by a bug found in ext4 during random crash >>>>> consistency tests. >>>>> >>>>> This test uses device mapper flakey target to demonstrate the bug >>>>> found using device mapper log-writes target. >>>>> >>>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com> >>>>> --- >>>>> >>>>> Ted, >>>>> >>>>> While working on crash consistency xfstests [1], I stubmled on what >>>>> appeared to be an ext4 crash consistency bug. >>>>> >>>>> The tests I used rely on the log-writes dm target code written >>>>> by Josef Bacik, which had little exposure to the wide community >>>>> as far as I know. I wanted to prove to myself that the found >>>>> inconsistency was not due to a test bug, so I bisected the failed >>>>> test to the minimal operations that trigger the failure and wrote >>>>> a small independent test to reproduce the issue using dm flakey target. >>>>> >>>>> The following fsck error is reliably reproduced by replaying some fsx >>>>> ops >>>>> on overlapping file regions, then emulating a crash, followed by mount, >>>>> umount and fsck -nf: >>>>> >>>>> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >>>>> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >>>>> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >>>>> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >>>>> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >>>>> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >>>>> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >>>>> All 7 operations completed A-OK! >>>>> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is >>>>> inconsistent >>>>> *** fsck.ext4 output *** >>>>> fsck from util-linux 2.27.1 >>>>> e2fsck 1.42.13 (17-May-2015) >>>>> Pass 1: Checking inodes, blocks, and sizes >>>>> Inode 12, end of extent exceeds allowed value >>>>> (logical block 33, physical block 33441, len 7) >>>>> Clear? no >>>>> Inode 12, i_blocks is 184, should be 128. Fix? no >>>> >>>> Hi Amir, >>>> >>>> I always get the following output when running your xfstests test case >>>> 501. >>> >>> Now merged as test generic/456 >>> >>>> >>>> --------------------------------------------------------------------------- >>>> e2fsck 1.42.9 (28-Dec-2013) >>>> Pass 1: Checking inodes, blocks, and sizes >>>> Inode 12, i_size is 147456, should be 163840. Fix? no >>>> >>>> --------------------------------------------------------------------------- >>>> >>>> Could you tell me how to get the expected output as you reported? >>> >>> I can't say I am doing anything special, but I can say that I get the >>> same output as you did when running the test inside kvm-xfstests. >>> Actually, I could not reproduce ANY of the the crash consistency bugs >>> inside kvm-xfstests. Must be something to do with different timing of >>> IO with KVM+virtio disks?? >>> >>> When running on my laptop (Ubuntu 16.04 with latest kernel) >>> on a 10G SSD volume, I always get the error reported above. >>> I just re-verified with latest stable e2fsprogs (1.43.6). >>> >>> Amir. >> >> >> . >> > > >
On Thu, Oct 5, 2017 at 6:04 PM, Ashlie Martinez <ashmrtn@utexas.edu> wrote: > On Thu, Oct 5, 2017 at 2:27 AM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote: >> On 2017/09/30 22:15, Ashlie Martinez wrote: >>> >>> Hi Xiao, >>> >>> I am a student at the University of Texas at Austin. Some researchers >>> in the computer science department at UT, myself included, have >>> recently been working to develop a file system crash consistency test >>> harness called CrashMonkey [1][2]. I have been working on the >>> CrashMonkey project since it was started late last year. With >>> CrashMonkey we have also been able to reproduce the incorrect i_size >>> error you noted but we have not been able to reproduce the other >>> output that Amir found. CrashMonkey works by logging and replaying >>> operations for a workload, so it should not be sensitive to >>> differences in timing that could be caused by things like KVM+virtio. >>> I also did a few experiments with Amir's new xfstests test 456 (both >>> with and without KVM and virtio) and I was unable to reproduce the >>> output noted in the xfstest. I have not spent a lot of time looking >>> into the cause of the bug that Amir found and it is rather unfortunate >>> that I was unable to reproduce it with either xfstests or CrashMonkey. >> >> Hi Ashlie, >> >> Thanks for your detailed comments. >> >> 1) Do you think the output that Amir noted in xfstests is a false positive? > > It almost seems that way, though to be honest, I don't think I know > enough about 1. the setup used by Amir and 2. all the internal working > of KVM+virtio to say for sure. I believe you misread my email. The problem was NOT reproduced on KVM+virtio. The problem *is* reproduced on a 10G LVM volume over SSD on Ubuntu 16.04 with latest kernel and latest e2fsprogs. There is no use in speculating why the problem doesn't happen on your systems. If the issue reproduces on A system (2 systems actually that I tested) that it is a problem. Attached is an e2image dump of the inconsistent file system following test generic/456 on my system. I would have attached the image sooner, but since on my system problem reproduces 100% on the time, I did not think that was a need. Without an ext4 developer looking into this image, I don't really see the point in further discussion. I would be interesting to get more test samples from people running the test on other systems. If only some people see the error "end of extent exceeds allowed value" it would be interesting to find the commonality between those setups. > One thing I have identified as being a > potential source of false positives is code in the kernel to remap the > block number sent to device drives to something relative to the start > of a block device, not the start of a partition. I'm thinking this > could cause trouble if 1. a partition higher than the first partition > is monitored by dm-log-writes, 2. the block numbers are recorded > verbatim in dm-log-writes, and 3. the dm-log-writes log is replayed on > a different device with different partitions (or possibly the same > device, but a different partition). You do realize that the test generic/456 is not using dm-log-writes at all. I intentionally took it out of the equation and used the even dumber dm-flakey target to demonstrate the crash inconsistency. > I know some other undergrad > students at UT on the CrashMonkey team are looking into this right > now, but I have not had time to test this myself. The offending code > in the kernel is in the completely misnamed > `generic_make_request_checks` function which has given the CrashMonkey > team trouble in the past. Links to that function and the remapping > function it calls are below. > > http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L2041 > http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L1902 > >> >> 2) About the output that both i and you reproduced, did you look into it >> and find its root cause? > > I have not looked to find the root cause of the issue that we both > reliably reproduce. > Now you have a broken file system image and the exact set of operations that led to it. That's should be a pretty big lead for investigation. Amir.
On Thu, Oct 5, 2017 at 2:10 PM, Amir Goldstein <amir73il@gmail.com> wrote: > On Thu, Oct 5, 2017 at 6:04 PM, Ashlie Martinez <ashmrtn@utexas.edu> wrote: >> On Thu, Oct 5, 2017 at 2:27 AM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote: >>> On 2017/09/30 22:15, Ashlie Martinez wrote: >>>> >>>> Hi Xiao, >>>> >>>> I am a student at the University of Texas at Austin. Some researchers >>>> in the computer science department at UT, myself included, have >>>> recently been working to develop a file system crash consistency test >>>> harness called CrashMonkey [1][2]. I have been working on the >>>> CrashMonkey project since it was started late last year. With >>>> CrashMonkey we have also been able to reproduce the incorrect i_size >>>> error you noted but we have not been able to reproduce the other >>>> output that Amir found. CrashMonkey works by logging and replaying >>>> operations for a workload, so it should not be sensitive to >>>> differences in timing that could be caused by things like KVM+virtio. >>>> I also did a few experiments with Amir's new xfstests test 456 (both >>>> with and without KVM and virtio) and I was unable to reproduce the >>>> output noted in the xfstest. I have not spent a lot of time looking >>>> into the cause of the bug that Amir found and it is rather unfortunate >>>> that I was unable to reproduce it with either xfstests or CrashMonkey. >>> >>> Hi Ashlie, >>> >>> Thanks for your detailed comments. >>> >>> 1) Do you think the output that Amir noted in xfstests is a false positive? >> >> It almost seems that way, though to be honest, I don't think I know >> enough about 1. the setup used by Amir and 2. all the internal working >> of KVM+virtio to say for sure. > > I believe you misread my email. > The problem was NOT reproduced on KVM+virtio. > The problem *is* reproduced on a 10G LVM volume over SSD on > Ubuntu 16.04 with latest kernel and latest e2fsprogs. I have also tried running generic/456 on non-KVM, non-virtio machines and I was unable to reproduce it. Without information on test setups individuals are using, it is very hard to tell where I, or other people, are going wrong in testing other than maybe using a virtual machine. As an aside, it does not appear to be simply a timing issue due to KVM+virtio. I would hope that CrashMonkey would still be able to find the extent error you saw, even on a KVM VM since it is not dependent on timing. > > There is no use in speculating why the problem doesn't happen on your > systems. If the issue reproduces on A system (2 systems actually that I tested) > that it is a problem. > > Attached is an e2image dump of the inconsistent file system following test > generic/456 on my system. I would have attached the image sooner, > but since on my system problem reproduces 100% on the time, I did not think > that was a need. Without an ext4 developer looking into this image, I don't > really see the point in further discussion. > > I would be interesting to get more test samples from people running the test > on other systems. If only some people see the error > "end of extent exceeds allowed value" > it would be interesting to find the commonality between those setups. > I agree it would be interesting to see why this error appears only for some people. >> One thing I have identified as being a >> potential source of false positives is code in the kernel to remap the >> block number sent to device drives to something relative to the start >> of a block device, not the start of a partition. I'm thinking this >> could cause trouble if 1. a partition higher than the first partition >> is monitored by dm-log-writes, 2. the block numbers are recorded >> verbatim in dm-log-writes, and 3. the dm-log-writes log is replayed on >> a different device with different partitions (or possibly the same >> device, but a different partition). > > You do realize that the test generic/456 is not using dm-log-writes at all. > I intentionally took it out of the equation and used the even dumber dm-flakey > target to demonstrate the crash inconsistency. Oh, yes, I recall that now. The past few weeks have been very busy for me with all of my other school work, so getting all of my i's dotted and t's crossed has been a challenge. At any rate, I don't think it is a real minus to mention the potential remapping of blocks in the cited function. If people are going to use any sort of block device module (be dm-log-writes or otherwise) then I think it would be good for developers to be aware of. > >> I know some other undergrad >> students at UT on the CrashMonkey team are looking into this right >> now, but I have not had time to test this myself. The offending code >> in the kernel is in the completely misnamed >> `generic_make_request_checks` function which has given the CrashMonkey >> team trouble in the past. Links to that function and the remapping >> function it calls are below. >> >> http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L2041 >> http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L1902 >> >>> >>> 2) About the output that both i and you reproduced, did you look into it >>> and find its root cause? >> >> I have not looked to find the root cause of the issue that we both >> reliably reproduce. >> > > Now you have a broken file system image and the exact set of operations > that led to it. That's should be a pretty big lead for investigation. > > Amir.
diff --git a/tests/generic/501 b/tests/generic/501 new file mode 100755 index 0000000..ccb513d --- /dev/null +++ b/tests/generic/501 @@ -0,0 +1,80 @@ +#! /bin/bash +# FS QA Test No. 501 +# +# This test is motivated by a bug found in ext4 during random crash +# consistency tests. +# +#----------------------------------------------------------------------- +# Copyright (C) 2017 CTERA Networks. All Rights Reserved. +# Author: Amir Goldstein <amir73il@gmail.com> +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/dmflakey + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch +_require_dm_target flakey +_require_metadata_journaling $SCRATCH_DEV + +rm -f $seqres.full + +_scratch_mkfs >> $seqres.full 2>&1 + +_init_flakey +_mount_flakey + +fsxops=$tmp.fsxops +cat <<EOF > $fsxops +write 0x137dd 0xdc69 0x0 +fallocate 0xb531 0xb5ad 0x21446 +collapse_range 0x1c000 0x4000 0x21446 +write 0x3e5ec 0x1a14 0x21446 +zero_range 0x20fac 0x6d9c 0x40000 keep_size +mapwrite 0x216ad 0x274f 0x40000 +EOF +run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile + +_flakey_drop_and_remount +_unmount_flakey +_cleanup_flakey +_check_scratch_fs + +echo "Silence is golden" + +status=0 +exit diff --git a/tests/generic/501.out b/tests/generic/501.out new file mode 100644 index 0000000..00133b6 --- /dev/null +++ b/tests/generic/501.out @@ -0,0 +1,2 @@ +QA output created by 501 +Silence is golden diff --git a/tests/generic/group b/tests/generic/group index 2396b72..bb870f2 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -454,3 +454,4 @@ 449 auto quick acl enospc 450 auto quick rw 500 auto log replay +501 auto quick metadata
This test is motivated by a bug found in ext4 during random crash consistency tests. This test uses device mapper flakey target to demonstrate the bug found using device mapper log-writes target. Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- Ted, While working on crash consistency xfstests [1], I stubmled on what appeared to be an ext4 crash consistency bug. The tests I used rely on the log-writes dm target code written by Josef Bacik, which had little exposure to the wide community as far as I know. I wanted to prove to myself that the found inconsistency was not due to a test bug, so I bisected the failed test to the minimal operations that trigger the failure and wrote a small independent test to reproduce the issue using dm flakey target. The following fsck error is reliably reproduced by replaying some fsx ops on overlapping file regions, then emulating a crash, followed by mount, umount and fsck -nf: ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile 1 write 0x137dd thru 0x21445 (0xdc69 bytes) 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) All 7 operations completed A-OK! _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent *** fsck.ext4 output *** fsck from util-linux 2.27.1 e2fsck 1.42.13 (17-May-2015) Pass 1: Checking inodes, blocks, and sizes Inode 12, end of extent exceeds allowed value (logical block 33, physical block 33441, len 7) Clear? no Inode 12, i_blocks is 184, should be 128. Fix? no Note that the inconsistency is "applied" by journal replay during mount. fsck -nf before mount does not report any errors. I did not intend for this test to be merged as is, but rather to be used by ext4 developers to analyze the problem and then re-write the test with more comments and less arbitrary offset/length values. P.S.: crash consistency tests also reliably reproduce a btrfs fsck error. a detailed report with I/O recording was sent to Josef. P.S.2: crash consistency tests report file data checksum errors on xfs after fsync+crash, but I still need to prove the reliability of these reports. [1] https://github.com/amir73il/xfstests/commits/dm-log-writes tests/generic/501 | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++ tests/generic/501.out | 2 ++ tests/generic/group | 1 + 3 files changed, 83 insertions(+) create mode 100755 tests/generic/501 create mode 100644 tests/generic/501.out