Message ID | 1281087376-2411-1-git-send-email-stefan.bader@canonical.com |
---|---|
State | Accepted |
Headers | show |
On 08/06/2010 03:36 AM, Stefan Bader wrote: > SRU Justification: > > Impact: Changes to ext4 which are part of 2.6.32.17 and we took in advance to > fix other issues are causing a lockup regression when working with lvm snap- > shots. > > Fix: This patch, which is slowly making its way upstream, was verified to fix > this problem and is also small and contained enough to be reasonably save. > Under normal circumstances we would wait for this patch to appear upstream > but as this is a more serious regression and we are planning to do a limited > change upload to Lucid to address some priority problems I think we should > consider this patch even if it does not make it upstream in time. > For Maverick the same problem exists but it is probably enough time to wait > and this is marked to be reviewed before beta as well. > Should the patch go upstream before doing the Lucid upload, references will > be updated. > > Testcase: > Use lvm to create a LV, mount it and have it actively doing IO, then try > to create a snapshot will hang without further IO being possible. > > -Stefan > > From 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f Mon Sep 17 00:00:00 2001 > From: Eric Sandeen<sandeen@sandeen.net> > Date: Sun, 1 Aug 2010 17:33:29 -0400 > Subject: [PATCH] (pre-stable) ext4: fix freeze deadlock under IO > > BugLink: http://bugs.launchpad.net/bugs/595489 > > Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks > when freezing a filesystem which had active IO; the vfs_check_frozen > level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing > through. Duh. > > Changing the test to FREEZE_TRANS should let the normal freeze > syncing get through the fs, but still block any transactions from > starting once the fs is completely frozen. > > I tested this by running fsstress in the background while periodically > snapshotting the fs and running fsck on the result. I ran into > occasional deadlocks, but different ones. I think this is a > fine fix for the problem at hand, and the other deadlocky things > will need more investigation. > > Reported-by: Phillip Susi<psusi@cfl.rr.com> > Signed-off-by: Eric Sandeen<sandeen@redhat.com> > Signed-off-by: "Theodore Ts'o"<tytso@mit.edu> > (cherry-picked from commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f linux-next) > Signed-off-by: Stefan Bader<stefan.bader@canonical.com> > --- > fs/ext4/super.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index e046eba..282a270 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -241,7 +241,7 @@ handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks) > if (sb->s_flags& MS_RDONLY) > return ERR_PTR(-EROFS); > > - vfs_check_frozen(sb, SB_FREEZE_WRITE); > + vfs_check_frozen(sb, SB_FREEZE_TRANS); > /* Special case here: if the journal has aborted behind our > * backs (eg. EIO in the commit thread), then we still need to > * take the FS itself readonly cleanly. */ > @@ -3608,7 +3608,7 @@ int ext4_force_commit(struct super_block *sb) > > journal = EXT4_SB(sb)->s_journal; > if (journal) { > - vfs_check_frozen(sb, SB_FREEZE_WRITE); > + vfs_check_frozen(sb, SB_FREEZE_TRANS); > ret = ext4_journal_force_commit(journal); > } > Acked-by: Tim Gardner <tim.gardner@canonical;.com>
On Fri, 2010-08-06 at 11:36 +0200, Stefan Bader wrote: > SRU Justification: > > Impact: Changes to ext4 which are part of 2.6.32.17 and we took in advance to > fix other issues are causing a lockup regression when working with lvm snap- > shots. > > Fix: This patch, which is slowly making its way upstream, was verified to fix > this problem and is also small and contained enough to be reasonably save. > Under normal circumstances we would wait for this patch to appear upstream > but as this is a more serious regression and we are planning to do a limited > change upload to Lucid to address some priority problems I think we should > consider this patch even if it does not make it upstream in time. > For Maverick the same problem exists but it is probably enough time to wait > and this is marked to be reviewed before beta as well. > Should the patch go upstream before doing the Lucid upload, references will > be updated. > > Testcase: > Use lvm to create a LV, mount it and have it actively doing IO, then try > to create a snapshot will hang without further IO being possible. > > -Stefan > > From 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f Mon Sep 17 00:00:00 2001 > From: Eric Sandeen <sandeen@sandeen.net> > Date: Sun, 1 Aug 2010 17:33:29 -0400 > Subject: [PATCH] (pre-stable) ext4: fix freeze deadlock under IO > > BugLink: http://bugs.launchpad.net/bugs/595489 > > Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks > when freezing a filesystem which had active IO; the vfs_check_frozen > level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing > through. Duh. > > Changing the test to FREEZE_TRANS should let the normal freeze > syncing get through the fs, but still block any transactions from > starting once the fs is completely frozen. > > I tested this by running fsstress in the background while periodically > snapshotting the fs and running fsck on the result. I ran into > occasional deadlocks, but different ones. I think this is a > fine fix for the problem at hand, and the other deadlocky things > will need more investigation. > > Reported-by: Phillip Susi <psusi@cfl.rr.com> > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > (cherry-picked from commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f linux-next) > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > --- > fs/ext4/super.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index e046eba..282a270 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -241,7 +241,7 @@ handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks) > if (sb->s_flags & MS_RDONLY) > return ERR_PTR(-EROFS); > > - vfs_check_frozen(sb, SB_FREEZE_WRITE); > + vfs_check_frozen(sb, SB_FREEZE_TRANS); > /* Special case here: if the journal has aborted behind our > * backs (eg. EIO in the commit thread), then we still need to > * take the FS itself readonly cleanly. */ > @@ -3608,7 +3608,7 @@ int ext4_force_commit(struct super_block *sb) > > journal = EXT4_SB(sb)->s_journal; > if (journal) { > - vfs_check_frozen(sb, SB_FREEZE_WRITE); > + vfs_check_frozen(sb, SB_FREEZE_TRANS); > ret = ext4_journal_force_commit(journal); > } > > -- > 1.7.0.4 > > Acked-by: Steve Conklin <sconklin@canonical.com>
Steve, though we got the acks, I would hold back with applying the patch itself. I think we won't be stitching together the next proposed upload before Monday. So this gives a little more time to potentially let the patch go upstream. Stefan
On Fri, Aug 06, 2010 at 11:36:16AM +0200, Stefan Bader wrote: > SRU Justification: > > Impact: Changes to ext4 which are part of 2.6.32.17 and we took in advance to > fix other issues are causing a lockup regression when working with lvm snap- > shots. > > Fix: This patch, which is slowly making its way upstream, was verified to fix > this problem and is also small and contained enough to be reasonably save. > Under normal circumstances we would wait for this patch to appear upstream > but as this is a more serious regression and we are planning to do a limited > change upload to Lucid to address some priority problems I think we should > consider this patch even if it does not make it upstream in time. > For Maverick the same problem exists but it is probably enough time to wait > and this is marked to be reviewed before beta as well. > Should the patch go upstream before doing the Lucid upload, references will > be updated. > > Testcase: > Use lvm to create a LV, mount it and have it actively doing IO, then try > to create a snapshot will hang without further IO being possible. > > -Stefan > > From 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f Mon Sep 17 00:00:00 2001 > From: Eric Sandeen <sandeen@sandeen.net> > Date: Sun, 1 Aug 2010 17:33:29 -0400 > Subject: [PATCH] (pre-stable) ext4: fix freeze deadlock under IO > > BugLink: http://bugs.launchpad.net/bugs/595489 > > Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks > when freezing a filesystem which had active IO; the vfs_check_frozen > level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing > through. Duh. > > Changing the test to FREEZE_TRANS should let the normal freeze > syncing get through the fs, but still block any transactions from > starting once the fs is completely frozen. > > I tested this by running fsstress in the background while periodically > snapshotting the fs and running fsck on the result. I ran into > occasional deadlocks, but different ones. I think this is a > fine fix for the problem at hand, and the other deadlocky things > will need more investigation. > > Reported-by: Phillip Susi <psusi@cfl.rr.com> > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > (cherry-picked from commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f linux-next) > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > --- > fs/ext4/super.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index e046eba..282a270 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -241,7 +241,7 @@ handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks) > if (sb->s_flags & MS_RDONLY) > return ERR_PTR(-EROFS); > > - vfs_check_frozen(sb, SB_FREEZE_WRITE); > + vfs_check_frozen(sb, SB_FREEZE_TRANS); > /* Special case here: if the journal has aborted behind our > * backs (eg. EIO in the commit thread), then we still need to > * take the FS itself readonly cleanly. */ > @@ -3608,7 +3608,7 @@ int ext4_force_commit(struct super_block *sb) > > journal = EXT4_SB(sb)->s_journal; > if (journal) { > - vfs_check_frozen(sb, SB_FREEZE_WRITE); > + vfs_check_frozen(sb, SB_FREEZE_TRANS); > ret = ext4_journal_force_commit(journal); > } > Looks good. The change is very clear. Acked-by: Andy Whitcroft <apw@canonical.com> -apw
Applied to Lucid master. Patch went upstream as of today (same SHA1).
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index e046eba..282a270 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -241,7 +241,7 @@ handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks) if (sb->s_flags & MS_RDONLY) return ERR_PTR(-EROFS); - vfs_check_frozen(sb, SB_FREEZE_WRITE); + vfs_check_frozen(sb, SB_FREEZE_TRANS); /* Special case here: if the journal has aborted behind our * backs (eg. EIO in the commit thread), then we still need to * take the FS itself readonly cleanly. */ @@ -3608,7 +3608,7 @@ int ext4_force_commit(struct super_block *sb) journal = EXT4_SB(sb)->s_journal; if (journal) { - vfs_check_frozen(sb, SB_FREEZE_WRITE); + vfs_check_frozen(sb, SB_FREEZE_TRANS); ret = ext4_journal_force_commit(journal); }