Message ID | 20200701112643.726986-1-chengzhihao1@huawei.com |
---|---|
State | Changes Requested |
Headers | show |
Series | ubifs: Fix a potential space leak problem while linking tmpfile | expand |
On Wed, Jul 1, 2020 at 1:28 PM Zhihao Cheng <chengzhihao1@huawei.com> wrote: > > There is a potential space leak problem while linking tmpfile, in which > case, inode node (with nlink=0) is valid in tnc (on flash), which leads > to space leak. Meanwhile, the corresponding data nodes won't be released > from tnc. For example, (A reproducer can be found in Link): > > $ mount UBIFS > [process A] [process B] [TNC] [orphan area] > > ubifs_tmpfile inode_A (nlink=0) inode_A > do_commit inode_A (nlink=0) inode_A > ↑ > (comment: It makes sure not replay inode_A in next mount) > ubifs_link inode_A (nlink=0) inode_A > ubifs_delete_orphan inode_A (nlink=0) > do_commit inode_A (nlink=0) > ---> POWERCUT <--- > (ubifs_jnl_update) > > $ mount UBIFS > inode_A will neither be replayed in ubifs_replay_journal() nor > ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will > always on tnc, it occupy space but is non-visable for users. > > Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing > orphans.") handles problem in mistakenly deleting relinked tmpfile > while replaying orphan area. Since that, tmpfile inode should always > live in orphan area even it is linked. Fix it by reverting commit > 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()"). Well, by reverting this commit you re-introduce the issue it fixes. ;-\
在 2020/7/7 19:26, Richard Weinberger 写道: > On Wed, Jul 1, 2020 at 1:28 PM Zhihao Cheng <chengzhihao1@huawei.com> wrote: >> There is a potential space leak problem while linking tmpfile, in which >> case, inode node (with nlink=0) is valid in tnc (on flash), which leads >> to space leak. Meanwhile, the corresponding data nodes won't be released >> from tnc. For example, (A reproducer can be found in Link): >> >> $ mount UBIFS >> [process A] [process B] [TNC] [orphan area] >> >> ubifs_tmpfile inode_A (nlink=0) inode_A >> do_commit inode_A (nlink=0) inode_A >> ↑ >> (comment: It makes sure not replay inode_A in next mount) >> ubifs_link inode_A (nlink=0) inode_A >> ubifs_delete_orphan inode_A (nlink=0) >> do_commit inode_A (nlink=0) >> ---> POWERCUT <--- >> (ubifs_jnl_update) >> >> $ mount UBIFS >> inode_A will neither be replayed in ubifs_replay_journal() nor >> ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will >> always on tnc, it occupy space but is non-visable for users. >> >> Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing >> orphans.") handles problem in mistakenly deleting relinked tmpfile >> while replaying orphan area. Since that, tmpfile inode should always >> live in orphan area even it is linked. Fix it by reverting commit >> 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()"). > Well, by reverting this commit you re-introduce the issue it fixes. ;-\ > Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") wanted to fix. I think orphan area is used to remind filesystem don't forget to delete inodes (whose nlink is 0) in next unclean rebooting. Generally, the file system is not corrupted caused by replaying orphan nodes. Ralph reported a filesystem corruption in combination with overlayfs. Can you tell me the details about that problem? Thanks.
----- Ursprüngliche Mail ----- > Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix > O_TMPFILE corner case in ubifs_link()") wanted to fix. > I think orphan area is used to remind filesystem don't forget to delete > inodes (whose nlink is 0) in next unclean rebooting. Generally, the file > system is not corrupted caused by replaying orphan nodes. > Ralph reported a filesystem corruption in combination with overlayfs. > Can you tell me the details about that problem? Thanks. On my test bed I didn't see a fs corruption, what I saw was a failing orphan self test while playing with O_TMPFILE and linkat(). When you create a tmpfile it has a link count of 0 and an orphan is installed. Such that the tmpfile is gone after a reboot but you can still use it prior to that. By using linkat() you can raise the link counter to 1 again. Thus, the orphan needs to be removed. This is pattern overlayfs uses a lot. Since UBIFS never supported raising the link counter from 0 to 1 we have many corner cases and fixing all these turned out into a nightmare. ...as you can see from the amount broken patches from me :-(. Thanks, //richard
在 2020/7/7 20:09, Richard Weinberger 写道: > ----- Ursprüngliche Mail ----- >> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix >> O_TMPFILE corner case in ubifs_link()") wanted to fix. >> I think orphan area is used to remind filesystem don't forget to delete >> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file >> system is not corrupted caused by replaying orphan nodes. >> Ralph reported a filesystem corruption in combination with overlayfs. >> Can you tell me the details about that problem? Thanks. > On my test bed I didn't see a fs corruption, what I saw was a failing orphan > self test while playing with O_TMPFILE and linkat(). Do we have a reproducer, or can I get the fail testcase? Is it a xfstest case? > > When you create a tmpfile it has a link count of 0 and an orphan is > installed. Such that the tmpfile is gone after a reboot but you can > still use it prior to that. > By using linkat() you can raise the link counter to 1 again. > Thus, the orphan needs to be removed. > This is pattern overlayfs uses a lot. > > Since UBIFS never supported raising the link counter from 0 to 1 > we have many corner cases and fixing all these turned out into a nightmare. > ...as you can see from the amount broken patches from me :-(. > > Thanks, > //richard > > .
----- Ursprüngliche Mail ----- >>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix >>> O_TMPFILE corner case in ubifs_link()") wanted to fix. >>> I think orphan area is used to remind filesystem don't forget to delete >>> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file >>> system is not corrupted caused by replaying orphan nodes. >>> Ralph reported a filesystem corruption in combination with overlayfs. >>> Can you tell me the details about that problem? Thanks. >> On my test bed I didn't see a fs corruption, what I saw was a failing orphan >> self test while playing with O_TMPFILE and linkat(). > Do we have a reproducer, or can I get the fail testcase? Is it a xfstest > case? I think xfstests triggered it, yes. Later today I can check. :) Thanks, //richard
在 2020/7/7 20:47, Richard Weinberger 写道: > ----- Ursprüngliche Mail ----- >>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix >>>> O_TMPFILE corner case in ubifs_link()") wanted to fix. >>>> I think orphan area is used to remind filesystem don't forget to delete >>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file >>>> system is not corrupted caused by replaying orphan nodes. >>>> Ralph reported a filesystem corruption in combination with overlayfs. >>>> Can you tell me the details about that problem? Thanks. >>> On my test bed I didn't see a fs corruption, what I saw was a failing orphan >>> self test while playing with O_TMPFILE and linkat(). >> Do we have a reproducer, or can I get the fail testcase? Is it a xfstest >> case? > I think xfstests triggered it, yes. > Later today I can check. :) > > Thanks, > //richard > > . I think I have found the testcases, overlay/006 and overlay/041. The 'mv' and 'rm' operations will put lowertestfile into orphan list twice, so we must reseve the orphan deletion operation in ubifs_link(), otherwise the testcase fails and we will see the following msg: overlay/006 2s ... - output mismatch (see /root/git/xfstests-dev/results//overlay/006.out.bad) --- tests/overlay/006.out 2020-07-07 21:42:57.737000000 +0800 +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 14:31:55.340000000 +0800 @@ -1,2 +1,4 @@ QA output created by 006 Silence is golden +rm: cannot remove '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument +lowertestfile ... [ 382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: orphaned twice [ 382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: orphan list not empty at unmount So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in function ubifs_link(). Following modifications applied in linux-5.8 has been tested by overlay/041, overlay/006 and other tmpfile cases (generic/531, generic/530, generic/509, generic/389, generic/004). diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..fd4443a5e8c6 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0); if (err) goto out_cancel; + + /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ + if (inode->i_nlink == 1) + ubifs_delete_orphan(c, inode->i_ino); + unlock_2_inodes(dir, inode); ubifs_release_budget(c, &req); @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, &req); iput(inode); --
在 2020/7/11 14:37, Zhihao Cheng 写道: > 在 2020/7/7 20:47, Richard Weinberger 写道: >> ----- Ursprüngliche Mail ----- >>>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix >>>>> O_TMPFILE corner case in ubifs_link()") wanted to fix. >>>>> I think orphan area is used to remind filesystem don't forget to >>>>> delete >>>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, >>>>> the file >>>>> system is not corrupted caused by replaying orphan nodes. >>>>> Ralph reported a filesystem corruption in combination with overlayfs. >>>>> Can you tell me the details about that problem? Thanks. >>>> On my test bed I didn't see a fs corruption, what I saw was a >>>> failing orphan >>>> self test while playing with O_TMPFILE and linkat(). >>> Do we have a reproducer, or can I get the fail testcase? Is it a >>> xfstest >>> case? >> I think xfstests triggered it, yes. >> Later today I can check. :) >> >> Thanks, >> //richard >> >> . > > I think I have found the testcases, overlay/006 and overlay/041. > > The 'mv' and 'rm' operations will put lowertestfile into orphan list > twice, so we must reseve the orphan deletion operation in > ubifs_link(), otherwise the testcase fails and we will see the > following msg: > > overlay/006 2s ... - output mismatch (see > /root/git/xfstests-dev/results//overlay/006.out.bad) > --- tests/overlay/006.out 2020-07-07 21:42:57.737000000 +0800 > +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 > 14:31:55.340000000 +0800 > @@ -1,2 +1,4 @@ > QA output created by 006 > Silence is golden > +rm: cannot remove > '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument > +lowertestfile > ... > > [ 382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: > orphaned twice > [ 382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: > orphan list not empty at unmount > > > So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in > function ubifs_link(). Following modifications applied in linux-5.8 > has been tested by overlay/041, overlay/006 and other tmpfile cases > (generic/531, generic/530, generic/509, generic/389, generic/004). > Results for testcases generic/530, generic/509, generic/389 and generic/004 are still "not run". > diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c > index ef85ec167a84..fd4443a5e8c6 100644 > --- a/fs/ubifs/dir.c > +++ b/fs/ubifs/dir.c > @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, > struct inode *dir, > goto out_fname; > > lock_2_inodes(dir, inode); > - > - /* Handle O_TMPFILE corner case, it is allowed to link a > O_TMPFILE. */ > - if (inode->i_nlink == 0) > - ubifs_delete_orphan(c, inode->i_ino); > - > inc_nlink(inode); > ihold(inode); > inode->i_ctime = current_time(inode); > @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, > struct inode *dir, > err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0); > if (err) > goto out_cancel; > + > + /* Handle O_TMPFILE corner case, it is allowed to link a > O_TMPFILE. */ > + if (inode->i_nlink == 1) > + ubifs_delete_orphan(c, inode->i_ino); > + > unlock_2_inodes(dir, inode); > > ubifs_release_budget(c, &req); > @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, > struct inode *dir, > dir->i_size -= sz_change; > dir_ui->ui_size = dir->i_size; > drop_nlink(inode); > - if (inode->i_nlink == 0) > - ubifs_add_orphan(c, inode->i_ino); > unlock_2_inodes(dir, inode); > ubifs_release_budget(c, &req); > iput(inode); > -- >
在 2020/7/11 14:37, Zhihao Cheng 写道: > 在 2020/7/7 20:47, Richard Weinberger 写道: >> ----- Ursprüngliche Mail ----- >>>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix >>>>> O_TMPFILE corner case in ubifs_link()") wanted to fix. >>>>> I think orphan area is used to remind filesystem don't forget to >>>>> delete >>>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, >>>>> the file >>>>> system is not corrupted caused by replaying orphan nodes. >>>>> Ralph reported a filesystem corruption in combination with overlayfs. >>>>> Can you tell me the details about that problem? Thanks. >>>> On my test bed I didn't see a fs corruption, what I saw was a >>>> failing orphan >>>> self test while playing with O_TMPFILE and linkat(). >>> Do we have a reproducer, or can I get the fail testcase? Is it a >>> xfstest >>> case? >> I think xfstests triggered it, yes. >> Later today I can check. :) >> >> Thanks, >> //richard >> >> . > > I think I have found the testcases, overlay/006 and overlay/041. > > The 'mv' and 'rm' operations will put lowertestfile into orphan list > twice, so we must reseve the orphan deletion operation in > ubifs_link(), otherwise the testcase fails and we will see the > following msg: Sorry, not lowertestfile, it's tempfile which is generated by ovl copy-up (mv operation). The tempfile is linked after copy-up finished. The tempfile is then unlinked by 'rm' operation. > > overlay/006 2s ... - output mismatch (see > /root/git/xfstests-dev/results//overlay/006.out.bad) > --- tests/overlay/006.out 2020-07-07 21:42:57.737000000 +0800 > +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 > 14:31:55.340000000 +0800 > @@ -1,2 +1,4 @@ > QA output created by 006 > Silence is golden > +rm: cannot remove > '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument > +lowertestfile > ... > > [ 382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: > orphaned twice > [ 382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: > orphan list not empty at unmount > > > So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in > function ubifs_link(). Following modifications applied in linux-5.8 > has been tested by overlay/041, overlay/006 and other tmpfile cases > (generic/531, generic/530, generic/509, generic/389, generic/004). > > diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c > index ef85ec167a84..fd4443a5e8c6 100644 > --- a/fs/ubifs/dir.c > +++ b/fs/ubifs/dir.c > @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, > struct inode *dir, > goto out_fname; > > lock_2_inodes(dir, inode); > - > - /* Handle O_TMPFILE corner case, it is allowed to link a > O_TMPFILE. */ > - if (inode->i_nlink == 0) > - ubifs_delete_orphan(c, inode->i_ino); > - > inc_nlink(inode); > ihold(inode); > inode->i_ctime = current_time(inode); > @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, > struct inode *dir, > err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0); > if (err) > goto out_cancel; > + > + /* Handle O_TMPFILE corner case, it is allowed to link a > O_TMPFILE. */ > + if (inode->i_nlink == 1) > + ubifs_delete_orphan(c, inode->i_ino); > + > unlock_2_inodes(dir, inode); > > ubifs_release_budget(c, &req); > @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, > struct inode *dir, > dir->i_size -= sz_change; > dir_ui->ui_size = dir->i_size; > drop_nlink(inode); > - if (inode->i_nlink == 0) > - ubifs_add_orphan(c, inode->i_ino); > unlock_2_inodes(dir, inode); > ubifs_release_budget(c, &req); > iput(inode); > -- > > > > ______________________________________________________ > Linux MTD discussion mailing list > http://lists.infradead.org/mailman/listinfo/linux-mtd/
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..9534c4bb598f 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -747,8 +742,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, &req); iput(inode);
There is a potential space leak problem while linking tmpfile, in which case, inode node (with nlink=0) is valid in tnc (on flash), which leads to space leak. Meanwhile, the corresponding data nodes won't be released from tnc. For example, (A reproducer can be found in Link): $ mount UBIFS [process A] [process B] [TNC] [orphan area] ubifs_tmpfile inode_A (nlink=0) inode_A do_commit inode_A (nlink=0) inode_A ↑ (comment: It makes sure not replay inode_A in next mount) ubifs_link inode_A (nlink=0) inode_A ubifs_delete_orphan inode_A (nlink=0) do_commit inode_A (nlink=0) ---> POWERCUT <--- (ubifs_jnl_update) $ mount UBIFS inode_A will neither be replayed in ubifs_replay_journal() nor ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will always on tnc, it occupy space but is non-visable for users. Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing orphans.") handles problem in mistakenly deleting relinked tmpfile while replaying orphan area. Since that, tmpfile inode should always live in orphan area even it is linked. Fix it by reverting commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()"). Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Cc: <stable@vger.kernel.org> # v5.3+ Fixes: 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") Link: https://bugzilla.kernel.org/show_bug.cgi?id=208405 --- fs/ubifs/dir.c | 7 ------- 1 file changed, 7 deletions(-)