mbox series

[v3,0/6] ext4: improve commit path performance for fast commit

Message ID 20220419173143.3564144-1-harshads@google.com
Headers show
Series ext4: improve commit path performance for fast commit | expand

Message

harshad shirwadkar April 19, 2022, 5:31 p.m. UTC
ext4: improve commit path performance for fast commit

This patch series supersedes the patch "ext4: remove journal barrier
during fast commit" sent in Feb 2022.

This patch series reworks the fast commit's commit path to improve
overall performance of the commit path. Following optimizations have
been added in this series:

* Avoid having to lock the journal throughout the fast commit.
* Remove tracking of open handles per inode.

With the changes introduced in this patch series, now the commit path
for fast commits is as follows:

 [1] Lock the journal by calling jbd2_journal_lock_updates. This
     ensures that all the exsiting handles finish and no new handles
     can start.
 [2] Mark all the fast commit eligible inodes as undergoing fast commit
     by setting "EXT4_STATE_FC_COMMITTING" state.
 [3] Unlock the journal by calling jbd2_journal_unlock_updates. This allows
     starting of new handles. If new handles try to start an update on
     any of the inodes that are being committed, ext4_fc_track_inode()
     will block until those inodes have finished the fast commit.
 [4] Submit data buffers of all the committing inodes.
 [5] Wait for [4] to complete.
 [6] Commit all the directory entry updates in the fast commit space.
 [7] Commit all the changed inodes in the fast commit space and clear
     "EXT4_STATE_FC_COMMITTING" for all the inodes.
 [8] Write tail tag to ensure atomicity of commits.

(The above flow has been documented in the code as well)

Instead of calling ext4_fc_track_inode() in ext4_journal_start() as I
originally proposed on the code review of [PATCH V2 2/5] "ext4: ext4:
for committing inode, make ext4_fc_track_inode wait" [1], in this
version I changed the behavior of ext4_reserve_inode_write() to also
call ext4_fc_track_inode(). Let's call this approach 1.

I also evaluated another approach (approach 2) where
ext4_reserve_inode_write() acts just as an assertion to ensure that
inode is on fast commit list and the actual call to
ext4_fc_track_inode() is done by ext4_journal_start(). However, this
results in too many stray ext4_fc_track_inode() calls. Approach 1
reduces the number of stray ext4_fc_track_inode() calls and thus makes
the code more maintainable.

However, approach 1 results in a potential deadlock where the caller
can hang if they grab i_data_sem before calling
ext4_fc_track_inode(). To handle that, I have added explicit calls to
ext4_fc_track_inode() in such places. Eventually, when we migrate to
using extent status tree for logical to physical mapping lookup, we
can get rid of this ordering requirement and also remove these calls
to ext4_fc_track_inode(). But, even after adding these stray calls, the
number of stray calls to ext4_fc_track_inode() were less in approach 1
than in approach 2.

I verified that the patch series introduces no regressions in "quick"
and "log" groups when "fast_commit" feature is enabled.

[1] https://www.spinics.net/lists/linux-ext4/msg82019.html

Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

Harshad Shirwadkar (6):
  ext4: convert i_fc_lock to spinlock
  ext4: for committing inode, make ext4_fc_track_inode wait
  ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr
  ext4: rework fast commit commit path
  ext4: drop i_fc_updates from inode fc info
  ext4: update code documentation

 fs/ext4/ext4.h        |  12 +--
 fs/ext4/fast_commit.c | 240 +++++++++++++++++++++---------------------
 fs/ext4/inline.c      |   3 +
 fs/ext4/inode.c       |  10 +-
 fs/ext4/super.c       |   2 +-
 fs/jbd2/journal.c     |   2 -
 6 files changed, 136 insertions(+), 133 deletions(-)