From patchwork Thu Aug 23 02:27:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Whitney X-Patchwork-Id: 961138 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="AVl0xsLY"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41wpH15Bncz9s3C for ; Thu, 23 Aug 2018 12:27:57 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727491AbeHWFzS (ORCPT ); Thu, 23 Aug 2018 01:55:18 -0400 Received: from mail-yb0-f193.google.com ([209.85.213.193]:33566 "EHLO mail-yb0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727473AbeHWFzS (ORCPT ); Thu, 23 Aug 2018 01:55:18 -0400 Received: by mail-yb0-f193.google.com with SMTP id d4-v6so1485670ybl.0 for ; Wed, 22 Aug 2018 19:27:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=e2+7f47+5pNpGXSPSWNfsNFUglbqxDpHtoQtzqMA7P4=; b=AVl0xsLYHJtSfYZ+T472eGa0E4qQn2CuydwDEhzOfUt/8v9+KkWaJsAZroEHOb2TUG 2qW2LfFU1lB+VHFU1NzdHvU3D2AquvSr2mf0DMhNNEHtYqAbELErqiCzTH41e3iRm22k OhBVzoGYyEIsYvn42R1msgIcw4AuLj8bZvn7Kamw0CwzKAihSYsjedNn0qqHIcxekAdH 2eLGFLIUA0eMhG7XQDiqu5Tx8CFBrfqMdTnpH8rqc3BxoQybRRsfV/ugO94QBuZrIB9e h8Yill0tyiFONZrLcj+GtvC+mPKUpcZif5hNzn+LqIa7WhvGDehdKA8AzJls7dLSg1g8 rqJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=e2+7f47+5pNpGXSPSWNfsNFUglbqxDpHtoQtzqMA7P4=; b=fC06k2OJnQu67FB7eHlrFwC2DJogPovrL14GIEyZ9g19oaH/cgrdi8KqqZlp4wiVW2 cwpdy7HNgHU77/n2Q2Uy5dw5XSyPJilqWD8P3IgoOFKNk4VY1emVVscWZFyr8kz9Iy+G ZbWbhFbXwtMIxOc7WXpBQcZju8aPXpZyxdkiLwzzSVoERNe+k0gg5Z9JWCq3CdPIVyBI 5yWgzM/5NPky0sRKdC1EYxaz9AGujTQDPIHeEZvGFNuPak6raqGOLOtV3HKDF1fq9wvs hnnV+eniS7Xz+NSATyBQYMsWcEcwYuh55PYV7PyJt5FnI5fsKr8wWuZqngZL7d9nbNP9 flBA== X-Gm-Message-State: APzg51CO6jux25BOUdZJNi8AScXhcVQm8huz6l8XN+iGOuFmDMcAW80M 1+VOeAOIRgFrZ/K90eT3eo6ZDMms X-Google-Smtp-Source: ANB0VdaUZ1GFNm6nGVQYAM+t/t5NpZ/Ic0cz7IXwCk2il3gzbWJHSAIkN+4XeR7CKpNGsK4Lao3Jqw== X-Received: by 2002:a25:6d6:: with SMTP id 205-v6mr1532216ybg.420.1534991275456; Wed, 22 Aug 2018 19:27:55 -0700 (PDT) Received: from localhost.localdomain (c-73-60-226-25.hsd1.nh.comcast.net. [73.60.226.25]) by smtp.gmail.com with ESMTPSA id l21-v6sm1270458ywb.108.2018.08.22.19.27.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 22 Aug 2018 19:27:55 -0700 (PDT) From: Eric Whitney To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, Eric Whitney Subject: [RFC PATCH 6/6] ext4: fix reserved cluster accounting at page invalidation time Date: Wed, 22 Aug 2018 22:27:07 -0400 Message-Id: <20180823022707.14593-7-enwlinux@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180823022707.14593-1-enwlinux@gmail.com> References: <20180823022707.14593-1-enwlinux@gmail.com> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Add new code to count canceled pending cluster reservations on bigalloc file systems and to reduce the cluster reservation count on all file systems using delayed allocation. This replaces old code in ext4_da_page_release_reservations that was incorrect. Signed-off-by: Eric Whitney --- fs/ext4/ext4.h | 1 + fs/ext4/extents_status.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/ext4/extents_status.h | 2 ++ fs/ext4/inode.c | 23 +++---------- 4 files changed, 97 insertions(+), 19 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 77ff2a522315..7ee2a72ba9dd 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2477,6 +2477,7 @@ extern int ext4_page_mkwrite(struct vm_fault *vmf); extern int ext4_filemap_fault(struct vm_fault *vmf); extern qsize_t *ext4_get_reserved_space(struct inode *inode); extern int ext4_get_projid(struct inode *inode, kprojid_t *projid); +extern void ext4_da_release_space(struct inode *inode, int to_free); extern void ext4_da_update_reserve_space(struct inode *inode, int used, int quota_claim); extern int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index ff39944a8225..55cd495b70f8 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1788,3 +1788,93 @@ void ext4_make_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len) write_unlock(&EXT4_I(inode)->i_es_lock); } + +/* + * ext4_es_remove_blks - remove block range from extents status tree and + * reduce reservation count or cancel pending + * reservation as needed + * + * @inode - file containing range + * @lblk - first block in range + * @len - number of blocks to remove + * + */ +void ext4_es_remove_blks(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len) +{ + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + unsigned int clu_size, reserved = 0; + ext4_lblk_t last_lclu, first, length, remainder, last; + bool delunwrit; + int err = 0; + struct pending_reservation *pr; + struct ext4_pending_tree *tree; + + /* + * Process cluster by cluster for bigalloc - there may be up to + * two clusters in a 4k page with a 1k block size and two blocks + * per cluster. Also necessary for systems with larger page sizes + * and potentially larger block sizes. + */ + clu_size = sbi->s_cluster_ratio; + last_lclu = EXT4_B2C(sbi, lblk + len - 1); + + write_lock(&EXT4_I(inode)->i_es_lock); + + for (first = lblk, remainder = len; + remainder > 0; + first += length, remainder -= length) { + + if (EXT4_B2C(sbi, first) == last_lclu) + length = remainder; + else + length = clu_size - EXT4_LBLK_COFF(sbi, first); + + /* + * The BH_Delay flag, which triggers calls to this function, + * and the contents of the extents status tree can be + * inconsistent due to writepages activity. So, verify that + * the blocks to be removed belong to an extent with delayed + * and unwritten status. + */ + delunwrit = __es_scan_clu(inode, &ext4_es_is_delunwrit, first); + + /* + * because of the writepages effect, written and unwritten + * blocks could be removed here + */ + last = first + length - 1; + err = __es_remove_extent(inode, first, last); + if (err) + ext4_warning(inode->i_sb, + "%s: couldn't remove page (err = %d)", + __func__, err); + + /* non-bigalloc case: simply count the cluster for release */ + if (sbi->s_cluster_ratio == 1 && delunwrit) { + reserved++; + continue; + } + + /* + * bigalloc case: if all delayed allocated blocks have just + * been removed from a cluster, either cancel a pending + * reservation if it exists or count a cluster for release + */ + if (delunwrit && + !__es_scan_clu(inode, &ext4_es_is_delayed, first)) { + pr = __get_pending(inode, EXT4_B2C(sbi, first)); + if (pr != NULL) { + tree = &EXT4_I(inode)->i_pending_tree; + rb_erase(&pr->rb_node, &tree->root); + kmem_cache_free(ext4_pending_cachep, pr); + } else { + reserved++; + } + } + } + + write_unlock(&EXT4_I(inode)->i_es_lock); + + ext4_da_release_space(inode, reserved); +} diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index 34c6032a4246..5f04387c3985 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -251,5 +251,7 @@ extern void ext4_cancel_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len); extern void ext4_make_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len); +extern void ext4_es_remove_blks(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len); #endif /* _EXT4_EXTENTS_STATUS_H */ diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f83fbbb1d297..8bcf84f5b4af 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1595,7 +1595,7 @@ static int ext4_da_reserve_space(struct inode *inode) return 0; /* success */ } -static void ext4_da_release_space(struct inode *inode, int to_free) +void ext4_da_release_space(struct inode *inode, int to_free) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); @@ -1634,13 +1634,11 @@ static void ext4_da_page_release_reservation(struct page *page, unsigned int offset, unsigned int length) { - int to_release = 0, contiguous_blks = 0; + int contiguous_blks = 0; struct buffer_head *head, *bh; unsigned int curr_off = 0; struct inode *inode = page->mapping->host; - struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); unsigned int stop = offset + length; - int num_clusters; ext4_fsblk_t lblk; BUG_ON(stop > PAGE_SIZE || stop < length); @@ -1654,7 +1652,6 @@ static void ext4_da_page_release_reservation(struct page *page, break; if ((offset <= curr_off) && (buffer_delay(bh))) { - to_release++; contiguous_blks++; clear_buffer_delay(bh); } else if (contiguous_blks) { @@ -1662,7 +1659,7 @@ static void ext4_da_page_release_reservation(struct page *page, (PAGE_SHIFT - inode->i_blkbits); lblk += (curr_off >> inode->i_blkbits) - contiguous_blks; - ext4_es_remove_extent(inode, lblk, contiguous_blks); + ext4_es_remove_blks(inode, lblk, contiguous_blks); contiguous_blks = 0; } curr_off = next_off; @@ -1671,21 +1668,9 @@ static void ext4_da_page_release_reservation(struct page *page, if (contiguous_blks) { lblk = page->index << (PAGE_SHIFT - inode->i_blkbits); lblk += (curr_off >> inode->i_blkbits) - contiguous_blks; - ext4_es_remove_extent(inode, lblk, contiguous_blks); + ext4_es_remove_blks(inode, lblk, contiguous_blks); } - /* If we have released all the blocks belonging to a cluster, then we - * need to release the reserved space for that cluster. */ - num_clusters = EXT4_NUM_B2C(sbi, to_release); - while (num_clusters > 0) { - lblk = (page->index << (PAGE_SHIFT - inode->i_blkbits)) + - ((num_clusters - 1) << sbi->s_cluster_bits); - if (sbi->s_cluster_ratio == 1 || - !ext4_es_scan_clu(inode, &ext4_es_is_delayed, lblk)) - ext4_da_release_space(inode, 1); - - num_clusters--; - } } /*