From patchwork Fri Sep 7 20:45:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Whitney X-Patchwork-Id: 967506 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="hnJkFcMh"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 426Tx33w0yz9s4V for ; Sat, 8 Sep 2018 06:45:59 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726763AbeIHB2h (ORCPT ); Fri, 7 Sep 2018 21:28:37 -0400 Received: from mail-ua1-f66.google.com ([209.85.222.66]:38476 "EHLO mail-ua1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726747AbeIHB2h (ORCPT ); Fri, 7 Sep 2018 21:28:37 -0400 Received: by mail-ua1-f66.google.com with SMTP id o11-v6so12947448uak.5 for ; Fri, 07 Sep 2018 13:45:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=E4g9EWbKFz8WuzIi31D2qQ6bALiq0avu0tVSRrgF8us=; b=hnJkFcMh/UQwW62qUzpDuxAsEqmo8zcOWpz0c+7Y/DIvYVkxKTDUccSq9rpmDJhvop RvCkFc4ufFFifhcaRWBOcpWJL/SgT9xrNpHjGGugikfj5aeQJCp1UosYcPbN4t3+V9gp xgtJr+hTNZgwleKaT6WoCrYrUg8z6kyU5ZXAapr1Q0abUz0K1UIWSuAMlZNhDDoXcFcb nauU6Zy//c1UF09TGZ8TXC+qf0qzaAxklJ7ipGMjS+MeVvoJM5G4UTyYOrlertMOmeJY 3LPK7r/SYPd99qL4t1wCVN8Knulh5SFHQWROQ0ZMatwOiUv2Q9zdHXbAMaRwUJL2u4H7 LFVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=E4g9EWbKFz8WuzIi31D2qQ6bALiq0avu0tVSRrgF8us=; b=gE2mBawsQyz5Adgl/FJOFW3wD8zx60kvvUq1JWPp7RIQyQ0CnQCGakiw7EyNdSg51H 0vNLw3X5ad77gWzYDvi8Mgtcnwu3rW8EID7nMNAQxuTG8AB8pjUplMht9KJro0uYFwpn oHrdDGkJbzFDm6O0N49rnswecfqFpf93kVav2EMEI+1VZzFATLusNrrwLhnjU9KhT0MN 1Saz0h4UnDsTAZT3pE7mK+S0C4CwOAm0emi9OMwq5s18QA555WYd4y5Z/AgXfV/EhAqc tY1XUWFtfRBhF/DLgmbZhq0u+mlNYfKkYkZ8nEGxpjEvDuuS/1FkFUVOVfc12dKHnKPT qifQ== X-Gm-Message-State: APzg51AHV4WJf71aUlYS5C/zbxl9gamdV0176siV8Ns0YiIgWmdAz0pr s/0lXuffiSEw9edRAaYzESprojmj X-Google-Smtp-Source: ANB0VdZBrdBFmbrdNjwgiQcoMdIhRtmCXV1OQ4y7KdvAgxlHD5jPGP+iGvynR5R2GqwDyXKlKmZ0UA== X-Received: by 2002:ab0:210b:: with SMTP id d11-v6mr3476434ual.108.1536353157175; Fri, 07 Sep 2018 13:45:57 -0700 (PDT) Received: from localhost.localdomain (c-73-60-226-25.hsd1.nh.comcast.net. [73.60.226.25]) by smtp.gmail.com with ESMTPSA id z7-v6sm971512uam.11.2018.09.07.13.45.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 07 Sep 2018 13:45:56 -0700 (PDT) From: Eric Whitney To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, Eric Whitney Subject: [PATCH 6/6] ext4: fix reserved cluster accounting at page invalidation time Date: Fri, 7 Sep 2018 16:45:29 -0400 Message-Id: <20180907204529.1662-7-enwlinux@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180907204529.1662-1-enwlinux@gmail.com> References: <20180907204529.1662-1-enwlinux@gmail.com> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Add new code to count canceled pending cluster reservations on bigalloc file systems and to reduce the cluster reservation count on all file systems using delayed allocation. This replaces old code in ext4_da_page_release_reservations that was incorrect. Signed-off-by: Eric Whitney --- fs/ext4/ext4.h | 1 + fs/ext4/extents_status.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/ext4/extents_status.h | 2 ++ fs/ext4/inode.c | 23 +++---------- 4 files changed, 97 insertions(+), 19 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index c11dcc4cc2d1..0d731af489bf 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2477,6 +2477,7 @@ extern int ext4_page_mkwrite(struct vm_fault *vmf); extern int ext4_filemap_fault(struct vm_fault *vmf); extern qsize_t *ext4_get_reserved_space(struct inode *inode); extern int ext4_get_projid(struct inode *inode, kprojid_t *projid); +extern void ext4_da_release_space(struct inode *inode, int to_free); extern void ext4_da_update_reserve_space(struct inode *inode, int used, int quota_claim); extern int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index a2eb68f866bb..af03a87dbd84 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1788,3 +1788,93 @@ void ext4_make_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len) write_unlock(&EXT4_I(inode)->i_es_lock); } + +/* + * ext4_es_remove_blks - remove block range from extents status tree and + * reduce reservation count or cancel pending + * reservation as needed + * + * @inode - file containing range + * @lblk - first block in range + * @len - number of blocks to remove + * + */ +void ext4_es_remove_blks(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len) +{ + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + unsigned int clu_size, reserved = 0; + ext4_lblk_t last_lclu, first, length, remainder, last; + bool delunwrit; + int err = 0; + struct pending_reservation *pr; + struct ext4_pending_tree *tree; + + /* + * Process cluster by cluster for bigalloc - there may be up to + * two clusters in a 4k page with a 1k block size and two blocks + * per cluster. Also necessary for systems with larger page sizes + * and potentially larger block sizes. + */ + clu_size = sbi->s_cluster_ratio; + last_lclu = EXT4_B2C(sbi, lblk + len - 1); + + write_lock(&EXT4_I(inode)->i_es_lock); + + for (first = lblk, remainder = len; + remainder > 0; + first += length, remainder -= length) { + + if (EXT4_B2C(sbi, first) == last_lclu) + length = remainder; + else + length = clu_size - EXT4_LBLK_COFF(sbi, first); + + /* + * The BH_Delay flag, which triggers calls to this function, + * and the contents of the extents status tree can be + * inconsistent due to writepages activity. So, verify that + * the blocks to be removed belong to an extent with delayed + * and unwritten status. + */ + delunwrit = __es_scan_clu(inode, &ext4_es_is_delunwrit, first); + + /* + * because of the writepages effect, written and unwritten + * blocks could be removed here + */ + last = first + length - 1; + err = __es_remove_extent(inode, first, last); + if (err) + ext4_warning(inode->i_sb, + "%s: couldn't remove page (err = %d)", + __func__, err); + + /* non-bigalloc case: simply count the cluster for release */ + if (sbi->s_cluster_ratio == 1 && delunwrit) { + reserved++; + continue; + } + + /* + * bigalloc case: if all delayed allocated blocks have just + * been removed from a cluster, either cancel a pending + * reservation if it exists or count a cluster for release + */ + if (delunwrit && + !__es_scan_clu(inode, &ext4_es_is_delayed, first)) { + pr = __get_pending(inode, EXT4_B2C(sbi, first)); + if (pr != NULL) { + tree = &EXT4_I(inode)->i_pending_tree; + rb_erase(&pr->rb_node, &tree->root); + kmem_cache_free(ext4_pending_cachep, pr); + } else { + reserved++; + } + } + } + + write_unlock(&EXT4_I(inode)->i_es_lock); + + ext4_da_release_space(inode, reserved); +} diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index 46e41ef3be0a..73152fa676d2 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -250,5 +250,7 @@ extern void ext4_cancel_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len); extern void ext4_make_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len); +extern void ext4_es_remove_blks(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len); #endif /* _EXT4_EXTENTS_STATUS_H */ diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 802bbd0f99e5..2dd984f97d91 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1595,7 +1595,7 @@ static int ext4_da_reserve_space(struct inode *inode) return 0; /* success */ } -static void ext4_da_release_space(struct inode *inode, int to_free) +void ext4_da_release_space(struct inode *inode, int to_free) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); @@ -1634,13 +1634,11 @@ static void ext4_da_page_release_reservation(struct page *page, unsigned int offset, unsigned int length) { - int to_release = 0, contiguous_blks = 0; + int contiguous_blks = 0; struct buffer_head *head, *bh; unsigned int curr_off = 0; struct inode *inode = page->mapping->host; - struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); unsigned int stop = offset + length; - int num_clusters; ext4_fsblk_t lblk; BUG_ON(stop > PAGE_SIZE || stop < length); @@ -1654,7 +1652,6 @@ static void ext4_da_page_release_reservation(struct page *page, break; if ((offset <= curr_off) && (buffer_delay(bh))) { - to_release++; contiguous_blks++; clear_buffer_delay(bh); } else if (contiguous_blks) { @@ -1662,7 +1659,7 @@ static void ext4_da_page_release_reservation(struct page *page, (PAGE_SHIFT - inode->i_blkbits); lblk += (curr_off >> inode->i_blkbits) - contiguous_blks; - ext4_es_remove_extent(inode, lblk, contiguous_blks); + ext4_es_remove_blks(inode, lblk, contiguous_blks); contiguous_blks = 0; } curr_off = next_off; @@ -1671,21 +1668,9 @@ static void ext4_da_page_release_reservation(struct page *page, if (contiguous_blks) { lblk = page->index << (PAGE_SHIFT - inode->i_blkbits); lblk += (curr_off >> inode->i_blkbits) - contiguous_blks; - ext4_es_remove_extent(inode, lblk, contiguous_blks); + ext4_es_remove_blks(inode, lblk, contiguous_blks); } - /* If we have released all the blocks belonging to a cluster, then we - * need to release the reserved space for that cluster. */ - num_clusters = EXT4_NUM_B2C(sbi, to_release); - while (num_clusters > 0) { - lblk = (page->index << (PAGE_SHIFT - inode->i_blkbits)) + - ((num_clusters - 1) << sbi->s_cluster_bits); - if (sbi->s_cluster_ratio == 1 || - !ext4_es_scan_clu(inode, &ext4_es_is_delayed, lblk)) - ext4_da_release_space(inode, 1); - - num_clusters--; - } } /*