From patchwork Tue Oct 11 07:01:57 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dmitry Monakhov X-Patchwork-Id: 118863 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id BF689B6F70 for ; Tue, 11 Oct 2011 18:02:05 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750872Ab1JKHCD (ORCPT ); Tue, 11 Oct 2011 03:02:03 -0400 Received: from mail-bw0-f46.google.com ([209.85.214.46]:45956 "EHLO mail-bw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750712Ab1JKHCC (ORCPT ); Tue, 11 Oct 2011 03:02:02 -0400 Received: by bkbzt4 with SMTP id zt4so9410983bkb.19 for ; Tue, 11 Oct 2011 00:02:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version:content-type; bh=dN0cggUoLK7R5nd4BrSLtl716qgjrBGm2QJDjfeFHpI=; b=XcjMxPa6AtwmUe5lAd9m83/PYhNNwFbgBEdPaR0GZgJXCLxkqy4JzQVGvKEzLnNZBJ ujwvENq/WAOsD7Nk1maADKEJmZoQZHLpmxTmy2nNdVf+aIPON5Iug6QS+rN76ZZs944m zKZ7XorGlsBmAWSrqbom8F16Gn+bP9LnrKdaA= Received: by 10.204.135.13 with SMTP id l13mr8594185bkt.74.1318316521193; Tue, 11 Oct 2011 00:02:01 -0700 (PDT) Received: from smtp.gmail.com (swsoft-msk-nat.sw.ru. [195.214.232.10]) by mx.google.com with ESMTPS id ex8sm20020580bkc.2.2011.10.11.00.01.58 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 11 Oct 2011 00:01:59 -0700 (PDT) From: Dmitry Monakhov To: Curt Wohlgemuth , Lukas Czerner Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org Subject: Re: [PATCH] ext4: handle NULL p_ext in ext4_ext_next_allocated_block() In-Reply-To: References: <1318122074-16056-1-git-send-email-curtw@google.com> User-Agent: Notmuch/0.5-69-g3e4a9d6 (http://notmuchmail.org) Emacs/23.1.1 (i486-pc-linux-gnu) Date: Tue, 11 Oct 2011 11:01:57 +0400 Message-ID: <87ipnwnh96.fsf@dmbot.sw.ru> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, 10 Oct 2011 08:28:23 -0700, Curt Wohlgemuth wrote: > Hi Lukas: > > On Mon, Oct 10, 2011 at 12:19 AM, Lukas Czerner wrote: > > On Sat, 8 Oct 2011, Curt Wohlgemuth wrote: > > > >> In ext4_ext_next_allocated_block(), the path[depth] might > >> have a p_ext that is NULL -- see ext4_ext_binsearch().  In > >> such a case, dereferencing it will crash the machine. > >> > >> This patch checks for p_ext == NULL in > >> ext4_ext_next_allocated_block() before dereferencinging it. > >> > >> Tested using a hand-crafted an inode with eh_entries == 0 in > >> an extent block, verified that running FIEMAP on it crashes > >> without this patch, works fine with it. > > > > Hi Curt, > > > > It seems to me that even that the patch fixes the NULL dereference, it > > is not a complete solution. Is it possible that in "normal" case p_ext > > would be NULL ? I think that this is only possible in extent split/add > > case (as noted in ext4_ext_binsearch()) which should be atomic to the > > other operations (locked i_mutex?). > > Yes, unfortunately, it is possible in "normal" cases for p_ext to be NULL. > > We've seen this problem during what appears to be a race between an > inode growth (or truncate?) and another task doing a FIEMAP ioctl. > The real problem is that FIEMAP handing in ext4 is just, well, buggy? Wow, IMHO it not just buggy, it is obviously incorrect, IMHO it is more fair just return -ENOTSUPP, at least it is much safer. Yes calling FIEMAP and truncate/write in parallel is stupid, but not prohibited. > > ext4_ext_walk_space() will get the i_data_sem, construct the path > array, then release the semaphore. But then it does a bazillion > accesses on the extent/header/index pointers in the path array, with > no protection against truncate, growth, or any other changes. As far > as I can tell, this is the only use of a path array retrieved from > ext4_ext_find_extent() that isn't completely covered by i_data_sem. In that case i_data sem protects us from nothing. Path collected can simply disappear under us. And in fact i don't understand the reason why we drop i_data_sem too soon. Are any reason to do that? Seems like following patch should fix the issue. From 12cd56ccd86c4d132f186034a9c11b0a2441a19f Mon Sep 17 00:00:00 2001 From: Dmitry Monakhov Date: Tue, 11 Oct 2011 10:44:31 +0400 Subject: [PATCH] ext4: do not drop i_data_sem too soon Path returned from ext4_find_extent is valid only while we hold i_data_sem, so we can drop it only after we nolonger need it. Signed-off-by: Dmitry Monakhov --- fs/ext4/extents.c | 20 +++++++++++--------- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index da4583f..c716a1f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -1847,23 +1847,32 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block, num = last - block; /* find extent for this block */ down_read(&EXT4_I(inode)->i_data_sem); + if (ext_depth(inode) != depth) { + /* depth was changed. we have to realloc path */ + kfree(path); + path = NULL; + } + path = ext4_ext_find_extent(inode, block, path); - up_read(&EXT4_I(inode)->i_data_sem); if (IS_ERR(path)) { err = PTR_ERR(path); + up_read(&EXT4_I(inode)->i_data_sem); path = NULL; break; } depth = ext_depth(inode); if (unlikely(path[depth].p_hdr == NULL)) { + up_read(&EXT4_I(inode)->i_data_sem); EXT4_ERROR_INODE(inode, "path[%d].p_hdr == NULL", depth); err = -EIO; break; } + ex = path[depth].p_ext; next = ext4_ext_next_allocated_block(path); - + up_read(&EXT4_I(inode)->i_data_sem); + ext4_ext_drop_refs(path); exists = 0; if (!ex) { /* there is no extent yet, so try to allocate @@ -1915,7 +1924,6 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block, break; } err = func(inode, next, &cbex, ex, cbdata); - ext4_ext_drop_refs(path); if (err < 0) break; @@ -1927,12 +1935,6 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block, break; } - if (ext_depth(inode) != depth) { - /* depth was changed. we have to realloc path */ - kfree(path); - path = NULL; - } - block = cbex.ec_block + cbex.ec_len; }