From patchwork Thu Sep 11 22:49:07 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: TR Reardon X-Patchwork-Id: 388418 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 286C714013B for ; Fri, 12 Sep 2014 08:49:46 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752160AbaIKWtK (ORCPT ); Thu, 11 Sep 2014 18:49:10 -0400 Received: from bay004-omc3s27.hotmail.com ([65.54.190.165]:57404 "EHLO BAY004-OMC3S27.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751105AbaIKWtI convert rfc822-to-8bit (ORCPT ); Thu, 11 Sep 2014 18:49:08 -0400 Received: from BAY179-W1 ([65.54.190.187]) by BAY004-OMC3S27.hotmail.com with Microsoft SMTPSVC(7.5.7601.22724); Thu, 11 Sep 2014 15:49:08 -0700 X-TMN: [Kq7EF85qZIJpEojf3MRPuCM5eZ087ycc] X-Originating-Email: [thomas_reardon@hotmail.com] Message-ID: From: TR Reardon To: Andreas Dilger CC: "Darrick J. Wong" , "linux-ext4@vger.kernel.org" Subject: RE: possible different donor file naming in e4defrag Date: Thu, 11 Sep 2014 18:49:07 -0400 Importance: Normal In-Reply-To: <25905DD3-CD3E-42F2-A101-715E7C205CEB@dilger.ca> References: <20140815203909.GM2808@birch.djwong.org> , <4DF4149D-9995-475D-B25E-DAE799DE6100@dilger.ca> , <25905DD3-CD3E-42F2-A101-715E7C205CEB@dilger.ca> MIME-Version: 1.0 X-OriginalArrivalTime: 11 Sep 2014 22:49:08.0383 (UTC) FILETIME=[98B5FAF0:01CFCE12] Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org > On Sep 11, 2014, at 1:48 PM, TR Reardon wrote: >> Picking this back up. How would O_TMPFILE avoid races? It definitely avoids the unwanted mtime/atime update, but then the existing ".defrag" pseudo-lock file would no longer be available. How could you use O_TMPFILE and still avoid multiple defrag? If this isn't possible, then resetting the parent times on unlink(tmpfile), as you suggest, is the simplest way out of this. > > Looking at this the opposite way - what are the chances that there > will be concurrent defrags on the same file? Basically no chance at > all. So long as it doesn't explode (the kernel would need to protect > against this anyway to avoid malicious apps), the worst case is that > there will be some extra defragmentation done in a very rare case. > > Conversely, creating a temporary filename and then resetting the > parent directory timestamp is extra work for every file defragmented, > and is racy because e4defrag may "reset" the time to before the temp > file was created, but clobber a legitimate timestamp update in the > directory from some other concurrent update. That timestamp update > is always going to be racy, even if e4defrag tries to be careful. > > Cheers, Andreas Thanks, well described. So a simple attempt with O_TMPFILE first, then revert to original behavior, like below? -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/misc/e4defrag.c b/misc/e4defrag.c index d0eac60..8001182 100644 --- a/misc/e4defrag.c +++ b/misc/e4defrag.c @@ -40,6 +40,7 @@  #include  #include  #include +#include    /* A relatively new ioctl interface ... */  #ifndef EXT4_IOC_MOVE_EXT @@ -1526,31 +1527,36 @@ static int file_defrag(const char *file, const struct stat64 *buf,     /* Create donor inode */   memset(tmp_inode_name, 0, PATH_MAX + 8); - sprintf(tmp_inode_name, "%.*s.defrag", - (int)strnlen(file, PATH_MAX), file); - donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR); + /* Try O_TMPFILE first, to avoid changing directory mtime */ + sprintf(tmp_inode_name, "%.*s", (int)strnlen(file, PATH_MAX), file); + donor_fd = open64(dirname(tmp_inode_name), O_TMPFILE | O_WRONLY | O_EXCL, S_IRUSR | S_IWUSR );   if (donor_fd < 0) { - if (mode_flag & DETAIL) { - PRINT_FILE_NAME(file); - if (errno == EEXIST) - PRINT_ERR_MSG_WITH_ERRNO( - "File is being defraged by other program"); - else - PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN); + sprintf(tmp_inode_name, "%.*s.defrag", + (int)strnlen(file, PATH_MAX), file); + donor_fd = open64(tmp_inode_name, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR); + if (donor_fd < 0) { + if (mode_flag & DETAIL) { + PRINT_FILE_NAME(file); + if (errno == EEXIST) + PRINT_ERR_MSG_WITH_ERRNO( + "File is being defraged by other program"); + else + PRINT_ERR_MSG_WITH_ERRNO(NGMSG_FILE_OPEN); + } + goto out;   } - goto out; - }   - /* Unlink donor inode */ - ret = unlink(tmp_inode_name); - if (ret < 0) { - if (mode_flag & DETAIL) { - PRINT_FILE_NAME(file); - PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink"); + /* Unlink donor inode */ + ret = unlink(tmp_inode_name); + if (ret < 0) { + if (mode_flag & DETAIL) { + PRINT_FILE_NAME(file); + PRINT_ERR_MSG_WITH_ERRNO("Failed to unlink"); + } + goto out;   } - goto out;   } - +   /* Allocate space for donor inode */   orig_group_tmp = orig_group_head;   do {