From patchwork Fri Oct 3 17:08:05 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrea Arcangeli X-Patchwork-Id: 396446 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id BE4D2140097 for ; Sat, 4 Oct 2014 06:58:17 +1000 (EST) Received: from localhost ([::1]:41290 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xa9vc-0004sB-10 for incoming@patchwork.ozlabs.org; Fri, 03 Oct 2014 16:58:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44868) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xa6Lh-0001OM-Bw for qemu-devel@nongnu.org; Fri, 03 Oct 2014 13:09:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xa6Lb-0005nA-73 for qemu-devel@nongnu.org; Fri, 03 Oct 2014 13:08:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:12909) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xa6La-0005n6-Vq for qemu-devel@nongnu.org; Fri, 03 Oct 2014 13:08:51 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s93H8Fb9023240 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 3 Oct 2014 13:08:15 -0400 Received: from mail.random (ovpn-116-38.ams2.redhat.com [10.36.116.38]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s93H8DhA010209 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 3 Oct 2014 13:08:14 -0400 From: Andrea Arcangeli To: qemu-devel@nongnu.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org Date: Fri, 3 Oct 2014 19:08:05 +0200 Message-Id: <1412356087-16115-16-git-send-email-aarcange@redhat.com> In-Reply-To: <1412356087-16115-1-git-send-email-aarcange@redhat.com> References: <1412356087-16115-1-git-send-email-aarcange@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 X-Mailman-Approved-At: Fri, 03 Oct 2014 16:48:55 -0400 Cc: Robert Love , Dave Hansen , Jan Kara , Neil Brown , Stefan Hajnoczi , Andrew Jones , KOSAKI Motohiro , Michel Lespinasse , Taras Glek , Juan Quintela , Hugh Dickins , Isaku Yamahata , Mel Gorman , Sasha Levin , Android Kernel Team , "\\\"Dr. David Alan Gilbert\\\"" , "Huangpeng \(Peter\)" , Andres Lagar-Cavilla , Christopher Covington , Anthony Liguori , Paolo Bonzini , Keith Packard , Wenchao Xia , Andy Lutomirski , Minchan Kim , Dmitry Adamushko , Johannes Weiner , Mike Hommey , Andrew Morton , Linus Torvalds , Peter Feiner Subject: [Qemu-devel] [PATCH 15/17] userfaultfd: make userfaultfd_write non blocking X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org It is generally inefficient to ask the wakeup of userfault ranges where there's not a single userfault address read through userfaultfd_read earlier and in turn waiting a wakeup. However it may come handy to wakeup the same userfault range twice in case of multiple thread faulting on the same address. But we should still return an error so if the application thinks this occurrence can never happen it will know it hit a bug. So just return -ENOENT instead of blocking. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 34 +++++----------------------------- 1 file changed, 5 insertions(+), 29 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 62b827e..2667d0d 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -458,9 +458,7 @@ static ssize_t userfaultfd_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct userfaultfd_ctx *ctx = file->private_data; - ssize_t res; __u64 range[2]; - DECLARE_WAITQUEUE(wait, current); if (ctx->state == USERFAULTFD_STATE_ASK_PROTOCOL) { __u64 protocol; @@ -488,34 +486,12 @@ static ssize_t userfaultfd_write(struct file *file, const char __user *buf, if (range[0] >= range[1]) return -ERANGE; - spin_lock(&ctx->fd_wqh.lock); - __add_wait_queue(&ctx->fd_wqh, &wait); - for (;;) { - set_current_state(TASK_INTERRUPTIBLE); - /* always take the fd_wqh lock before the fault_wqh lock */ - if (find_userfault(ctx, NULL, POLLOUT)) { - if (!wake_userfault(ctx, range)) { - res = sizeof(range); - break; - } - } - if (signal_pending(current)) { - res = -ERESTARTSYS; - break; - } - if (file->f_flags & O_NONBLOCK) { - res = -EAGAIN; - break; - } - spin_unlock(&ctx->fd_wqh.lock); - schedule(); - spin_lock(&ctx->fd_wqh.lock); - } - __remove_wait_queue(&ctx->fd_wqh, &wait); - __set_current_state(TASK_RUNNING); - spin_unlock(&ctx->fd_wqh.lock); + /* always take the fd_wqh lock before the fault_wqh lock */ + if (find_userfault(ctx, NULL, POLLOUT)) + if (!wake_userfault(ctx, range)) + return sizeof(range); - return res; + return -ENOENT; } #ifdef CONFIG_PROC_FS