From patchwork Thu Mar 4 17:41:14 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Schmidt X-Patchwork-Id: 46964 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 5EF3AB7CE8 for ; Fri, 5 Mar 2010 04:41:29 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756385Ab0CDRlU (ORCPT ); Thu, 4 Mar 2010 12:41:20 -0500 Received: from mx1.redhat.com ([209.132.183.28]:6340 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752504Ab0CDRlS (ORCPT ); Thu, 4 Mar 2010 12:41:18 -0500 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o24HfGfu009320 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 4 Mar 2010 12:41:16 -0500 Received: from leela (leela.englab.brq.redhat.com [10.34.32.196]) by int-mx03.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o24HfFg6022422; Thu, 4 Mar 2010 12:41:15 -0500 Date: Thu, 4 Mar 2010 18:41:14 +0100 From: Michal Schmidt To: David Miller Cc: netdev@vger.kernel.org Subject: uninterruptible sleep in unix_dgram_recvmsg Message-ID: <20100304184114.62881b21@leela> Organization: Red Hat Mime-Version: 1.0 X-Scanned-By: MIMEDefang 2.67 on 10.5.11.16 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hello David. When multiple tasks call recv() on a AF_UNIX/SOCK_DGRAM socket where noone sends anything, only the first one will sleep interruptibly. The others are in uninterruptible sleep, causing artificial increase of loadavg. After two minutes, the hung task watchdog triggers and prints ugly warnings. The bug is reported here (with a reproducer attached): https://bugzilla.redhat.com/show_bug.cgi?id=529202 While the first task awaits the arrival of a packet in skb_recv_datagram(), it holds the u->readlock mutex, on which the other tasks will be waiting. My first idea was to simply replace mutex_lock with mutex_lock_interruptible. This solves the problem, but one issue still remains - the receiving timeout (SO_RCVTIMEO) would start ticking only after the process got the mutex and entered into skb_recv_datagram(). So instead of that I started to think about why u->readlock is held across skb_recv_datagram() anyway. I found that it was added in 2.6.10 by your patch "[AF_UNIX]: Serialize dgram read using semaphore just like stream" which apparently fixed an exploitable race condition (CAN-2004-1068). I don't know what exactly u->readlock protects here. IOW, what race would this patch cause?: --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index f255119..01387da 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1660,9 +1660,9 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock, msg->msg_namelen = 0; + skb = skb_recv_datagram(sk, flags, noblock, &err); mutex_lock(&u->readlock); - skb = skb_recv_datagram(sk, flags, noblock, &err); if (!skb) { unix_state_lock(sk); /* Signal EOF on disconnected non-blocking SEQPACKET socket. */