uninterruptible sleep in unix_dgram_recvmsg

Message ID	20100304184114.62881b21@leela
State	Changes Requested, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> Date: Thu, 4 Mar 2010 18:41:14 +0100 From: Michal Schmidt <mschmidt@redhat.com> To: David Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Subject: uninterruptible sleep in unix_dgram_recvmsg Message-ID: <20100304184114.62881b21@leela> Organization: Red Hat Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk

Message ID

20100304184114.62881b21@leela

State

Changes Requested, archived

Delegated to:

David Miller

Headers

Date: Thu, 4 Mar 2010 18:41:14 +0100
From: Michal Schmidt <mschmidt@redhat.com>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Subject: uninterruptible sleep in unix_dgram_recvmsg
Message-ID: <20100304184114.62881b21@leela>
Organization: Red Hat
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: netdev-owner@vger.kernel.org
Precedence: bulk

Commit Message

Michal Schmidt March 4, 2010, 5:41 p.m. UTC

Hello David.

When multiple tasks call recv() on a AF_UNIX/SOCK_DGRAM socket where
noone sends anything, only the first one will sleep interruptibly. The
others are in uninterruptible sleep, causing artificial increase of
loadavg. After two minutes, the hung task watchdog triggers and prints
ugly warnings.

The bug is reported here (with a reproducer attached):
https://bugzilla.redhat.com/show_bug.cgi?id=529202

While the first task awaits the arrival of a packet in
skb_recv_datagram(), it holds the u->readlock mutex, on which the
other tasks will be waiting.

My first idea was to simply replace mutex_lock with
mutex_lock_interruptible. This solves the problem, but one
issue still remains - the receiving timeout (SO_RCVTIMEO) would start
ticking only after the process got the mutex and entered into
skb_recv_datagram().

So instead of that I started to think about why u->readlock is held
across skb_recv_datagram() anyway. I found that it was added in 2.6.10
by your patch "[AF_UNIX]: Serialize dgram read using semaphore just
like stream" which apparently fixed an exploitable race condition
(CAN-2004-1068).

I don't know what exactly u->readlock protects here.
IOW, what race would this patch cause?:

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller March 8, 2010, 8:48 p.m. UTC | #1

From: Michal Schmidt <mschmidt@redhat.com>
Date: Thu, 4 Mar 2010 18:41:14 +0100

> So instead of that I started to think about why u->readlock is held
> across skb_recv_datagram() anyway. I found that it was added in 2.6.10
> by your patch "[AF_UNIX]: Serialize dgram read using semaphore just
> like stream" which apparently fixed an exploitable race condition
> (CAN-2004-1068).
> 
> I don't know what exactly u->readlock protects here.
> IOW, what race would this patch cause?:

Unfortunately I can't find any discussions about that change
and I can't find my own personal email archives from that far
back.

This is what irks me about handling security issues off-list
and in private.

In any event, I'm pretty sure we need to protect the dequeue
of SKBs from the datagram recv_queue with that mutex.  So
I'm weary to apply your patch.

Can you find a way to fix this without moving the SKB
dequeue outside of the lock?

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f255119..01387da 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1660,9 +1660,9 @@  static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	msg->msg_namelen = 0;
 
+	skb = skb_recv_datagram(sk, flags, noblock, &err);
 	mutex_lock(&u->readlock);
 
-	skb = skb_recv_datagram(sk, flags, noblock, &err);
 	if (!skb) {
 		unix_state_lock(sk);
 		/* Signal EOF on disconnected non-blocking SEQPACKET socket. */

uninterruptible sleep in unix_dgram_recvmsg

Commit Message

Comments

Patch