From patchwork Fri Dec 22 19:47:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom Herbert X-Patchwork-Id: 852551 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=quantonium-net.20150623.gappssmtp.com header.i=@quantonium-net.20150623.gappssmtp.com header.b="xjlcPJ+t"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z3JwK1Gvdz9ryr for ; Sat, 23 Dec 2017 06:48:33 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756764AbdLVTsa (ORCPT ); Fri, 22 Dec 2017 14:48:30 -0500 Received: from mail-pf0-f193.google.com ([209.85.192.193]:38306 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756473AbdLVTsZ (ORCPT ); Fri, 22 Dec 2017 14:48:25 -0500 Received: by mail-pf0-f193.google.com with SMTP id u25so15679403pfg.5 for ; Fri, 22 Dec 2017 11:48:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quantonium-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=KMVzAU6i/3hXlPPm7mVEvbIeFg3ZNgwYwwZ2sRA+n4M=; b=xjlcPJ+tZUQMDQEinxQQ0PXdqOdmKTJlIDdY0Tw+o/DHtgl7K6xyshm3UvlVTgq8wf S1gE17W1G4iM6kx6QuCjbS+0AiTpCTypFQ0idiIbA7evTQnBKL8o8+ufz2B8eJw+Gygi 0fWOMUVQhUIhicBlBdowma1WT27+fYHOfsuMB5RPQZ9mgs2K+Xe8oBezcCxf09+uDuFD AIB82iRjLTh1qiTK9ShpB0N7I7zBu2F6oOOr1cllBRGT6GE/t3FQJ7f3M5Lp1bxzo32s 9pbAoBJwNS6FF4LLcvMQZTnltqbwzvUds2/EGRrHVhgflHPxc03XNmN3Z8Pchaf96NZr sGqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=KMVzAU6i/3hXlPPm7mVEvbIeFg3ZNgwYwwZ2sRA+n4M=; b=Wkak9b0GXGjhvTyLHWqNjnBmzew03IAEWEAJTZQyxYAticuugz3gs8+g6Chpmv9E3y LwZ86tHlgnc5mH0O5AopjsLNmPIRN8VqPTqcyw9FQok7J1bfXAnhlm6uPKiGk+ZFEmnC aPCyfwKzCnm1ar0kGgch1g+pXiHB/W24hZ+jBGrILAhkDzyxuwTcxSDFCzS94b5xoK5E BaXpbechB1VFZIsnK8k0aV1HKPs1p3KGMkgQTJ1jWawSIu9iddhZvYflx4AsWjR6iJCb Dpc7mzKbcgC+9fe8UCWipxxTqj3FsH+lzQ/ue9eWBlYek/Eco6i4dtahbYdA0yYA9akk RCfw== X-Gm-Message-State: AKGB3mIFvxCDQAEw469rHxsUG+W5gzU3WCdhPR2Nrae/P+dRsGwnZVSB TUBqYSO033H939kJba+H3H+J1g== X-Google-Smtp-Source: ACJfBosD5KPoJTjs/DiLmVBzRN4s/qOukMQwgHXS4hqzUjYTKi7q0zBBO2/R+UdUf2mywH0YFxMDBA== X-Received: by 10.98.247.16 with SMTP id h16mr15272302pfi.242.1513972104587; Fri, 22 Dec 2017 11:48:24 -0800 (PST) Received: from localhost.localdomain (c-73-162-13-107.hsd1.ca.comcast.net. [73.162.13.107]) by smtp.gmail.com with ESMTPSA id e7sm50080812pfj.44.2017.12.22.11.48.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 22 Dec 2017 11:48:23 -0800 (PST) From: Tom Herbert To: davem@davemloft.net Cc: netdev@vger.kernel.org, dvyukov@google.com, john.fastabend@gmail.com, rohit@quantonium.net, Tom Herbert Subject: [PATCH net-next 4/4] kcm: Address deadlock between TX and RX paths Date: Fri, 22 Dec 2017 11:47:55 -0800 Message-Id: <20171222194755.8544-5-tom@quantonium.net> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171222194755.8544-1-tom@quantonium.net> References: <20171222194755.8544-1-tom@quantonium.net> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Both the transmit and receive paths of KCM need to take both the KCM socket lock and the psock socket lock, however they take the locks in opposite order which can lead to deadlock. This patch changes the transmit path (kcm_write_msgs to be specific) so the locks are taken in the proper order. try_sock_lock is first used to get the lower socket lock, if that is successful then sending data can proceed with dropping KCM lock. If try_sock_lock fails then the KCM lock is released and lock_sock is done on the lower socket followed by the lock_sock on the KCM sock. A doing_write_msgs flag has been added to kcm structure to prevent multiple threads doing write_msgs when the KCM lock is dropped. kernel_sendpage_locked is now called to do the send data with lock already held. Signed-off-by: Tom Herbert --- include/net/kcm.h | 1 + net/kcm/kcmsock.c | 64 ++++++++++++++++++++++++++++++++++++++++--------------- 2 files changed, 48 insertions(+), 17 deletions(-) diff --git a/include/net/kcm.h b/include/net/kcm.h index 2a8965819db0..22bd7dd3eedb 100644 --- a/include/net/kcm.h +++ b/include/net/kcm.h @@ -78,6 +78,7 @@ struct kcm_sock { /* Don't use bit fields here, these are set under different locks */ bool tx_wait; bool tx_wait_more; + bool doing_write_msgs; /* Receive */ struct kcm_psock *rx_psock; diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index d4e98f20fc2a..3eb3179b96b3 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -574,13 +574,19 @@ static void kcm_report_tx_retry(struct kcm_sock *kcm) static int kcm_write_msgs(struct kcm_sock *kcm) { struct sock *sk = &kcm->sk; - struct kcm_psock *psock; - struct sk_buff *skb, *head; - struct kcm_tx_msg *txm; + struct sk_buff *head = skb_peek(&sk->sk_write_queue); unsigned short fragidx, frag_offset; unsigned int sent, total_sent = 0; + struct kcm_psock *psock; + struct kcm_tx_msg *txm; + struct sk_buff *skb; int ret = 0; + if (unlikely(kcm->doing_write_msgs)) + return 0; + + kcm->doing_write_msgs = true; + kcm->tx_wait_more = false; psock = kcm->tx_psock; if (unlikely(psock && psock->tx_stopped)) { @@ -598,15 +604,36 @@ static int kcm_write_msgs(struct kcm_sock *kcm) return 0; } +try_again: + psock = reserve_psock(kcm); + if (!psock) + goto out_no_release_psock; + + /* Get lock for lower sock */ + if (!try_lock_sock(psock->sk)) { + /* Someone else is holding the lower sock lock. We need to + * release the KCM lock and get the psock lock first. This is + * needed since the receive path obtains the locks in reverse + * order and we want to avoid deadlock. Note that + * write_msgs can't be reentered when we drop the KCM lock + * since doing_write_msgs is set. + */ + release_sock(&kcm->sk); + + /* Take locks in order that receive path does */ + lock_sock(psock->sk); + lock_sock(&kcm->sk); + } + + /* At this point we have a reserved psock and its lower socket is + * locked. + */ + head = skb_peek(&sk->sk_write_queue); txm = kcm_tx_msg(head); if (txm->sent) { /* Send of first skbuff in queue already in progress */ - if (WARN_ON(!psock)) { - ret = -EINVAL; - goto out; - } sent = txm->sent; frag_offset = txm->frag_offset; fragidx = txm->fragidx; @@ -615,11 +642,6 @@ static int kcm_write_msgs(struct kcm_sock *kcm) goto do_frag; } -try_again: - psock = reserve_psock(kcm); - if (!psock) - goto out; - do { skb = head; txm = kcm_tx_msg(head); @@ -643,11 +665,12 @@ static int kcm_write_msgs(struct kcm_sock *kcm) goto out; } - ret = kernel_sendpage(psock->sk->sk_socket, - frag->page.p, - frag->page_offset + frag_offset, - frag->size - frag_offset, - MSG_DONTWAIT); + ret = kernel_sendpage_locked(psock->sk, frag->page.p, + frag->page_offset + + frag_offset, + frag->size - + frag_offset, + MSG_DONTWAIT); if (ret <= 0) { if (ret == -EAGAIN) { /* Save state to try again when there's @@ -669,6 +692,7 @@ static int kcm_write_msgs(struct kcm_sock *kcm) */ kcm_abort_tx_psock(psock, ret ? -ret : EPIPE, true); + release_sock(psock->sk); unreserve_psock(kcm); txm->sent = 0; @@ -704,7 +728,13 @@ static int kcm_write_msgs(struct kcm_sock *kcm) total_sent += sent; KCM_STATS_INCR(psock->stats.tx_msgs); } while ((head = skb_peek(&sk->sk_write_queue))); + out: + release_sock(psock->sk); + +out_no_release_psock: + kcm->doing_write_msgs = false; + if (!head) { /* Done with all queued messages. */ WARN_ON(!skb_queue_empty(&sk->sk_write_queue));