diff mbox

[v2] tcp: perform DMA to userspace only if there is a task waiting for it

Message ID alpine.LNX.2.00.1207272237470.30412@pobox.suse.cz
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Jiri Kosina July 27, 2012, 8:38 p.m. UTC
Back in 2006, commit 1a2449a87b ("[I/OAT]: TCP recv offload to I/OAT")
added support for receive offloading to IOAT dma engine if available.

The code in tcp_rcv_established() tries to perform early DMA copy if
applicable. It however does so without checking whether the userspace
task is actually expecting the data in the buffer.

This is not a problem under normal circumstances, but there is a corner
case where this doesn't work -- and that's when MSG_TRUNC flag to
recvmsg() is used.

If the IOAT dma engine is not used, the code properly checks whether
there is a valid ucopy.task and the socket is owned by userspace, but
misses the check in the dmaengine case.

This problem can be observed in real trivially -- for example 'tbench' is a
good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing
IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they
have been already early-copied in tcp_rcv_established() using dma engine.

This patch introduces the same check we are performing in the simple
iovec copy case to the IOAT case as well. It fixes the indefinite
recvmsg(MSG_TRUNC) hangs.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 net/ipv4/tcp_input.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

Comments

David Miller July 27, 2012, 8:48 p.m. UTC | #1
From: Jiri Kosina <jkosina@suse.cz>
Date: Fri, 27 Jul 2012 22:38:50 +0200 (CEST)

> Back in 2006, commit 1a2449a87b ("[I/OAT]: TCP recv offload to I/OAT")
> added support for receive offloading to IOAT dma engine if available.
> 
> The code in tcp_rcv_established() tries to perform early DMA copy if
> applicable. It however does so without checking whether the userspace
> task is actually expecting the data in the buffer.
> 
> This is not a problem under normal circumstances, but there is a corner
> case where this doesn't work -- and that's when MSG_TRUNC flag to
> recvmsg() is used.
> 
> If the IOAT dma engine is not used, the code properly checks whether
> there is a valid ucopy.task and the socket is owned by userspace, but
> misses the check in the dmaengine case.
> 
> This problem can be observed in real trivially -- for example 'tbench' is a
> good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing
> IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they
> have been already early-copied in tcp_rcv_established() using dma engine.
> 
> This patch introduces the same check we are performing in the simple
> iovec copy case to the IOAT case as well. It fixes the indefinite
> recvmsg(MSG_TRUNC) hangs.
> 
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3e07a64..76e3bf6 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5475,7 +5475,9 @@  int tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
 			if (tp->copied_seq == tp->rcv_nxt &&
 			    len - tcp_header_len <= tp->ucopy.len) {
 #ifdef CONFIG_NET_DMA
-				if (tcp_dma_try_early_copy(sk, skb, tcp_header_len)) {
+				if (tp->ucopy.task == current &&
+				    sock_owned_by_user(sk) &&
+				    tcp_dma_try_early_copy(sk, skb, tcp_header_len)) {
 					copied_early = 1;
 					eaten = 1;
 				}