diff mbox series

[net-next] tcp: improve recv_skip_hint for tcp_zerocopy_receive

Message ID 20191011032702.59998-1-soheil.kdev@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series [net-next] tcp: improve recv_skip_hint for tcp_zerocopy_receive | expand

Commit Message

Soheil Hassas Yeganeh Oct. 11, 2019, 3:27 a.m. UTC
From: Soheil Hassas Yeganeh <soheil@google.com>

tcp_zerocopy_receive() rounds down the zc->length a multiple of
PAGE_SIZE. This results in two issues:
- tcp_zerocopy_receive sets recv_skip_hint to the length of the
  receive queue if the zc->length input is smaller than the
  PAGE_SIZE, even though the data in receive queue could be
  zerocopied.
- tcp_zerocopy_receive would set recv_skip_hint of 0, in cases
  where we have a little bit of data after the perfectly-sized
  packets.

To fix these issues, do not store the rounded down value in
zc->length. Round down the length passed to zap_page_range(),
and return min(inq, zc->length) when the zap_range is 0.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

David Miller Oct. 13, 2019, 6:17 p.m. UTC | #1
From: Soheil Hassas Yeganeh <soheil.kdev@gmail.com>
Date: Thu, 10 Oct 2019 23:27:02 -0400

> From: Soheil Hassas Yeganeh <soheil@google.com>
> 
> tcp_zerocopy_receive() rounds down the zc->length a multiple of
> PAGE_SIZE. This results in two issues:
> - tcp_zerocopy_receive sets recv_skip_hint to the length of the
>   receive queue if the zc->length input is smaller than the
>   PAGE_SIZE, even though the data in receive queue could be
>   zerocopied.
> - tcp_zerocopy_receive would set recv_skip_hint of 0, in cases
>   where we have a little bit of data after the perfectly-sized
>   packets.
> 
> To fix these issues, do not store the rounded down value in
> zc->length. Round down the length passed to zap_page_range(),
> and return min(inq, zc->length) when the zap_range is 0.
> 
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thank you.
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f98a1882e537..9f41a76c1c54 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1739,8 +1739,8 @@  static int tcp_zerocopy_receive(struct sock *sk,
 				struct tcp_zerocopy_receive *zc)
 {
 	unsigned long address = (unsigned long)zc->address;
+	u32 length = 0, seq, offset, zap_len;
 	const skb_frag_t *frags = NULL;
-	u32 length = 0, seq, offset;
 	struct vm_area_struct *vma;
 	struct sk_buff *skb = NULL;
 	struct tcp_sock *tp;
@@ -1767,12 +1767,12 @@  static int tcp_zerocopy_receive(struct sock *sk,
 	seq = tp->copied_seq;
 	inq = tcp_inq(sk);
 	zc->length = min_t(u32, zc->length, inq);
-	zc->length &= ~(PAGE_SIZE - 1);
-	if (zc->length) {
-		zap_page_range(vma, address, zc->length);
+	zap_len = zc->length & ~(PAGE_SIZE - 1);
+	if (zap_len) {
+		zap_page_range(vma, address, zap_len);
 		zc->recv_skip_hint = 0;
 	} else {
-		zc->recv_skip_hint = inq;
+		zc->recv_skip_hint = zc->length;
 	}
 	ret = 0;
 	while (length + PAGE_SIZE <= zc->length) {