diff mbox series

[net-next,1/2] tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit

Message ID 20200914215210.2288109-1-soheil.kdev@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series [net-next,1/2] tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit | expand

Commit Message

Soheil Hassas Yeganeh Sept. 14, 2020, 9:52 p.m. UTC
From: Soheil Hassas Yeganeh <soheil@google.com>

If there was any event available on the TCP socket, tcp_poll()
will be called to retrieve all the events.  In tcp_poll(), we call
sk_stream_is_writeable() which returns true as long as we are at least
one byte below notsent_lowat.  This will result in quite a few
spurious EPLLOUT and frequent tiny sendmsg() calls as a result.

Similar to sk_stream_write_space(), use __sk_stream_is_writeable
with a wake value of 1, so that we set EPOLLOUT only if half the
space is available for write.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

David Miller Sept. 14, 2020, 11:58 p.m. UTC | #1
From: Soheil Hassas Yeganeh <soheil.kdev@gmail.com>
Date: Mon, 14 Sep 2020 17:52:09 -0400

> From: Soheil Hassas Yeganeh <soheil@google.com>
> 
> If there was any event available on the TCP socket, tcp_poll()
> will be called to retrieve all the events.  In tcp_poll(), we call
> sk_stream_is_writeable() which returns true as long as we are at least
> one byte below notsent_lowat.  This will result in quite a few
> spurious EPLLOUT and frequent tiny sendmsg() calls as a result.
> 
> Similar to sk_stream_write_space(), use __sk_stream_is_writeable
> with a wake value of 1, so that we set EPOLLOUT only if half the
> space is available for write.
> 
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d3781b6087cb..48c351804efc 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -564,7 +564,7 @@  __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
 			mask |= EPOLLIN | EPOLLRDNORM;
 
 		if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
-			if (sk_stream_is_writeable(sk)) {
+			if (__sk_stream_is_writeable(sk, 1)) {
 				mask |= EPOLLOUT | EPOLLWRNORM;
 			} else {  /* send SIGIO later */
 				sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
@@ -576,7 +576,7 @@  __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
 				 * pairs with the input side.
 				 */
 				smp_mb__after_atomic();
-				if (sk_stream_is_writeable(sk))
+				if (__sk_stream_is_writeable(sk, 1))
 					mask |= EPOLLOUT | EPOLLWRNORM;
 			}
 		} else