diff mbox

[net-next,v2] tcp: provide timestamps for partial writes

Message ID 20170104161934.26849-1-soheil.kdev@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Soheil Hassas Yeganeh Jan. 4, 2017, 4:19 p.m. UTC
From: Soheil Hassas Yeganeh <soheil@google.com>

For TCP sockets, TX timestamps are only captured when the user data
is successfully and fully written to the socket. In many cases,
however, TCP writes can be partial for which no timestamp is
collected.

Collect timestamps whenever any user data is (fully or partially)
copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
instead of the local skb pointer since it can be set to NULL on
the error path.

Note that tcp_write_queue_tail can be NULL, even if bytes have been
copied to the socket. This is because acknowledgements are being
processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
called tcp_write_queue_tail can be NULL. For such cases, this patch
does not collect any timestamps (i.e., it is best-effort).

This patch is written with suggestions from Willem de Bruijn and
Eric Dumazet.

Change-log V1 -> V2:
	- Use sockc.tsflags instead of sk->sk_tsflags.
	- Use the same code path for normal writes and errors.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
---
 net/ipv4/tcp.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

Comments

Willem de Bruijn Jan. 4, 2017, 10:22 p.m. UTC | #1
On Wed, Jan 4, 2017 at 11:19 AM, Soheil Hassas Yeganeh
<soheil.kdev@gmail.com> wrote:
> From: Soheil Hassas Yeganeh <soheil@google.com>
>
> For TCP sockets, TX timestamps are only captured when the user data
> is successfully and fully written to the socket. In many cases,
> however, TCP writes can be partial for which no timestamp is
> collected.
>
> Collect timestamps whenever any user data is (fully or partially)
> copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
> instead of the local skb pointer since it can be set to NULL on
> the error path.
>
> Note that tcp_write_queue_tail can be NULL, even if bytes have been
> copied to the socket. This is because acknowledgements are being
> processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
> called tcp_write_queue_tail can be NULL. For such cases, this patch
> does not collect any timestamps (i.e., it is best-effort).
>
> This patch is written with suggestions from Willem de Bruijn and
> Eric Dumazet.
>
> Change-log V1 -> V2:
>         - Use sockc.tsflags instead of sk->sk_tsflags.
>         - Use the same code path for normal writes and errors.
>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>

Acked-by: Willem de Bruijn <willemb@google.com>
David Miller Jan. 5, 2017, 7:56 p.m. UTC | #2
From: Soheil Hassas Yeganeh <soheil.kdev@gmail.com>
Date: Wed,  4 Jan 2017 11:19:34 -0500

> From: Soheil Hassas Yeganeh <soheil@google.com>
> 
> For TCP sockets, TX timestamps are only captured when the user data
> is successfully and fully written to the socket. In many cases,
> however, TCP writes can be partial for which no timestamp is
> collected.
> 
> Collect timestamps whenever any user data is (fully or partially)
> copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
> instead of the local skb pointer since it can be set to NULL on
> the error path.
> 
> Note that tcp_write_queue_tail can be NULL, even if bytes have been
> copied to the socket. This is because acknowledgements are being
> processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
> called tcp_write_queue_tail can be NULL. For such cases, this patch
> does not collect any timestamps (i.e., it is best-effort).
> 
> This patch is written with suggestions from Willem de Bruijn and
> Eric Dumazet.
> 
> Change-log V1 -> V2:
> 	- Use sockc.tsflags instead of sk->sk_tsflags.
> 	- Use the same code path for normal writes and errors.
> 
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>

Applied, thanks.
diff mbox

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 2e3807d8eba8..ec97e4b4a62f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -429,7 +429,7 @@  EXPORT_SYMBOL(tcp_init_sock);
 
 static void tcp_tx_timestamp(struct sock *sk, u16 tsflags, struct sk_buff *skb)
 {
-	if (tsflags) {
+	if (tsflags && skb) {
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 		struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
 
@@ -958,10 +958,8 @@  static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 		copied += copy;
 		offset += copy;
 		size -= copy;
-		if (!size) {
-			tcp_tx_timestamp(sk, sk->sk_tsflags, skb);
+		if (!size)
 			goto out;
-		}
 
 		if (skb->len < size_goal || (flags & MSG_OOB))
 			continue;
@@ -987,8 +985,11 @@  static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 	}
 
 out:
-	if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
-		tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
+	if (copied) {
+		tcp_tx_timestamp(sk, sk->sk_tsflags, tcp_write_queue_tail(sk));
+		if (!(flags & MSG_SENDPAGE_NOTLAST))
+			tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
+	}
 	return copied;
 
 do_error:
@@ -1281,7 +1282,6 @@  int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 		copied += copy;
 		if (!msg_data_left(msg)) {
-			tcp_tx_timestamp(sk, sockc.tsflags, skb);
 			if (unlikely(flags & MSG_EOR))
 				TCP_SKB_CB(skb)->eor = 1;
 			goto out;
@@ -1312,8 +1312,10 @@  int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	}
 
 out:
-	if (copied)
+	if (copied) {
+		tcp_tx_timestamp(sk, sockc.tsflags, tcp_write_queue_tail(sk));
 		tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
+	}
 out_nopush:
 	release_sock(sk);
 	return copied + copied_syn;