diff mbox series

[net-next,v3] tcp: propagate gso_segs to the new skb built in tcp collapse

Message ID 1535804465-11795-1-git-send-email-laoar.shao@gmail.com
State Rejected, archived
Delegated to: David Miller
Headers show
Series [net-next,v3] tcp: propagate gso_segs to the new skb built in tcp collapse | expand

Commit Message

Yafang Shao Sept. 1, 2018, 12:21 p.m. UTC
The gso_segs of the new built SKB in tcp collapse is inited to 0,
that makes us hard to know the accurate segments number of this new SKB.
We'd better propagate the gso_segs of the collapsed SKB to the new built
one, so when this SKB is dropped (for example when doing tcp prune) the
sk_drops will be added to the correct value.

If the collapsed SKB is fully copied to the new built one, we just add its
gso_segs to the new SKB.
While if the collapsed SKB is partially copied to the new built SKB,
we have to calculate how many segments are copied.
And when do the calculation we must make sure one SKB holds the same
gso_segs.
Furthemore, we have to reset the gso_segs of this SKB if is is partially
copied, so in the next round when the left segments are copied it could
propagate the correct value.

The gso_size will never exceed 65536 as the max size of the new built SKB
is 4K.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 net/ipv4/tcp_input.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Sept. 2, 2018, 8:34 p.m. UTC | #1
On 09/01/2018 05:21 AM, Yafang Shao wrote:
> The gso_segs of the new built SKB in tcp collapse is inited to 0,
> that makes us hard to know the accurate segments number of this new SKB.
> We'd better propagate the gso_segs of the collapsed SKB to the new built
> one, so when this SKB is dropped (for example when doing tcp prune) the
> sk_drops will be added to the correct value.
> 
> If the collapsed SKB is fully copied to the new built one, we just add its
> gso_segs to the new SKB.
> While if the collapsed SKB is partially copied to the new built SKB,
> we have to calculate how many segments are copied.
> And when do the calculation we must make sure one SKB holds the same
> gso_segs.
> Furthemore, we have to reset the gso_segs of this SKB if is is partially
> copied, so in the next round when the left segments are copied it could
> propagate the correct value.
> 
> The gso_size will never exceed 65536 as the max size of the new built SKB
> is 4K.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
>  net/ipv4/tcp_input.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 62508a2..6dc8e2f 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4910,6 +4910,7 @@ void tcp_rbtree_insert(struct rb_root *root, struct sk_buff *skb)
>  	while (before(start, end)) {
>  		int copy = min_t(int, SKB_MAX_ORDER(0, 0), end - start);
>  		struct sk_buff *nskb;
> +		int len = copy;
>  
>  		nskb = alloc_skb(copy, GFP_ATOMIC);
>  		if (!nskb)
> @@ -4928,12 +4929,24 @@ void tcp_rbtree_insert(struct rb_root *root, struct sk_buff *skb)
>  
>  		/* Copy data, releasing collapsed skbs. */
>  		while (copy > 0) {
> -			int offset = start - TCP_SKB_CB(skb)->seq;
>  			int size = TCP_SKB_CB(skb)->end_seq - start;
> +			int offset = start - TCP_SKB_CB(skb)->seq;
>  
>  			BUG_ON(offset < 0);
>  			if (size > 0) {
> -				size = min(copy, size);
> +				if (copy >= size) {
> +					skb_shinfo(nskb)->gso_segs +=
> +						max_t(u16, 1, skb_shinfo(skb)->gso_segs);
> +				} else {
> +					skb_shinfo(nskb)->gso_size =
> +						skb_shinfo(skb)->gso_size;
> +					skb_shinfo(nskb)->gso_segs =
> +						DIV_ROUND_UP(len, skb_shinfo(nskb)->gso_size);
> +					skb_shinfo(skb)->gso_segs =
> +						DIV_ROUND_UP(size - copy, skb_shinfo(skb)->gso_size);
> +					size = copy;
> +				}
> +
>  				if (skb_copy_bits(skb, offset, skb_put(nskb, size), size))
>  					BUG();
>  				TCP_SKB_CB(nskb)->end_seq += size;
> 

Please stop sending these patches.

1) There is no guarantee a TCP flow receive segments of the same size.
2) There is no guarantee an skb cooked by collapse contains an integral number of segments.

So really this is bloat, and for something that is not accurate anyway.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 62508a2..6dc8e2f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4910,6 +4910,7 @@  void tcp_rbtree_insert(struct rb_root *root, struct sk_buff *skb)
 	while (before(start, end)) {
 		int copy = min_t(int, SKB_MAX_ORDER(0, 0), end - start);
 		struct sk_buff *nskb;
+		int len = copy;
 
 		nskb = alloc_skb(copy, GFP_ATOMIC);
 		if (!nskb)
@@ -4928,12 +4929,24 @@  void tcp_rbtree_insert(struct rb_root *root, struct sk_buff *skb)
 
 		/* Copy data, releasing collapsed skbs. */
 		while (copy > 0) {
-			int offset = start - TCP_SKB_CB(skb)->seq;
 			int size = TCP_SKB_CB(skb)->end_seq - start;
+			int offset = start - TCP_SKB_CB(skb)->seq;
 
 			BUG_ON(offset < 0);
 			if (size > 0) {
-				size = min(copy, size);
+				if (copy >= size) {
+					skb_shinfo(nskb)->gso_segs +=
+						max_t(u16, 1, skb_shinfo(skb)->gso_segs);
+				} else {
+					skb_shinfo(nskb)->gso_size =
+						skb_shinfo(skb)->gso_size;
+					skb_shinfo(nskb)->gso_segs =
+						DIV_ROUND_UP(len, skb_shinfo(nskb)->gso_size);
+					skb_shinfo(skb)->gso_segs =
+						DIV_ROUND_UP(size - copy, skb_shinfo(skb)->gso_size);
+					size = copy;
+				}
+
 				if (skb_copy_bits(skb, offset, skb_put(nskb, size), size))
 					BUG();
 				TCP_SKB_CB(nskb)->end_seq += size;