diff mbox

[net-next] LRO: improve aggregation in case of zero TSecr packets

Message ID 200908270208.31581.opurdila@ixiacom.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Octavian Purdila Aug. 26, 2009, 11:08 p.m. UTC
This fixes a temporary performance issue we noticed in back to back
TSO - LRO tests when such tests are run within five minutes after
boot.

The TSval field of TCP packets is filled in based on the current
jiffie, which is initialized at -300*HZ. That means that in 5 minutes
after reboot it will wrap to zero.

While the jiffie value is 0 on the LRO side, TCP packets will be
send out with TSVal zero as well. The TSO side will respond with 
packets in which TSecr is set to zero. 

At this point LRO will avoid aggregating the packets, suddenly 
putting a lot of pressure on the stack which will result in drops and 
retransmission for a short while.

There are cases where aggregating zero TSecr is better and cases in
which is not:

1. non zero TSecr packets, zero TSecr packet

   Better not to aggregate, otherwise we miss a valid TSecr.

2. zero TSecr packets (amplified by TSO)

   Better to aggregate.

3. zero TSecr packet, non zero TSecr packets

   OK to aggregate.

4. non zero TSecr packets, zero TSecr packets, non zero TSecr packet

   OK to aggregate, but not a big overhead if we aggregate in 2
   segments instead of one.

This patch allows aggregation for cases 2 and 3 as well as aggregation
in 2 segments for case 4, while denying aggregation in case 1.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
 net/ipv4/inet_lro.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Comments

David Miller Aug. 31, 2009, 5:11 a.m. UTC | #1
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Thu, 27 Aug 2009 02:08:31 +0300

> This fixes a temporary performance issue we noticed in back to back
> TSO - LRO tests when such tests are run within five minutes after
> boot.
> 
> The TSval field of TCP packets is filled in based on the current
> jiffie, which is initialized at -300*HZ. That means that in 5 minutes
> after reboot it will wrap to zero.

RFC1323 says we absolutely must ignore zero TSecr values.

It is a bug that the stack emits a zero value when it means to give a
real TSecr value that will be used.

Probably we can do something like emit '1' when we would emit '0'
based upon jiffies.

And this would be an improvement from now in that having a off-by-one
TSecr in this situation is better than emitting one which we can
guarentee will be ignored.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Aug. 31, 2009, 5:15 p.m. UTC | #2
On Monday 31 August 2009 08:11:05 David Miller wrote:

> From: Octavian Purdila <opurdila@ixiacom.com>
> Date: Thu, 27 Aug 2009 02:08:31 +0300
>
> > This fixes a temporary performance issue we noticed in back to back
> > TSO - LRO tests when such tests are run within five minutes after
> > boot.
> >
> > The TSval field of TCP packets is filled in based on the current
> > jiffie, which is initialized at -300*HZ. That means that in 5 minutes
> > after reboot it will wrap to zero.
>
> RFC1323 says we absolutely must ignore zero TSecr values.
>
> It is a bug that the stack emits a zero value when it means to give a
> real TSecr value that will be used.
>
> Probably we can do something like emit '1' when we would emit '0'
> based upon jiffies.
>
> And this would be an improvement from now in that having a off-by-one
> TSecr in this situation is better than emitting one which we can
> guarentee will be ignored.

Right, why did I thought that the LRO TSecr issue can happen even when 
emitting a right TSval ? :-/

I'll follow with a patch which takes this approach.

Thanks,
tavi
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Sept. 25, 2009, 9:33 p.m. UTC | #3
Octavian Purdila <opurdila@ixiacom.com> wrote:
> 
> This fixes a temporary performance issue we noticed in back to back
> TSO - LRO tests when such tests are run within five minutes after
> boot.

Sorry for responding to old emails again but just for the record,
please keep in mind that LRO is going away so you really should
run your tests with GRO (and update the driver if necessary) as
otherwise your efforts may end up being wasted.

In this particular instance, you're trying to fix an issue that
GRO already fixes.

Cheers,
diff mbox

Patch

diff --git a/net/ipv4/inet_lro.c b/net/ipv4/inet_lro.c
index 6a667da..d16c9d5 100644
--- a/net/ipv4/inet_lro.c
+++ b/net/ipv4/inet_lro.c
@@ -90,9 +90,9 @@  static int lro_tcp_ip_check(struct iphdr *iph, struct tcphdr *tcph,
 				      ntohl(*topt)))
 			return -1;
 
-		/* timestamp reply should not be zero */
+		/* aggregate if we are sure we won't miss a valid TSecr */
 		topt++;
-		if (*topt == 0)
+		if (*topt == 0 && lro_desc->tcp_rcv_tsecr != 0)
 			return -1;
 	}