diff mbox

[v5,net-next,3/5] tcp: add TCP support for low latency receive poll.

Message ID 20130527074421.29882.73968.stgit@ladj378.jer.intel.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Eliezer Tamir May 27, 2013, 7:44 a.m. UTC
adds busy-poll support for TCP.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 net/ipv4/tcp.c       |    5 +++++
 net/ipv4/tcp_input.c |    1 +
 net/ipv4/tcp_ipv4.c  |    2 ++
 3 files changed, 8 insertions(+), 0 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet May 28, 2013, 12:36 a.m. UTC | #1
On Mon, 2013-05-27 at 10:44 +0300, Eliezer Tamir wrote:
> adds busy-poll support for TCP.
> 

Really, this is a small changelog for such an addition :(

How poll()/epoll() is supported ?

> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Tested-by: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> ---
> 
>  net/ipv4/tcp.c       |    5 +++++
>  net/ipv4/tcp_input.c |    1 +
>  net/ipv4/tcp_ipv4.c  |    2 ++
>  3 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index d87ce72..652c75a 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -279,6 +279,7 @@
>  
>  #include <asm/uaccess.h>
>  #include <asm/ioctls.h>
> +#include <net/ll_poll.h>
>  
>  int sysctl_tcp_fin_timeout __read_mostly = TCP_FIN_TIMEOUT;
>  
> @@ -1551,6 +1552,10 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
>  	struct sk_buff *skb;
>  	u32 urg_hole = 0;
>  
> +	if (sk_valid_ll(sk) && skb_queue_empty(&sk->sk_receive_queue)
> +	    && (sk->sk_state == TCP_ESTABLISHED))
> +		sk_poll_ll(sk, nonblock);
> +
>  	lock_sock(sk);
>  
>  	err = -ENOTCONN;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 9579e1a..4d82939 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -74,6 +74,7 @@
>  #include <linux/ipsec.h>
>  #include <asm/unaligned.h>
>  #include <net/netdma.h>
> +#include <net/ll_poll.h>
>  


Not sure why this include is needed in this file ?

You added nothing else but this line.

>  int sysctl_tcp_timestamps __read_mostly = 1;
>  int sysctl_tcp_window_scaling __read_mostly = 1;
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index d20ede0..35fd8bc 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -75,6 +75,7 @@
>  #include <net/netdma.h>
>  #include <net/secure_seq.h>
>  #include <net/tcp_memcontrol.h>
> +#include <net/ll_poll.h>
>  
>  #include <linux/inet.h>
>  #include <linux/ipv6.h>
> @@ -2011,6 +2012,7 @@ process:
>  	if (sk_filter(sk, skb))
>  		goto discard_and_relse;
>  
> +	sk_mark_ll(sk, skb);
>  	skb->dev = NULL;
>  
>  	bh_lock_sock_nested(sk);

How IPv6 is handled ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliezer Tamir May 28, 2013, 8:26 a.m. UTC | #2
On 28/05/2013 03:36, Eric Dumazet wrote:
> On Mon, 2013-05-27 at 10:44 +0300, Eliezer Tamir wrote:
>> adds busy-poll support for TCP.
>>
>
> Really, this is a small changelog for such an addition :(

OK


> How poll()/epoll() is supported ?

poll()/select() are done by the code added to fs/select.c in 2/5.
epoll() is not yet supported.

>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index 9579e1a..4d82939 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -74,6 +74,7 @@
>>   #include <linux/ipsec.h>
>>   #include <asm/unaligned.h>
>>   #include <net/netdma.h>
>> +#include <net/ll_poll.h>
>>
>
> Not sure why this include is needed in this file ?
>
> You added nothing else but this line.

This is a mistake, a remnant from an earlier version when sk_mark_ll() 
was where we copy data to the socket.
I will remove it.

>>   #include <net/netdma.h>
>>   #include <net/secure_seq.h>
>>   #include <net/tcp_memcontrol.h>
>> +#include <net/ll_poll.h>
>>
>>   #include <linux/inet.h>
>>   #include <linux/ipv6.h>
>> @@ -2011,6 +2012,7 @@ process:
>>   	if (sk_filter(sk, skb))
>>   		goto discard_and_relse;
>>
>> +	sk_mark_ll(sk, skb);
>>   	skb->dev = NULL;
>>
>>   	bh_lock_sock_nested(sk);
>
> How IPv6 is handled ?

IPv6 is currently not supported (it was not supported in any version of
this patch set, the POC code in fact was hard-codded for UDPv4/TCPv4).

If there is interest, I will add it, I think it will not be complicated.
However, I would prefer to wait with that for a second stage.

My main concern is that adding IPv6 will significantly increase my
  testing effort, which is already 90% of what I'm spending time on.

IMHO epoll() and a more robust support for select()/poll() should have
a higher priority, but I'm open to suggestions.

I would like to get what we have so far applied so more people can try
it, then work on all of the other things that we need.

Dave, I would like to hear your opinion on this, please.

-Eliezer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliezer Tamir May 28, 2013, 12:15 p.m. UTC | #3
On 28/05/2013 11:26, Eliezer Tamir wrote:
> On 28/05/2013 03:36, Eric Dumazet wrote:
>> On Mon, 2013-05-27 at 10:44 +0300, Eliezer Tamir wrote:
>>>   #include <net/netdma.h>
>>>   #include <net/secure_seq.h>
>>>   #include <net/tcp_memcontrol.h>
>>> +#include <net/ll_poll.h>
>>>
>>>   #include <linux/inet.h>
>>>   #include <linux/ipv6.h>
>>> @@ -2011,6 +2012,7 @@ process:
>>>       if (sk_filter(sk, skb))
>>>           goto discard_and_relse;
>>>
>>> +    sk_mark_ll(sk, skb);
>>>       skb->dev = NULL;
>>>
>>>       bh_lock_sock_nested(sk);
>>
>> How IPv6 is handled ?

It turns out that adding TCPv6/UDPv6 is very simple.
I will add them, with a warning that I only did very limited testing.

-Eliezer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 28, 2013, 1:44 p.m. UTC | #4
On Tue, 2013-05-28 at 15:15 +0300, Eliezer Tamir wrote:

> >> How IPv6 is handled ?
> 
> It turns out that adding TCPv6/UDPv6 is very simple.

Yep, I was about to send you the needed lines after my breakfast ;)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d87ce72..652c75a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -279,6 +279,7 @@ 
 
 #include <asm/uaccess.h>
 #include <asm/ioctls.h>
+#include <net/ll_poll.h>
 
 int sysctl_tcp_fin_timeout __read_mostly = TCP_FIN_TIMEOUT;
 
@@ -1551,6 +1552,10 @@  int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	struct sk_buff *skb;
 	u32 urg_hole = 0;
 
+	if (sk_valid_ll(sk) && skb_queue_empty(&sk->sk_receive_queue)
+	    && (sk->sk_state == TCP_ESTABLISHED))
+		sk_poll_ll(sk, nonblock);
+
 	lock_sock(sk);
 
 	err = -ENOTCONN;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9579e1a..4d82939 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -74,6 +74,7 @@ 
 #include <linux/ipsec.h>
 #include <asm/unaligned.h>
 #include <net/netdma.h>
+#include <net/ll_poll.h>
 
 int sysctl_tcp_timestamps __read_mostly = 1;
 int sysctl_tcp_window_scaling __read_mostly = 1;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d20ede0..35fd8bc 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -75,6 +75,7 @@ 
 #include <net/netdma.h>
 #include <net/secure_seq.h>
 #include <net/tcp_memcontrol.h>
+#include <net/ll_poll.h>
 
 #include <linux/inet.h>
 #include <linux/ipv6.h>
@@ -2011,6 +2012,7 @@  process:
 	if (sk_filter(sk, skb))
 		goto discard_and_relse;
 
+	sk_mark_ll(sk, skb);
 	skb->dev = NULL;
 
 	bh_lock_sock_nested(sk);