mbox series

[v3,net-next,0/5] ipv6: avoid taking refcnt on dst during route lookup

Message ID 20190621003641.168591-1-tracywwnj@gmail.com
Headers show
Series ipv6: avoid taking refcnt on dst during route lookup | expand

Message

Wei Wang June 21, 2019, 12:36 a.m. UTC
From: Wei Wang <weiwan@google.com>

Ipv6 route lookup code always grabs refcnt on the dst for the caller.
But for certain cases, grabbing refcnt is not always necessary if the
call path is rcu protected and the caller does not cache the dst.
Another issue in the route lookup logic is:
When there are multiple custom rules, we have to do the lookup into
each table associated to each rule individually. And when we can't
find the route in one table, we grab and release refcnt on
net->ipv6.ip6_null_entry before going to the next table.
This operation is completely redundant, and causes false issue because
net->ipv6.ip6_null_entry is a shared object.

This patch set introduces a new flag RT6_LOOKUP_F_DST_NOREF for route
lookup callers to set, to avoid any manipulation on the dst refcnt. And
it converts the major input and output path to use it.

The performance gain is noticable.
I ran synflood tests between 2 hosts under the same switch. Both hosts
have 20G mlx NIC, and 8 tx/rx queues.
Sender sends pure SYN flood with random src IPs and ports using trafgen.
Receiver has a simple TCP listener on the target port.
Both hosts have multiple custom rules:
- For incoming packets, only local table is traversed.
- For outgoing packets, 3 tables are traversed to find the route.
The packet processing rate on the receiver is as follows:
- Before the fix: 3.78Mpps
- After the fix:  5.50Mpps

v2->v3:
- Handled fib6_rule_lookup() when CONFIG_IPV6_MULTIPLE_TABLES is not
  configured in patch 03 (suggested by David Ahern)
- Removed the renaming of l3mdev_link_scope_lookup() in patch 05
  (suggested by David Ahern)
- Moved definition of ip6_route_output_flags() from an inline function
  in /net/ipv6/route.c to net/ipv6/route.c in order to address kbuild
  error in patch 05

v1->v2:
- Added a helper ip6_rt_put_flags() in patch 3 suggested by David Miller


Wei Wang (5):
  ipv6: introduce RT6_LOOKUP_F_DST_NOREF flag in ip6_pol_route()
  ipv6: initialize rt6->rt6i_uncached in all pre-allocated dst entries
  ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic
  ipv6: convert rx data path to not take refcnt on dst
  ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF

 drivers/net/vrf.c       |   5 +-
 include/net/ip6_route.h |  15 ++++++
 net/ipv6/fib6_rules.c   |  12 +++--
 net/ipv6/ip6_fib.c      |   5 +-
 net/ipv6/route.c        | 112 +++++++++++++++++++++++-----------------
 net/l3mdev/l3mdev.c     |   7 ++-
 6 files changed, 95 insertions(+), 61 deletions(-)

Comments

David Ahern June 21, 2019, 12:50 a.m. UTC | #1
On 6/20/19 6:36 PM, Wei Wang wrote:
> From: Wei Wang <weiwan@google.com>
> 
> Ipv6 route lookup code always grabs refcnt on the dst for the caller.
> But for certain cases, grabbing refcnt is not always necessary if the
> call path is rcu protected and the caller does not cache the dst.
> Another issue in the route lookup logic is:
> When there are multiple custom rules, we have to do the lookup into
> each table associated to each rule individually. And when we can't
> find the route in one table, we grab and release refcnt on
> net->ipv6.ip6_null_entry before going to the next table.
> This operation is completely redundant, and causes false issue because
> net->ipv6.ip6_null_entry is a shared object.
> 
> This patch set introduces a new flag RT6_LOOKUP_F_DST_NOREF for route
> lookup callers to set, to avoid any manipulation on the dst refcnt. And
> it converts the major input and output path to use it.
> 
> The performance gain is noticable.
> I ran synflood tests between 2 hosts under the same switch. Both hosts
> have 20G mlx NIC, and 8 tx/rx queues.
> Sender sends pure SYN flood with random src IPs and ports using trafgen.
> Receiver has a simple TCP listener on the target port.
> Both hosts have multiple custom rules:
> - For incoming packets, only local table is traversed.
> - For outgoing packets, 3 tables are traversed to find the route.
> The packet processing rate on the receiver is as follows:
> - Before the fix: 3.78Mpps
> - After the fix:  5.50Mpps
> 

LGTM. Thanks for doing this - big improvement.

Reviewed-by: David Ahern <dsahern@gmail.com>
Eric Dumazet June 21, 2019, 12:33 p.m. UTC | #2
On Thu, Jun 20, 2019 at 8:50 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 6/20/19 6:36 PM, Wei Wang wrote:
> > From: Wei Wang <weiwan@google.com>
> >
> > Ipv6 route lookup code always grabs refcnt on the dst for the caller.
> > But for certain cases, grabbing refcnt is not always necessary if the
> > call path is rcu protected and the caller does not cache the dst.
> > Another issue in the route lookup logic is:
> > When there are multiple custom rules, we have to do the lookup into
> > each table associated to each rule individually. And when we can't
> > find the route in one table, we grab and release refcnt on
> > net->ipv6.ip6_null_entry before going to the next table.
> > This operation is completely redundant, and causes false issue because
> > net->ipv6.ip6_null_entry is a shared object.
> >
> > This patch set introduces a new flag RT6_LOOKUP_F_DST_NOREF for route
> > lookup callers to set, to avoid any manipulation on the dst refcnt. And
> > it converts the major input and output path to use it.
> >
> > The performance gain is noticable.
> > I ran synflood tests between 2 hosts under the same switch. Both hosts
> > have 20G mlx NIC, and 8 tx/rx queues.
> > Sender sends pure SYN flood with random src IPs and ports using trafgen.
> > Receiver has a simple TCP listener on the target port.
> > Both hosts have multiple custom rules:
> > - For incoming packets, only local table is traversed.
> > - For outgoing packets, 3 tables are traversed to find the route.
> > The packet processing rate on the receiver is as follows:
> > - Before the fix: 3.78Mpps
> > - After the fix:  5.50Mpps
> >
>
> LGTM. Thanks for doing this - big improvement.
>
> Reviewed-by: David Ahern <dsahern@gmail.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>
David Miller June 23, 2019, 6:27 p.m. UTC | #3
From: Wei Wang <tracywwnj@gmail.com>
Date: Thu, 20 Jun 2019 17:36:36 -0700

> v2->v3:
> - Handled fib6_rule_lookup() when CONFIG_IPV6_MULTIPLE_TABLES is not
>   configured in patch 03 (suggested by David Ahern)
> - Removed the renaming of l3mdev_link_scope_lookup() in patch 05
>   (suggested by David Ahern)
> - Moved definition of ip6_route_output_flags() from an inline function
>   in /net/ipv6/route.c to net/ipv6/route.c in order to address kbuild
>   error in patch 05

I'll give David A. a chance to review this before applying.
David Ahern June 23, 2019, 7:29 p.m. UTC | #4
On 6/23/19 12:27 PM, David Miller wrote:
> From: Wei Wang <tracywwnj@gmail.com>
> Date: Thu, 20 Jun 2019 17:36:36 -0700
> 
>> v2->v3:
>> - Handled fib6_rule_lookup() when CONFIG_IPV6_MULTIPLE_TABLES is not
>>   configured in patch 03 (suggested by David Ahern)
>> - Removed the renaming of l3mdev_link_scope_lookup() in patch 05
>>   (suggested by David Ahern)
>> - Moved definition of ip6_route_output_flags() from an inline function
>>   in /net/ipv6/route.c to net/ipv6/route.c in order to address kbuild
>>   error in patch 05
> 
> I'll give David A. a chance to review this before applying.
> 

Hey Dave: I responded to the cover-letter on Friday.
David Miller June 23, 2019, 8:24 p.m. UTC | #5
From: David Ahern <dsahern@gmail.com>
Date: Sun, 23 Jun 2019 13:29:27 -0600

> On 6/23/19 12:27 PM, David Miller wrote:
>> From: Wei Wang <tracywwnj@gmail.com>
>> Date: Thu, 20 Jun 2019 17:36:36 -0700
>> 
>>> v2->v3:
>>> - Handled fib6_rule_lookup() when CONFIG_IPV6_MULTIPLE_TABLES is not
>>>   configured in patch 03 (suggested by David Ahern)
>>> - Removed the renaming of l3mdev_link_scope_lookup() in patch 05
>>>   (suggested by David Ahern)
>>> - Moved definition of ip6_route_output_flags() from an inline function
>>>   in /net/ipv6/route.c to net/ipv6/route.c in order to address kbuild
>>>   error in patch 05
>> 
>> I'll give David A. a chance to review this before applying.
>> 
> 
> Hey Dave: I responded to the cover-letter on Friday.

Indeed, and so did tractor man.

Series applied, thanks everyone.