mbox series

[net,0/5] nexthops: Fix 2 fundamental flaws with nexthop groups

Message ID 20200526150114.41687-1-dsahern@kernel.org
Headers show
Series nexthops: Fix 2 fundamental flaws with nexthop groups | expand

Message

David Ahern May 26, 2020, 3:01 p.m. UTC
From: David Ahern <dahern@digitalocean.com>

Nik's torture tests have exposed 2 fundamental mistakes with the initial
nexthop code for groups. First, the nexthops entries and num_nh in the
nh_grp struct should not be modified once the struct is set under rcu.
Doing so has major affects on the datapath seeing valid nexthop entries.

Second, the helpers in the header file were convenient for not repeating
code, but they cause datapath walks to potentially see 2 different group
structs after an rcu replace, disrupting a walk of the path objects.
This second problem applies solely to IPv4 as I re-used too much of the
existing code in walking legs of a multipath route.

Patches 1 is refactoring change to simplify the overhead of reviewing and
understanding the change in patch 2 which fixes the update of nexthop
groups when a compnent leg is removed.

Patches 3-5 address the second problem. Patch 3 inlines the multipath
check such that the mpath lookup and subsequent calls all use the same
nh_grp struct. Patches 4 and 5 fix datapath uses of fib_info_num_path
with iterative calls to fib_info_nhc.

fib_info_num_path can be used in control plane path in a 'for loop' with
subsequent fib_info_nhc calls to get each leg since the nh_grp struct is
only changed while holding the rtnl; the combination can not be used in
the data plane with external nexthops as it involves repeated dereferences
of nh_grp struct which can change between calls.

Similarly, nexthop_is_multipath can be used for branching decisions in
the datapath since the nexthop type can not be changed (a group can not
be converted to standalone and vice versa).

Patch set developed in coordination with Nikolay Aleksandrov. He did a
lot of work creating a good reproducer, discussing options to fix it
and testing iterations.

I have adapted Nik's commands into additional tests in the nexthops
selftest script which I will send against -next.

David Ahern (4):
  nexthops: Move code from remove_nexthop_from_groups to
    remove_nh_grp_entry
  nexthop: Expand nexthop_is_multipath in a few places
  ipv4: Refactor nhc evaluation in fib_table_lookup
  ipv4: nexthop version of fib_info_nh_uses_dev

Nikolay Aleksandrov (1):
  nexthops: don't modify published nexthop groups

 include/net/ip_fib.h    |  12 +++++
 include/net/nexthop.h   | 100 ++++++++++++++++++++++++++++++++-------
 net/ipv4/fib_frontend.c |  19 ++++----
 net/ipv4/fib_trie.c     |  51 ++++++++++++++------
 net/ipv4/nexthop.c      | 102 +++++++++++++++++++++++++---------------
 5 files changed, 205 insertions(+), 79 deletions(-)

Comments

David Miller May 26, 2020, 10:28 p.m. UTC | #1
From: David Ahern <dsahern@kernel.org>
Date: Tue, 26 May 2020 09:01:09 -0600

> From: David Ahern <dahern@digitalocean.com>
> 
> Nik's torture tests have exposed 2 fundamental mistakes with the initial
> nexthop code for groups. First, the nexthops entries and num_nh in the
> nh_grp struct should not be modified once the struct is set under rcu.
> Doing so has major affects on the datapath seeing valid nexthop entries.
> 
> Second, the helpers in the header file were convenient for not repeating
> code, but they cause datapath walks to potentially see 2 different group
> structs after an rcu replace, disrupting a walk of the path objects.
> This second problem applies solely to IPv4 as I re-used too much of the
> existing code in walking legs of a multipath route.
> 
> Patches 1 is refactoring change to simplify the overhead of reviewing and
> understanding the change in patch 2 which fixes the update of nexthop
> groups when a compnent leg is removed.
> 
> Patches 3-5 address the second problem. Patch 3 inlines the multipath
> check such that the mpath lookup and subsequent calls all use the same
> nh_grp struct. Patches 4 and 5 fix datapath uses of fib_info_num_path
> with iterative calls to fib_info_nhc.
> 
> fib_info_num_path can be used in control plane path in a 'for loop' with
> subsequent fib_info_nhc calls to get each leg since the nh_grp struct is
> only changed while holding the rtnl; the combination can not be used in
> the data plane with external nexthops as it involves repeated dereferences
> of nh_grp struct which can change between calls.
> 
> Similarly, nexthop_is_multipath can be used for branching decisions in
> the datapath since the nexthop type can not be changed (a group can not
> be converted to standalone and vice versa).
> 
> Patch set developed in coordination with Nikolay Aleksandrov. He did a
> lot of work creating a good reproducer, discussing options to fix it
> and testing iterations.
> 
> I have adapted Nik's commands into additional tests in the nexthops
> selftest script which I will send against -next.

Series applied and queued up for -stable, thanks David.
Nikolay Aleksandrov May 26, 2020, 10:35 p.m. UTC | #2
On 5/27/20 1:28 AM, David Miller wrote:
> From: David Ahern <dsahern@kernel.org>
> Date: Tue, 26 May 2020 09:01:09 -0600
> 
>> From: David Ahern <dahern@digitalocean.com>
>>
>> Nik's torture tests have exposed 2 fundamental mistakes with the initial
>> nexthop code for groups. First, the nexthops entries and num_nh in the
>> nh_grp struct should not be modified once the struct is set under rcu.
>> Doing so has major affects on the datapath seeing valid nexthop entries.
>>
>> Second, the helpers in the header file were convenient for not repeating
>> code, but they cause datapath walks to potentially see 2 different group
>> structs after an rcu replace, disrupting a walk of the path objects.
>> This second problem applies solely to IPv4 as I re-used too much of the
>> existing code in walking legs of a multipath route.
>>
>> Patches 1 is refactoring change to simplify the overhead of reviewing and
>> understanding the change in patch 2 which fixes the update of nexthop
>> groups when a compnent leg is removed.
>>
>> Patches 3-5 address the second problem. Patch 3 inlines the multipath
>> check such that the mpath lookup and subsequent calls all use the same
>> nh_grp struct. Patches 4 and 5 fix datapath uses of fib_info_num_path
>> with iterative calls to fib_info_nhc.
>>
>> fib_info_num_path can be used in control plane path in a 'for loop' with
>> subsequent fib_info_nhc calls to get each leg since the nh_grp struct is
>> only changed while holding the rtnl; the combination can not be used in
>> the data plane with external nexthops as it involves repeated dereferences
>> of nh_grp struct which can change between calls.
>>
>> Similarly, nexthop_is_multipath can be used for branching decisions in
>> the datapath since the nexthop type can not be changed (a group can not
>> be converted to standalone and vice versa).
>>
>> Patch set developed in coordination with Nikolay Aleksandrov. He did a
>> lot of work creating a good reproducer, discussing options to fix it
>> and testing iterations.
>>
>> I have adapted Nik's commands into additional tests in the nexthops
>> selftest script which I will send against -next.
> 
> Series applied and queued up for -stable, thanks David.
> 

Dave this was v1, there were some whitespace errors which were fixed
in v2: http://patchwork.ozlabs.org/project/netdev/list/?series=179428
Nikolay Aleksandrov May 26, 2020, 10:37 p.m. UTC | #3
On 5/27/20 1:35 AM, Nikolay Aleksandrov wrote:
> On 5/27/20 1:28 AM, David Miller wrote:
>> From: David Ahern <dsahern@kernel.org>
>> Date: Tue, 26 May 2020 09:01:09 -0600
>>
>>> From: David Ahern <dahern@digitalocean.com>
>>>
>>> Nik's torture tests have exposed 2 fundamental mistakes with the initial
>>> nexthop code for groups. First, the nexthops entries and num_nh in the
>>> nh_grp struct should not be modified once the struct is set under rcu.
>>> Doing so has major affects on the datapath seeing valid nexthop entries.
>>>
>>> Second, the helpers in the header file were convenient for not repeating
>>> code, but they cause datapath walks to potentially see 2 different group
>>> structs after an rcu replace, disrupting a walk of the path objects.
>>> This second problem applies solely to IPv4 as I re-used too much of the
>>> existing code in walking legs of a multipath route.
>>>
>>> Patches 1 is refactoring change to simplify the overhead of reviewing and
>>> understanding the change in patch 2 which fixes the update of nexthop
>>> groups when a compnent leg is removed.
>>>
>>> Patches 3-5 address the second problem. Patch 3 inlines the multipath
>>> check such that the mpath lookup and subsequent calls all use the same
>>> nh_grp struct. Patches 4 and 5 fix datapath uses of fib_info_num_path
>>> with iterative calls to fib_info_nhc.
>>>
>>> fib_info_num_path can be used in control plane path in a 'for loop' with
>>> subsequent fib_info_nhc calls to get each leg since the nh_grp struct is
>>> only changed while holding the rtnl; the combination can not be used in
>>> the data plane with external nexthops as it involves repeated dereferences
>>> of nh_grp struct which can change between calls.
>>>
>>> Similarly, nexthop_is_multipath can be used for branching decisions in
>>> the datapath since the nexthop type can not be changed (a group can not
>>> be converted to standalone and vice versa).
>>>
>>> Patch set developed in coordination with Nikolay Aleksandrov. He did a
>>> lot of work creating a good reproducer, discussing options to fix it
>>> and testing iterations.
>>>
>>> I have adapted Nik's commands into additional tests in the nexthops
>>> selftest script which I will send against -next.
>>
>> Series applied and queued up for -stable, thanks David.
>>
> 
> Dave this was v1, there were some whitespace errors which were fixed
> in v2: http://patchwork.ozlabs.org/project/netdev/list/?series=179428
> 

We can send a simple incremental against this set to fix 'em up. If David
doesn't me beat to it, I'll send a fix tomorrow.

Cheers
David Ahern May 26, 2020, 10:39 p.m. UTC | #4
On 5/26/20 4:37 PM, Nikolay Aleksandrov wrote:
> We can send a simple incremental against this set to fix 'em up. If David
> doesn't me beat to it, I'll send a fix tomorrow.

Leave -net as is; I think there were 4 lines with spaces instead of tabs.

I have another feature coming for the nexthop code; I can look at fixing
the whitespace issues in -next with it.