mbox series

[net-next,0/6] net: Allow FIB notifiers to fail add and replace

Message ID 20180328012200.15175-1-dsa@cumulusnetworks.com
Headers show
Series net: Allow FIB notifiers to fail add and replace | expand

Message

David Ahern March 28, 2018, 1:21 a.m. UTC
I wanted to revisit how resource overload is handled for hardware offload
of FIB entries and rules. At the moment, the in-kernel fib notifier can
tell a driver about a route or rule add, replace, and delete, but the
notifier can not affect the action. Specifically, in the case of mlxsw
if a route or rule add is going to overflow the ASIC resources the only
recourse is to abort hardware offload. Aborting offload is akin to taking
down the switch as the path from data plane to the control plane simply
can not support the traffic bandwidth of the front panel ports. Further,
the current state of FIB notifiers is inconsistent with other resources
where a driver can affect a user request - e.g., enslavement of a port
into a bridge or a VRF.

As a result of the work done over the past 3+ years, I believe we are
at a point where we can bring consistency to the stack and offloads,
and reliably allow the FIB notifiers to fail a request, pushing an error
along with a suitable error message back to the user. Rather than
aborting offload when the switch is out of resources, userspace is simply
prevented from adding more routes and has a clear indication of why.

This set does not resolve the corner case where rules or routes not
supported by the device are installed prior to the driver getting loaded
and registering for FIB notifications. In that case, hardware offload has
not been established and it can refuse to offload anything, sending
errors back to userspace via extack. Since conceptually the driver owns
the netdevices associated with its asic, this corner case mainly applies
to unsupported rules and any races during the bringup phase.

Patch 1 fixes call_fib_notifiers to extract the errno from the encoded
response from handlers.

Patches 2-5 allow the call to call_fib_notifiers to fail the add or
replace of a route or rule.

Patch 6 adds a simple resource controller to netdevsim to illustrate
how a FIB resource controller can limit the number of route entries.

Changes since RFC
- correct return code for call_fib_notifier
- dropped patch 6 exporting devlink symbols
- limited example resource controller to init_net only
- updated Kconfig for netdevsim to use MAY_USE_DEVLINK
- updated cover letter regarding startup case noted by Ido

David Ahern (6):
  net: Fix fib notifer to return errno
  net: Move call_fib_rule_notifiers up in fib_nl_newrule
  net/ipv4: Move call_fib_entry_notifiers up for new routes
  net/ipv4: Allow notifier to fail route replace
  net/ipv6: Move call_fib6_entry_notifiers up for route adds
  netdevsim: Add simple FIB resource controller via devlink

 drivers/net/Kconfig               |   1 +
 drivers/net/netdevsim/Makefile    |   4 +
 drivers/net/netdevsim/devlink.c   | 294 ++++++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/fib.c       | 263 ++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdev.c    |  12 +-
 drivers/net/netdevsim/netdevsim.h |  43 ++++++
 net/core/fib_notifier.c           |  10 +-
 net/core/fib_rules.c              |   6 +-
 net/ipv4/fib_trie.c               |  27 +++-
 net/ipv6/ip6_fib.c                |  16 ++-
 10 files changed, 664 insertions(+), 12 deletions(-)
 create mode 100644 drivers/net/netdevsim/devlink.c
 create mode 100644 drivers/net/netdevsim/fib.c

Comments

Ido Schimmel March 29, 2018, 8:29 a.m. UTC | #1
On Tue, Mar 27, 2018 at 06:21:54PM -0700, David Ahern wrote:
> I wanted to revisit how resource overload is handled for hardware offload
> of FIB entries and rules. At the moment, the in-kernel fib notifier can
> tell a driver about a route or rule add, replace, and delete, but the
> notifier can not affect the action. Specifically, in the case of mlxsw
> if a route or rule add is going to overflow the ASIC resources the only
> recourse is to abort hardware offload. Aborting offload is akin to taking
> down the switch as the path from data plane to the control plane simply
> can not support the traffic bandwidth of the front panel ports. Further,
> the current state of FIB notifiers is inconsistent with other resources
> where a driver can affect a user request - e.g., enslavement of a port
> into a bridge or a VRF.
> 
> As a result of the work done over the past 3+ years, I believe we are
> at a point where we can bring consistency to the stack and offloads,
> and reliably allow the FIB notifiers to fail a request, pushing an error
> along with a suitable error message back to the user. Rather than
> aborting offload when the switch is out of resources, userspace is simply
> prevented from adding more routes and has a clear indication of why.

Nice work, David. Ran various tests and didn't see any regressions.

I know you already know this, but for the record, we plan to add
accounting to KVD hash resources which will eventually allow us to
return errors when resources are exceeded.
David Miller March 29, 2018, 6:11 p.m. UTC | #2
From: David Ahern <dsa@cumulusnetworks.com>
Date: Tue, 27 Mar 2018 18:21:54 -0700

> I wanted to revisit how resource overload is handled for hardware offload
> of FIB entries and rules. At the moment, the in-kernel fib notifier can
> tell a driver about a route or rule add, replace, and delete, but the
> notifier can not affect the action. Specifically, in the case of mlxsw
> if a route or rule add is going to overflow the ASIC resources the only
> recourse is to abort hardware offload. Aborting offload is akin to taking
> down the switch as the path from data plane to the control plane simply
> can not support the traffic bandwidth of the front panel ports. Further,
> the current state of FIB notifiers is inconsistent with other resources
> where a driver can affect a user request - e.g., enslavement of a port
> into a bridge or a VRF.
> 
> As a result of the work done over the past 3+ years, I believe we are
> at a point where we can bring consistency to the stack and offloads,
> and reliably allow the FIB notifiers to fail a request, pushing an error
> along with a suitable error message back to the user. Rather than
> aborting offload when the switch is out of resources, userspace is simply
> prevented from adding more routes and has a clear indication of why.
> 
> This set does not resolve the corner case where rules or routes not
> supported by the device are installed prior to the driver getting loaded
> and registering for FIB notifications. In that case, hardware offload has
> not been established and it can refuse to offload anything, sending
> errors back to userspace via extack. Since conceptually the driver owns
> the netdevices associated with its asic, this corner case mainly applies
> to unsupported rules and any races during the bringup phase.
> 
> Patch 1 fixes call_fib_notifiers to extract the errno from the encoded
> response from handlers.
> 
> Patches 2-5 allow the call to call_fib_notifiers to fail the add or
> replace of a route or rule.
> 
> Patch 6 adds a simple resource controller to netdevsim to illustrate
> how a FIB resource controller can limit the number of route entries.

Series applied, thanks David.
David Ahern March 29, 2018, 8 p.m. UTC | #3
On 3/29/18 2:29 AM, Ido Schimmel wrote:
> Nice work, David. Ran various tests and didn't see any regressions.
> 
> I know you already know this, but for the record, we plan to add
> accounting to KVD hash resources which will eventually allow us to
> return errors when resources are exceeded.

Thanks for running the tests, and looking forward to the mlxsw updates
for resource management.