Message ID | 20220804174050.1461741-1-vladbu@nvidia.com |
---|---|
State | Accepted |
Commit | 399185865e59a16e2ddc51afeb9b23f62864baae |
Headers | show |
Series | [ovs-dev,branch-2.17,RESEND] netdev-linux: set correct action for packets that passed policer | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | fail | test: fail |
On Thu, Aug 04, 2022 at 07:40:50PM +0200, Vlad Buslov wrote: > Referenced commit changed policer action type from TC_ACT_UNSPEC (continue) > to TC_ACT_PIPE. However, since neither TC hardware offload layer nor mlx5 > driver at the time validated action type and always assumed 'continue', the > breakage wasn't caught until later validation code was added. The change > also broke valid configuration when sending from offload-capable device to > non-offload capable. For example, when sending from mlx5 VF to OvS bridge > netdevice the traffic that passed matchall classifier with policer could no > longer match the following flower rule in software: > > filter protocol all pref 1 matchall chain 0 > filter protocol all pref 1 matchall chain 0 handle 0x1 > in_hw (rule hit 7863) > action order 1: police 0x1 rate 32Mbit burst 1000Kb mtu 64Kb action drop/pipe overhead 0b > ref 1 bind 1 installed 17 sec firstused 17 sec > Action statistics: > Sent 152199634 bytes 102550 pkt (dropped 1315, overlimits 1315 requeues 0) > Sent software 74612172 bytes 51275 pkt > Sent hardware 77587462 bytes 51275 pkt > backlog 0b 0p requeues 0 > used_hw_stats delayed > > filter protocol ip pref 3 flower chain 0 > filter protocol ip pref 3 flower chain 0 handle 0x1 > dst_mac aa:94:1f:f2:f8:44 > src_mac e4:00:01:08:00:02 > eth_type ipv4 > ip_flags nofrag > not_in_hw > action order 1: skbedit ptype host pipe > index 1 ref 1 bind 1 installed 6 sec used 6 sec > Action statistics: > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > action order 2: mirred (Ingress Redirect to device br-ovs) stolen > index 1 ref 1 bind 1 installed 6 sec used 6 sec > Action statistics: > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > cookie 401a9c8b3d403c62240d3eb5e21c1604 > no_percpu > > Fix the issue by restoring policer action type to 'continue'. > > Fixes: c2567e533f8a ("add port-based ingress policing based packet-per-second rate-limiting") > Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Thanks Vlad, this looks good to me and I'm reasonably confident it won't regress any use-cases that I am aware of. I plan to apply this to branch-2.17 and branch-2.16.
On Fri, Aug 05, 2022 at 11:57:28AM +0100, Simon Horman wrote: > On Thu, Aug 04, 2022 at 07:40:50PM +0200, Vlad Buslov wrote: > > Referenced commit changed policer action type from TC_ACT_UNSPEC (continue) > > to TC_ACT_PIPE. However, since neither TC hardware offload layer nor mlx5 > > driver at the time validated action type and always assumed 'continue', the > > breakage wasn't caught until later validation code was added. The change > > also broke valid configuration when sending from offload-capable device to > > non-offload capable. For example, when sending from mlx5 VF to OvS bridge > > netdevice the traffic that passed matchall classifier with policer could no > > longer match the following flower rule in software: > > > > filter protocol all pref 1 matchall chain 0 > > filter protocol all pref 1 matchall chain 0 handle 0x1 > > in_hw (rule hit 7863) > > action order 1: police 0x1 rate 32Mbit burst 1000Kb mtu 64Kb action drop/pipe overhead 0b > > ref 1 bind 1 installed 17 sec firstused 17 sec > > Action statistics: > > Sent 152199634 bytes 102550 pkt (dropped 1315, overlimits 1315 requeues 0) > > Sent software 74612172 bytes 51275 pkt > > Sent hardware 77587462 bytes 51275 pkt > > backlog 0b 0p requeues 0 > > used_hw_stats delayed > > > > filter protocol ip pref 3 flower chain 0 > > filter protocol ip pref 3 flower chain 0 handle 0x1 > > dst_mac aa:94:1f:f2:f8:44 > > src_mac e4:00:01:08:00:02 > > eth_type ipv4 > > ip_flags nofrag > > not_in_hw > > action order 1: skbedit ptype host pipe > > index 1 ref 1 bind 1 installed 6 sec used 6 sec > > Action statistics: > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > backlog 0b 0p requeues 0 > > > > action order 2: mirred (Ingress Redirect to device br-ovs) stolen > > index 1 ref 1 bind 1 installed 6 sec used 6 sec > > Action statistics: > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > backlog 0b 0p requeues 0 > > cookie 401a9c8b3d403c62240d3eb5e21c1604 > > no_percpu > > > > Fix the issue by restoring policer action type to 'continue'. > > > > Fixes: c2567e533f8a ("add port-based ingress policing based packet-per-second rate-limiting") > > Signed-off-by: Vlad Buslov <vladbu@nvidia.com> > > Thanks Vlad, > > this looks good to me and I'm reasonably confident it won't regress > any use-cases that I am aware of. > > I plan to apply this to branch-2.17 and branch-2.16. Thanks again, applied: * branch-2.17 399185865e59 ("netdev-linux: set correct action for packets that passed policer") https://github.com/openvswitch/ovs/commit/399185865e59a16e2ddc51afeb9b23f62864baae * branch-2.16 e7792039e6db ("netdev-linux: set correct action for packets that passed policer") https://github.com/openvswitch/ovs/commit/e7792039e6db23b733f73c29672e0409ed0bc3df
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index c9cf8c7892f1..067e0175612b 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -2626,7 +2626,7 @@ static void nl_msg_act_police_end_nest(struct ofpbuf *request, size_t offset, size_t act_offset) { - nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_PIPE); + nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_UNSPEC); nl_msg_end_nested(request, offset); nl_msg_end_nested(request, act_offset); }
Referenced commit changed policer action type from TC_ACT_UNSPEC (continue) to TC_ACT_PIPE. However, since neither TC hardware offload layer nor mlx5 driver at the time validated action type and always assumed 'continue', the breakage wasn't caught until later validation code was added. The change also broke valid configuration when sending from offload-capable device to non-offload capable. For example, when sending from mlx5 VF to OvS bridge netdevice the traffic that passed matchall classifier with policer could no longer match the following flower rule in software: filter protocol all pref 1 matchall chain 0 filter protocol all pref 1 matchall chain 0 handle 0x1 in_hw (rule hit 7863) action order 1: police 0x1 rate 32Mbit burst 1000Kb mtu 64Kb action drop/pipe overhead 0b ref 1 bind 1 installed 17 sec firstused 17 sec Action statistics: Sent 152199634 bytes 102550 pkt (dropped 1315, overlimits 1315 requeues 0) Sent software 74612172 bytes 51275 pkt Sent hardware 77587462 bytes 51275 pkt backlog 0b 0p requeues 0 used_hw_stats delayed filter protocol ip pref 3 flower chain 0 filter protocol ip pref 3 flower chain 0 handle 0x1 dst_mac aa:94:1f:f2:f8:44 src_mac e4:00:01:08:00:02 eth_type ipv4 ip_flags nofrag not_in_hw action order 1: skbedit ptype host pipe index 1 ref 1 bind 1 installed 6 sec used 6 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: mirred (Ingress Redirect to device br-ovs) stolen index 1 ref 1 bind 1 installed 6 sec used 6 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie 401a9c8b3d403c62240d3eb5e21c1604 no_percpu Fix the issue by restoring policer action type to 'continue'. Fixes: c2567e533f8a ("add port-based ingress policing based packet-per-second rate-limiting") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> --- lib/netdev-linux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)