Message ID | 20200611103635.53367-1-xiangxia.m.yue@gmail.com |
---|---|
State | Rejected |
Headers | show |
Series | [ovs-dev,ovs-dev,v1] netdev-offload: Flush offloaded rules when ports removed | expand |
Bleep bloop. Greetings Tonghao Zhang, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Co-author Wengang Hou <houwengang@didiglobal.com> needs to sign off. Lines checked: 48, Warnings: 0, Errors: 1 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > When ports were removed from bridge, we should flush the > offload rules on the ports. The main reason is two factors: > > * The ports removed from bridge, will be managed by OvS, so > flush the rules which installed by OvS. > * If using the TC flower offload, for example, tc rules still > are on the ports. And the information still are maintained by > OvS, such as the mapping for tc and ufid. > Then if adding the port to bridge and installing the rules > to it again, *del_filter_and_ufid_mapping will be invoked, > and delete the tc rule using tc handle which may not exist > (offload init api flushed them.) on kernel or is used by other > previous rules (if so, that rules will be deleted that is not > we expected.). > > Cc: Simon Horman <simon.horman@netronome.com> > Cc: Paul Blakey <paulb@mellanox.com> > Cc: Roi Dayan <roid@mellanox.com> > Cc: Ben Pfaff <blp@ovn.org> > Cc: William Tu <u9012063@gmail.com> > Cc: Ilya Maximets <i.maximets@ovn.org> > Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 > Co-authored-by: Wengang Hou <houwengang@didiglobal.com> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > --- > lib/netdev-offload.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > index ab97a292ebac..964566caab1e 100644 > --- a/lib/netdev-offload.c > +++ b/lib/netdev-offload.c > @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) > data = netdev_ports_lookup(port_no, dpif_class); > if (data) { > dpif_port_destroy(&data->dpif_port); > + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ > netdev_close(data->netdev); /* unref and possibly close */ > hmap_remove(&port_to_netdev, &data->portno_node); > hmap_remove(&ifindex_to_port, &data->ifindex_node); > it looks ok but I wonder if it's redundant? I tested and when I removed a port with tc rules on it, all tc rules got deleted using the flow api flow del, we get into netdev_tc_flow_del() for each flow. So you shouldn't have a tc rule left or ufid mapping. How did you test that you saw tc rules were still on the port?
On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid@mellanox.com> wrote: > > > > On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > When ports were removed from bridge, we should flush the > > offload rules on the ports. The main reason is two factors: > > > > * The ports removed from bridge, will be managed by OvS, so > > flush the rules which installed by OvS. > > * If using the TC flower offload, for example, tc rules still > > are on the ports. And the information still are maintained by > > OvS, such as the mapping for tc and ufid. > > Then if adding the port to bridge and installing the rules > > to it again, *del_filter_and_ufid_mapping will be invoked, > > and delete the tc rule using tc handle which may not exist > > (offload init api flushed them.) on kernel or is used by other > > previous rules (if so, that rules will be deleted that is not > > we expected.). > > > > Cc: Simon Horman <simon.horman@netronome.com> > > Cc: Paul Blakey <paulb@mellanox.com> > > Cc: Roi Dayan <roid@mellanox.com> > > Cc: Ben Pfaff <blp@ovn.org> > > Cc: William Tu <u9012063@gmail.com> > > Cc: Ilya Maximets <i.maximets@ovn.org> > > Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 > > Co-authored-by: Wengang Hou <houwengang@didiglobal.com> > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > --- > > lib/netdev-offload.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > > index ab97a292ebac..964566caab1e 100644 > > --- a/lib/netdev-offload.c > > +++ b/lib/netdev-offload.c > > @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) > > data = netdev_ports_lookup(port_no, dpif_class); > > if (data) { > > dpif_port_destroy(&data->dpif_port); > > + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ > > netdev_close(data->netdev); /* unref and possibly close */ > > hmap_remove(&port_to_netdev, &data->portno_node); > > hmap_remove(&ifindex_to_port, &data->ifindex_node); > > > Hi Roi Thanks for your review. The case is reproduced as below: # ovs-appctl dpctl/show system@ovs-system: lookups: hit:0 missed:0 lost:0 flows: 0 masks: hit:0 total:0 hit/pkt:0.00 port 0: ovs-system (internal) port 1: br-int (internal) port 2: enp130s0f0_0 port 3: enp130s0f0_1 port 4: vxlan_sys_4789 (vxlan: packet_type=ptap) # cat /tmp/tmp-ovs-debug.sh set -x ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' 3 ovs-appctl dpctl/dump-flows ovs-vsctl del-port br-int enp130s0f0_0 tc filter show dev enp130s0f0_0 ingress ovs-appctl dpctl/dump-flows # sh /tmp/tmp-ovs-debug.sh + ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' 3 + ovs-appctl dpctl/dump-flows recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200), packets:0, bytes:0, used:0.020s, actions:3 + ovs-vsctl del-port br-int enp130s0f0_0 + tc filter show dev enp130s0f0_0 ingress filter protocol ip pref 2 flower chain 0 filter protocol ip pref 2 flower chain 0 handle 0x1 dst_mac 00:11:22:33:44:66 eth_type ipv4 dst_ip 1.1.1.200 in_hw in_hw_count 1 action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen index 1 ref 1 bind 1 cookie 40ca20750c4add5a9cc1c880ccb3d0e8 no_percpu used_hw_stats delayed + ovs-appctl dpctl/dump-flows The tc rules on port enp130s0f0_0 are not deleted, we can use the tc command to show them. > it looks ok but I wonder if it's redundant? > I tested and when I removed a port with tc rules on it, > all tc rules got deleted using the flow api flow del, > we get into netdev_tc_flow_del() for each flow. I didn't find the netdev_tc_flow_del was invoked while the port was deleting. The debug step is: 1. run the /tmp/tmp-ovs-debug.sh 2. and while gdb the ovs-vswitchd: gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid] the breakpoint was not triggered. > So you shouldn't have a tc rule left or ufid mapping. As I debug, the netdev_tc_flow_del was not invoked, and the ufids of flow still are in the mapping. When we add the port back and the install the rules again, the new rules(e.g. tc handle 10) may be deleted by the later new rules(which ufid mapping not deleted and the handle may be 10), via del_filter_and_ufid_mapping. > How did you test that you saw tc rules were still on the port? Yes, I test it using the tc command. -- Best regards, Tonghao
On 6/14/20 1:40 PM, Tonghao Zhang wrote: > On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid@mellanox.com> wrote: >> >> >> >> On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com> >>> >>> When ports were removed from bridge, we should flush the >>> offload rules on the ports. The main reason is two factors: >>> >>> * The ports removed from bridge, will be managed by OvS, so >>> flush the rules which installed by OvS. >>> * If using the TC flower offload, for example, tc rules still >>> are on the ports. And the information still are maintained by >>> OvS, such as the mapping for tc and ufid. >>> Then if adding the port to bridge and installing the rules >>> to it again, *del_filter_and_ufid_mapping will be invoked, >>> and delete the tc rule using tc handle which may not exist >>> (offload init api flushed them.) on kernel or is used by other >>> previous rules (if so, that rules will be deleted that is not >>> we expected.). >>> >>> Cc: Simon Horman <simon.horman@netronome.com> >>> Cc: Paul Blakey <paulb@mellanox.com> >>> Cc: Roi Dayan <roid@mellanox.com> >>> Cc: Ben Pfaff <blp@ovn.org> >>> Cc: William Tu <u9012063@gmail.com> >>> Cc: Ilya Maximets <i.maximets@ovn.org> >>> Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 >>> Co-authored-by: Wengang Hou <houwengang@didiglobal.com> >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> >>> --- >>> lib/netdev-offload.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c >>> index ab97a292ebac..964566caab1e 100644 >>> --- a/lib/netdev-offload.c >>> +++ b/lib/netdev-offload.c >>> @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) >>> data = netdev_ports_lookup(port_no, dpif_class); >>> if (data) { >>> dpif_port_destroy(&data->dpif_port); >>> + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ >>> netdev_close(data->netdev); /* unref and possibly close */ >>> hmap_remove(&port_to_netdev, &data->portno_node); >>> hmap_remove(&ifindex_to_port, &data->ifindex_node); >>> >> > Hi Roi > Thanks for your review. The case is reproduced as below: > # ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:0 missed:0 lost:0 > flows: 0 > masks: hit:0 total:0 hit/pkt:0.00 > port 0: ovs-system (internal) > port 1: br-int (internal) > port 2: enp130s0f0_0 > port 3: enp130s0f0_1 > port 4: vxlan_sys_4789 (vxlan: packet_type=ptap) > > # cat /tmp/tmp-ovs-debug.sh > set -x > ovs-appctl dpctl/add-flow > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > 3 > ovs-appctl dpctl/dump-flows > ovs-vsctl del-port br-int enp130s0f0_0 > tc filter show dev enp130s0f0_0 ingress > ovs-appctl dpctl/dump-flows > > # sh /tmp/tmp-ovs-debug.sh > + ovs-appctl dpctl/add-flow > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > 3 > + ovs-appctl dpctl/dump-flows > recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200), > packets:0, bytes:0, used:0.020s, actions:3 > + ovs-vsctl del-port br-int enp130s0f0_0 > + tc filter show dev enp130s0f0_0 ingress > filter protocol ip pref 2 flower chain 0 > filter protocol ip pref 2 flower chain 0 handle 0x1 > dst_mac 00:11:22:33:44:66 > eth_type ipv4 > dst_ip 1.1.1.200 > in_hw in_hw_count 1 > action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen > index 1 ref 1 bind 1 > cookie 40ca20750c4add5a9cc1c880ccb3d0e8 > no_percpu > used_hw_stats delayed > > + ovs-appctl dpctl/dump-flows > > The tc rules on port enp130s0f0_0 are not deleted, we can use the tc > command to show them. > >> it looks ok but I wonder if it's redundant? >> I tested and when I removed a port with tc rules on it, >> all tc rules got deleted using the flow api flow del, >> we get into netdev_tc_flow_del() for each flow. > I didn't find the netdev_tc_flow_del was invoked while the port was deleting. > The debug step is: > 1. run the /tmp/tmp-ovs-debug.sh > 2. and while gdb the ovs-vswitchd: > gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b > netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid] > > the breakpoint was not triggered. >> So you shouldn't have a tc rule left or ufid mapping. > As I debug, the netdev_tc_flow_del was not invoked, and the ufids of > flow still are in the mapping. > When we add the port back and the install the rules again, the new > rules(e.g. tc handle 10) may be deleted by the later new rules(which > ufid mapping not > deleted and the handle may be 10), via del_filter_and_ufid_mapping. >> How did you test that you saw tc rules were still on the port? > Yes, I test it using the tc command. Hi. {add/del/mod}-flow dpctl commands are for *debugging purposes only*. And this is clearly stated in documentation: "DATAPATH FLOW TABLE DEBUGGING COMMANDS ... Do not use commands to add or remove or modify datapath flows if ovs-vswitchd is running because it interferes with ovs-vswitchd's own datapath flow management. Use ovs-ofctl(8), instead, to work with OpenFlow flow entries." As you're injecting flows directly to datapath you can't expect higher layers to handle them correctly. The issue here is that you're using OVS in a way it's not intended to be used. Please, configure OVS rules via OpenFlow or be sure that you're removing flows by hands from the datapath as you're adding them. In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest reverting of the previous patch that allowed offloading via add-flow since this is effectively a memory leak and even debugging commands should not produce a memory leak. --- BTW, IMHO, it's a very confusing design that we're installing rules directly to tc. The issue here is that we have to maintain list of ports we offloaded to with ufids completely in userspace and we can not re-create these mappings after OVS restart like we do in case of kernel datapath. The main issue is that we do not know which ports are ours. Datapath is actually split between kernel and userspace with all the metadata in ovs-vswitchd and flows in tc in kernel. This leads to big number of issues. If we could install tc rules from the inside of the usual kernel datapath that might solve a lot of configuration issues. Might create some new, but anyway, I think, common architecture would be much more clear. For example, in userspace datapath we're installing flows to usual userspace flow table AND to offload provider. This allows us to manage flows in more natural way. And even if we will lost some flows inside offload provider implementation we will still clean up all the hanging data while removing flows from the usual userspace datapath flow tables. So, offloading is actually joined part of userspace datapath and not a side thing like tc for kernel datapath. So, what is this part about: kernel datapath has persistent ports and flows (persistent in terms of ovs-vswitchd restarts), but tc offloading has persistent flows and non-persistent ports (i.e. list of ports we're affloading to) and non-persistent flows metadata (ufids of flows we offlaoded). That introduces a lot of complications. Best regards, Ilya Maximets.
On Mon, Jun 15, 2020 at 7:13 PM Ilya Maximets <i.maximets@ovn.org> wrote: > > On 6/14/20 1:40 PM, Tonghao Zhang wrote: > > On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid@mellanox.com> wrote: > >> > >> > >> > >> On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > >>> > >>> When ports were removed from bridge, we should flush the > >>> offload rules on the ports. The main reason is two factors: > >>> > >>> * The ports removed from bridge, will be managed by OvS, so > >>> flush the rules which installed by OvS. > >>> * If using the TC flower offload, for example, tc rules still > >>> are on the ports. And the information still are maintained by > >>> OvS, such as the mapping for tc and ufid. > >>> Then if adding the port to bridge and installing the rules > >>> to it again, *del_filter_and_ufid_mapping will be invoked, > >>> and delete the tc rule using tc handle which may not exist > >>> (offload init api flushed them.) on kernel or is used by other > >>> previous rules (if so, that rules will be deleted that is not > >>> we expected.). > >>> > >>> Cc: Simon Horman <simon.horman@netronome.com> > >>> Cc: Paul Blakey <paulb@mellanox.com> > >>> Cc: Roi Dayan <roid@mellanox.com> > >>> Cc: Ben Pfaff <blp@ovn.org> > >>> Cc: William Tu <u9012063@gmail.com> > >>> Cc: Ilya Maximets <i.maximets@ovn.org> > >>> Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 > >>> Co-authored-by: Wengang Hou <houwengang@didiglobal.com> > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > >>> --- > >>> lib/netdev-offload.c | 1 + > >>> 1 file changed, 1 insertion(+) > >>> > >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > >>> index ab97a292ebac..964566caab1e 100644 > >>> --- a/lib/netdev-offload.c > >>> +++ b/lib/netdev-offload.c > >>> @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) > >>> data = netdev_ports_lookup(port_no, dpif_class); > >>> if (data) { > >>> dpif_port_destroy(&data->dpif_port); > >>> + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ > >>> netdev_close(data->netdev); /* unref and possibly close */ > >>> hmap_remove(&port_to_netdev, &data->portno_node); > >>> hmap_remove(&ifindex_to_port, &data->ifindex_node); > >>> > >> > > Hi Roi > > Thanks for your review. The case is reproduced as below: > > # ovs-appctl dpctl/show > > system@ovs-system: > > lookups: hit:0 missed:0 lost:0 > > flows: 0 > > masks: hit:0 total:0 hit/pkt:0.00 > > port 0: ovs-system (internal) > > port 1: br-int (internal) > > port 2: enp130s0f0_0 > > port 3: enp130s0f0_1 > > port 4: vxlan_sys_4789 (vxlan: packet_type=ptap) > > > > # cat /tmp/tmp-ovs-debug.sh > > set -x > > ovs-appctl dpctl/add-flow > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > 3 > > ovs-appctl dpctl/dump-flows > > ovs-vsctl del-port br-int enp130s0f0_0 > > tc filter show dev enp130s0f0_0 ingress > > ovs-appctl dpctl/dump-flows > > > > # sh /tmp/tmp-ovs-debug.sh > > + ovs-appctl dpctl/add-flow > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > 3 > > + ovs-appctl dpctl/dump-flows > > recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200), > > packets:0, bytes:0, used:0.020s, actions:3 > > + ovs-vsctl del-port br-int enp130s0f0_0 > > + tc filter show dev enp130s0f0_0 ingress > > filter protocol ip pref 2 flower chain 0 > > filter protocol ip pref 2 flower chain 0 handle 0x1 > > dst_mac 00:11:22:33:44:66 > > eth_type ipv4 > > dst_ip 1.1.1.200 > > in_hw in_hw_count 1 > > action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen > > index 1 ref 1 bind 1 > > cookie 40ca20750c4add5a9cc1c880ccb3d0e8 > > no_percpu > > used_hw_stats delayed > > > > + ovs-appctl dpctl/dump-flows > > > > The tc rules on port enp130s0f0_0 are not deleted, we can use the tc > > command to show them. > > > >> it looks ok but I wonder if it's redundant? > >> I tested and when I removed a port with tc rules on it, > >> all tc rules got deleted using the flow api flow del, > >> we get into netdev_tc_flow_del() for each flow. > > I didn't find the netdev_tc_flow_del was invoked while the port was deleting. > > The debug step is: > > 1. run the /tmp/tmp-ovs-debug.sh > > 2. and while gdb the ovs-vswitchd: > > gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b > > netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid] > > > > the breakpoint was not triggered. > >> So you shouldn't have a tc rule left or ufid mapping. > > As I debug, the netdev_tc_flow_del was not invoked, and the ufids of > > flow still are in the mapping. > > When we add the port back and the install the rules again, the new > > rules(e.g. tc handle 10) may be deleted by the later new rules(which > > ufid mapping not > > deleted and the handle may be 10), via del_filter_and_ufid_mapping. > >> How did you test that you saw tc rules were still on the port? > > Yes, I test it using the tc command. > > Hi. > > {add/del/mod}-flow dpctl commands are for *debugging purposes only*. And this > is clearly stated in documentation: > > "DATAPATH FLOW TABLE DEBUGGING COMMANDS > ... > Do not use commands to add or remove or modify datapath flows if ovs-vswitchd > is running because it interferes with ovs-vswitchd's own datapath flow > management. Use ovs-ofctl(8), instead, to work with OpenFlow flow entries." > > As you're injecting flows directly to datapath you can't expect higher layers > to handle them correctly. The issue here is that you're using OVS in a way > it's not intended to be used. Please, configure OVS rules via OpenFlow or > be sure that you're removing flows by hands from the datapath as you're adding > them. Hi Thanks for your review. I test it with openflow, It works fine because the *revalidator will clean up the flows(which may be offloaded). > In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest > reverting of the previous patch that allowed offloading via add-flow since this > is effectively a memory leak and even debugging commands should not produce > a memory leak. yes, we can revert the previous patch. But I think the command is useful for us to debug offload(rte flow offload and tc offload), and test the offload function. For this issue, can we use this little patch to fix it ? or other better way ? > --- > > BTW, IMHO, it's a very confusing design that we're installing rules directly to > tc. The issue here is that we have to maintain list of ports we offloaded to > with ufids completely in userspace and we can not re-create these mappings after > OVS restart like we do in case of kernel datapath. The main issue is that we > do not know which ports are ours. Datapath is actually split between kernel > and userspace with all the metadata in ovs-vswitchd and flows in tc in kernel. > This leads to big number of issues. If we could install tc rules from the > inside of the usual kernel datapath that might solve a lot of configuration issues. > Might create some new, but anyway, I think, common architecture would be much > more clear. > For example, in userspace datapath we're installing flows to usual userspace > flow table AND to offload provider. This allows us to manage flows in more > natural way. And even if we will lost some flows inside offload provider > implementation we will still clean up all the hanging data while removing flows > from the usual userspace datapath flow tables. So, offloading is actually > joined part of userspace datapath and not a side thing like tc for kernel datapath. > So, what is this part about: kernel datapath has persistent ports and flows > (persistent in terms of ovs-vswitchd restarts), but tc offloading has persistent > flows and non-persistent ports (i.e. list of ports we're affloading to) and > non-persistent flows metadata (ufids of flows we offlaoded). That introduces > a lot of complications. > > > Best regards, Ilya Maximets.
On Tue, Jun 16, 2020 at 04:25:21PM +0800, Tonghao Zhang wrote: > On Mon, Jun 15, 2020 at 7:13 PM Ilya Maximets <i.maximets@ovn.org> wrote: > > > > On 6/14/20 1:40 PM, Tonghao Zhang wrote: > > > On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid@mellanox.com> wrote: > > >> > > >> > > >> > > >> On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: > > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > >>> > > >>> When ports were removed from bridge, we should flush the > > >>> offload rules on the ports. The main reason is two factors: > > >>> > > >>> * The ports removed from bridge, will be managed by OvS, so > > >>> flush the rules which installed by OvS. > > >>> * If using the TC flower offload, for example, tc rules still > > >>> are on the ports. And the information still are maintained by > > >>> OvS, such as the mapping for tc and ufid. > > >>> Then if adding the port to bridge and installing the rules > > >>> to it again, *del_filter_and_ufid_mapping will be invoked, > > >>> and delete the tc rule using tc handle which may not exist > > >>> (offload init api flushed them.) on kernel or is used by other > > >>> previous rules (if so, that rules will be deleted that is not > > >>> we expected.). > > >>> > > >>> Cc: Simon Horman <simon.horman@netronome.com> > > >>> Cc: Paul Blakey <paulb@mellanox.com> > > >>> Cc: Roi Dayan <roid@mellanox.com> > > >>> Cc: Ben Pfaff <blp@ovn.org> > > >>> Cc: William Tu <u9012063@gmail.com> > > >>> Cc: Ilya Maximets <i.maximets@ovn.org> > > >>> Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 > > >>> Co-authored-by: Wengang Hou <houwengang@didiglobal.com> > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > >>> --- > > >>> lib/netdev-offload.c | 1 + > > >>> 1 file changed, 1 insertion(+) > > >>> > > >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > > >>> index ab97a292ebac..964566caab1e 100644 > > >>> --- a/lib/netdev-offload.c > > >>> +++ b/lib/netdev-offload.c > > >>> @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) > > >>> data = netdev_ports_lookup(port_no, dpif_class); > > >>> if (data) { > > >>> dpif_port_destroy(&data->dpif_port); > > >>> + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ > > >>> netdev_close(data->netdev); /* unref and possibly close */ > > >>> hmap_remove(&port_to_netdev, &data->portno_node); > > >>> hmap_remove(&ifindex_to_port, &data->ifindex_node); > > >>> > > >> > > > Hi Roi > > > Thanks for your review. The case is reproduced as below: > > > # ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:0 missed:0 lost:0 > > > flows: 0 > > > masks: hit:0 total:0 hit/pkt:0.00 > > > port 0: ovs-system (internal) > > > port 1: br-int (internal) > > > port 2: enp130s0f0_0 > > > port 3: enp130s0f0_1 > > > port 4: vxlan_sys_4789 (vxlan: packet_type=ptap) > > > > > > # cat /tmp/tmp-ovs-debug.sh > > > set -x > > > ovs-appctl dpctl/add-flow > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > > 3 > > > ovs-appctl dpctl/dump-flows > > > ovs-vsctl del-port br-int enp130s0f0_0 > > > tc filter show dev enp130s0f0_0 ingress > > > ovs-appctl dpctl/dump-flows > > > > > > # sh /tmp/tmp-ovs-debug.sh > > > + ovs-appctl dpctl/add-flow > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > > 3 > > > + ovs-appctl dpctl/dump-flows > > > recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200), > > > packets:0, bytes:0, used:0.020s, actions:3 > > > + ovs-vsctl del-port br-int enp130s0f0_0 > > > + tc filter show dev enp130s0f0_0 ingress > > > filter protocol ip pref 2 flower chain 0 > > > filter protocol ip pref 2 flower chain 0 handle 0x1 > > > dst_mac 00:11:22:33:44:66 > > > eth_type ipv4 > > > dst_ip 1.1.1.200 > > > in_hw in_hw_count 1 > > > action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen > > > index 1 ref 1 bind 1 > > > cookie 40ca20750c4add5a9cc1c880ccb3d0e8 > > > no_percpu > > > used_hw_stats delayed > > > > > > + ovs-appctl dpctl/dump-flows > > > > > > The tc rules on port enp130s0f0_0 are not deleted, we can use the tc > > > command to show them. > > > > > >> it looks ok but I wonder if it's redundant? > > >> I tested and when I removed a port with tc rules on it, > > >> all tc rules got deleted using the flow api flow del, > > >> we get into netdev_tc_flow_del() for each flow. > > > I didn't find the netdev_tc_flow_del was invoked while the port was deleting. > > > The debug step is: > > > 1. run the /tmp/tmp-ovs-debug.sh > > > 2. and while gdb the ovs-vswitchd: > > > gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b > > > netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid] > > > > > > the breakpoint was not triggered. > > >> So you shouldn't have a tc rule left or ufid mapping. > > > As I debug, the netdev_tc_flow_del was not invoked, and the ufids of > > > flow still are in the mapping. > > > When we add the port back and the install the rules again, the new > > > rules(e.g. tc handle 10) may be deleted by the later new rules(which > > > ufid mapping not > > > deleted and the handle may be 10), via del_filter_and_ufid_mapping. > > >> How did you test that you saw tc rules were still on the port? > > > Yes, I test it using the tc command. > > > > Hi. > > > > {add/del/mod}-flow dpctl commands are for *debugging purposes only*. And this > > is clearly stated in documentation: > > > > "DATAPATH FLOW TABLE DEBUGGING COMMANDS > > ... > > Do not use commands to add or remove or modify datapath flows if ovs-vswitchd > > is running because it interferes with ovs-vswitchd's own datapath flow > > management. Use ovs-ofctl(8), instead, to work with OpenFlow flow entries." > > > > As you're injecting flows directly to datapath you can't expect higher layers > > to handle them correctly. The issue here is that you're using OVS in a way > > it's not intended to be used. Please, configure OVS rules via OpenFlow or > > be sure that you're removing flows by hands from the datapath as you're adding > > them. > Hi > Thanks for your review. I test it with openflow, It works fine because > the *revalidator > will clean up the flows(which may be offloaded). > > In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest > > reverting of the previous patch that allowed offloading via add-flow since this > > is effectively a memory leak and even debugging commands should not produce > > a memory leak. > yes, we can revert the previous patch. But I think the command is > useful for us to > debug offload(rte flow offload and tc offload), and test the offload function. > For this issue, can we use this little patch to fix it ? or other better way ? I am confused. Which little patch? > > --- > > > > BTW, IMHO, it's a very confusing design that we're installing rules directly to > > tc. The issue here is that we have to maintain list of ports we offloaded to > > with ufids completely in userspace and we can not re-create these mappings after > > OVS restart like we do in case of kernel datapath. The main issue is that we > > do not know which ports are ours. Datapath is actually split between kernel > > and userspace with all the metadata in ovs-vswitchd and flows in tc in kernel. > > This leads to big number of issues. If we could install tc rules from the > > inside of the usual kernel datapath that might solve a lot of configuration issues. > > Might create some new, but anyway, I think, common architecture would be much > > more clear. > > For example, in userspace datapath we're installing flows to usual userspace > > flow table AND to offload provider. This allows us to manage flows in more > > natural way. And even if we will lost some flows inside offload provider > > implementation we will still clean up all the hanging data while removing flows > > from the usual userspace datapath flow tables. So, offloading is actually > > joined part of userspace datapath and not a side thing like tc for kernel datapath. > > So, what is this part about: kernel datapath has persistent ports and flows > > (persistent in terms of ovs-vswitchd restarts), but tc offloading has persistent > > flows and non-persistent ports (i.e. list of ports we're affloading to) and > > non-persistent flows metadata (ufids of flows we offlaoded). That introduces > > a lot of complications. > > > > > > Best regards, Ilya Maximets. > > > > -- > Best regards, Tonghao
On Thu, Jun 18, 2020 at 7:18 PM Simon Horman <simon.horman@netronome.com> wrote: > > On Tue, Jun 16, 2020 at 04:25:21PM +0800, Tonghao Zhang wrote: > > On Mon, Jun 15, 2020 at 7:13 PM Ilya Maximets <i.maximets@ovn.org> wrote: > > > > > > On 6/14/20 1:40 PM, Tonghao Zhang wrote: > > > > On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid@mellanox.com> wrote: > > > >> > > > >> > > > >> > > > >> On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: > > > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > >>> > > > >>> When ports were removed from bridge, we should flush the > > > >>> offload rules on the ports. The main reason is two factors: > > > >>> > > > >>> * The ports removed from bridge, will be managed by OvS, so > > > >>> flush the rules which installed by OvS. > > > >>> * If using the TC flower offload, for example, tc rules still > > > >>> are on the ports. And the information still are maintained by > > > >>> OvS, such as the mapping for tc and ufid. > > > >>> Then if adding the port to bridge and installing the rules > > > >>> to it again, *del_filter_and_ufid_mapping will be invoked, > > > >>> and delete the tc rule using tc handle which may not exist > > > >>> (offload init api flushed them.) on kernel or is used by other > > > >>> previous rules (if so, that rules will be deleted that is not > > > >>> we expected.). > > > >>> > > > >>> Cc: Simon Horman <simon.horman@netronome.com> > > > >>> Cc: Paul Blakey <paulb@mellanox.com> > > > >>> Cc: Roi Dayan <roid@mellanox.com> > > > >>> Cc: Ben Pfaff <blp@ovn.org> > > > >>> Cc: William Tu <u9012063@gmail.com> > > > >>> Cc: Ilya Maximets <i.maximets@ovn.org> > > > >>> Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 > > > >>> Co-authored-by: Wengang Hou <houwengang@didiglobal.com> > > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > >>> --- > > > >>> lib/netdev-offload.c | 1 + > > > >>> 1 file changed, 1 insertion(+) > > > >>> > > > >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > > > >>> index ab97a292ebac..964566caab1e 100644 > > > >>> --- a/lib/netdev-offload.c > > > >>> +++ b/lib/netdev-offload.c > > > >>> @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) > > > >>> data = netdev_ports_lookup(port_no, dpif_class); > > > >>> if (data) { > > > >>> dpif_port_destroy(&data->dpif_port); > > > >>> + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ > > > >>> netdev_close(data->netdev); /* unref and possibly close */ > > > >>> hmap_remove(&port_to_netdev, &data->portno_node); > > > >>> hmap_remove(&ifindex_to_port, &data->ifindex_node); > > > >>> > > > >> > > > > Hi Roi > > > > Thanks for your review. The case is reproduced as below: > > > > # ovs-appctl dpctl/show > > > > system@ovs-system: > > > > lookups: hit:0 missed:0 lost:0 > > > > flows: 0 > > > > masks: hit:0 total:0 hit/pkt:0.00 > > > > port 0: ovs-system (internal) > > > > port 1: br-int (internal) > > > > port 2: enp130s0f0_0 > > > > port 3: enp130s0f0_1 > > > > port 4: vxlan_sys_4789 (vxlan: packet_type=ptap) > > > > > > > > # cat /tmp/tmp-ovs-debug.sh > > > > set -x > > > > ovs-appctl dpctl/add-flow > > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > > > 3 > > > > ovs-appctl dpctl/dump-flows > > > > ovs-vsctl del-port br-int enp130s0f0_0 > > > > tc filter show dev enp130s0f0_0 ingress > > > > ovs-appctl dpctl/dump-flows > > > > > > > > # sh /tmp/tmp-ovs-debug.sh > > > > + ovs-appctl dpctl/add-flow > > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > > > 3 > > > > + ovs-appctl dpctl/dump-flows > > > > recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200), > > > > packets:0, bytes:0, used:0.020s, actions:3 > > > > + ovs-vsctl del-port br-int enp130s0f0_0 > > > > + tc filter show dev enp130s0f0_0 ingress > > > > filter protocol ip pref 2 flower chain 0 > > > > filter protocol ip pref 2 flower chain 0 handle 0x1 > > > > dst_mac 00:11:22:33:44:66 > > > > eth_type ipv4 > > > > dst_ip 1.1.1.200 > > > > in_hw in_hw_count 1 > > > > action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen > > > > index 1 ref 1 bind 1 > > > > cookie 40ca20750c4add5a9cc1c880ccb3d0e8 > > > > no_percpu > > > > used_hw_stats delayed > > > > > > > > + ovs-appctl dpctl/dump-flows > > > > > > > > The tc rules on port enp130s0f0_0 are not deleted, we can use the tc > > > > command to show them. > > > > > > > >> it looks ok but I wonder if it's redundant? > > > >> I tested and when I removed a port with tc rules on it, > > > >> all tc rules got deleted using the flow api flow del, > > > >> we get into netdev_tc_flow_del() for each flow. > > > > I didn't find the netdev_tc_flow_del was invoked while the port was deleting. > > > > The debug step is: > > > > 1. run the /tmp/tmp-ovs-debug.sh > > > > 2. and while gdb the ovs-vswitchd: > > > > gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b > > > > netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid] > > > > > > > > the breakpoint was not triggered. > > > >> So you shouldn't have a tc rule left or ufid mapping. > > > > As I debug, the netdev_tc_flow_del was not invoked, and the ufids of > > > > flow still are in the mapping. > > > > When we add the port back and the install the rules again, the new > > > > rules(e.g. tc handle 10) may be deleted by the later new rules(which > > > > ufid mapping not > > > > deleted and the handle may be 10), via del_filter_and_ufid_mapping. > > > >> How did you test that you saw tc rules were still on the port? > > > > Yes, I test it using the tc command. > > > > > > Hi. > > > > > > {add/del/mod}-flow dpctl commands are for *debugging purposes only*. And this > > > is clearly stated in documentation: > > > > > > "DATAPATH FLOW TABLE DEBUGGING COMMANDS > > > ... > > > Do not use commands to add or remove or modify datapath flows if ovs-vswitchd > > > is running because it interferes with ovs-vswitchd's own datapath flow > > > management. Use ovs-ofctl(8), instead, to work with OpenFlow flow entries." > > > > > > As you're injecting flows directly to datapath you can't expect higher layers > > > to handle them correctly. The issue here is that you're using OVS in a way > > > it's not intended to be used. Please, configure OVS rules via OpenFlow or > > > be sure that you're removing flows by hands from the datapath as you're adding > > > them. > > Hi > > Thanks for your review. I test it with openflow, It works fine because > > the *revalidator > > will clean up the flows(which may be offloaded). > > > In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest > > > reverting of the previous patch that allowed offloading via add-flow since this > > > is effectively a memory leak and even debugging commands should not produce > > > a memory leak. > > yes, we can revert the previous patch. But I think the command is > > useful for us to > > debug offload(rte flow offload and tc offload), and test the offload function. > > For this issue, can we use this little patch to fix it ? or other better way ? > > I am confused. Which little patch? Hi Simon This patch [1]: [1] - http://patchwork.ozlabs.org/project/openvswitch/patch/20200611103635.53367-1-xiangxia.m.yue@gmail.com/ If we revert the patch which commit id is e61984e781e6c7d621568428788cb87c11be8f1f and we can't debug or install the flows with "ovs-app dpctl/add-flow", for tc offload. I think that tools are useful for us. > > > --- > > > > > > BTW, IMHO, it's a very confusing design that we're installing rules directly to > > > tc. The issue here is that we have to maintain list of ports we offloaded to > > > with ufids completely in userspace and we can not re-create these mappings after > > > OVS restart like we do in case of kernel datapath. The main issue is that we > > > do not know which ports are ours. Datapath is actually split between kernel > > > and userspace with all the metadata in ovs-vswitchd and flows in tc in kernel. > > > This leads to big number of issues. If we could install tc rules from the > > > inside of the usual kernel datapath that might solve a lot of configuration issues. > > > Might create some new, but anyway, I think, common architecture would be much > > > more clear. > > > For example, in userspace datapath we're installing flows to usual userspace > > > flow table AND to offload provider. This allows us to manage flows in more > > > natural way. And even if we will lost some flows inside offload provider > > > implementation we will still clean up all the hanging data while removing flows > > > from the usual userspace datapath flow tables. So, offloading is actually > > > joined part of userspace datapath and not a side thing like tc for kernel datapath. > > > So, what is this part about: kernel datapath has persistent ports and flows > > > (persistent in terms of ovs-vswitchd restarts), but tc offloading has persistent > > > flows and non-persistent ports (i.e. list of ports we're affloading to) and > > > non-persistent flows metadata (ufids of flows we offlaoded). That introduces > > > a lot of complications. > > > > > > > > > Best regards, Ilya Maximets. > > > > > > > > -- > > Best regards, Tonghao
On Thu, Jun 18, 2020 at 7:41 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Thu, Jun 18, 2020 at 7:18 PM Simon Horman <simon.horman@netronome.com> wrote: > > > > On Tue, Jun 16, 2020 at 04:25:21PM +0800, Tonghao Zhang wrote: > > > On Mon, Jun 15, 2020 at 7:13 PM Ilya Maximets <i.maximets@ovn.org> wrote: > > > > > > > > On 6/14/20 1:40 PM, Tonghao Zhang wrote: > > > > > On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid@mellanox.com> wrote: > > > > >> > > > > >> > > > > >> > > > > >> On 2020-06-11 1:36 PM, xiangxia.m.yue@gmail.com wrote: > > > > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > >>> > > > > >>> When ports were removed from bridge, we should flush the > > > > >>> offload rules on the ports. The main reason is two factors: > > > > >>> > > > > >>> * The ports removed from bridge, will be managed by OvS, so > > > > >>> flush the rules which installed by OvS. > > > > >>> * If using the TC flower offload, for example, tc rules still > > > > >>> are on the ports. And the information still are maintained by > > > > >>> OvS, such as the mapping for tc and ufid. > > > > >>> Then if adding the port to bridge and installing the rules > > > > >>> to it again, *del_filter_and_ufid_mapping will be invoked, > > > > >>> and delete the tc rule using tc handle which may not exist > > > > >>> (offload init api flushed them.) on kernel or is used by other > > > > >>> previous rules (if so, that rules will be deleted that is not > > > > >>> we expected.). > > > > >>> > > > > >>> Cc: Simon Horman <simon.horman@netronome.com> > > > > >>> Cc: Paul Blakey <paulb@mellanox.com> > > > > >>> Cc: Roi Dayan <roid@mellanox.com> > > > > >>> Cc: Ben Pfaff <blp@ovn.org> > > > > >>> Cc: William Tu <u9012063@gmail.com> > > > > >>> Cc: Ilya Maximets <i.maximets@ovn.org> > > > > >>> Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0 > > > > >>> Co-authored-by: Wengang Hou <houwengang@didiglobal.com> > > > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > >>> --- > > > > >>> lib/netdev-offload.c | 1 + > > > > >>> 1 file changed, 1 insertion(+) > > > > >>> > > > > >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > > > > >>> index ab97a292ebac..964566caab1e 100644 > > > > >>> --- a/lib/netdev-offload.c > > > > >>> +++ b/lib/netdev-offload.c > > > > >>> @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) > > > > >>> data = netdev_ports_lookup(port_no, dpif_class); > > > > >>> if (data) { > > > > >>> dpif_port_destroy(&data->dpif_port); > > > > >>> + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ > > > > >>> netdev_close(data->netdev); /* unref and possibly close */ > > > > >>> hmap_remove(&port_to_netdev, &data->portno_node); > > > > >>> hmap_remove(&ifindex_to_port, &data->ifindex_node); > > > > >>> > > > > >> > > > > > Hi Roi > > > > > Thanks for your review. The case is reproduced as below: > > > > > # ovs-appctl dpctl/show > > > > > system@ovs-system: > > > > > lookups: hit:0 missed:0 lost:0 > > > > > flows: 0 > > > > > masks: hit:0 total:0 hit/pkt:0.00 > > > > > port 0: ovs-system (internal) > > > > > port 1: br-int (internal) > > > > > port 2: enp130s0f0_0 > > > > > port 3: enp130s0f0_1 > > > > > port 4: vxlan_sys_4789 (vxlan: packet_type=ptap) > > > > > > > > > > # cat /tmp/tmp-ovs-debug.sh > > > > > set -x > > > > > ovs-appctl dpctl/add-flow > > > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > > > > 3 > > > > > ovs-appctl dpctl/dump-flows > > > > > ovs-vsctl del-port br-int enp130s0f0_0 > > > > > tc filter show dev enp130s0f0_0 ingress > > > > > ovs-appctl dpctl/dump-flows > > > > > > > > > > # sh /tmp/tmp-ovs-debug.sh > > > > > + ovs-appctl dpctl/add-flow > > > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)' > > > > > 3 > > > > > + ovs-appctl dpctl/dump-flows > > > > > recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200), > > > > > packets:0, bytes:0, used:0.020s, actions:3 > > > > > + ovs-vsctl del-port br-int enp130s0f0_0 > > > > > + tc filter show dev enp130s0f0_0 ingress > > > > > filter protocol ip pref 2 flower chain 0 > > > > > filter protocol ip pref 2 flower chain 0 handle 0x1 > > > > > dst_mac 00:11:22:33:44:66 > > > > > eth_type ipv4 > > > > > dst_ip 1.1.1.200 > > > > > in_hw in_hw_count 1 > > > > > action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen > > > > > index 1 ref 1 bind 1 > > > > > cookie 40ca20750c4add5a9cc1c880ccb3d0e8 > > > > > no_percpu > > > > > used_hw_stats delayed > > > > > > > > > > + ovs-appctl dpctl/dump-flows > > > > > > > > > > The tc rules on port enp130s0f0_0 are not deleted, we can use the tc > > > > > command to show them. > > > > > > > > > >> it looks ok but I wonder if it's redundant? > > > > >> I tested and when I removed a port with tc rules on it, > > > > >> all tc rules got deleted using the flow api flow del, > > > > >> we get into netdev_tc_flow_del() for each flow. > > > > > I didn't find the netdev_tc_flow_del was invoked while the port was deleting. > > > > > The debug step is: > > > > > 1. run the /tmp/tmp-ovs-debug.sh > > > > > 2. and while gdb the ovs-vswitchd: > > > > > gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b > > > > > netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid] > > > > > > > > > > the breakpoint was not triggered. > > > > >> So you shouldn't have a tc rule left or ufid mapping. > > > > > As I debug, the netdev_tc_flow_del was not invoked, and the ufids of > > > > > flow still are in the mapping. > > > > > When we add the port back and the install the rules again, the new > > > > > rules(e.g. tc handle 10) may be deleted by the later new rules(which > > > > > ufid mapping not > > > > > deleted and the handle may be 10), via del_filter_and_ufid_mapping. > > > > >> How did you test that you saw tc rules were still on the port? > > > > > Yes, I test it using the tc command. > > > > > > > > Hi. > > > > > > > > {add/del/mod}-flow dpctl commands are for *debugging purposes only*. And this > > > > is clearly stated in documentation: > > > > > > > > "DATAPATH FLOW TABLE DEBUGGING COMMANDS > > > > ... > > > > Do not use commands to add or remove or modify datapath flows if ovs-vswitchd > > > > is running because it interferes with ovs-vswitchd's own datapath flow > > > > management. Use ovs-ofctl(8), instead, to work with OpenFlow flow entries." > > > > > > > > As you're injecting flows directly to datapath you can't expect higher layers > > > > to handle them correctly. The issue here is that you're using OVS in a way > > > > it's not intended to be used. Please, configure OVS rules via OpenFlow or > > > > be sure that you're removing flows by hands from the datapath as you're adding > > > > them. > > > Hi > > > Thanks for your review. I test it with openflow, It works fine because > > > the *revalidator > > > will clean up the flows(which may be offloaded). > > > > In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest > > > > reverting of the previous patch that allowed offloading via add-flow since this > > > > is effectively a memory leak and even debugging commands should not produce > > > > a memory leak. > > > yes, we can revert the previous patch. But I think the command is > > > useful for us to > > > debug offload(rte flow offload and tc offload), and test the offload function. > > > For this issue, can we use this little patch to fix it ? or other better way ? > > > > I am confused. Which little patch? > Hi Simon > This patch [1]: > [1] - http://patchwork.ozlabs.org/project/openvswitch/patch/20200611103635.53367-1-xiangxia.m.yue@gmail.com/ > > If we revert the patch which commit id is > e61984e781e6c7d621568428788cb87c11be8f1f > and we can't debug or install the flows with "ovs-app dpctl/add-flow", > for tc offload. > I think that tools are useful for us. Hi Simon, Ilya Do we have plan to apply this patch, or drop this patch ? > > > > --- > > > > > > > > BTW, IMHO, it's a very confusing design that we're installing rules directly to > > > > tc. The issue here is that we have to maintain list of ports we offloaded to > > > > with ufids completely in userspace and we can not re-create these mappings after > > > > OVS restart like we do in case of kernel datapath. The main issue is that we > > > > do not know which ports are ours. Datapath is actually split between kernel > > > > and userspace with all the metadata in ovs-vswitchd and flows in tc in kernel. > > > > This leads to big number of issues. If we could install tc rules from the > > > > inside of the usual kernel datapath that might solve a lot of configuration issues. > > > > Might create some new, but anyway, I think, common architecture would be much > > > > more clear. > > > > For example, in userspace datapath we're installing flows to usual userspace > > > > flow table AND to offload provider. This allows us to manage flows in more > > > > natural way. And even if we will lost some flows inside offload provider > > > > implementation we will still clean up all the hanging data while removing flows > > > > from the usual userspace datapath flow tables. So, offloading is actually > > > > joined part of userspace datapath and not a side thing like tc for kernel datapath. > > > > So, what is this part about: kernel datapath has persistent ports and flows > > > > (persistent in terms of ovs-vswitchd restarts), but tc offloading has persistent > > > > flows and non-persistent ports (i.e. list of ports we're affloading to) and > > > > non-persistent flows metadata (ufids of flows we offlaoded). That introduces > > > > a lot of complications. > > > > > > > > > > > > Best regards, Ilya Maximets. > > > > > > > > > > > > -- > > > Best regards, Tonghao > > > > -- > Best regards, Tonghao
> > > > Hi > > > > Thanks for your review. I test it with openflow, It works fine because > > > > the *revalidator > > > > will clean up the flows(which may be offloaded). > > > > > In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest > > > > > reverting of the previous patch that allowed offloading via add-flow since this > > > > > is effectively a memory leak and even debugging commands should not produce > > > > > a memory leak. > > > > yes, we can revert the previous patch. But I think the command is > > > > useful for us to > > > > debug offload(rte flow offload and tc offload), and test the offload function. > > > > For this issue, can we use this little patch to fix it ? or other better way ? > > > > > > I am confused. Which little patch? > > Hi Simon > > This patch [1]: > > [1] - http://patchwork.ozlabs.org/project/openvswitch/patch/20200611103635.53367-1-xiangxia.m.yue@gmail.com/ > > > > If we revert the patch which commit id is > > e61984e781e6c7d621568428788cb87c11be8f1f > > and we can't debug or install the flows with "ovs-app dpctl/add-flow", > > for tc offload. > > I think that tools are useful for us. > Hi Simon, Ilya > Do we have plan to apply this patch, or drop this patch ? My understanding from reading the discussion is that using ovs-appctl dpclt/add-flow is just for debugging purposes. When using the right way, the ovs-ofctl add-flow, there is no such problem. I don't have any strong opinion on drop or apply. William
diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index ab97a292ebac..964566caab1e 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class) data = netdev_ports_lookup(port_no, dpif_class); if (data) { dpif_port_destroy(&data->dpif_port); + netdev_flow_flush(data->netdev); /* flush offloaded rules. */ netdev_close(data->netdev); /* unref and possibly close */ hmap_remove(&port_to_netdev, &data->portno_node); hmap_remove(&ifindex_to_port, &data->ifindex_node);