diff mbox series

[ovs-dev] dpif-netlink-rtnl: Fix ovs_geneve probing after restart

Message ID 1509046263-23324-1-git-send-email-u9012063@gmail.com
State Deferred
Headers show
Series [ovs-dev] dpif-netlink-rtnl: Fix ovs_geneve probing after restart | expand

Commit Message

William Tu Oct. 26, 2017, 7:31 p.m. UTC
When using the out-of-tree (openvswitch compat) geneve module,
the first time oot tunnel probing returns true (correct).
Without unloading the geneve module, if the userspace ovs-vswitchd
restarts, because the 'geneve_sys_6081' still exists, the probing
incorrectly returns false and loads the in-tree (upstream kernel)
geneve module.  The issue also exists the other way around, where
existing geneve module is in-tree and ovs-vswitchd restarts.

The patch fixes it by unconditionally removing the tunnel device
before the probing.  To reproduce the issue, start the ovs
> /etc/init.d/openvswitch-switch start
> creat a bridge and attach a geneve port using out-of-tree geneve
> /etc/init.d/openvswitch-switch restart

Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Eric Garver <e@erig.me>
Cc: Gurucharan Shetty <guru@ovn.org>
---
 lib/dpif-netlink-rtnl.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Eric Garver Oct. 26, 2017, 8:20 p.m. UTC | #1
On Thu, Oct 26, 2017 at 12:31:03PM -0700, William Tu wrote:
> When using the out-of-tree (openvswitch compat) geneve module,
> the first time oot tunnel probing returns true (correct).
> Without unloading the geneve module, if the userspace ovs-vswitchd
> restarts, because the 'geneve_sys_6081' still exists, the probing
> incorrectly returns false and loads the in-tree (upstream kernel)
> geneve module.  The issue also exists the other way around, where
> existing geneve module is in-tree and ovs-vswitchd restarts.
> 
> The patch fixes it by unconditionally removing the tunnel device
> before the probing.  To reproduce the issue, start the ovs
> > /etc/init.d/openvswitch-switch start
> > creat a bridge and attach a geneve port using out-of-tree geneve
> > /etc/init.d/openvswitch-switch restart
> 
> Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface")
> Signed-off-by: William Tu <u9012063@gmail.com>
> Cc: Eric Garver <e@erig.me>
> Cc: Gurucharan Shetty <guru@ovn.org>
> ---
>  lib/dpif-netlink-rtnl.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
> index 0c32e7d8ccb4..148ce5ff3a3d 100644
> --- a/lib/dpif-netlink-rtnl.c
> +++ b/lib/dpif-netlink-rtnl.c
> @@ -448,6 +448,8 @@ dpif_netlink_rtnl_probe_oot_tunnels(void)
>          }
>  
>          name = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
> +        dpif_netlink_rtnl_destroy(name);
> +
>          error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
>                                           "ovs_geneve",
>                                           (NLM_F_REQUEST | NLM_F_ACK
> -- 
> 2.7.4

Thanks for the fix!

Acked-by: Eric Garver <e@erig.me>
William Tu Oct. 30, 2017, 5:36 p.m. UTC | #2
Thanks for the review.

Guru and I had some offline discussion. We have concern about possible
packet lost when unconditionally remove the interface and bring it
again. Currently if the userspace ovs-vswitchd dies, since the kernel
datapath module still remains, packets can still go through.
If restarting ovs-vswitch restarts the geneve device, then we might
see more packets lost. So I'm thinking about another way to fix it
without deleting the device.

Thanks
William

On Thu, Oct 26, 2017 at 1:20 PM, Eric Garver <e@erig.me> wrote:
> On Thu, Oct 26, 2017 at 12:31:03PM -0700, William Tu wrote:
>> When using the out-of-tree (openvswitch compat) geneve module,
>> the first time oot tunnel probing returns true (correct).
>> Without unloading the geneve module, if the userspace ovs-vswitchd
>> restarts, because the 'geneve_sys_6081' still exists, the probing
>> incorrectly returns false and loads the in-tree (upstream kernel)
>> geneve module.  The issue also exists the other way around, where
>> existing geneve module is in-tree and ovs-vswitchd restarts.
>>
>> The patch fixes it by unconditionally removing the tunnel device
>> before the probing.  To reproduce the issue, start the ovs
>> > /etc/init.d/openvswitch-switch start
>> > creat a bridge and attach a geneve port using out-of-tree geneve
>> > /etc/init.d/openvswitch-switch restart
>>
>> Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface")
>> Signed-off-by: William Tu <u9012063@gmail.com>
>> Cc: Eric Garver <e@erig.me>
>> Cc: Gurucharan Shetty <guru@ovn.org>
>> ---
>>  lib/dpif-netlink-rtnl.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
>> index 0c32e7d8ccb4..148ce5ff3a3d 100644
>> --- a/lib/dpif-netlink-rtnl.c
>> +++ b/lib/dpif-netlink-rtnl.c
>> @@ -448,6 +448,8 @@ dpif_netlink_rtnl_probe_oot_tunnels(void)
>>          }
>>
>>          name = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
>> +        dpif_netlink_rtnl_destroy(name);
>> +
>>          error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
>>                                           "ovs_geneve",
>>                                           (NLM_F_REQUEST | NLM_F_ACK
>> --
>> 2.7.4
>
> Thanks for the fix!
>
> Acked-by: Eric Garver <e@erig.me>
Eric Garver Nov. 1, 2017, 3:29 p.m. UTC | #3
On Mon, Oct 30, 2017 at 10:36:29AM -0700, William Tu wrote:
> Thanks for the review.
> 
> Guru and I had some offline discussion. We have concern about possible
> packet lost when unconditionally remove the interface and bring it
> again. Currently if the userspace ovs-vswitchd dies, since the kernel
> datapath module still remains, packets can still go through.
> If restarting ovs-vswitch restarts the geneve device, then we might
> see more packets lost. So I'm thinking about another way to fix it
> without deleting the device.

You can use dpif_netlink_rtnl_getlink() if the device already exists.
Then check IFLA_INFO_KIND == "ovs_geneve".

> 
> Thanks
> William
> 
> On Thu, Oct 26, 2017 at 1:20 PM, Eric Garver <e@erig.me> wrote:
> > On Thu, Oct 26, 2017 at 12:31:03PM -0700, William Tu wrote:
> >> When using the out-of-tree (openvswitch compat) geneve module,
> >> the first time oot tunnel probing returns true (correct).
> >> Without unloading the geneve module, if the userspace ovs-vswitchd
> >> restarts, because the 'geneve_sys_6081' still exists, the probing
> >> incorrectly returns false and loads the in-tree (upstream kernel)
> >> geneve module.  The issue also exists the other way around, where
> >> existing geneve module is in-tree and ovs-vswitchd restarts.
> >>
> >> The patch fixes it by unconditionally removing the tunnel device
> >> before the probing.  To reproduce the issue, start the ovs
> >> > /etc/init.d/openvswitch-switch start
> >> > creat a bridge and attach a geneve port using out-of-tree geneve
> >> > /etc/init.d/openvswitch-switch restart
> >>
> >> Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface")
> >> Signed-off-by: William Tu <u9012063@gmail.com>
> >> Cc: Eric Garver <e@erig.me>
> >> Cc: Gurucharan Shetty <guru@ovn.org>
> >> ---
> >>  lib/dpif-netlink-rtnl.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
> >> index 0c32e7d8ccb4..148ce5ff3a3d 100644
> >> --- a/lib/dpif-netlink-rtnl.c
> >> +++ b/lib/dpif-netlink-rtnl.c
> >> @@ -448,6 +448,8 @@ dpif_netlink_rtnl_probe_oot_tunnels(void)
> >>          }
> >>
> >>          name = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
> >> +        dpif_netlink_rtnl_destroy(name);
> >> +
> >>          error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
> >>                                           "ovs_geneve",
> >>                                           (NLM_F_REQUEST | NLM_F_ACK
> >> --
> >> 2.7.4
> >
> > Thanks for the fix!
> >
> > Acked-by: Eric Garver <e@erig.me>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
William Tu Nov. 2, 2017, 3:49 p.m. UTC | #4
On Wed, Nov 1, 2017 at 8:29 AM, Eric Garver <e@erig.me> wrote:
> On Mon, Oct 30, 2017 at 10:36:29AM -0700, William Tu wrote:
>> Thanks for the review.
>>
>> Guru and I had some offline discussion. We have concern about possible
>> packet lost when unconditionally remove the interface and bring it
>> again. Currently if the userspace ovs-vswitchd dies, since the kernel
>> datapath module still remains, packets can still go through.
>> If restarting ovs-vswitch restarts the geneve device, then we might
>> see more packets lost. So I'm thinking about another way to fix it
>> without deleting the device.
>
> You can use dpif_netlink_rtnl_getlink() if the device already exists.
> Then check IFLA_INFO_KIND == "ovs_geneve".
>
Thanks Eric,
I followed your suggestion and submit v2 here:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/340357.html

William
diff mbox series

Patch

diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
index 0c32e7d8ccb4..148ce5ff3a3d 100644
--- a/lib/dpif-netlink-rtnl.c
+++ b/lib/dpif-netlink-rtnl.c
@@ -448,6 +448,8 @@  dpif_netlink_rtnl_probe_oot_tunnels(void)
         }
 
         name = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
+        dpif_netlink_rtnl_destroy(name);
+
         error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
                                          "ovs_geneve",
                                          (NLM_F_REQUEST | NLM_F_ACK