diff mbox series

[ovs-dev] dpif-netlink-rtnl: Fix ovs_geneve probing after restart

Message ID 1508881411-36992-1-git-send-email-u9012063@gmail.com
State Changes Requested
Headers show
Series [ovs-dev] dpif-netlink-rtnl: Fix ovs_geneve probing after restart | expand

Commit Message

William Tu Oct. 24, 2017, 9:43 p.m. UTC
When using the out-of-tree (openvswitch compat) geneve module,
the first time oot tunnel probing returns true (correct).
Without unloading the geneve module, if the userspace ovs-vswitchd
restarts, because the 'geneve_sys_6081' still exists, the probing
incorrectly returns false and loads the in-tree (upstream kernel)
geneve module.

The patch fixes it by adding NLM_F_EXCL flags, so if geneve already
exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel
probing returns true.  To reproduce the issue, start the ovs
> /etc/init.d/openvswitch-switch start
> creat a bridge and attach a geneve port using out-of-tree geneve
> /etc/init.d/openvswitch-switch restart

Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Eric Garver <e@erig.me>
Cc: Gurucharan Shetty <guru@ovn.org>
---
 lib/dpif-netlink-rtnl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Garver Oct. 25, 2017, 5:39 p.m. UTC | #1
On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote:

Hi William,
Thanks for taking a look at this.

> When using the out-of-tree (openvswitch compat) geneve module,
> the first time oot tunnel probing returns true (correct).
> Without unloading the geneve module, if the userspace ovs-vswitchd
> restarts, because the 'geneve_sys_6081' still exists, the probing
> incorrectly returns false and loads the in-tree (upstream kernel)
> geneve module.

The reason for this is because rtnl sees the existing device and tries
to call ->changelink, but since the out-of-tree tunnel has no handler it
returns EOPNOTSUPP.

> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already
> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel
> probing returns true.  To reproduce the issue, start the ovs

While this fixes this scenario, it breaks another; an existing device
in-tree/kernel geneve tunnel. We'll end up with the opposite occurring -
it returns true when it should return false.

So in addition to adding NLM_F_EXCL, this needs to also try to delete
the existing device in the case of EEXISTS, then restart the probe.

> > /etc/init.d/openvswitch-switch start
> > creat a bridge and attach a geneve port using out-of-tree geneve
> > /etc/init.d/openvswitch-switch restart
> 
> Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface")
> Signed-off-by: William Tu <u9012063@gmail.com>
> Cc: Eric Garver <e@erig.me>
> Cc: Gurucharan Shetty <guru@ovn.org>
> ---
>  lib/dpif-netlink-rtnl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
> index 0c32e7d8ccb4..ab9737b8cc4b 100644
> --- a/lib/dpif-netlink-rtnl.c
> +++ b/lib/dpif-netlink-rtnl.c
> @@ -451,7 +451,7 @@ dpif_netlink_rtnl_probe_oot_tunnels(void)
>          error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
>                                           "ovs_geneve",
>                                           (NLM_F_REQUEST | NLM_F_ACK
> -                                          | NLM_F_CREATE));
> +                                          | NLM_F_CREATE | NLM_F_EXCL));
>          if (error != EOPNOTSUPP) {
>              if (!error) {
>                  dpif_netlink_rtnl_destroy(name);
> -- 
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
William Tu Oct. 26, 2017, 4:12 p.m. UTC | #2
On Wed, Oct 25, 2017 at 10:39 AM, Eric Garver <e@erig.me> wrote:
> On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote:
>
> Hi William,
> Thanks for taking a look at this.
>
>> When using the out-of-tree (openvswitch compat) geneve module,
>> the first time oot tunnel probing returns true (correct).
>> Without unloading the geneve module, if the userspace ovs-vswitchd
>> restarts, because the 'geneve_sys_6081' still exists, the probing
>> incorrectly returns false and loads the in-tree (upstream kernel)
>> geneve module.
>
> The reason for this is because rtnl sees the existing device and tries
> to call ->changelink, but since the out-of-tree tunnel has no handler it
> returns EOPNOTSUPP.
>
>> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already
>> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel
>> probing returns true.  To reproduce the issue, start the ovs
>
> While this fixes this scenario, it breaks another; an existing device
> in-tree/kernel geneve tunnel. We'll end up with the opposite occurring -
> it returns true when it should return false.
>
> So in addition to adding NLM_F_EXCL, this needs to also try to delete
> the existing device in the case of EEXISTS, then restart the probe.
>

You're right, thanks!
How about we unconditionally delete the device before the probe?. I'm
simply adding one line without using NLM_F_EXCL:

+       dpif_netlink_rtnl_destroy(name);
         error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
                                          "ovs_geneve",
                                          (NLM_F_REQUEST | NLM_F_ACK
                                           | NLM_F_CREATE));

I'm testing using in-tree or out-of-tree and the result is consistent.

Regards,
William
Eric Garver Oct. 26, 2017, 6:15 p.m. UTC | #3
On Thu, Oct 26, 2017 at 09:12:14AM -0700, William Tu wrote:
> On Wed, Oct 25, 2017 at 10:39 AM, Eric Garver <e@erig.me> wrote:
> > On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote:
> >
> > Hi William,
> > Thanks for taking a look at this.
> >
> >> When using the out-of-tree (openvswitch compat) geneve module,
> >> the first time oot tunnel probing returns true (correct).
> >> Without unloading the geneve module, if the userspace ovs-vswitchd
> >> restarts, because the 'geneve_sys_6081' still exists, the probing
> >> incorrectly returns false and loads the in-tree (upstream kernel)
> >> geneve module.
> >
> > The reason for this is because rtnl sees the existing device and tries
> > to call ->changelink, but since the out-of-tree tunnel has no handler it
> > returns EOPNOTSUPP.
> >
> >> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already
> >> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel
> >> probing returns true.  To reproduce the issue, start the ovs
> >
> > While this fixes this scenario, it breaks another; an existing device
> > in-tree/kernel geneve tunnel. We'll end up with the opposite occurring -
> > it returns true when it should return false.
> >
> > So in addition to adding NLM_F_EXCL, this needs to also try to delete
> > the existing device in the case of EEXISTS, then restart the probe.
> >
> 
> You're right, thanks!
> How about we unconditionally delete the device before the probe?. I'm
> simply adding one line without using NLM_F_EXCL:
> 
> +       dpif_netlink_rtnl_destroy(name);
>          error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
>                                           "ovs_geneve",
>                                           (NLM_F_REQUEST | NLM_F_ACK
>                                            | NLM_F_CREATE));
> 
> I'm testing using in-tree or out-of-tree and the result is consistent.
> 

I think that will work.
William Tu Oct. 26, 2017, 6:29 p.m. UTC | #4
On Thu, Oct 26, 2017 at 11:15 AM, Eric Garver <e@erig.me> wrote:
> On Thu, Oct 26, 2017 at 09:12:14AM -0700, William Tu wrote:
>> On Wed, Oct 25, 2017 at 10:39 AM, Eric Garver <e@erig.me> wrote:
>> > On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote:
>> >
>> > Hi William,
>> > Thanks for taking a look at this.
>> >
>> >> When using the out-of-tree (openvswitch compat) geneve module,
>> >> the first time oot tunnel probing returns true (correct).
>> >> Without unloading the geneve module, if the userspace ovs-vswitchd
>> >> restarts, because the 'geneve_sys_6081' still exists, the probing
>> >> incorrectly returns false and loads the in-tree (upstream kernel)
>> >> geneve module.
>> >
>> > The reason for this is because rtnl sees the existing device and tries
>> > to call ->changelink, but since the out-of-tree tunnel has no handler it
>> > returns EOPNOTSUPP.
>> >
>> >> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already
>> >> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel
>> >> probing returns true.  To reproduce the issue, start the ovs
>> >
>> > While this fixes this scenario, it breaks another; an existing device
>> > in-tree/kernel geneve tunnel. We'll end up with the opposite occurring -
>> > it returns true when it should return false.
>> >
>> > So in addition to adding NLM_F_EXCL, this needs to also try to delete
>> > the existing device in the case of EEXISTS, then restart the probe.
>> >
>>
>> You're right, thanks!
>> How about we unconditionally delete the device before the probe?. I'm
>> simply adding one line without using NLM_F_EXCL:
>>
>> +       dpif_netlink_rtnl_destroy(name);
>>          error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
>>                                           "ovs_geneve",
>>                                           (NLM_F_REQUEST | NLM_F_ACK
>>                                            | NLM_F_CREATE));
>>
>> I'm testing using in-tree or out-of-tree and the result is consistent.
>>
>
> I think that will work.
thanks, I will send out v2.
diff mbox series

Patch

diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
index 0c32e7d8ccb4..ab9737b8cc4b 100644
--- a/lib/dpif-netlink-rtnl.c
+++ b/lib/dpif-netlink-rtnl.c
@@ -451,7 +451,7 @@  dpif_netlink_rtnl_probe_oot_tunnels(void)
         error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE,
                                          "ovs_geneve",
                                          (NLM_F_REQUEST | NLM_F_ACK
-                                          | NLM_F_CREATE));
+                                          | NLM_F_CREATE | NLM_F_EXCL));
         if (error != EOPNOTSUPP) {
             if (!error) {
                 dpif_netlink_rtnl_destroy(name);