Message ID | 1508881411-36992-1-git-send-email-u9012063@gmail.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] dpif-netlink-rtnl: Fix ovs_geneve probing after restart | expand |
On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote: Hi William, Thanks for taking a look at this. > When using the out-of-tree (openvswitch compat) geneve module, > the first time oot tunnel probing returns true (correct). > Without unloading the geneve module, if the userspace ovs-vswitchd > restarts, because the 'geneve_sys_6081' still exists, the probing > incorrectly returns false and loads the in-tree (upstream kernel) > geneve module. The reason for this is because rtnl sees the existing device and tries to call ->changelink, but since the out-of-tree tunnel has no handler it returns EOPNOTSUPP. > The patch fixes it by adding NLM_F_EXCL flags, so if geneve already > exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel > probing returns true. To reproduce the issue, start the ovs While this fixes this scenario, it breaks another; an existing device in-tree/kernel geneve tunnel. We'll end up with the opposite occurring - it returns true when it should return false. So in addition to adding NLM_F_EXCL, this needs to also try to delete the existing device in the case of EEXISTS, then restart the probe. > > /etc/init.d/openvswitch-switch start > > creat a bridge and attach a geneve port using out-of-tree geneve > > /etc/init.d/openvswitch-switch restart > > Fixes: 921c370a9df5 ("dpif-netlink: Probe for out-of-tree tunnels, decides used interface") > Signed-off-by: William Tu <u9012063@gmail.com> > Cc: Eric Garver <e@erig.me> > Cc: Gurucharan Shetty <guru@ovn.org> > --- > lib/dpif-netlink-rtnl.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > index 0c32e7d8ccb4..ab9737b8cc4b 100644 > --- a/lib/dpif-netlink-rtnl.c > +++ b/lib/dpif-netlink-rtnl.c > @@ -451,7 +451,7 @@ dpif_netlink_rtnl_probe_oot_tunnels(void) > error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE, > "ovs_geneve", > (NLM_F_REQUEST | NLM_F_ACK > - | NLM_F_CREATE)); > + | NLM_F_CREATE | NLM_F_EXCL)); > if (error != EOPNOTSUPP) { > if (!error) { > dpif_netlink_rtnl_destroy(name); > -- > 2.7.4 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Wed, Oct 25, 2017 at 10:39 AM, Eric Garver <e@erig.me> wrote: > On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote: > > Hi William, > Thanks for taking a look at this. > >> When using the out-of-tree (openvswitch compat) geneve module, >> the first time oot tunnel probing returns true (correct). >> Without unloading the geneve module, if the userspace ovs-vswitchd >> restarts, because the 'geneve_sys_6081' still exists, the probing >> incorrectly returns false and loads the in-tree (upstream kernel) >> geneve module. > > The reason for this is because rtnl sees the existing device and tries > to call ->changelink, but since the out-of-tree tunnel has no handler it > returns EOPNOTSUPP. > >> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already >> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel >> probing returns true. To reproduce the issue, start the ovs > > While this fixes this scenario, it breaks another; an existing device > in-tree/kernel geneve tunnel. We'll end up with the opposite occurring - > it returns true when it should return false. > > So in addition to adding NLM_F_EXCL, this needs to also try to delete > the existing device in the case of EEXISTS, then restart the probe. > You're right, thanks! How about we unconditionally delete the device before the probe?. I'm simply adding one line without using NLM_F_EXCL: + dpif_netlink_rtnl_destroy(name); error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE, "ovs_geneve", (NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE)); I'm testing using in-tree or out-of-tree and the result is consistent. Regards, William
On Thu, Oct 26, 2017 at 09:12:14AM -0700, William Tu wrote: > On Wed, Oct 25, 2017 at 10:39 AM, Eric Garver <e@erig.me> wrote: > > On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote: > > > > Hi William, > > Thanks for taking a look at this. > > > >> When using the out-of-tree (openvswitch compat) geneve module, > >> the first time oot tunnel probing returns true (correct). > >> Without unloading the geneve module, if the userspace ovs-vswitchd > >> restarts, because the 'geneve_sys_6081' still exists, the probing > >> incorrectly returns false and loads the in-tree (upstream kernel) > >> geneve module. > > > > The reason for this is because rtnl sees the existing device and tries > > to call ->changelink, but since the out-of-tree tunnel has no handler it > > returns EOPNOTSUPP. > > > >> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already > >> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel > >> probing returns true. To reproduce the issue, start the ovs > > > > While this fixes this scenario, it breaks another; an existing device > > in-tree/kernel geneve tunnel. We'll end up with the opposite occurring - > > it returns true when it should return false. > > > > So in addition to adding NLM_F_EXCL, this needs to also try to delete > > the existing device in the case of EEXISTS, then restart the probe. > > > > You're right, thanks! > How about we unconditionally delete the device before the probe?. I'm > simply adding one line without using NLM_F_EXCL: > > + dpif_netlink_rtnl_destroy(name); > error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE, > "ovs_geneve", > (NLM_F_REQUEST | NLM_F_ACK > | NLM_F_CREATE)); > > I'm testing using in-tree or out-of-tree and the result is consistent. > I think that will work.
On Thu, Oct 26, 2017 at 11:15 AM, Eric Garver <e@erig.me> wrote: > On Thu, Oct 26, 2017 at 09:12:14AM -0700, William Tu wrote: >> On Wed, Oct 25, 2017 at 10:39 AM, Eric Garver <e@erig.me> wrote: >> > On Tue, Oct 24, 2017 at 02:43:31PM -0700, William Tu wrote: >> > >> > Hi William, >> > Thanks for taking a look at this. >> > >> >> When using the out-of-tree (openvswitch compat) geneve module, >> >> the first time oot tunnel probing returns true (correct). >> >> Without unloading the geneve module, if the userspace ovs-vswitchd >> >> restarts, because the 'geneve_sys_6081' still exists, the probing >> >> incorrectly returns false and loads the in-tree (upstream kernel) >> >> geneve module. >> > >> > The reason for this is because rtnl sees the existing device and tries >> > to call ->changelink, but since the out-of-tree tunnel has no handler it >> > returns EOPNOTSUPP. >> > >> >> The patch fixes it by adding NLM_F_EXCL flags, so if geneve already >> >> exists, the dpif_netlink_rtnl_create return EEXIST and the oot tunnel >> >> probing returns true. To reproduce the issue, start the ovs >> > >> > While this fixes this scenario, it breaks another; an existing device >> > in-tree/kernel geneve tunnel. We'll end up with the opposite occurring - >> > it returns true when it should return false. >> > >> > So in addition to adding NLM_F_EXCL, this needs to also try to delete >> > the existing device in the case of EEXISTS, then restart the probe. >> > >> >> You're right, thanks! >> How about we unconditionally delete the device before the probe?. I'm >> simply adding one line without using NLM_F_EXCL: >> >> + dpif_netlink_rtnl_destroy(name); >> error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE, >> "ovs_geneve", >> (NLM_F_REQUEST | NLM_F_ACK >> | NLM_F_CREATE)); >> >> I'm testing using in-tree or out-of-tree and the result is consistent. >> > > I think that will work. thanks, I will send out v2.
diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c index 0c32e7d8ccb4..ab9737b8cc4b 100644 --- a/lib/dpif-netlink-rtnl.c +++ b/lib/dpif-netlink-rtnl.c @@ -451,7 +451,7 @@ dpif_netlink_rtnl_probe_oot_tunnels(void) error = dpif_netlink_rtnl_create(tnl_cfg, name, OVS_VPORT_TYPE_GENEVE, "ovs_geneve", (NLM_F_REQUEST | NLM_F_ACK - | NLM_F_CREATE)); + | NLM_F_CREATE | NLM_F_EXCL)); if (error != EOPNOTSUPP) { if (!error) { dpif_netlink_rtnl_destroy(name);