Message ID | 1575525907-63828-1-git-send-email-u9012063@gmail.com |
---|---|
State | Rejected |
Headers | show |
Series | [ovs-dev] netdev-afxdp: Add delay when reconfiguring xsk. | expand |
On Wed, Dec 4, 2019 at 10:06 PM William Tu <u9012063@gmail.com> wrote: > > The patch works around an error when reconfigure the netdev-afxdp > device into different modes. Currently, when OVS destroy xsk, the > xsk_destruct() in linux kernel calls xdp_put_umem() and defers > calling xdp_umem_release_deferred(). > > This creates an -EBUSY error when xsk_socket__create() calls > xsk_bind() > xdp_umem_assign_dev() > xdp_get_umem_from_qid() > > And this is because xdp_clear_umem_at_qid() is deferred so not > clearing umem yet. The issue can be reproduced by just changing > xdp-mode multiple times, ex: > ovs-vsctl -- set interface afxdp-p0 \ > options:n_rxq=1 type="afxdp" options:xdp-mode=best-effort > ovs-vsctl -- set interface afxdp-p0 \ > options:n_rxq=1 type="afxdp" options:xdp-mode=generic > > The patch fixes it by adding a delay, hopefully the umem has > been properly cleanup. > > Signed-off-by: William Tu <u9012063@gmail.com> > --- > lib/netdev-afxdp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c > index ca2dfd005b9f..84897f1fd025 100644 > --- a/lib/netdev-afxdp.c > +++ b/lib/netdev-afxdp.c > @@ -661,6 +661,9 @@ netdev_afxdp_reconfigure(struct netdev *netdev) > struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; > int err = 0; > > + /* Time for umem to be released in kernel. */ > + xnanosleep(500000); > + > ovs_mutex_lock(&dev->mutex); > > if (netdev->n_rxq == dev->requested_n_rxq > -- > 2.7.4 > Now I think it's a bad solution. Let me work on other way to fix it. Regards, William
On Thu, Dec 5, 2019 at 5:43 AM William Tu <u9012063@gmail.com> wrote: > > On Wed, Dec 4, 2019 at 10:06 PM William Tu <u9012063@gmail.com> wrote: > > > > The patch works around an error when reconfigure the netdev-afxdp > > device into different modes. Currently, when OVS destroy xsk, the > > xsk_destruct() in linux kernel calls xdp_put_umem() and defers > > calling xdp_umem_release_deferred(). > > > > This creates an -EBUSY error when xsk_socket__create() calls > > xsk_bind() > > xdp_umem_assign_dev() > > xdp_get_umem_from_qid() > > > > And this is because xdp_clear_umem_at_qid() is deferred so not > > clearing umem yet. The issue can be reproduced by just changing > > xdp-mode multiple times, ex: > > ovs-vsctl -- set interface afxdp-p0 \ > > options:n_rxq=1 type="afxdp" options:xdp-mode=best-effort > > ovs-vsctl -- set interface afxdp-p0 \ > > options:n_rxq=1 type="afxdp" options:xdp-mode=generic > > > > The patch fixes it by adding a delay, hopefully the umem has > > been properly cleanup. > > > > Signed-off-by: William Tu <u9012063@gmail.com> > > --- > > lib/netdev-afxdp.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c > > index ca2dfd005b9f..84897f1fd025 100644 > > --- a/lib/netdev-afxdp.c > > +++ b/lib/netdev-afxdp.c > > @@ -661,6 +661,9 @@ netdev_afxdp_reconfigure(struct netdev *netdev) > > struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; > > int err = 0; > > > > + /* Time for umem to be released in kernel. */ > > + xnanosleep(500000); > > + > > ovs_mutex_lock(&dev->mutex); > > > > if (netdev->n_rxq == dev->requested_n_rxq > > -- > > 2.7.4 > > > Now I think it's a bad solution. Let me work on other way > to fix it. > This is an issue due to XSK async cleanup. We can wait for fix later. https://lore.kernel.org/netdev/CALDO+SbciaNy5EReV5YHvciOSJmdMRPBQdF6XbhxfBF6gvPFDw@mail.gmail.com/T/#m10cd49235090496dab98c24837dd1a2fbcb2e187 William
On 6 Dec 2019, at 14:55, William Tu wrote: > On Thu, Dec 5, 2019 at 5:43 AM William Tu <u9012063@gmail.com> wrote: >> >> On Wed, Dec 4, 2019 at 10:06 PM William Tu <u9012063@gmail.com> >> wrote: >>> >>> The patch works around an error when reconfigure the netdev-afxdp >>> device into different modes. Currently, when OVS destroy xsk, the >>> xsk_destruct() in linux kernel calls xdp_put_umem() and defers >>> calling xdp_umem_release_deferred(). >>> >>> This creates an -EBUSY error when xsk_socket__create() calls >>> xsk_bind() >>> xdp_umem_assign_dev() >>> xdp_get_umem_from_qid() >>> >>> And this is because xdp_clear_umem_at_qid() is deferred so not >>> clearing umem yet. The issue can be reproduced by just changing >>> xdp-mode multiple times, ex: >>> ovs-vsctl -- set interface afxdp-p0 \ >>> options:n_rxq=1 type="afxdp" options:xdp-mode=best-effort >>> ovs-vsctl -- set interface afxdp-p0 \ >>> options:n_rxq=1 type="afxdp" options:xdp-mode=generic >>> >>> The patch fixes it by adding a delay, hopefully the umem has >>> been properly cleanup. >>> >>> Signed-off-by: William Tu <u9012063@gmail.com> >>> --- >>> lib/netdev-afxdp.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c >>> index ca2dfd005b9f..84897f1fd025 100644 >>> --- a/lib/netdev-afxdp.c >>> +++ b/lib/netdev-afxdp.c >>> @@ -661,6 +661,9 @@ netdev_afxdp_reconfigure(struct netdev *netdev) >>> struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; >>> int err = 0; >>> >>> + /* Time for umem to be released in kernel. */ >>> + xnanosleep(500000); >>> + >>> ovs_mutex_lock(&dev->mutex); >>> >>> if (netdev->n_rxq == dev->requested_n_rxq >>> -- >>> 2.7.4 >>> >> Now I think it's a bad solution. Let me work on other way >> to fix it. >> > This is an issue due to XSK async cleanup. > We can wait for fix later. > https://lore.kernel.org/netdev/CALDO+SbciaNy5EReV5YHvciOSJmdMRPBQdF6XbhxfBF6gvPFDw@mail.gmail.com/T/#m10cd49235090496dab98c24837dd1a2fbcb2e187 Waiting for the sync cleanup might make sense. Adding a xnanosleep() delay is not the way forward.
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index ca2dfd005b9f..84897f1fd025 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -661,6 +661,9 @@ netdev_afxdp_reconfigure(struct netdev *netdev) struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; int err = 0; + /* Time for umem to be released in kernel. */ + xnanosleep(500000); + ovs_mutex_lock(&dev->mutex); if (netdev->n_rxq == dev->requested_n_rxq
The patch works around an error when reconfigure the netdev-afxdp device into different modes. Currently, when OVS destroy xsk, the xsk_destruct() in linux kernel calls xdp_put_umem() and defers calling xdp_umem_release_deferred(). This creates an -EBUSY error when xsk_socket__create() calls xsk_bind() xdp_umem_assign_dev() xdp_get_umem_from_qid() And this is because xdp_clear_umem_at_qid() is deferred so not clearing umem yet. The issue can be reproduced by just changing xdp-mode multiple times, ex: ovs-vsctl -- set interface afxdp-p0 \ options:n_rxq=1 type="afxdp" options:xdp-mode=best-effort ovs-vsctl -- set interface afxdp-p0 \ options:n_rxq=1 type="afxdp" options:xdp-mode=generic The patch fixes it by adding a delay, hopefully the umem has been properly cleanup. Signed-off-by: William Tu <u9012063@gmail.com> --- lib/netdev-afxdp.c | 3 +++ 1 file changed, 3 insertions(+)