Message ID | 20240612143255.63973-2-ktraynor@redhat.com |
---|---|
State | Accepted |
Commit | 639fcf200517e757fbf651f870080c01629cbd74 |
Delegated to: | Kevin Traynor |
Headers | show |
Series | netdev-dpdk: Check pending reset when adding device. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On Wed, Jun 12, 2024 at 4:33 PM Kevin Traynor <ktraynor@redhat.com> wrote: > > When a device reset interrupt event (RTE_ETH_EVENT_INTR_RESET) > is detected for a DPDK device added to OVS, a device reset is > performed. > > If a device reset interrupt event is detected for a device before > it is added to OVS, device reset is not called. > > If that device is later attempted to be added to OVS, it may fail > while being configured if it is still pending a reset as pending > reset is not checked when adding a device. > > A simple way to force a reset event from the ice driver for an > iavf device is to set the mac address after binding iavf dev to > vfio but before adding to OVS. (note: should not be set like this > in normal case). e.g. > > $ echo 2 > /sys/class/net/ens3f0/device/sriov_numvfs > $ ./devbind.py -b vfio-pci 0000:d8:01.1 > $ ip link set ens3f0 vf 1 mac 26:ab:e6:6f:79:4d > $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ > options:dpdk-devargs=0000:d8:01.1 > > |dpdk|ERR|Port1 dev_configure = -1 > |netdev_dpdk|WARN|Interface dpdk0 eth_dev setup error > Operation not permitted > |netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:5 lsc interrupt mode:false) > configure error: Operation not permitted > |dpif_netdev|ERR|Failed to set interface dpdk0 new configuration > > Add a check if there was any previous device reset interrupt events > when a device is added to OVS. If there was, perform the reset > before continuing with the rest of the configuration. > > netdev_dpdk_pending_reset[] already tracks device reset interrupt > events for all devices, so it can be reused to check if there is a > reset needed during configuration of newly added devices. By extending > it's usage, dev->reset_needed is no longer needed. > Maybe add a Fixes: and I think we should backport this. > Signed-off-by: Kevin Traynor <ktraynor@redhat.com> > --- > lib/netdev-dpdk.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 0fa37d514..9ceb0d972 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -465,7 +465,6 @@ struct netdev_dpdk { > /* If true, rte_eth_dev_start() was successfully called */ > bool started; > - bool reset_needed; > - /* 1 pad byte here. */ > struct eth_addr hwaddr; > + /* 2 pad byte here. */ Nit: bytes? > int mtu; > int socket_id; > @@ -1532,5 +1531,4 @@ common_construct(struct netdev *netdev, dpdk_port_t port_no, > dev->attached = false; > dev->started = false; > - dev->reset_needed = false; > > ovsrcu_init(&dev->qos_conf, NULL); > @@ -2155,5 +2153,4 @@ netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) > continue; > } > - atomic_store_relaxed(&netdev_dpdk_pending_reset[port_id], false); > > ovs_mutex_lock(&dpdk_mutex); > @@ -2161,5 +2158,4 @@ netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) > if (dev) { > ovs_mutex_lock(&dev->mutex); > - dev->reset_needed = true; > netdev_request_reconfigure(&dev->up); > VLOG_DBG_RL(&rl, "%s: Device reset requested.", > @@ -6073,4 +6069,5 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > { > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > + bool pending_reset; > bool try_rx_steer; > int err = 0; > @@ -6084,4 +6081,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > } > > + atomic_read_relaxed(&netdev_dpdk_pending_reset[dev->port_id], > + &pending_reset); > + > if (netdev->n_txq == dev->requested_n_txq > && netdev->n_rxq == dev->requested_n_rxq > @@ -6093,5 +6093,5 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr) > && dev->socket_id == dev->requested_socket_id > - && dev->started && !dev->reset_needed) { > + && dev->started && !pending_reset) { > /* Reconfiguration is unnecessary */ > > @@ -6102,8 +6102,12 @@ retry: > dpdk_rx_steer_unconfigure(dev); > > - if (dev->reset_needed) { > + if (pending_reset) { > + /* > + * Set false before reset to avoid missing a new reset interrupt event > + * in a race with event callback. > + */ > + atomic_store_relaxed(&netdev_dpdk_pending_reset[dev->port_id], false); > rte_eth_dev_reset(dev->port_id); > if_notifier_manual_report(); > - dev->reset_needed = false; > } else { > rte_eth_dev_stop(dev->port_id); The patch looks good to me. Reviewed-by: David Marchand <david.marchand@redhat.com>
On 6/13/24 11:40, David Marchand wrote: > On Wed, Jun 12, 2024 at 4:33 PM Kevin Traynor <ktraynor@redhat.com> wrote: >> >> When a device reset interrupt event (RTE_ETH_EVENT_INTR_RESET) >> is detected for a DPDK device added to OVS, a device reset is >> performed. >> >> If a device reset interrupt event is detected for a device before >> it is added to OVS, device reset is not called. >> >> If that device is later attempted to be added to OVS, it may fail >> while being configured if it is still pending a reset as pending >> reset is not checked when adding a device. >> >> A simple way to force a reset event from the ice driver for an >> iavf device is to set the mac address after binding iavf dev to >> vfio but before adding to OVS. (note: should not be set like this >> in normal case). e.g. >> >> $ echo 2 > /sys/class/net/ens3f0/device/sriov_numvfs >> $ ./devbind.py -b vfio-pci 0000:d8:01.1 >> $ ip link set ens3f0 vf 1 mac 26:ab:e6:6f:79:4d >> $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ >> options:dpdk-devargs=0000:d8:01.1 >> >> |dpdk|ERR|Port1 dev_configure = -1 >> |netdev_dpdk|WARN|Interface dpdk0 eth_dev setup error >> Operation not permitted >> |netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:5 lsc interrupt mode:false) >> configure error: Operation not permitted >> |dpif_netdev|ERR|Failed to set interface dpdk0 new configuration >> >> Add a check if there was any previous device reset interrupt events >> when a device is added to OVS. If there was, perform the reset >> before continuing with the rest of the configuration. >> >> netdev_dpdk_pending_reset[] already tracks device reset interrupt >> events for all devices, so it can be reused to check if there is a >> reset needed during configuration of newly added devices. By extending >> it's usage, dev->reset_needed is no longer needed. >> > > Maybe add a Fixes: and I think we should backport this. > >> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> >> --- >> lib/netdev-dpdk.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) > > The patch looks good to me. > Reviewed-by: David Marchand <david.marchand@redhat.com> Thanks, Kevin and David! I fixed the nits and applied the change. Also backported to branch-3.3. Best regards, Ilya Maximets.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 0fa37d514..9ceb0d972 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -465,7 +465,6 @@ struct netdev_dpdk { /* If true, rte_eth_dev_start() was successfully called */ bool started; - bool reset_needed; - /* 1 pad byte here. */ struct eth_addr hwaddr; + /* 2 pad byte here. */ int mtu; int socket_id; @@ -1532,5 +1531,4 @@ common_construct(struct netdev *netdev, dpdk_port_t port_no, dev->attached = false; dev->started = false; - dev->reset_needed = false; ovsrcu_init(&dev->qos_conf, NULL); @@ -2155,5 +2153,4 @@ netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) continue; } - atomic_store_relaxed(&netdev_dpdk_pending_reset[port_id], false); ovs_mutex_lock(&dpdk_mutex); @@ -2161,5 +2158,4 @@ netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) if (dev) { ovs_mutex_lock(&dev->mutex); - dev->reset_needed = true; netdev_request_reconfigure(&dev->up); VLOG_DBG_RL(&rl, "%s: Device reset requested.", @@ -6073,4 +6069,5 @@ netdev_dpdk_reconfigure(struct netdev *netdev) { struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + bool pending_reset; bool try_rx_steer; int err = 0; @@ -6084,4 +6081,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev) } + atomic_read_relaxed(&netdev_dpdk_pending_reset[dev->port_id], + &pending_reset); + if (netdev->n_txq == dev->requested_n_txq && netdev->n_rxq == dev->requested_n_rxq @@ -6093,5 +6093,5 @@ netdev_dpdk_reconfigure(struct netdev *netdev) && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr) && dev->socket_id == dev->requested_socket_id - && dev->started && !dev->reset_needed) { + && dev->started && !pending_reset) { /* Reconfiguration is unnecessary */ @@ -6102,8 +6102,12 @@ retry: dpdk_rx_steer_unconfigure(dev); - if (dev->reset_needed) { + if (pending_reset) { + /* + * Set false before reset to avoid missing a new reset interrupt event + * in a race with event callback. + */ + atomic_store_relaxed(&netdev_dpdk_pending_reset[dev->port_id], false); rte_eth_dev_reset(dev->port_id); if_notifier_manual_report(); - dev->reset_needed = false; } else { rte_eth_dev_stop(dev->port_id);
When a device reset interrupt event (RTE_ETH_EVENT_INTR_RESET) is detected for a DPDK device added to OVS, a device reset is performed. If a device reset interrupt event is detected for a device before it is added to OVS, device reset is not called. If that device is later attempted to be added to OVS, it may fail while being configured if it is still pending a reset as pending reset is not checked when adding a device. A simple way to force a reset event from the ice driver for an iavf device is to set the mac address after binding iavf dev to vfio but before adding to OVS. (note: should not be set like this in normal case). e.g. $ echo 2 > /sys/class/net/ens3f0/device/sriov_numvfs $ ./devbind.py -b vfio-pci 0000:d8:01.1 $ ip link set ens3f0 vf 1 mac 26:ab:e6:6f:79:4d $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ options:dpdk-devargs=0000:d8:01.1 |dpdk|ERR|Port1 dev_configure = -1 |netdev_dpdk|WARN|Interface dpdk0 eth_dev setup error Operation not permitted |netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:5 lsc interrupt mode:false) configure error: Operation not permitted |dpif_netdev|ERR|Failed to set interface dpdk0 new configuration Add a check if there was any previous device reset interrupt events when a device is added to OVS. If there was, perform the reset before continuing with the rest of the configuration. netdev_dpdk_pending_reset[] already tracks device reset interrupt events for all devices, so it can be reused to check if there is a reset needed during configuration of newly added devices. By extending it's usage, dev->reset_needed is no longer needed. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> --- lib/netdev-dpdk.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)