Message ID | 20200902094145.12216-4-luobin9@huawei.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | hinic: BugFixes | expand |
On 9/2/20 2:41 AM, Luo bin wrote: > When calling hinic_close in hinic_set_channels, netif_carrier_off > and netif_tx_disable are excuted, and TX host resources are freed > after that. Core may call hinic_xmit_frame to send pkt after > netif_tx_disable within a short time, so we should judge whether > carrier is on before sending pkt otherwise the resources that > have already been freed in hinic_close may be accessed. > > Fixes: 2eed5a8b614b ("hinic: add set_channels ethtool_ops support") > Signed-off-by: Luo bin <luobin9@huawei.com> > --- > drivers/net/ethernet/huawei/hinic/hinic_tx.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c b/drivers/net/ethernet/huawei/hinic/hinic_tx.c > index a97498ee6914..a0662552a39c 100644 > --- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c > +++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c > @@ -531,6 +531,11 @@ netdev_tx_t hinic_xmit_frame(struct sk_buff *skb, struct net_device *netdev) > struct hinic_txq *txq; > struct hinic_qp *qp; > > + if (unlikely(!netif_carrier_ok(netdev))) { > + dev_kfree_skb_any(skb); > + return NETDEV_TX_OK; > + } > + > txq = &nic_dev->txqs[q_id]; > qp = container_of(txq->sq, struct hinic_qp, sq); > > Adding this kind of tests in fast path seems a big hammer to me. See https://marc.info/?l=linux-netdev&m=159903844423389&w=2 for a similar problem. Normally, after hinic_close() operation, no packet should be sent by core networking stack. Trying to work around some core networking issue in each driver is a dead end.
From: Luo bin <luobin9@huawei.com> Date: Wed, 2 Sep 2020 17:41:45 +0800 > @@ -531,6 +531,11 @@ netdev_tx_t hinic_xmit_frame(struct sk_buff *skb, struct net_device *netdev) > struct hinic_txq *txq; > struct hinic_qp *qp; > > + if (unlikely(!netif_carrier_ok(netdev))) { > + dev_kfree_skb_any(skb); > + return NETDEV_TX_OK; > + } As Eric said, these kinds of tests should not be placed in the fast path of the driver. If you invoke close and the core networking still sends packets to the driver, that's a bug that needs to be fixed in the core networking.
On 2020/9/2 18:16, Eric Dumazet wrote: > > > On 9/2/20 2:41 AM, Luo bin wrote: >> When calling hinic_close in hinic_set_channels, netif_carrier_off >> and netif_tx_disable are excuted, and TX host resources are freed >> after that. Core may call hinic_xmit_frame to send pkt after >> netif_tx_disable within a short time, so we should judge whether >> carrier is on before sending pkt otherwise the resources that >> have already been freed in hinic_close may be accessed. >> >> Fixes: 2eed5a8b614b ("hinic: add set_channels ethtool_ops support") >> Signed-off-by: Luo bin <luobin9@huawei.com> >> --- >> drivers/net/ethernet/huawei/hinic/hinic_tx.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c b/drivers/net/ethernet/huawei/hinic/hinic_tx.c >> index a97498ee6914..a0662552a39c 100644 >> --- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c >> +++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c >> @@ -531,6 +531,11 @@ netdev_tx_t hinic_xmit_frame(struct sk_buff *skb, struct net_device *netdev) >> struct hinic_txq *txq; >> struct hinic_qp *qp; >> >> + if (unlikely(!netif_carrier_ok(netdev))) { >> + dev_kfree_skb_any(skb); >> + return NETDEV_TX_OK; >> + } >> + >> txq = &nic_dev->txqs[q_id]; >> qp = container_of(txq->sq, struct hinic_qp, sq); >> >> > > Adding this kind of tests in fast path seems a big hammer to me. > > See https://marc.info/?l=linux-netdev&m=159903844423389&w=2 for a similar problem. > > Normally, after hinic_close() operation, no packet should be sent by core networking stack. > > Trying to work around some core networking issue in each driver is a dead end. Thanks for your review. I agree with what you said. Theoretically, core can't call ndo_start_xmit to send packet after netif_tx_disable called by hinic_close because __QUEUE_STATE_DRV_XOFF bit is set and this bit is protected by __netif_tx_lock but it does call hinic_xmit_frame after netif_tx_disable in my debug message. I'll try to figure out why and fix it. It seems like that the patch from https://marc.info/?l=linux-netdev&m=159903844423389&w=2 can't fix this problem. > > > > > > > . >
On 2020/9/3 3:52, David Miller wrote: > From: Luo bin <luobin9@huawei.com> > Date: Wed, 2 Sep 2020 17:41:45 +0800 > >> @@ -531,6 +531,11 @@ netdev_tx_t hinic_xmit_frame(struct sk_buff *skb, struct net_device *netdev) >> struct hinic_txq *txq; >> struct hinic_qp *qp; >> >> + if (unlikely(!netif_carrier_ok(netdev))) { >> + dev_kfree_skb_any(skb); >> + return NETDEV_TX_OK; >> + } > > As Eric said, these kinds of tests should not be placed in the fast path > of the driver. > > If you invoke close and the core networking still sends packets to the > driver, that's a bug that needs to be fixed in the core networking. > . > Okay, I'm trying to figure out why the core networking can still call ndo_start_xmit after netif_tx_disable and solve the problem fundamentally. And I'll undo this patch temporarily.
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c b/drivers/net/ethernet/huawei/hinic/hinic_tx.c index a97498ee6914..a0662552a39c 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c @@ -531,6 +531,11 @@ netdev_tx_t hinic_xmit_frame(struct sk_buff *skb, struct net_device *netdev) struct hinic_txq *txq; struct hinic_qp *qp; + if (unlikely(!netif_carrier_ok(netdev))) { + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; + } + txq = &nic_dev->txqs[q_id]; qp = container_of(txq->sq, struct hinic_qp, sq);
When calling hinic_close in hinic_set_channels, netif_carrier_off and netif_tx_disable are excuted, and TX host resources are freed after that. Core may call hinic_xmit_frame to send pkt after netif_tx_disable within a short time, so we should judge whether carrier is on before sending pkt otherwise the resources that have already been freed in hinic_close may be accessed. Fixes: 2eed5a8b614b ("hinic: add set_channels ethtool_ops support") Signed-off-by: Luo bin <luobin9@huawei.com> --- drivers/net/ethernet/huawei/hinic/hinic_tx.c | 5 +++++ 1 file changed, 5 insertions(+)