Message ID | 20200825172641.806912-1-drt@linux.ibm.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net,v3] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset | expand |
From: Dany Madden <drt@linux.ibm.com> Date: Tue, 25 Aug 2020 13:26:41 -0400 > From: Mingming Cao <mmc@linux.vnet.ibm.com> > > At the time of do_rest, ibmvnic tries to re-initalize the tx_pools > and rx_pools to avoid re-allocating the long term buffer. However > there is a window inside do_reset that the tx_pools and > rx_pools were freed before re-initialized making it possible to deference > null pointers. > > This patch fix this issue by always check the tx_pool > and rx_pool are not NULL after ibmvnic_login. If so, re-allocating > the pools. This will avoid getting into calling reset_tx/rx_pools with > NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in > reset_tx_pools and reset_rx_pools to safe handle NULL pointer case. > > Signed-off-by: Mingming Cao <mmc@linux.vnet.ibm.com> > Signed-off-by: Dany Madden <drt@linux.ibm.com> Applied, but: > + if (!adapter->rx_pool) > + return -1; > + This driver has poor error code usage, it's a random mix of hypervisor error codes, normal error codes like -EINVAL, and internal error codes. Sometimes used all in the same function. For example: static int ibmvnic_send_crq(struct ibmvnic_adapter *adapter, union ibmvnic_crq *crq) ... if (!adapter->crq.active && crq->generic.first != IBMVNIC_CRQ_INIT_CMD) { dev_warn(dev, "Invalid request detected while CRQ is inactive, possible device state change during reset\n"); return -EINVAL; } ... rc = plpar_hcall_norets(H_SEND_CRQ, ua, cpu_to_be64(u64_crq[0]), cpu_to_be64(u64_crq[1])); if (rc) { if (rc == H_CLOSED) { ... return rc; So obviously this function returns a mix of negative erro codes and Hypervisor codes such as H_CLOSED. And stuff like: rc = __ibmvnic_open(netdev); if (rc) return IBMVNIC_OPEN_FAILED;
> On Aug 25, 2020, at 5:31 PM, David Miller <davem@davemloft.net> wrote: > > From: Dany Madden <drt@linux.ibm.com> > Date: Tue, 25 Aug 2020 13:26:41 -0400 > >> From: Mingming Cao <mmc@linux.vnet.ibm.com> >> >> At the time of do_rest, ibmvnic tries to re-initalize the tx_pools >> and rx_pools to avoid re-allocating the long term buffer. However >> there is a window inside do_reset that the tx_pools and >> rx_pools were freed before re-initialized making it possible to deference >> null pointers. >> >> This patch fix this issue by always check the tx_pool >> and rx_pool are not NULL after ibmvnic_login. If so, re-allocating >> the pools. This will avoid getting into calling reset_tx/rx_pools with >> NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in >> reset_tx_pools and reset_rx_pools to safe handle NULL pointer case. >> >> Signed-off-by: Mingming Cao <mmc@linux.vnet.ibm.com> >> Signed-off-by: Dany Madden <drt@linux.ibm.com> > > Applied, but: > >> + if (!adapter->rx_pool) >> + return -1; >> + > > This driver has poor error code usage, it's a random mix of hypervisor > error codes, normal error codes like -EINVAL, and internal error codes. > Sometimes used all in the same function. > Agree need to improve. For this patch/fix, -1 is chosen to follow other part of the driver that check NULL pointer and return -1 . We should go through all of -1 cases and replace with normal proper error code. That should be a seperate patch. > For example: > > static int ibmvnic_send_crq(struct ibmvnic_adapter *adapter, > union ibmvnic_crq *crq) > ... > if (!adapter->crq.active && > crq->generic.first != IBMVNIC_CRQ_INIT_CMD) { > dev_warn(dev, "Invalid request detected while CRQ is inactive, possible device state change during reset\n"); > return -EINVAL; > } > ... > rc = plpar_hcall_norets(H_SEND_CRQ, ua, > cpu_to_be64(u64_crq[0]), > cpu_to_be64(u64_crq[1])); > > if (rc) { > if (rc == H_CLOSED) { > ... > return rc; > > So obviously this function returns a mix of negative erro codes > and Hypervisor codes such as H_CLOSED. > > And stuff like: > > rc = __ibmvnic_open(netdev); > if (rc) > return IBMVNIC_OPEN_FAILED; Agree. Mingming
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 5afb3c9c52d2..d3a774331afc 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -479,6 +479,9 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter) int i, j, rc; u64 *size_array; + if (!adapter->rx_pool) + return -1; + size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); @@ -649,6 +652,9 @@ static int reset_tx_pools(struct ibmvnic_adapter *adapter) int tx_scrqs; int i, rc; + if (!adapter->tx_pool) + return -1; + tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs); for (i = 0; i < tx_scrqs; i++) { rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]); @@ -2011,7 +2017,10 @@ static int do_reset(struct ibmvnic_adapter *adapter, adapter->req_rx_add_entries_per_subcrq != old_num_rx_slots || adapter->req_tx_entries_per_subcrq != - old_num_tx_slots) { + old_num_tx_slots || + !adapter->rx_pool || + !adapter->tso_pool || + !adapter->tx_pool) { release_rx_pools(adapter); release_tx_pools(adapter); release_napi(adapter); @@ -2024,10 +2033,14 @@ static int do_reset(struct ibmvnic_adapter *adapter, } else { rc = reset_tx_pools(adapter); if (rc) + netdev_dbg(adapter->netdev, "reset tx pools failed (%d)\n", + rc); goto out; rc = reset_rx_pools(adapter); if (rc) + netdev_dbg(adapter->netdev, "reset rx pools failed (%d)\n", + rc); goto out; } ibmvnic_disable_irqs(adapter);