Message ID | 20231130075247.3078931-1-quan@os.amperecomputing.com |
---|---|
State | New |
Headers | show |
Series | mctp i2c: Requeue the packet when arbitration is lost | expand |
Hi Quan, > If arbitration is lost, __i2c_transfer() returns -EAGAIN and the > packet should be resent. > > Requeue the packet and increase collisions count on this case. Are you sure you want to re-queue the packet here? The i2c core would have already retried on arbitration loss: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-core-base.c#n2223 With this change, we would be disregarding the limits in adap->retries and/or adap->timeout. Cheers, Jeremy
On 30/11/2023 15:03, Jeremy Kerr wrote: > Hi Quan, > >> If arbitration is lost, __i2c_transfer() returns -EAGAIN and the >> packet should be resent. >> >> Requeue the packet and increase collisions count on this case. > > Are you sure you want to re-queue the packet here? The i2c core would > have already retried on arbitration loss: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-core-base.c#n2223 > > With this change, we would be disregarding the limits in adap->retries > and/or adap->timeout. > Yes, the __i2c_transfer() does try its best to send the packet but if it still fail with -EAGAIN, that means it still unable to win the arbitration. Without this patch we usually see the error below. The consequence is it stop and need the PLDM layer to retry when timed out, which takes time. "[ 61.616210] i2c i2c-3: __i2c_transfer failed -11" With this commit, we all see the packets go through peacefully just by requeueing the packets at MCTP layer and eliminated the need to retry in PLDM layer which would need more time. The result in our test is as below (Note: would needs some addition debug print code). I also think it is worth noting that in our setup there are multiple Masters. [73183.511487] i2c i2c-3: Arbitration loss, re-queue the packet [73192.550519] i2c i2c-3: __i2c_transfer failed -11 [73192.555734] i2c i2c-3: Arbitration loss, re-queue the packet [73207.250036] i2c i2c-3: __i2c_transfer failed -11 [73207.255247] i2c i2c-3: Arbitration loss, re-queue the packet [73429.499875] i2c i2c-3: __i2c_transfer failed -11 [73429.505116] i2c i2c-3: Arbitration loss, re-queue the packet [73504.672065] i2c i2c-3: __i2c_transfer failed -11 [73504.677317] i2c i2c-3: Arbitration loss, re-queue the packet [73540.762984] i2c i2c-3: __i2c_transfer failed -11 [73540.768242] i2c i2c-3: Arbitration loss, re-queue the packet [73631.040049] i2c i2c-3: __i2c_transfer failed -11 [73631.045333] i2c i2c-3: Arbitration loss, re-queue the packet [73648.538967] i2c i2c-3: __i2c_transfer failed -11 Thanks, - Quan
Hi Quan, > With this commit, we all see the packets go through peacefully just > by requeueing the packets at MCTP layer and eliminated the need to > retry in PLDM layer which would need more time. It's certainly possible that this tries harder to send MCTP packets, but it's just duplicating the retry mechanism already present in the i2c core. Why do we need another retry mechanism here? How is the i2c core retry mechanism not sufficient? It would seem that you can get the same behaviour by adjusting the retries/timeout limits in the core code, which has the advantage of applying to all arbitration loss events on the i2c bus, not just for MCTP tx. Cheers, Jeremy
On 30/11/2023 16:40, Jeremy Kerr wrote: > Hi Quan, > >> With this commit, we all see the packets go through peacefully just >> by requeueing the packets at MCTP layer and eliminated the need to >> retry in PLDM layer which would need more time. > > It's certainly possible that this tries harder to send MCTP packets, > but it's just duplicating the retry mechanism already present in the > i2c core. > > Why do we need another retry mechanism here? How is the i2c core retry > mechanism not sufficient? > > It would seem that you can get the same behaviour by adjusting the > retries/timeout limits in the core code, which has the advantage of > applying to all arbitration loss events on the i2c bus, not just for > MCTP tx. > Hi Jeremy, As per [1], __i2c_transfer() will retry for adap->retries times consecutively (without any wait) within the amount of time specified by adap->timeout. So as per my observation, once it loses the arbitration, the next retry is most likely still lost as another controller who win the arbitration may still be using the bus. Especially for upper layer protocol message like PLDM or SPDM, which size is far bigger than SMBus, usually ends up to queue several MCTP packets at a time. But if to requeue the packet, it must wait to acquire the lock before actually queueing that packet, and that is more likely to increase the chance to win the arbitration than to retry it right away as on the i2c core. Another reason is that, as i2c is widely used for many other applications, fixing the retry algorithm within the i2c core seems impossible. The other fact is that the initial default value of these two parameters depends on each type of controller; I'm not sure why yet. + i2c-aspeed: retries=0 timeout=1 sec [2] + i2c-cadence: retries=3 timeout=1 sec [3] + i2c-designware: retries=3 timeout=1 sec [4], [5] + i2c-emev2: retries=5 timeout=1 sec [6] + ... Unfortunately, in our case, we use i2c-aspeed, and there is only one try (see [2]), and that means we have only one single shot. I'm not sure why i2c-aspeed chose to set retries=0 by default, but I guess there must be a reason behind it. And yes, I agree, as per [7], these two parameters could be adjusted via ioctl() from userspace if the user wishes to change them. But, honestly, forcing users to change these parameters is not a good way, as I might have to say. To avoid that, requeueing the packet in the MCTP layer was kind of way better choice, and it was confirmed via our case. Thanks, - Quan [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-core-base.c#n2223 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c#n1030 [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-cadence.c#n1322 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-designware-master.c#n1030 [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-designware-slave.c#n258 [6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-emev2.c#n385 [7] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-dev.c#n478
Hi Quan, > As per [1], __i2c_transfer() will retry for adap->retries times > consecutively (without any wait) within the amount of time specified > by adap->timeout. > > So as per my observation, once it loses the arbitration, the next > retry is most likely still lost as another controller who win the > arbitration may still be using the bus. In general (and specifically with your hardware setup), the controller should be waiting for a bus-idle state before attempting the retransmission. You may well hit another arbitration loss after that, but it won't be from the same bus activity. > Especially for upper layer protocol message like PLDM or SPDM, which > size is far bigger than SMBus, usually ends up to queue several MCTP > packets at a time. But if to requeue the packet, it must wait to > acquire the lock ... you're relying on the delay of acquiring a spinlock? The only contention on that lock is from local packets being sent to the device (and, in heavy TX backlogs, the netif queue will be stopped, so that lock will be uncontended). That sounds fairly fragile, and somewhat disconnected from the goal of waiting for a bus idle state. > before actually queueing that packet, and that is > more likely to increase the chance to win the arbitration than to > retry it right away as on the i2c core. > > Another reason is that, as i2c is widely used for many other > applications, fixing the retry algorithm within the i2c core seems > impossible. What needs fixing there? You haven't identified any issue with it. > The other fact is that the initial default value of these two > parameters > depends on each type of controller; I'm not sure why yet. > > + i2c-aspeed: retries=0 timeout=1 sec [2] > + i2c-cadence: retries=3 timeout=1 sec [3] > + i2c-designware: retries=3 timeout=1 sec [4], [5] > + i2c-emev2: retries=5 timeout=1 sec [6] > + ... > > Unfortunately, in our case, we use i2c-aspeed, and there is only one > try (see [2]), and that means we have only one single shot. I'm not > sure why i2c-aspeed chose to set retries=0 by default, but I guess > there must be a reason behind it. I would suggest that the actual fix you want here is to increase that retry count, rather than working-around your "not sure" points above with a duplication of the common retry mechanism. > And yes, I agree, as per [7], these two parameters could be adjusted > via ioctl() from userspace if the user wishes to change them. But, > honestly, forcing users to change these parameters is not a good way, > as I might have to say. But now you're forcing users to use your infinite-retry mechanism instead. We already have a retry mechanism, which is user-configurable, and we can set per-controller defaults. If you believe the defaults (present in the aspeed driver) are not suitable, and it's too onerous for users to adjust, then I would suggest proposing a change to the default. Your requeue approach has a problem, in that there is no mechanism for recovery on repeated packet contention. In a scenario where a specific packet always causes contention (say, a faulty device on the bus attempts to respond to that packet too early), then the packet is never dequeued; other packets already queued will be blocked behind this one packet. The only way to make forward progress from there is to fill the TX queue completely. You could address that by limiting the retries and/or having a timeout, but then you may as well just use the existing retry mechanism that we already have, which already does that. > To avoid that, requeueing the packet in the MCTP layer was kind of > way better choice, and it was confirmed via our case. Your earlier examples showed a max of one retry was needed for recovery. I would suggest bumping the i2c-aspeed driver default to suit the other in-tree controllers would be a much more appropriate fix. Cheers, Jeremy
On 01/12/2023 15:38, Jeremy Kerr wrote: > Hi Quan, > >> As per [1], __i2c_transfer() will retry for adap->retries times >> consecutively (without any wait) within the amount of time specified >> by adap->timeout. >> >> So as per my observation, once it loses the arbitration, the next >> retry is most likely still lost as another controller who win the >> arbitration may still be using the bus. > > In general (and specifically with your hardware setup), the controller > should be waiting for a bus-idle state before attempting the > retransmission. You may well hit another arbitration loss after that, > but it won't be from the same bus activity. > Yes, that is the correct behaviour and I think that is why __i2c_transfer() return -EAGAIN when arbitration loss. One example I could find in kernel tree is here [a], and, it seems not to totally rely on the retry mechanism implemented in i2c core. [a] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/video/fbdev/omap2/omapfb/displays/connector-dvi.c#n157 Anyway, it's been great to discuss on this topic. To answer your questions I did have to dig around and have learn a lot. :-) My sincere thanks to you, - Quan
diff --git a/drivers/net/mctp/mctp-i2c.c b/drivers/net/mctp/mctp-i2c.c index b37a9e4bade4..132023306052 100644 --- a/drivers/net/mctp/mctp-i2c.c +++ b/drivers/net/mctp/mctp-i2c.c @@ -442,14 +442,14 @@ static void mctp_i2c_unlock_reset(struct mctp_i2c_dev *midev) i2c_unlock_bus(midev->adapter, I2C_LOCK_SEGMENT); } -static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb) +static int mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb) { struct net_device_stats *stats = &midev->ndev->stats; enum mctp_i2c_flow_state fs; struct mctp_i2c_hdr *hdr; struct i2c_msg msg = {0}; u8 *pecp; - int rc; + int rc = 0; fs = mctp_i2c_get_tx_flow_state(midev, skb); @@ -461,17 +461,11 @@ static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb) dev_warn_ratelimited(&midev->adapter->dev, "Bad tx length %d vs skb %u\n", hdr->byte_count + 3, skb->len); - return; + return -EINVAL; } - if (skb_tailroom(skb) >= 1) { - /* Linear case with space, we can just append the PEC */ - skb_put(skb, 1); - } else { - /* Otherwise need to copy the buffer */ - skb_copy_bits(skb, 0, midev->tx_scratch, skb->len); - hdr = (void *)midev->tx_scratch; - } + skb_copy_bits(skb, 0, midev->tx_scratch, skb->len); + hdr = (void *)midev->tx_scratch; pecp = (void *)&hdr->source_slave + hdr->byte_count; *pecp = i2c_smbus_pec(0, (u8 *)hdr, hdr->byte_count + 3); @@ -503,17 +497,20 @@ static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb) break; case MCTP_I2C_TX_FLOW_INVALID: - return; + return rc; } if (rc < 0) { dev_warn_ratelimited(&midev->adapter->dev, "__i2c_transfer failed %d\n", rc); stats->tx_errors++; + if (rc == -EAGAIN) + stats->collisions++; } else { stats->tx_bytes += skb->len; stats->tx_packets++; } + return rc; } static void mctp_i2c_flow_release(struct mctp_i2c_dev *midev) @@ -568,6 +565,7 @@ static int mctp_i2c_tx_thread(void *data) struct mctp_i2c_dev *midev = data; struct sk_buff *skb; unsigned long flags; + int rc; for (;;) { if (kthread_should_stop()) @@ -583,8 +581,17 @@ static int mctp_i2c_tx_thread(void *data) mctp_i2c_flow_release(midev); } else if (skb) { - mctp_i2c_xmit(midev, skb); - kfree_skb(skb); + rc = mctp_i2c_xmit(midev, skb); + if (rc == -EAGAIN) { + spin_lock_irqsave(&midev->tx_queue.lock, flags); + if (skb_queue_len(&midev->tx_queue) >= MCTP_I2C_TX_WORK_LEN) + kfree_skb(skb); + else + __skb_queue_head(&midev->tx_queue, skb); + spin_unlock_irqrestore(&midev->tx_queue.lock, flags); + } else { + kfree_skb(skb); + } } else { wait_event_idle(midev->tx_wq,