diff mbox series

mctp i2c: Requeue the packet when arbitration is lost

Message ID 20231130075247.3078931-1-quan@os.amperecomputing.com
State New
Headers show
Series mctp i2c: Requeue the packet when arbitration is lost | expand

Commit Message

Quan Nguyen Nov. 30, 2023, 7:52 a.m. UTC
If arbitration is lost, __i2c_transfer() returns -EAGAIN and the
packet should be resent.

Requeue the packet and increase collisions count on this case.

Co-developed-by: Dung Cao <dung@os.amperecomputing.com>
Signed-off-by: Dung Cao <dung@os.amperecomputing.com>
Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
---
 drivers/net/mctp/mctp-i2c.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

Comments

Jeremy Kerr Nov. 30, 2023, 8:03 a.m. UTC | #1
Hi Quan,

> If arbitration is lost, __i2c_transfer() returns -EAGAIN and the
> packet should be resent.
> 
> Requeue the packet and increase collisions count on this case.

Are you sure you want to re-queue the packet here? The i2c core would
have already retried on arbitration loss:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-core-base.c#n2223

With this change, we would be disregarding the limits in adap->retries
and/or adap->timeout.

Cheers,


Jeremy
Quan Nguyen Nov. 30, 2023, 8:39 a.m. UTC | #2
On 30/11/2023 15:03, Jeremy Kerr wrote:
> Hi Quan,
> 
>> If arbitration is lost, __i2c_transfer() returns -EAGAIN and the
>> packet should be resent.
>>
>> Requeue the packet and increase collisions count on this case.
> 
> Are you sure you want to re-queue the packet here? The i2c core would
> have already retried on arbitration loss:
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-core-base.c#n2223
> 
> With this change, we would be disregarding the limits in adap->retries
> and/or adap->timeout.
> 

Yes, the __i2c_transfer() does try its best to send the packet but if it 
still fail with -EAGAIN, that means it still unable to win the arbitration.

Without this patch we usually see the error below. The consequence is it 
stop and need the PLDM layer to retry when timed out, which takes time.

"[   61.616210] i2c i2c-3: __i2c_transfer failed -11"

With this commit, we all see the packets go through peacefully just by 
requeueing the packets at MCTP layer and eliminated the need to retry in 
PLDM layer which would need more time.

The result in our test is as below (Note: would needs some addition 
debug print code). I also think it is worth noting that in our setup 
there are multiple Masters.

[73183.511487] i2c i2c-3: Arbitration loss, re-queue the packet
[73192.550519] i2c i2c-3: __i2c_transfer failed -11
[73192.555734] i2c i2c-3: Arbitration loss, re-queue the packet
[73207.250036] i2c i2c-3: __i2c_transfer failed -11
[73207.255247] i2c i2c-3: Arbitration loss, re-queue the packet
[73429.499875] i2c i2c-3: __i2c_transfer failed -11
[73429.505116] i2c i2c-3: Arbitration loss, re-queue the packet
[73504.672065] i2c i2c-3: __i2c_transfer failed -11
[73504.677317] i2c i2c-3: Arbitration loss, re-queue the packet
[73540.762984] i2c i2c-3: __i2c_transfer failed -11
[73540.768242] i2c i2c-3: Arbitration loss, re-queue the packet
[73631.040049] i2c i2c-3: __i2c_transfer failed -11
[73631.045333] i2c i2c-3: Arbitration loss, re-queue the packet
[73648.538967] i2c i2c-3: __i2c_transfer failed -11

Thanks,
- Quan
Jeremy Kerr Nov. 30, 2023, 9:40 a.m. UTC | #3
Hi Quan,

> With this commit, we all see the packets go through peacefully just
> by requeueing the packets at MCTP layer and eliminated the need to
> retry in PLDM layer which would need more time.

It's certainly possible that this tries harder to send MCTP packets,
but it's just duplicating the retry mechanism already present in the
i2c core.

Why do we need another retry mechanism here? How is the i2c core retry
mechanism not sufficient?

It would seem that you can get the same behaviour by adjusting the
retries/timeout limits in the core code, which has the advantage of
applying to all arbitration loss events on the i2c bus, not just for
MCTP tx.

Cheers,


Jeremy
Quan Nguyen Dec. 1, 2023, 6:31 a.m. UTC | #4
On 30/11/2023 16:40, Jeremy Kerr wrote:
> Hi Quan,
> 
>> With this commit, we all see the packets go through peacefully just
>> by requeueing the packets at MCTP layer and eliminated the need to
>> retry in PLDM layer which would need more time.
> 
> It's certainly possible that this tries harder to send MCTP packets,
> but it's just duplicating the retry mechanism already present in the
> i2c core.
> 
> Why do we need another retry mechanism here? How is the i2c core retry
> mechanism not sufficient?
> 
> It would seem that you can get the same behaviour by adjusting the
> retries/timeout limits in the core code, which has the advantage of
> applying to all arbitration loss events on the i2c bus, not just for
> MCTP tx.
> 

Hi Jeremy,

As per [1], __i2c_transfer() will retry for adap->retries times 
consecutively (without any wait) within the amount of time specified by 
adap->timeout.

So as per my observation, once it loses the arbitration, the next retry 
is most likely still lost as another controller who win the arbitration 
may still be using the bus. Especially for upper layer protocol message 
like PLDM or SPDM, which size is far bigger than SMBus, usually ends up 
to queue several MCTP packets at a time. But if to requeue the packet, 
it must wait to acquire the lock before actually queueing that packet, 
and that is more likely to increase the chance to win the arbitration 
than to retry it right away as on the i2c core.

Another reason is that, as i2c is widely used for many other 
applications, fixing the retry algorithm within the i2c core seems 
impossible.

The other fact is that the initial default value of these two parameters 
depends on each type of controller; I'm not sure why yet.

+ i2c-aspeed:     retries=0 timeout=1 sec [2]
+ i2c-cadence:    retries=3 timeout=1 sec [3]
+ i2c-designware: retries=3 timeout=1 sec [4], [5]
+ i2c-emev2:      retries=5 timeout=1 sec [6]
+ ...

Unfortunately, in our case, we use i2c-aspeed, and there is only one try 
(see [2]), and that means we have only one single shot. I'm not sure why 
i2c-aspeed chose to set retries=0 by default, but I guess there must be 
a reason behind it.

And yes, I agree, as per [7], these two parameters could be adjusted via 
ioctl() from userspace if the user wishes to change them. But, honestly, 
forcing users to change these parameters is not a good way, as I might 
have to say.

To avoid that, requeueing the packet in the MCTP layer was kind of way 
better choice, and it was confirmed via our case.

Thanks,
- Quan


[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-core-base.c#n2223
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c#n1030
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-cadence.c#n1322
[4] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-designware-master.c#n1030
[5] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-designware-slave.c#n258
[6] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-emev2.c#n385
[7] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/i2c-dev.c#n478
Jeremy Kerr Dec. 1, 2023, 8:38 a.m. UTC | #5
Hi Quan,

> As per [1], __i2c_transfer() will retry for adap->retries times 
> consecutively (without any wait) within the amount of time specified
> by adap->timeout.
> 
> So as per my observation, once it loses the arbitration, the next
> retry is most likely still lost as another controller who win the
> arbitration may still be using the bus.

In general (and specifically with your hardware setup), the controller
should be waiting for a bus-idle state before attempting the
retransmission. You may well hit another arbitration loss after that,
but it won't be from the same bus activity.

> Especially for upper layer protocol message like PLDM or SPDM, which
> size is far bigger than SMBus, usually ends up to queue several MCTP
> packets at a time. But if to requeue the packet, it must wait to
> acquire the lock

... you're relying on the delay of acquiring a spinlock? The only
contention on that lock is from local packets being sent to the device
(and, in heavy TX backlogs, the netif queue will be stopped, so that
lock will be uncontended).

That sounds fairly fragile, and somewhat disconnected from the goal of
waiting for a bus idle state.

> before actually queueing that packet, and that is
> more likely to increase the chance to win the arbitration than to
> retry it right away as on the i2c core.
> 
> Another reason is that, as i2c is widely used for many other 
> applications, fixing the retry algorithm within the i2c core seems 
> impossible.

What needs fixing there? You haven't identified any issue with it.

> The other fact is that the initial default value of these two
> parameters 
> depends on each type of controller; I'm not sure why yet.
> 
> + i2c-aspeed:     retries=0 timeout=1 sec [2]
> + i2c-cadence:    retries=3 timeout=1 sec [3]
> + i2c-designware: retries=3 timeout=1 sec [4], [5]
> + i2c-emev2:      retries=5 timeout=1 sec [6]
> + ...
> 
> Unfortunately, in our case, we use i2c-aspeed, and there is only one
> try (see [2]), and that means we have only one single shot. I'm not
> sure why i2c-aspeed chose to set retries=0 by default, but I guess
> there must be a reason behind it.

I would suggest that the actual fix you want here is to increase that
retry count, rather than working-around your "not sure" points above
with a duplication of the common retry mechanism.

> And yes, I agree, as per [7], these two parameters could be adjusted
> via ioctl() from userspace if the user wishes to change them. But,
> honestly, forcing users to change these parameters is not a good way,
> as I might have to say.

But now you're forcing users to use your infinite-retry mechanism
instead.

We already have a retry mechanism, which is user-configurable, and we
can set per-controller defaults. If you believe the defaults (present in
the aspeed driver) are not suitable, and it's too onerous for users to
adjust, then I would suggest proposing a change to the default.

Your requeue approach has a problem, in that there is no mechanism for
recovery on repeated packet contention. In a scenario where a specific
packet always causes contention (say, a faulty device on the bus
attempts to respond to that packet too early), then the packet is never
dequeued; other packets already queued will be blocked behind this one
packet. The only way to make forward progress from there is to fill the
TX queue completely.

You could address that by limiting the retries and/or having a timeout,
but then you may as well just use the existing retry mechanism that we
already have, which already does that.

> To avoid that, requeueing the packet in the MCTP layer was kind of
> way better choice, and it was confirmed via our case.

Your earlier examples showed a max of one retry was needed for recovery.
I would suggest bumping the i2c-aspeed driver default to suit the other
in-tree controllers would be a much more appropriate fix.

Cheers,


Jeremy
Quan Nguyen Dec. 1, 2023, 11:47 a.m. UTC | #6
On 01/12/2023 15:38, Jeremy Kerr wrote:
> Hi Quan,
> 
>> As per [1], __i2c_transfer() will retry for adap->retries times
>> consecutively (without any wait) within the amount of time specified
>> by adap->timeout.
>>
>> So as per my observation, once it loses the arbitration, the next
>> retry is most likely still lost as another controller who win the
>> arbitration may still be using the bus.
> 
> In general (and specifically with your hardware setup), the controller
> should be waiting for a bus-idle state before attempting the
> retransmission. You may well hit another arbitration loss after that,
> but it won't be from the same bus activity.
> 

Yes, that is the correct behaviour and I think that is why 
__i2c_transfer() return -EAGAIN when arbitration loss.

One example I could find in kernel tree is here [a], and, it seems not 
to totally rely on the retry mechanism implemented in i2c core.

[a] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/video/fbdev/omap2/omapfb/displays/connector-dvi.c#n157

Anyway, it's been great to discuss on this topic. To answer your 
questions I did have to dig around and have learn a lot. :-)

My sincere thanks to you,
- Quan
diff mbox series

Patch

diff --git a/drivers/net/mctp/mctp-i2c.c b/drivers/net/mctp/mctp-i2c.c
index b37a9e4bade4..132023306052 100644
--- a/drivers/net/mctp/mctp-i2c.c
+++ b/drivers/net/mctp/mctp-i2c.c
@@ -442,14 +442,14 @@  static void mctp_i2c_unlock_reset(struct mctp_i2c_dev *midev)
 		i2c_unlock_bus(midev->adapter, I2C_LOCK_SEGMENT);
 }
 
-static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb)
+static int mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb)
 {
 	struct net_device_stats *stats = &midev->ndev->stats;
 	enum mctp_i2c_flow_state fs;
 	struct mctp_i2c_hdr *hdr;
 	struct i2c_msg msg = {0};
 	u8 *pecp;
-	int rc;
+	int rc = 0;
 
 	fs = mctp_i2c_get_tx_flow_state(midev, skb);
 
@@ -461,17 +461,11 @@  static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb)
 		dev_warn_ratelimited(&midev->adapter->dev,
 				     "Bad tx length %d vs skb %u\n",
 				     hdr->byte_count + 3, skb->len);
-		return;
+		return -EINVAL;
 	}
 
-	if (skb_tailroom(skb) >= 1) {
-		/* Linear case with space, we can just append the PEC */
-		skb_put(skb, 1);
-	} else {
-		/* Otherwise need to copy the buffer */
-		skb_copy_bits(skb, 0, midev->tx_scratch, skb->len);
-		hdr = (void *)midev->tx_scratch;
-	}
+	skb_copy_bits(skb, 0, midev->tx_scratch, skb->len);
+	hdr = (void *)midev->tx_scratch;
 
 	pecp = (void *)&hdr->source_slave + hdr->byte_count;
 	*pecp = i2c_smbus_pec(0, (u8 *)hdr, hdr->byte_count + 3);
@@ -503,17 +497,20 @@  static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb)
 		break;
 
 	case MCTP_I2C_TX_FLOW_INVALID:
-		return;
+		return rc;
 	}
 
 	if (rc < 0) {
 		dev_warn_ratelimited(&midev->adapter->dev,
 				     "__i2c_transfer failed %d\n", rc);
 		stats->tx_errors++;
+		if (rc == -EAGAIN)
+			stats->collisions++;
 	} else {
 		stats->tx_bytes += skb->len;
 		stats->tx_packets++;
 	}
+	return rc;
 }
 
 static void mctp_i2c_flow_release(struct mctp_i2c_dev *midev)
@@ -568,6 +565,7 @@  static int mctp_i2c_tx_thread(void *data)
 	struct mctp_i2c_dev *midev = data;
 	struct sk_buff *skb;
 	unsigned long flags;
+	int rc;
 
 	for (;;) {
 		if (kthread_should_stop())
@@ -583,8 +581,17 @@  static int mctp_i2c_tx_thread(void *data)
 			mctp_i2c_flow_release(midev);
 
 		} else if (skb) {
-			mctp_i2c_xmit(midev, skb);
-			kfree_skb(skb);
+			rc = mctp_i2c_xmit(midev, skb);
+			if (rc == -EAGAIN) {
+				spin_lock_irqsave(&midev->tx_queue.lock, flags);
+				if (skb_queue_len(&midev->tx_queue) >= MCTP_I2C_TX_WORK_LEN)
+					kfree_skb(skb);
+				else
+					__skb_queue_head(&midev->tx_queue, skb);
+				spin_unlock_irqrestore(&midev->tx_queue.lock, flags);
+			} else {
+				kfree_skb(skb);
+			}
 
 		} else {
 			wait_event_idle(midev->tx_wq,