diff mbox series

[v4,2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late

Message ID 20231211102217.2436294-3-quan@os.amperecomputing.com
State New
Headers show
Series i2c: aspeed: Late ack Tx done irqs and handle coalesced start with stop conditions | expand

Commit Message

Quan Nguyen Dec. 11, 2023, 10:22 a.m. UTC
Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
interrupt handler") acknowledges most interrupts early before the slave
irq handler is executed, except for the "Receive Done Interrupt status"
which is acknowledged late in the interrupt.
However, it has been observed that the early acknowledgment of "Transmit
Done Interrupt Status" (with ACK or NACK) often causes the interrupt to
be raised in READ REQUEST state, that shows the
"Unexpected ACK on read request." complaint messages.

Assuming that the "Transmit Done" interrupt should only be acknowledged
once it is truly processed, this commit fixes that issue by acknowledging
interrupts for both ACK and NACK cases late in the interrupt handler.

Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
---
v4:
  + Switch to use define macro instead of variable             [Andrew]
  + Make the early ack conditionally to avoid unnecessary
writel()/readl()                                                 [Quan]

v3:
  + Fix the unconditinal write when ack the irqs               [Andrew]
  + Refactor the code to enhance code readability                [Quan]
  + Fix grammar in commit message                                [Quan]

v2:
  + Split to separate series                                     [Joel]
  + Added the Fixes line                                         [Joel]
  + Fixed multiline comment                                      [Joel]
  + Refactor irq clearing code                          [Joel, Guenter]
  + Revised commit message                                       [Joel]
  + Revised commit message                                       [Quan]
  + About a note to remind why the readl() should immediately follow the
writel() to fix the race condition when clearing irq status from commit
c926c87b8e36 ("i2c: aspeed: Avoid i2c interrupt status clear race
condition"), I think it looks straight forward in this patch and decided
not to add that note.                                            [Joel]

v1:
  + First introduced in
https://lore.kernel.org/all/20210519074934.20712-1-quan@os.amperecomputing.com/
---
 drivers/i2c/busses/i2c-aspeed.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

Comments

Andrew Jeffery Dec. 11, 2023, 11:03 p.m. UTC | #1
On Mon, 2023-12-11 at 17:22 +0700, Quan Nguyen wrote:
> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
> interrupt handler") acknowledges most interrupts early before the slave
> irq handler is executed, except for the "Receive Done Interrupt status"
> which is acknowledged late in the interrupt.
> However, it has been observed that the early acknowledgment of "Transmit
> Done Interrupt Status" (with ACK or NACK) often causes the interrupt to
> be raised in READ REQUEST state, that shows the
> "Unexpected ACK on read request." complaint messages.
> 
> Assuming that the "Transmit Done" interrupt should only be acknowledged
> once it is truly processed, this commit fixes that issue by acknowledging
> interrupts for both ACK and NACK cases late in the interrupt handler.
> 
> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
> Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>

Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>

Thanks Quan!
Andrew Jeffery Dec. 14, 2023, 10:21 p.m. UTC | #2
> On Mon, 2023-12-11 at 17:22 +0700, Quan Nguyen wrote:
> > Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
> > interrupt handler") acknowledges most interrupts early before the slave
> > irq handler is executed, except for the "Receive Done Interrupt status"
> > which is acknowledged late in the interrupt.
> > However, it has been observed that the early acknowledgment of "Transmit
> > Done Interrupt Status" (with ACK or NACK) often causes the interrupt to
> > be raised in READ REQUEST state, that shows the
> > "Unexpected ACK on read request." complaint messages.
> > 
> > Assuming that the "Transmit Done" interrupt should only be acknowledged
> > once it is truly processed, this commit fixes that issue by acknowledging
> > interrupts for both ACK and NACK cases late in the interrupt handler.
> > 
> > Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
> > Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
> 
> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>

So I just booted this series on v6.7-rc5 under qemu v8.2.0-rc4 and
found this:

```
$ qemu-system-arm \
	-M ast2600-evb \
	-kernel build.aspeed_g5/arch/arm/boot/zImage \
	-dtb build.aspeed_g5/arch/arm/boot/dts/aspeed/aspeed-ast2600-evb.dtb \
	-initrd ~/src/buildroot.org/buildroot/output/images/rootfs.cpio.xz \
	-nographic 2>&1 \
	| ts -s
...
00:00:03 [    1.089187] Freeing initrd memory: 3308K
00:00:05 smbus: error: Unexpected send start condition in state 1
00:00:05 smbus: error: Unexpected write in state -1
00:00:06 [    3.685731] aspeed-i2c-bus 1e78a400.i2c-bus: i2c bus 7 registered, irq 48
00:00:06 [    3.688918] aspeed-i2c-bus 1e78a480.i2c-bus: i2c bus 8 registered, irq 49
00:00:06 [    3.692326] aspeed-i2c-bus 1e78a500.i2c-bus: i2c bus 9 registered, irq 50
00:00:06 [    3.693757] aspeed-i2c-bus 1e78a680.i2c-bus: i2c bus 12 registered, irq 51
00:00:06 [    3.695070] aspeed-i2c-bus 1e78a700.i2c-bus: i2c bus 13 registered, irq 52
00:00:06 [    3.696184] aspeed-i2c-bus 1e78a780.i2c-bus: i2c bus 14 registered, irq 53
00:00:06 [    3.697144] aspeed-i2c-bus 1e78a800.i2c-bus: i2c bus 15 registered, irq 54
00:00:06 [    3.699061] aspeed-video 1e700000.video: irq 55
00:00:06 [    3.699254] aspeed-video 1e700000.video: assigned reserved memory node video
00:00:06 [    3.702755] aspeed-video 1e700000.video: alloc mem size(24576) at 0xbc000000 for jpeg header
00:00:06 [    3.706139] Driver for 1-wire Dallas network protocol.
00:00:07 smbus: error: Unexpected send start condition in state -1
00:00:07 smbus: error: Unexpected write in state -1
00:00:10 smbus: error: Unexpected send start condition in state -1
00:00:10 smbus: error: Unexpected write in state -1
00:00:12 smbus: error: Unexpected send start condition in state -1
00:00:12 smbus: error: Unexpected write in state -1
00:00:14 smbus: error: Unexpected send start condition in state -1
00:00:14 smbus: error: Unexpected write in state -1
00:00:17 smbus: error: Unexpected send start condition in state -1
00:00:17 smbus: error: Unexpected write in state -1
00:00:18 [   14.080141] adt7475 7-002e: Error configuring attenuator bypass
00:00:19 smbus: error: Unexpected send start condition in state -1
00:00:19 smbus: error: Unexpected write in state -1
00:00:21 smbus: error: Unexpected send start condition in state -1
00:00:21 smbus: error: Unexpected write in state -1
00:00:24 smbus: error: Unexpected send start condition in state -1
00:00:24 smbus: error: Unexpected write in state -1
```

The smbus errors do not occur if I revert this patch.

Can you look into qemu to see if it's a bug in the aspeed i2c
controller model's state machine?

Cheers,

Andrew
Andi Shyti Dec. 15, 2023, 1:45 a.m. UTC | #3
Hi Quan,

On Mon, Dec 11, 2023 at 05:22:17PM +0700, Quan Nguyen wrote:
> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
> interrupt handler") acknowledges most interrupts early before the slave
> irq handler is executed, except for the "Receive Done Interrupt status"
> which is acknowledged late in the interrupt.
> However, it has been observed that the early acknowledgment of "Transmit
> Done Interrupt Status" (with ACK or NACK) often causes the interrupt to
> be raised in READ REQUEST state, that shows the
> "Unexpected ACK on read request." complaint messages.
> 
> Assuming that the "Transmit Done" interrupt should only be acknowledged
> once it is truly processed, this commit fixes that issue by acknowledging
> interrupts for both ACK and NACK cases late in the interrupt handler.
> 
> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
> Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>

Reviewed-by: Andi Shyti <andi.shyti@kernel.org>

Thanks,
Andi
Quan Nguyen Dec. 18, 2023, 8:45 a.m. UTC | #4
On 15/12/2023 05:21, Andrew Jeffery wrote:
>> On Mon, 2023-12-11 at 17:22 +0700, Quan Nguyen wrote:
>>> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
>>> interrupt handler") acknowledges most interrupts early before the slave
>>> irq handler is executed, except for the "Receive Done Interrupt status"
>>> which is acknowledged late in the interrupt.
>>> However, it has been observed that the early acknowledgment of "Transmit
>>> Done Interrupt Status" (with ACK or NACK) often causes the interrupt to
>>> be raised in READ REQUEST state, that shows the
>>> "Unexpected ACK on read request." complaint messages.
>>>
>>> Assuming that the "Transmit Done" interrupt should only be acknowledged
>>> once it is truly processed, this commit fixes that issue by acknowledging
>>> interrupts for both ACK and NACK cases late in the interrupt handler.
>>>
>>> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
>>> Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
>>
>> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
> 
> So I just booted this series on v6.7-rc5 under qemu v8.2.0-rc4 and
> found this:
> 
> ```
> $ qemu-system-arm \
> 	-M ast2600-evb \
> 	-kernel build.aspeed_g5/arch/arm/boot/zImage \
> 	-dtb build.aspeed_g5/arch/arm/boot/dts/aspeed/aspeed-ast2600-evb.dtb \
> 	-initrd ~/src/buildroot.org/buildroot/output/images/rootfs.cpio.xz \
> 	-nographic 2>&1 \
> 	| ts -s
> ...
> 00:00:03 [    1.089187] Freeing initrd memory: 3308K
> 00:00:05 smbus: error: Unexpected send start condition in state 1
> 00:00:05 smbus: error: Unexpected write in state -1
> 00:00:06 [    3.685731] aspeed-i2c-bus 1e78a400.i2c-bus: i2c bus 7 registered, irq 48
> 00:00:06 [    3.688918] aspeed-i2c-bus 1e78a480.i2c-bus: i2c bus 8 registered, irq 49
> 00:00:06 [    3.692326] aspeed-i2c-bus 1e78a500.i2c-bus: i2c bus 9 registered, irq 50
> 00:00:06 [    3.693757] aspeed-i2c-bus 1e78a680.i2c-bus: i2c bus 12 registered, irq 51
> 00:00:06 [    3.695070] aspeed-i2c-bus 1e78a700.i2c-bus: i2c bus 13 registered, irq 52
> 00:00:06 [    3.696184] aspeed-i2c-bus 1e78a780.i2c-bus: i2c bus 14 registered, irq 53
> 00:00:06 [    3.697144] aspeed-i2c-bus 1e78a800.i2c-bus: i2c bus 15 registered, irq 54
> 00:00:06 [    3.699061] aspeed-video 1e700000.video: irq 55
> 00:00:06 [    3.699254] aspeed-video 1e700000.video: assigned reserved memory node video
> 00:00:06 [    3.702755] aspeed-video 1e700000.video: alloc mem size(24576) at 0xbc000000 for jpeg header
> 00:00:06 [    3.706139] Driver for 1-wire Dallas network protocol.
> 00:00:07 smbus: error: Unexpected send start condition in state -1
> 00:00:07 smbus: error: Unexpected write in state -1
> 00:00:10 smbus: error: Unexpected send start condition in state -1
> 00:00:10 smbus: error: Unexpected write in state -1
> 00:00:12 smbus: error: Unexpected send start condition in state -1
> 00:00:12 smbus: error: Unexpected write in state -1
> 00:00:14 smbus: error: Unexpected send start condition in state -1
> 00:00:14 smbus: error: Unexpected write in state -1
> 00:00:17 smbus: error: Unexpected send start condition in state -1
> 00:00:17 smbus: error: Unexpected write in state -1
> 00:00:18 [   14.080141] adt7475 7-002e: Error configuring attenuator bypass
> 00:00:19 smbus: error: Unexpected send start condition in state -1
> 00:00:19 smbus: error: Unexpected write in state -1
> 00:00:21 smbus: error: Unexpected send start condition in state -1
> 00:00:21 smbus: error: Unexpected write in state -1
> 00:00:24 smbus: error: Unexpected send start condition in state -1
> 00:00:24 smbus: error: Unexpected write in state -1
> ```
> 
> The smbus errors do not occur if I revert this patch.
> 
> Can you look into qemu to see if it's a bug in the aspeed i2c
> controller model's state machine?
> 

Thanks, Andrew, for testing these patches on qemu.

I'll try to look into it to see if anything can be improved, but I have 
to admit that I'm not so familiar with it. This is my first time trying 
it on qemu. Just did these tests on real HW with waveform captured 
sometimes.

So far I could be able to reproduce the issue and start playing around 
trying to understand the model.

Thanks,
- Quan
Andrew Jeffery Dec. 18, 2023, 11:02 p.m. UTC | #5
On Mon, 2023-12-18 at 15:45 +0700, Quan Nguyen wrote:
> 
> On 15/12/2023 05:21, Andrew Jeffery wrote:
> > 
> > ```
> > $ qemu-system-arm \
> > 	-M ast2600-evb \
> > 	-kernel build.aspeed_g5/arch/arm/boot/zImage \
> > 	-dtb build.aspeed_g5/arch/arm/boot/dts/aspeed/aspeed-ast2600-evb.dtb \
> > 	-initrd ~/src/buildroot.org/buildroot/output/images/rootfs.cpio.xz \
> > 	-nographic 2>&1 \
> > 	| ts -s
> > ...
> > 00:00:03 [    1.089187] Freeing initrd memory: 3308K
> > 00:00:05 smbus: error: Unexpected send start condition in state 1
> > 00:00:05 smbus: error: Unexpected write in state -1
> > 00:00:06 [    3.685731] aspeed-i2c-bus 1e78a400.i2c-bus: i2c bus 7 registered, irq 48
> > 00:00:06 [    3.688918] aspeed-i2c-bus 1e78a480.i2c-bus: i2c bus 8 registered, irq 49
> > 00:00:06 [    3.692326] aspeed-i2c-bus 1e78a500.i2c-bus: i2c bus 9 registered, irq 50
> > 00:00:06 [    3.693757] aspeed-i2c-bus 1e78a680.i2c-bus: i2c bus 12 registered, irq 51
> > 00:00:06 [    3.695070] aspeed-i2c-bus 1e78a700.i2c-bus: i2c bus 13 registered, irq 52
> > 00:00:06 [    3.696184] aspeed-i2c-bus 1e78a780.i2c-bus: i2c bus 14 registered, irq 53
> > 00:00:06 [    3.697144] aspeed-i2c-bus 1e78a800.i2c-bus: i2c bus 15 registered, irq 54
> > 00:00:06 [    3.699061] aspeed-video 1e700000.video: irq 55
> > 00:00:06 [    3.699254] aspeed-video 1e700000.video: assigned reserved memory node video
> > 00:00:06 [    3.702755] aspeed-video 1e700000.video: alloc mem size(24576) at 0xbc000000 for jpeg header
> > 00:00:06 [    3.706139] Driver for 1-wire Dallas network protocol.
> > 00:00:07 smbus: error: Unexpected send start condition in state -1
> > 00:00:07 smbus: error: Unexpected write in state -1
> > 00:00:10 smbus: error: Unexpected send start condition in state -1
> > 00:00:10 smbus: error: Unexpected write in state -1
> > 00:00:12 smbus: error: Unexpected send start condition in state -1
> > 00:00:12 smbus: error: Unexpected write in state -1
> > 00:00:14 smbus: error: Unexpected send start condition in state -1
> > 00:00:14 smbus: error: Unexpected write in state -1
> > 00:00:17 smbus: error: Unexpected send start condition in state -1
> > 00:00:17 smbus: error: Unexpected write in state -1
> > 00:00:18 [   14.080141] adt7475 7-002e: Error configuring attenuator bypass
> > 00:00:19 smbus: error: Unexpected send start condition in state -1
> > 00:00:19 smbus: error: Unexpected write in state -1
> > 00:00:21 smbus: error: Unexpected send start condition in state -1
> > 00:00:21 smbus: error: Unexpected write in state -1
> > 00:00:24 smbus: error: Unexpected send start condition in state -1
> > 00:00:24 smbus: error: Unexpected write in state -1
> > ```
> > 
> > The smbus errors do not occur if I revert this patch.
> > 
> > Can you look into qemu to see if it's a bug in the aspeed i2c
> > controller model's state machine?
> > 
> 
> Thanks, Andrew, for testing these patches on qemu.
> 
> I'll try to look into it to see if anything can be improved, but I have 
> to admit that I'm not so familiar with it. This is my first time trying 
> it on qemu. Just did these tests on real HW with waveform captured 
> sometimes.
> 
> So far I could be able to reproduce the issue and start playing around 
> trying to understand the model.
> 

So `$ git grep -lF 'Unexpected write in state'` leads us to
hw/i2c/smbus_slave.c:193. From the switch statement there and the log
output above dev->mode must be SMBUS_CONFUSED:


https://gitlab.com/qemu-project/qemu/-/blob/039afc5ef7367fbc8fb475580c291c2655e856cb/hw/i2c/smbus_slave.c#L35-L41

The prior log message was:

```
00:00:05 smbus: error: Unexpected send start condition in state 1
```

So we entered SMBUS_CONFUSED from SMBUS_WRITE_DATA. Given the log
output above it suggests the master model is failing to send an
I2C_FINISH prior to I2C_START_SEND, as that log message is emitted from
`dev->mode != SMBUS_IDLE` when the slave sees an `I2C_START_SEND`.

Perhaps the M_STOP_CMD handling needs to go above the M_START_CMD
handling in aspeed_i2c_bus_handle_cmd()?

https://gitlab.com/qemu-project/qemu/-/blob/039afc5ef7367fbc8fb475580c291c2655e856cb/hw/i2c/aspeed_i2c.c#L450

Andrew
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index 5511fd46a65e..0f67218cf04a 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -93,6 +93,10 @@ 
 		 ASPEED_I2CD_INTR_RX_DONE |				       \
 		 ASPEED_I2CD_INTR_TX_NAK |				       \
 		 ASPEED_I2CD_INTR_TX_ACK)
+#define ASPEED_I2CD_INTR_ACK_RX_TX					       \
+		(ASPEED_I2CD_INTR_RX_DONE |				       \
+		 ASPEED_I2CD_INTR_TX_ACK |				       \
+		 ASPEED_I2CD_INTR_TX_NAK)
 
 /* 0x14 : I2CD Command/Status Register   */
 #define ASPEED_I2CD_SCL_LINE_STS			BIT(18)
@@ -622,10 +626,19 @@  static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
 
 	spin_lock(&bus->lock);
 	irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
-	/* Ack all interrupts except for Rx done */
-	writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
-	       bus->base + ASPEED_I2C_INTR_STS_REG);
-	readl(bus->base + ASPEED_I2C_INTR_STS_REG);
+
+	/*
+	 * Early acking of INTR_RX_DONE and INTR_TX_[ACK|NAK] would indicate HW to
+	 * start receiving or sending new data, and this may cause a race condition
+	 * as the irq handler has not yet handled these irqs but is being acked.
+	 * Let's ack them late at the end of the irq handler when those are truly processed.
+	 */
+	if (irq_received & ~ASPEED_I2CD_INTR_ACK_RX_TX) {
+		writel(irq_received & ~ASPEED_I2CD_INTR_ACK_RX_TX,
+		       bus->base + ASPEED_I2C_INTR_STS_REG);
+		readl(bus->base + ASPEED_I2C_INTR_STS_REG);
+	}
+
 	irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
 	irq_remaining = irq_received;
 
@@ -668,12 +681,12 @@  static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
 			"irq handled != irq. expected 0x%08x, but was 0x%08x\n",
 			irq_received, irq_handled);
 
-	/* Ack Rx done */
-	if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
-		writel(ASPEED_I2CD_INTR_RX_DONE,
+	if (irq_received & ASPEED_I2CD_INTR_ACK_RX_TX) {
+		writel(irq_received & ASPEED_I2CD_INTR_ACK_RX_TX,
 		       bus->base + ASPEED_I2C_INTR_STS_REG);
 		readl(bus->base + ASPEED_I2C_INTR_STS_REG);
 	}
+
 	spin_unlock(&bus->lock);
 	return irq_remaining ? IRQ_NONE : IRQ_HANDLED;
 }