diff mbox series

[v3,2/3] usb: dwc3: fix dcache flush range calculation

Message ID 20241002-u-boot-dwc3-gadget-dcache-fixup-v3-2-5398088ef93c@linaro.org
State New
Delegated to: Bin Meng
Headers show
Series dwc3: gadget: properly fix cache operations | expand

Commit Message

Neil Armstrong Oct. 2, 2024, 2:39 p.m. UTC
The current flush operation will omit doing a flush/invalidate on
the first and last bytes if the base address and size are not aligned
with DMA_MINALIGN.

This causes operation failures Qualcomm platforms.

Take in account the alignment and size of the buffer and also
flush the previous and last cacheline.

Remove CACHELINE_SIZE which was the same as DMA_MINALIGN.

Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/usb/dwc3/io.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Marek Vasut Oct. 2, 2024, 2:55 p.m. UTC | #1
On 10/2/24 4:39 PM, Neil Armstrong wrote:
> The current flush operation will omit doing a flush/invalidate on
> the first and last bytes if the base address and size are not aligned
> with DMA_MINALIGN.
> 
> This causes operation failures Qualcomm platforms.
> 
> Take in account the alignment and size of the buffer and also
> flush the previous and last cacheline.
> 
> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN.

It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE 
(CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine 
alignment requirement (from times where there used to be one DMA engine 
on most devices). You likely want a max(CONFIG_SYS_CACHELINE_SIZE, 
dwc3-buffer-alignment-requirement) to really correctly align the buffer.

> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com>
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>   drivers/usb/dwc3/io.h | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/io.h b/drivers/usb/dwc3/io.h
> index 04791d4c9be..a6c2bb0f47d 100644
> --- a/drivers/usb/dwc3/io.h
> +++ b/drivers/usb/dwc3/io.h
> @@ -20,7 +20,6 @@
>   #include <cpu_func.h>
>   #include <asm/io.h>
>   
> -#define	CACHELINE_SIZE		CONFIG_SYS_CACHELINE_SIZE
>   static inline u32 dwc3_readl(void __iomem *base, u32 offset)
>   {
>   	unsigned long offs = offset - DWC3_GLOBALS_REGS_START;
> @@ -50,6 +49,9 @@ static inline void dwc3_writel(void __iomem *base, u32 offset, u32 value)
>   
>   static inline void dwc3_flush_cache(uintptr_t addr, int length)
>   {
> -	flush_dcache_range(addr, addr + ROUND(length, CACHELINE_SIZE));
> +	uintptr_t start_addr = (uintptr_t)addr & ~(ARCH_DMA_MINALIGN - 1);
> +	uintptr_t end_addr = ALIGN((uintptr_t)addr + length, ARCH_DMA_MINALIGN);
> +
> +	flush_dcache_range((unsigned long)start_addr, (unsigned long)end_addr);
>   }
Neil Armstrong Oct. 3, 2024, 12:49 p.m. UTC | #2
On 02/10/2024 16:55, Marek Vasut wrote:
> On 10/2/24 4:39 PM, Neil Armstrong wrote:
>> The current flush operation will omit doing a flush/invalidate on
>> the first and last bytes if the base address and size are not aligned
>> with DMA_MINALIGN.
>>
>> This causes operation failures Qualcomm platforms.
>>
>> Take in account the alignment and size of the buffer and also
>> flush the previous and last cacheline.
>>
>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN.
> 
> It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times where there used to be one DMA engine on most devices). You likely want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to really correctly align the buffer.

It is definitely true for platforms declaring dma_alloc_coherent() (arm, riscv, x86)
except nios2 but there's 0 chance dwc3 appears on a nios2 platform.

Neil

> 
>> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/usb/dwc3/io.h | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/dwc3/io.h b/drivers/usb/dwc3/io.h
>> index 04791d4c9be..a6c2bb0f47d 100644
>> --- a/drivers/usb/dwc3/io.h
>> +++ b/drivers/usb/dwc3/io.h
>> @@ -20,7 +20,6 @@
>>   #include <cpu_func.h>
>>   #include <asm/io.h>
>> -#define    CACHELINE_SIZE        CONFIG_SYS_CACHELINE_SIZE
>>   static inline u32 dwc3_readl(void __iomem *base, u32 offset)
>>   {
>>       unsigned long offs = offset - DWC3_GLOBALS_REGS_START;
>> @@ -50,6 +49,9 @@ static inline void dwc3_writel(void __iomem *base, u32 offset, u32 value)
>>   static inline void dwc3_flush_cache(uintptr_t addr, int length)
>>   {
>> -    flush_dcache_range(addr, addr + ROUND(length, CACHELINE_SIZE));
>> +    uintptr_t start_addr = (uintptr_t)addr & ~(ARCH_DMA_MINALIGN - 1);
>> +    uintptr_t end_addr = ALIGN((uintptr_t)addr + length, ARCH_DMA_MINALIGN);
>> +
>> +    flush_dcache_range((unsigned long)start_addr, (unsigned long)end_addr);
>>   }
>
Marek Vasut Oct. 3, 2024, 1:19 p.m. UTC | #3
On 10/3/24 2:49 PM, Neil Armstrong wrote:
> On 02/10/2024 16:55, Marek Vasut wrote:
>> On 10/2/24 4:39 PM, Neil Armstrong wrote:
>>> The current flush operation will omit doing a flush/invalidate on
>>> the first and last bytes if the base address and size are not aligned
>>> with DMA_MINALIGN.
>>>
>>> This causes operation failures Qualcomm platforms.
>>>
>>> Take in account the alignment and size of the buffer and also
>>> flush the previous and last cacheline.
>>>
>>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN.
>>
>> It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE 
>> (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine 
>> alignment requirement (from times where there used to be one DMA 
>> engine on most devices). You likely want a 
>> max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to 
>> really correctly align the buffer.
> 
> It is definitely true for platforms declaring dma_alloc_coherent() (arm, 
> riscv, x86)
> except nios2 but there's 0 chance dwc3 appears on a nios2 platform.
There is real chance of that, because on modern SoCFPGA platforms 
(Agilex) you can have the FPGA content access the SoC peripherals, and 
one of the SoC peripherals is DWC3 controller. If anyone would actually 
synthesize it is another question ... but it is an FPGA, so that option 
exists.
Neil Armstrong Oct. 4, 2024, 7:16 a.m. UTC | #4
On 03/10/2024 15:19, Marek Vasut wrote:
> On 10/3/24 2:49 PM, Neil Armstrong wrote:
>> On 02/10/2024 16:55, Marek Vasut wrote:
>>> On 10/2/24 4:39 PM, Neil Armstrong wrote:
>>>> The current flush operation will omit doing a flush/invalidate on
>>>> the first and last bytes if the base address and size are not aligned
>>>> with DMA_MINALIGN.
>>>>
>>>> This causes operation failures Qualcomm platforms.
>>>>
>>>> Take in account the alignment and size of the buffer and also
>>>> flush the previous and last cacheline.
>>>>
>>>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN.
>>>
>>> It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times where there used to be one DMA engine on most devices). You likely want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to really correctly align the buffer.
>>
>> It is definitely true for platforms declaring dma_alloc_coherent() (arm, riscv, x86)
>> except nios2 but there's 0 chance dwc3 appears on a nios2 platform.
> There is real chance of that, because on modern SoCFPGA platforms (Agilex) you can have the FPGA content access the SoC peripherals, and one of the SoC peripherals is DWC3 controller. If anyone would actually synthesize it is another question ... but it is an FPGA, so that option exists.

Guess I'll switch to CACHELINE_SIZE instead of DMA_MINALIGN for nios2.

Neil
Marek Vasut Oct. 4, 2024, 11:19 a.m. UTC | #5
On 10/4/24 9:16 AM, neil.armstrong@linaro.org wrote:
> On 03/10/2024 15:19, Marek Vasut wrote:
>> On 10/3/24 2:49 PM, Neil Armstrong wrote:
>>> On 02/10/2024 16:55, Marek Vasut wrote:
>>>> On 10/2/24 4:39 PM, Neil Armstrong wrote:
>>>>> The current flush operation will omit doing a flush/invalidate on
>>>>> the first and last bytes if the base address and size are not aligned
>>>>> with DMA_MINALIGN.
>>>>>
>>>>> This causes operation failures Qualcomm platforms.
>>>>>
>>>>> Take in account the alignment and size of the buffer and also
>>>>> flush the previous and last cacheline.
>>>>>
>>>>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN.
>>>>
>>>> It isn't the same, CACHELINE_SIZE was set to 
>>>> CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while 
>>>> ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times 
>>>> where there used to be one DMA engine on most devices). You likely 
>>>> want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment- 
>>>> requirement) to really correctly align the buffer.
>>>
>>> It is definitely true for platforms declaring dma_alloc_coherent() 
>>> (arm, riscv, x86)
>>> except nios2 but there's 0 chance dwc3 appears on a nios2 platform.
>> There is real chance of that, because on modern SoCFPGA platforms 
>> (Agilex) you can have the FPGA content access the SoC peripherals, and 
>> one of the SoC peripherals is DWC3 controller. If anyone would 
>> actually synthesize it is another question ... but it is an FPGA, so 
>> that option exists.
> 
> Guess I'll switch to CACHELINE_SIZE instead of DMA_MINALIGN for nios2.
I think max(CONFIG_SYS_CACHELINE_SIZE, ARCH_DMA_MINALIGN) should cover 
all the cases ?
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/io.h b/drivers/usb/dwc3/io.h
index 04791d4c9be..a6c2bb0f47d 100644
--- a/drivers/usb/dwc3/io.h
+++ b/drivers/usb/dwc3/io.h
@@ -20,7 +20,6 @@ 
 #include <cpu_func.h>
 #include <asm/io.h>
 
-#define	CACHELINE_SIZE		CONFIG_SYS_CACHELINE_SIZE
 static inline u32 dwc3_readl(void __iomem *base, u32 offset)
 {
 	unsigned long offs = offset - DWC3_GLOBALS_REGS_START;
@@ -50,6 +49,9 @@  static inline void dwc3_writel(void __iomem *base, u32 offset, u32 value)
 
 static inline void dwc3_flush_cache(uintptr_t addr, int length)
 {
-	flush_dcache_range(addr, addr + ROUND(length, CACHELINE_SIZE));
+	uintptr_t start_addr = (uintptr_t)addr & ~(ARCH_DMA_MINALIGN - 1);
+	uintptr_t end_addr = ALIGN((uintptr_t)addr + length, ARCH_DMA_MINALIGN);
+
+	flush_dcache_range((unsigned long)start_addr, (unsigned long)end_addr);
 }
 #endif /* __DRIVERS_USB_DWC3_IO_H */