diff mbox

[U-Boot,PATH,V2] usb: dwc2: invalidate the dcache before starting the DMA

Message ID 20170406020328.25191-1-eddie.cai.linux@gmail.com
State Superseded
Delegated to: Marek Vasut
Headers show

Commit Message

Eddie Cai April 6, 2017, 2:03 a.m. UTC
We should invalidate the dcache before starting the DMA. In case there are
any dirty lines from the DMA buffer in the cache, subsequent cache-line
replacements may corrupt the buffer in memory while the DMA is still going on.
Cache-line replacement can happen if the CPU tries to bring some other memory
locations into the cache while the DMA is going on.

Signed-off-by: Eddie Cai <eddie.cai.linux@gmail.com>
Reviewed-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
---
 drivers/usb/host/dwc2.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

Comments

Marek Vasut April 6, 2017, 2:14 a.m. UTC | #1
On 04/06/2017 04:03 AM, Eddie Cai wrote:
> We should invalidate the dcache before starting the DMA. In case there are
> any dirty lines from the DMA buffer in the cache, subsequent cache-line
> replacements may corrupt the buffer in memory while the DMA is still going on.
> Cache-line replacement can happen if the CPU tries to bring some other memory
> locations into the cache while the DMA is going on.
> 
> Signed-off-by: Eddie Cai <eddie.cai.linux@gmail.com>
> Reviewed-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
> ---
>  drivers/usb/host/dwc2.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c
> index 5ac602e..e3a5d7d 100644
> --- a/drivers/usb/host/dwc2.c
> +++ b/drivers/usb/host/dwc2.c
> @@ -811,12 +811,19 @@ static int transfer_chunk(struct dwc2_hc_regs *hc_regs, void *aligned_buffer,
>  	       (*pid << DWC2_HCTSIZ_PID_OFFSET),
>  	       &hc_regs->hctsiz);
>  
> -	if (!in && xfer_len) {
> -		memcpy(aligned_buffer, buffer, xfer_len);
> -
> -		flush_dcache_range((unsigned long)aligned_buffer,
> -				   (unsigned long)aligned_buffer +
> -				   roundup(xfer_len, ARCH_DMA_MINALIGN));
> +	if (xfer_len) {
> +		if (in) {
> +			invalidate_dcache_range(
> +					(unsigned long)aligned_buffer,
> +					(unsigned long)aligned_buffer +

Should be uintptr_t all over the place to deal with 32/64 bit systems,
otherwise great, thanks.

> +					roundup(xfer_len, ARCH_DMA_MINALIGN));
> +		} else {
> +			memcpy(aligned_buffer, buffer, xfer_len);
> +			flush_dcache_range(
> +					(unsigned long)aligned_buffer,
> +					(unsigned long)aligned_buffer +
> +					roundup(xfer_len, ARCH_DMA_MINALIGN));
> +		}
>  	}
>  
>  	writel(phys_to_bus((unsigned long)aligned_buffer), &hc_regs->hcdma);
>
Kever Yang April 6, 2017, 8:34 a.m. UTC | #2
Hi Eddie,


On 04/06/2017 10:14 AM, Marek Vasut wrote:
> On 04/06/2017 04:03 AM, Eddie Cai wrote:
>> We should invalidate the dcache before starting the DMA. In case there are
>> any dirty lines from the DMA buffer in the cache, subsequent cache-line
>> replacements may corrupt the buffer in memory while the DMA is still going on.
>> Cache-line replacement can happen if the CPU tries to bring some other memory
>> locations into the cache while the DMA is going on.
>>
>> Signed-off-by: Eddie Cai <eddie.cai.linux@gmail.com>
>> Reviewed-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
>> ---
>>   drivers/usb/host/dwc2.c | 19 +++++++++++++------
>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c
>> index 5ac602e..e3a5d7d 100644
>> --- a/drivers/usb/host/dwc2.c
>> +++ b/drivers/usb/host/dwc2.c
>> @@ -811,12 +811,19 @@ static int transfer_chunk(struct dwc2_hc_regs *hc_regs, void *aligned_buffer,
>>   	       (*pid << DWC2_HCTSIZ_PID_OFFSET),
>>   	       &hc_regs->hctsiz);
>>   
>> -	if (!in && xfer_len) {
>> -		memcpy(aligned_buffer, buffer, xfer_len);
>> -
>> -		flush_dcache_range((unsigned long)aligned_buffer,
>> -				   (unsigned long)aligned_buffer +
>> -				   roundup(xfer_len, ARCH_DMA_MINALIGN));
>> +	if (xfer_len) {
>> +		if (in) {
>> +			invalidate_dcache_range(
>> +					(unsigned long)aligned_buffer,
>> +					(unsigned long)aligned_buffer +
> Should be uintptr_t all over the place to deal with 32/64 bit systems,
> otherwise great, thanks.
>
>> +					roundup(xfer_len, ARCH_DMA_MINALIGN));
>> +		} else {
>> +			memcpy(aligned_buffer, buffer, xfer_len);
>> +			flush_dcache_range(
>> +					(unsigned long)aligned_buffer,
>> +					(unsigned long)aligned_buffer +
>> +					roundup(xfer_len, ARCH_DMA_MINALIGN));
>> +		}
>>   	}
>>   
>>   	writel(phys_to_bus((unsigned long)aligned_buffer), &hc_regs->hcdma);
>>

Did you test with remove another invalidate_dcache_range() in this function?
I still think no need to have 2 invalidate_dcache in this function.

Thanks,
- Kever
>
Marek Vasut April 6, 2017, 10:20 a.m. UTC | #3
On 04/06/2017 10:34 AM, Kever Yang wrote:
> Hi Eddie,
> 
> 
> On 04/06/2017 10:14 AM, Marek Vasut wrote:
>> On 04/06/2017 04:03 AM, Eddie Cai wrote:
>>> We should invalidate the dcache before starting the DMA. In case
>>> there are
>>> any dirty lines from the DMA buffer in the cache, subsequent cache-line
>>> replacements may corrupt the buffer in memory while the DMA is still
>>> going on.
>>> Cache-line replacement can happen if the CPU tries to bring some
>>> other memory
>>> locations into the cache while the DMA is going on.
>>>
>>> Signed-off-by: Eddie Cai <eddie.cai.linux@gmail.com>
>>> Reviewed-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
>>> ---
>>>   drivers/usb/host/dwc2.c | 19 +++++++++++++------
>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c
>>> index 5ac602e..e3a5d7d 100644
>>> --- a/drivers/usb/host/dwc2.c
>>> +++ b/drivers/usb/host/dwc2.c
>>> @@ -811,12 +811,19 @@ static int transfer_chunk(struct dwc2_hc_regs
>>> *hc_regs, void *aligned_buffer,
>>>              (*pid << DWC2_HCTSIZ_PID_OFFSET),
>>>              &hc_regs->hctsiz);
>>>   -    if (!in && xfer_len) {
>>> -        memcpy(aligned_buffer, buffer, xfer_len);
>>> -
>>> -        flush_dcache_range((unsigned long)aligned_buffer,
>>> -                   (unsigned long)aligned_buffer +
>>> -                   roundup(xfer_len, ARCH_DMA_MINALIGN));
>>> +    if (xfer_len) {
>>> +        if (in) {
>>> +            invalidate_dcache_range(
>>> +                    (unsigned long)aligned_buffer,
>>> +                    (unsigned long)aligned_buffer +
>> Should be uintptr_t all over the place to deal with 32/64 bit systems,
>> otherwise great, thanks.
>>
>>> +                    roundup(xfer_len, ARCH_DMA_MINALIGN));
>>> +        } else {
>>> +            memcpy(aligned_buffer, buffer, xfer_len);
>>> +            flush_dcache_range(
>>> +                    (unsigned long)aligned_buffer,
>>> +                    (unsigned long)aligned_buffer +
>>> +                    roundup(xfer_len, ARCH_DMA_MINALIGN));
>>> +        }
>>>       }
>>>         writel(phys_to_bus((unsigned long)aligned_buffer),
>>> &hc_regs->hcdma);
>>>
> 
> Did you test with remove another invalidate_dcache_range() in this
> function?
> I still think no need to have 2 invalidate_dcache in this function.

You need the first invalidate to prevent having data stuck in the cache
and accidentally being evicted into the RAM while the USB DMA is also
running , thus creating a total mess of the result.
Kever Yang April 18, 2017, 3:55 a.m. UTC | #4
Hi Marek,

On 04/06/2017 06:20 PM, Marek Vasut wrote:
> On 04/06/2017 10:34 AM, Kever Yang wrote:
>> Hi Eddie,
>>
>>
>> On 04/06/2017 10:14 AM, Marek Vasut wrote:
>>> On 04/06/2017 04:03 AM, Eddie Cai wrote:
>>>> We should invalidate the dcache before starting the DMA. In case
>>>> there are
>>>> any dirty lines from the DMA buffer in the cache, subsequent cache-line
>>>> replacements may corrupt the buffer in memory while the DMA is still
>>>> going on.
>>>> Cache-line replacement can happen if the CPU tries to bring some
>>>> other memory
>>>> locations into the cache while the DMA is going on.
>>>>
>>>> Signed-off-by: Eddie Cai <eddie.cai.linux@gmail.com>
>>>> Reviewed-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
>>>> ---
>>>>    drivers/usb/host/dwc2.c | 19 +++++++++++++------
>>>>    1 file changed, 13 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c
>>>> index 5ac602e..e3a5d7d 100644
>>>> --- a/drivers/usb/host/dwc2.c
>>>> +++ b/drivers/usb/host/dwc2.c
>>>> @@ -811,12 +811,19 @@ static int transfer_chunk(struct dwc2_hc_regs
>>>> *hc_regs, void *aligned_buffer,
>>>>               (*pid << DWC2_HCTSIZ_PID_OFFSET),
>>>>               &hc_regs->hctsiz);
>>>>    -    if (!in && xfer_len) {
>>>> -        memcpy(aligned_buffer, buffer, xfer_len);
>>>> -
>>>> -        flush_dcache_range((unsigned long)aligned_buffer,
>>>> -                   (unsigned long)aligned_buffer +
>>>> -                   roundup(xfer_len, ARCH_DMA_MINALIGN));
>>>> +    if (xfer_len) {
>>>> +        if (in) {
>>>> +            invalidate_dcache_range(
>>>> +                    (unsigned long)aligned_buffer,
>>>> +                    (unsigned long)aligned_buffer +
>>> Should be uintptr_t all over the place to deal with 32/64 bit systems,
>>> otherwise great, thanks.
>>>
>>>> +                    roundup(xfer_len, ARCH_DMA_MINALIGN));
>>>> +        } else {
>>>> +            memcpy(aligned_buffer, buffer, xfer_len);
>>>> +            flush_dcache_range(
>>>> +                    (unsigned long)aligned_buffer,
>>>> +                    (unsigned long)aligned_buffer +
>>>> +                    roundup(xfer_len, ARCH_DMA_MINALIGN));
>>>> +        }
>>>>        }
>>>>          writel(phys_to_bus((unsigned long)aligned_buffer),
>>>> &hc_regs->hcdma);
>>>>
>> Did you test with remove another invalidate_dcache_range() in this
>> function?
>> I still think no need to have 2 invalidate_dcache in this function.
> You need the first invalidate to prevent having data stuck in the cache
> and accidentally being evicted into the RAM while the USB DMA is also
> running , thus creating a total mess of the result.
>
>

Yes, I'm very clear about what happen here, because it's me report
the origin bug.
What I mean is the fix should be _move_ the invalidate_dcache_range() 
operation
from "after start DMA transfer" to "before DMA transter" for in 
transfer, but not
_add_ a invalidate_dcache_range() operation.

If you apply this patch, then there will be two 
invalidate_dcache_range() operation
for in transfer, and the second one is no need, because CPU should not 
have any
access to the buffer during dma start and done.

Thanks,
- Kever
Eddie Cai April 18, 2017, 10:56 a.m. UTC | #5
Hi

2017-04-18 11:55 GMT+08:00 Kever Yang <kever.yang@rock-chips.com>:

> Hi Marek,
>
>
> On 04/06/2017 06:20 PM, Marek Vasut wrote:
>
>> On 04/06/2017 10:34 AM, Kever Yang wrote:
>>
>>> Hi Eddie,
>>>
>>>
>>> On 04/06/2017 10:14 AM, Marek Vasut wrote:
>>>
>>>> On 04/06/2017 04:03 AM, Eddie Cai wrote:
>>>>
>>>>> We should invalidate the dcache before starting the DMA. In case
>>>>> there are
>>>>> any dirty lines from the DMA buffer in the cache, subsequent cache-line
>>>>> replacements may corrupt the buffer in memory while the DMA is still
>>>>> going on.
>>>>> Cache-line replacement can happen if the CPU tries to bring some
>>>>> other memory
>>>>> locations into the cache while the DMA is going on.
>>>>>
>>>>> Signed-off-by: Eddie Cai <eddie.cai.linux@gmail.com>
>>>>> Reviewed-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
>>>>> ---
>>>>>    drivers/usb/host/dwc2.c | 19 +++++++++++++------
>>>>>    1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c
>>>>> index 5ac602e..e3a5d7d 100644
>>>>> --- a/drivers/usb/host/dwc2.c
>>>>> +++ b/drivers/usb/host/dwc2.c
>>>>> @@ -811,12 +811,19 @@ static int transfer_chunk(struct dwc2_hc_regs
>>>>> *hc_regs, void *aligned_buffer,
>>>>>               (*pid << DWC2_HCTSIZ_PID_OFFSET),
>>>>>               &hc_regs->hctsiz);
>>>>>    -    if (!in && xfer_len) {
>>>>> -        memcpy(aligned_buffer, buffer, xfer_len);
>>>>> -
>>>>> -        flush_dcache_range((unsigned long)aligned_buffer,
>>>>> -                   (unsigned long)aligned_buffer +
>>>>> -                   roundup(xfer_len, ARCH_DMA_MINALIGN));
>>>>> +    if (xfer_len) {
>>>>> +        if (in) {
>>>>> +            invalidate_dcache_range(
>>>>> +                    (unsigned long)aligned_buffer,
>>>>> +                    (unsigned long)aligned_buffer +
>>>>>
>>>> Should be uintptr_t all over the place to deal with 32/64 bit systems,
>>>> otherwise great, thanks.
>>>>
>>>> +                    roundup(xfer_len, ARCH_DMA_MINALIGN));
>>>>> +        } else {
>>>>> +            memcpy(aligned_buffer, buffer, xfer_len);
>>>>> +            flush_dcache_range(
>>>>> +                    (unsigned long)aligned_buffer,
>>>>> +                    (unsigned long)aligned_buffer +
>>>>> +                    roundup(xfer_len, ARCH_DMA_MINALIGN));
>>>>> +        }
>>>>>        }
>>>>>          writel(phys_to_bus((unsigned long)aligned_buffer),
>>>>> &hc_regs->hcdma);
>>>>>
>>>>> Did you test with remove another invalidate_dcache_range() in this
>>> function?
>>> I still think no need to have 2 invalidate_dcache in this function.
>>>
>> You need the first invalidate to prevent having data stuck in the cache
>> and accidentally being evicted into the RAM while the USB DMA is also
>> running , thus creating a total mess of the result.
>>
>>
>>
> Yes, I'm very clear about what happen here, because it's me report
> the origin bug.
> What I mean is the fix should be _move_ the invalidate_dcache_range()
> operation
> from "after start DMA transfer" to "before DMA transter" for in transfer,
> but not
> _add_ a invalidate_dcache_range() operation.
>

I follow doc/README.arm-caches. I says
Peripheral to Memory DMA:

- Invalidate the buffer before starting the DMA. In case there are any
dirty
  lines from the DMA buffer in the cache, subsequent cache-line
replacements
  may corrupt the buffer in memory while the DMA is still going on.
Cache-line
  replacement can happen if the CPU tries to bring some other memory
locations
  into the cache while the DMA is going on.

- Invalidate the buffer after the DMA is complete and before the MPU reads

  it. This may be needed in addition to the invalidation before the DMA

  mentioned above, because in some processors memory contents can
spontaneously
  come to the cache due to speculative memory access by the CPU. If this

  happens with the DMA buffer while DMA is going on we have a coherency
problem.
I guess that means we need to Invalidate the buffer before and after the DMA
Thanks
Eddie



> If you apply this patch, then there will be two invalidate_dcache_range()
> operation
> for in transfer, and the second one is no need, because CPU should not
> have any
> access to the buffer during dma start and done.
>
> Thanks,
> - Kever
>
>
diff mbox

Patch

diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c
index 5ac602e..e3a5d7d 100644
--- a/drivers/usb/host/dwc2.c
+++ b/drivers/usb/host/dwc2.c
@@ -811,12 +811,19 @@  static int transfer_chunk(struct dwc2_hc_regs *hc_regs, void *aligned_buffer,
 	       (*pid << DWC2_HCTSIZ_PID_OFFSET),
 	       &hc_regs->hctsiz);
 
-	if (!in && xfer_len) {
-		memcpy(aligned_buffer, buffer, xfer_len);
-
-		flush_dcache_range((unsigned long)aligned_buffer,
-				   (unsigned long)aligned_buffer +
-				   roundup(xfer_len, ARCH_DMA_MINALIGN));
+	if (xfer_len) {
+		if (in) {
+			invalidate_dcache_range(
+					(unsigned long)aligned_buffer,
+					(unsigned long)aligned_buffer +
+					roundup(xfer_len, ARCH_DMA_MINALIGN));
+		} else {
+			memcpy(aligned_buffer, buffer, xfer_len);
+			flush_dcache_range(
+					(unsigned long)aligned_buffer,
+					(unsigned long)aligned_buffer +
+					roundup(xfer_len, ARCH_DMA_MINALIGN));
+		}
 	}
 
 	writel(phys_to_bus((unsigned long)aligned_buffer), &hc_regs->hcdma);