diff mbox series

[net-next,2/4] net: stmmac: use correct barrier between coherent memory and MMIO

Message ID 20180226214709.4359-3-niklas.cassel@axis.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series stmmac barrier fixes and cleanup | expand

Commit Message

Niklas Cassel Feb. 26, 2018, 9:47 p.m. UTC
The last memory barrier in stmmac_xmit()/stmmac_tso_xmit() is placed
between a coherent memory write and a MMIO write:

The own bit is written in First Desc (TSO: MSS desc or First Desc).
<barrier>
The DMA engine is started by a write to the tx desc tail pointer/
enable dma transmission register, i.e. a MMIO write.

This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only
used to guarantee the ordering, with respect to other writes,
to cache coherent DMA memory.

To guarantee that the cache coherent memory writes have completed
before we attempt to write to the cache incoherent MMIO region,
we need to use the more heavyweight barrier wmb().

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Pavel Machek March 2, 2018, 9:20 a.m. UTC | #1
Hi!

Thanks for doing the detective work!

> This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only
> used to guarantee the ordering, with respect to other writes,
> to cache coherent DMA memory.

Could you explain this a bit more (and perhaps in code comment)?

Ensuring other writes are done before writing the "GO!" bit should be
enough, no?

(If it is not, do we need heavier barriers in other places, too?)

Best regards,
							Pavel
David Miller March 2, 2018, 2:54 p.m. UTC | #2
From: Pavel Machek <pavel@ucw.cz>
Date: Fri, 2 Mar 2018 10:20:00 +0100

>> This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only
>> used to guarantee the ordering, with respect to other writes,
>> to cache coherent DMA memory.
> 
> Could you explain this a bit more (and perhaps in code comment)?
> 
> Ensuring other writes are done before writing the "GO!" bit should be
> enough, no?

Indeed, the chip should never look at the descriptor contents unless
the GO bit is set.

If there are ways that it can, this must be explained and documented
since it is quite unusual compared to other hardware.

> (If it is not, do we need heavier barriers in other places, too?)

Right.
Niklas Cassel March 2, 2018, 11:28 p.m. UTC | #3
On Fri, Mar 02, 2018 at 09:54:11AM -0500, David Miller wrote:
> From: Pavel Machek <pavel@ucw.cz>
> Date: Fri, 2 Mar 2018 10:20:00 +0100
>

Hello Pavel, David

> >> This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only
> >> used to guarantee the ordering, with respect to other writes,
> >> to cache coherent DMA memory.
> > 
> > Could you explain this a bit more (and perhaps in code comment)?

Have a look at:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/memory-barriers.txt?h=v4.16-rc1#n1913

AFAICT, a dma_wmb() can only be used to guarantee that the
writes to cache coherent memory (e.g. memory allocated with
dma_alloc_coherent()) before the dma_wmb() will be performed
before the writes to cache coherent memory after the dma_wmb().

Since most of our writes are simply writing new buffer addresses
and sizes to TDES0/TDES1/TDES2/TDES3, and since these TX DMA
descriptors have been allocated with dma_alloc_coherent(),
a dma_wmb() should be enough to e.g. make sure that TDES3
(which contains the OWN bit), is written after the writes to
TDES0/TDES1/TDES2.

However, the last write we do is "DMA start transmission",
this is a register in the IP, i.e. it is a write to the cache
incoherent MMIO region (rather than a write to cache coherent memory).
To ensure that all writes to cache coherent memory have
completed before we start the DMA, we have to use the barrier
wmb() (which performs a more extensive flush compared to
dma_wmb()).

So the only place where we have to use a wmb() instead
of a dma_wmb() is where we have a write to coherent memory,
followed by a write to cache incoherent MMIO.
The only obvious place where we have this situtation is
where we write the OWN bit immediately followed by a write
to the "DMA start transmission" register.

Note that this also matches how it's done in other other drivers:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/amd/xgbe/xgbe-dev.c?h=v4.16-rc1#n1638

There is already a comment describing the barrier in
stmmac_xmit() and stmmac_tso_xmit() that says:
/* The own bit must be the latest setting done when prepare the
 * descriptor and then barrier is needed to make sure that
 * all is coherent before granting the DMA engine.
 */
However, if you want, we could mention wmb() explicitly in this comment.

> > 
> > Ensuring other writes are done before writing the "GO!" bit should be
> > enough, no?
> 
> Indeed, the chip should never look at the descriptor contents unless
> the GO bit is set.
> 
> If there are ways that it can, this must be explained and documented
> since it is quite unusual compared to other hardware.
> 
> > (If it is not, do we need heavier barriers in other places, too?)
> 
> Right.

I hope that my explaination above has cleared any potential confusion.


Best regards,
Niklas
David Miller March 7, 2018, 3:32 p.m. UTC | #4
From: Niklas Cassel <niklas.cassel@axis.com>
Date: Sat, 3 Mar 2018 00:28:53 +0100

> However, the last write we do is "DMA start transmission",
> this is a register in the IP, i.e. it is a write to the cache
> incoherent MMIO region (rather than a write to cache coherent memory).
> To ensure that all writes to cache coherent memory have
> completed before we start the DMA, we have to use the barrier
> wmb() (which performs a more extensive flush compared to
> dma_wmb()).

The is an implicit memory barrier between physical memory writes
and those to MMIO register space.

So as long as you place the dma_wmb() to ensure the correct
ordering within the descriptor words, you need nothing else
after the last descriptor word write.
Niklas Cassel March 7, 2018, 5:21 p.m. UTC | #5
On Wed, Mar 07, 2018 at 10:32:26AM -0500, David Miller wrote:
> From: Niklas Cassel <niklas.cassel@axis.com>
> Date: Sat, 3 Mar 2018 00:28:53 +0100
> 
> > However, the last write we do is "DMA start transmission",
> > this is a register in the IP, i.e. it is a write to the cache
> > incoherent MMIO region (rather than a write to cache coherent memory).
> > To ensure that all writes to cache coherent memory have
> > completed before we start the DMA, we have to use the barrier
> > wmb() (which performs a more extensive flush compared to
> > dma_wmb()).
> 
> The is an implicit memory barrier between physical memory writes
> and those to MMIO register space.
> 
> So as long as you place the dma_wmb() to ensure the correct
> ordering within the descriptor words, you need nothing else
> after the last descriptor word write.

Hello David,

Looking at writel() in e.g. arch/arm/include/asm/io.h:
#define writel(v,c)             ({ __iowmb(); writel_relaxed(v,c); })
it indeed has a __iowmb() (which is defined as a wmb()) in its definition.

Is is safe to assume that this is true for all archs?

If so, perhaps the example at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/memory-barriers.txt?h=v4.16-rc1#n1913
Should be updated.

Considering this, you can drop/revert:
95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO")
or perhaps you want me to send a revert?

After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the
last descriptor word write. You just explained that nothing else is needed
after the last descriptor word write, so I actually think that this last
barrier is superfluous.


Best regards,
Niklas
David Miller March 7, 2018, 5:42 p.m. UTC | #6
From: Niklas Cassel <niklas.cassel@axis.com>
Date: Wed, 7 Mar 2018 18:21:57 +0100

> Considering this, you can drop/revert:
> 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO")
> or perhaps you want me to send a revert?

You must submit explicit patches to do a revert or any other change.

> After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the
> last descriptor word write. You just explained that nothing else is needed
> after the last descriptor word write, so I actually think that this last
> barrier is superfluous.

You don't need one after the last descriptor write.

Look, you're only concerned with ordering within the descriptor writes.

So it's only about:

	desc->a = x;

	/* Write to 'a' must be visible to the hardware before 'b'. */
	dma_wmb();
	desc->b = y;

	writel();

That's all that you need.
Niklas Cassel March 7, 2018, 6:09 p.m. UTC | #7
On Wed, Mar 07, 2018 at 12:42:49PM -0500, David Miller wrote:
> From: Niklas Cassel <niklas.cassel@axis.com>
> Date: Wed, 7 Mar 2018 18:21:57 +0100
> 
> > Considering this, you can drop/revert:
> > 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO")
> > or perhaps you want me to send a revert?
> 
> You must submit explicit patches to do a revert or any other change.
> 
> > After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the
> > last descriptor word write. You just explained that nothing else is needed
> > after the last descriptor word write, so I actually think that this last
> > barrier is superfluous.
> 
> You don't need one after the last descriptor write.
> 
> Look, you're only concerned with ordering within the descriptor writes.
> 
> So it's only about:
> 
> 	desc->a = x;
> 
> 	/* Write to 'a' must be visible to the hardware before 'b'. */
> 	dma_wmb();
> 	desc->b = y;
> 
> 	writel();
> 
> That's all that you need.

It seems like the first commit that added a wmb()
after set_tx_owner() on first desc (which is be the absolute last mem write)
was the following commit:

commit 8e83989106562326bfd6aaf92174fe138efd026b
Author: Deepak Sikri <deepak.sikri@st.com>
Date:   Sun Jul 8 21:14:45 2012 +0000

    stmmac: Fix for nfs hang on multiple reboot
    
    It was observed that during multiple reboots nfs hangs. The status of
    receive descriptors shows that all the descriptors were in control of
    CPU, and none were assigned to DMA.
    Also the DMA status register confirmed that the Rx buffer is
    unavailable.
    
    This patch adds the fix for the same by adding the memory barriers to
    ascertain that the all instructions before enabling the Rx or Tx DMA are
    completed which involves the proper setting of the ownership bit in DMA
    descriptors.
    
    Signed-off-by: Deepak Sikri <deepak.sikri@st.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 51b3b68528ee..ea3003edde18 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1212,6 +1212,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
                priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion);
                wmb();
                priv->hw->desc->set_tx_owner(desc);
+               wmb();
        }
 
        /* Interrupt on completition only for the latest segment */
@@ -1227,6 +1228,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
        /* To avoid raise condition */
        priv->hw->desc->set_tx_owner(first);
+       wmb();
 
        priv->cur_tx++;
 
@@ -1290,6 +1292,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
                }
                wmb();
                priv->hw->desc->set_rx_owner(p + entry);
+               wmb();
        }
 }


The first wmb() is bogus, since we don't need any wmb() before
or after setting the own bit on fragments.

The second wmb() is performed after setting the own bit on the first desc
(something that is always done last). This also seems bogus, since there
already was a wmb() just before set_tx_owner(first), and a writel() is
performed shortly after.

The last wmb(), after set_rx_owner(), might actually be needed,
since the commit message refered to problems with RX, and I don't
see any writel() being performed after this.

It is worth noting that the last barrier was changed to a dma_wmb()
by Pavel in: ad688cdbb076 ("stmmac: fix memory barriers").


TL;DL:
I will send a patch that removes the barriers performed after the
last descriptor write for TX, since a writel() is performed
shortly after.

However for RX, since this barrier is performed after setting
the own bit, and there is no writel() performed shortly after,
I don't know if this should be a dma_wmb() or has to be a wmb().


Best regards,
Niklas
Pavel Machek March 8, 2018, 9:05 a.m. UTC | #8
On Wed 2018-03-07 12:42:49, David Miller wrote:
> From: Niklas Cassel <niklas.cassel@axis.com>
> Date: Wed, 7 Mar 2018 18:21:57 +0100
> 
> > Considering this, you can drop/revert:
> > 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO")
> > or perhaps you want me to send a revert?
> 
> You must submit explicit patches to do a revert or any other change.
> 
> > After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the
> > last descriptor word write. You just explained that nothing else is needed
> > after the last descriptor word write, so I actually think that this last
> > barrier is superfluous.
> 
> You don't need one after the last descriptor write.
> 
> Look, you're only concerned with ordering within the descriptor writes.
> 
> So it's only about:
> 
> 	desc->a = x;
> 
> 	/* Write to 'a' must be visible to the hardware before 'b'. */
> 	dma_wmb();
> 	desc->b = y;
> 
> 	writel();
> 
> That's all that you need.

We may need to fix the docs then, there's wmb() there in the docs:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/memory-barriers.txt?h=v4.16-rc1#n1913

									Pavel
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3b5e7b06e796..6dd04f237b2a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2997,7 +2997,7 @@  static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * descriptor and then barrier is needed to make sure that
 	 * all is coherent before granting the DMA engine.
 	 */
-	dma_wmb();
+	wmb();
 
 	if (netif_msg_pktdata(priv)) {
 		pr_info("%s: curr=%d dirty=%d f=%d, e=%d, f_p=%p, nfrags %d\n",
@@ -3221,7 +3221,7 @@  static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		 * descriptor and then barrier is needed to make sure that
 		 * all is coherent before granting the DMA engine.
 		 */
-		dma_wmb();
+		wmb();
 	}
 
 	netdev_tx_sent_queue(netdev_get_tx_queue(dev, queue), skb->len);