Message ID | 20180226214709.4359-3-niklas.cassel@axis.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | stmmac barrier fixes and cleanup | expand |
Hi! Thanks for doing the detective work! > This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only > used to guarantee the ordering, with respect to other writes, > to cache coherent DMA memory. Could you explain this a bit more (and perhaps in code comment)? Ensuring other writes are done before writing the "GO!" bit should be enough, no? (If it is not, do we need heavier barriers in other places, too?) Best regards, Pavel
From: Pavel Machek <pavel@ucw.cz> Date: Fri, 2 Mar 2018 10:20:00 +0100 >> This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only >> used to guarantee the ordering, with respect to other writes, >> to cache coherent DMA memory. > > Could you explain this a bit more (and perhaps in code comment)? > > Ensuring other writes are done before writing the "GO!" bit should be > enough, no? Indeed, the chip should never look at the descriptor contents unless the GO bit is set. If there are ways that it can, this must be explained and documented since it is quite unusual compared to other hardware. > (If it is not, do we need heavier barriers in other places, too?) Right.
On Fri, Mar 02, 2018 at 09:54:11AM -0500, David Miller wrote: > From: Pavel Machek <pavel@ucw.cz> > Date: Fri, 2 Mar 2018 10:20:00 +0100 > Hello Pavel, David > >> This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only > >> used to guarantee the ordering, with respect to other writes, > >> to cache coherent DMA memory. > > > > Could you explain this a bit more (and perhaps in code comment)? Have a look at: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/memory-barriers.txt?h=v4.16-rc1#n1913 AFAICT, a dma_wmb() can only be used to guarantee that the writes to cache coherent memory (e.g. memory allocated with dma_alloc_coherent()) before the dma_wmb() will be performed before the writes to cache coherent memory after the dma_wmb(). Since most of our writes are simply writing new buffer addresses and sizes to TDES0/TDES1/TDES2/TDES3, and since these TX DMA descriptors have been allocated with dma_alloc_coherent(), a dma_wmb() should be enough to e.g. make sure that TDES3 (which contains the OWN bit), is written after the writes to TDES0/TDES1/TDES2. However, the last write we do is "DMA start transmission", this is a register in the IP, i.e. it is a write to the cache incoherent MMIO region (rather than a write to cache coherent memory). To ensure that all writes to cache coherent memory have completed before we start the DMA, we have to use the barrier wmb() (which performs a more extensive flush compared to dma_wmb()). So the only place where we have to use a wmb() instead of a dma_wmb() is where we have a write to coherent memory, followed by a write to cache incoherent MMIO. The only obvious place where we have this situtation is where we write the OWN bit immediately followed by a write to the "DMA start transmission" register. Note that this also matches how it's done in other other drivers: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/amd/xgbe/xgbe-dev.c?h=v4.16-rc1#n1638 There is already a comment describing the barrier in stmmac_xmit() and stmmac_tso_xmit() that says: /* The own bit must be the latest setting done when prepare the * descriptor and then barrier is needed to make sure that * all is coherent before granting the DMA engine. */ However, if you want, we could mention wmb() explicitly in this comment. > > > > Ensuring other writes are done before writing the "GO!" bit should be > > enough, no? > > Indeed, the chip should never look at the descriptor contents unless > the GO bit is set. > > If there are ways that it can, this must be explained and documented > since it is quite unusual compared to other hardware. > > > (If it is not, do we need heavier barriers in other places, too?) > > Right. I hope that my explaination above has cleared any potential confusion. Best regards, Niklas
From: Niklas Cassel <niklas.cassel@axis.com> Date: Sat, 3 Mar 2018 00:28:53 +0100 > However, the last write we do is "DMA start transmission", > this is a register in the IP, i.e. it is a write to the cache > incoherent MMIO region (rather than a write to cache coherent memory). > To ensure that all writes to cache coherent memory have > completed before we start the DMA, we have to use the barrier > wmb() (which performs a more extensive flush compared to > dma_wmb()). The is an implicit memory barrier between physical memory writes and those to MMIO register space. So as long as you place the dma_wmb() to ensure the correct ordering within the descriptor words, you need nothing else after the last descriptor word write.
On Wed, Mar 07, 2018 at 10:32:26AM -0500, David Miller wrote: > From: Niklas Cassel <niklas.cassel@axis.com> > Date: Sat, 3 Mar 2018 00:28:53 +0100 > > > However, the last write we do is "DMA start transmission", > > this is a register in the IP, i.e. it is a write to the cache > > incoherent MMIO region (rather than a write to cache coherent memory). > > To ensure that all writes to cache coherent memory have > > completed before we start the DMA, we have to use the barrier > > wmb() (which performs a more extensive flush compared to > > dma_wmb()). > > The is an implicit memory barrier between physical memory writes > and those to MMIO register space. > > So as long as you place the dma_wmb() to ensure the correct > ordering within the descriptor words, you need nothing else > after the last descriptor word write. Hello David, Looking at writel() in e.g. arch/arm/include/asm/io.h: #define writel(v,c) ({ __iowmb(); writel_relaxed(v,c); }) it indeed has a __iowmb() (which is defined as a wmb()) in its definition. Is is safe to assume that this is true for all archs? If so, perhaps the example at: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/memory-barriers.txt?h=v4.16-rc1#n1913 Should be updated. Considering this, you can drop/revert: 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO") or perhaps you want me to send a revert? After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the last descriptor word write. You just explained that nothing else is needed after the last descriptor word write, so I actually think that this last barrier is superfluous. Best regards, Niklas
From: Niklas Cassel <niklas.cassel@axis.com> Date: Wed, 7 Mar 2018 18:21:57 +0100 > Considering this, you can drop/revert: > 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO") > or perhaps you want me to send a revert? You must submit explicit patches to do a revert or any other change. > After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the > last descriptor word write. You just explained that nothing else is needed > after the last descriptor word write, so I actually think that this last > barrier is superfluous. You don't need one after the last descriptor write. Look, you're only concerned with ordering within the descriptor writes. So it's only about: desc->a = x; /* Write to 'a' must be visible to the hardware before 'b'. */ dma_wmb(); desc->b = y; writel(); That's all that you need.
On Wed, Mar 07, 2018 at 12:42:49PM -0500, David Miller wrote: > From: Niklas Cassel <niklas.cassel@axis.com> > Date: Wed, 7 Mar 2018 18:21:57 +0100 > > > Considering this, you can drop/revert: > > 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO") > > or perhaps you want me to send a revert? > > You must submit explicit patches to do a revert or any other change. > > > After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the > > last descriptor word write. You just explained that nothing else is needed > > after the last descriptor word write, so I actually think that this last > > barrier is superfluous. > > You don't need one after the last descriptor write. > > Look, you're only concerned with ordering within the descriptor writes. > > So it's only about: > > desc->a = x; > > /* Write to 'a' must be visible to the hardware before 'b'. */ > dma_wmb(); > desc->b = y; > > writel(); > > That's all that you need. It seems like the first commit that added a wmb() after set_tx_owner() on first desc (which is be the absolute last mem write) was the following commit: commit 8e83989106562326bfd6aaf92174fe138efd026b Author: Deepak Sikri <deepak.sikri@st.com> Date: Sun Jul 8 21:14:45 2012 +0000 stmmac: Fix for nfs hang on multiple reboot It was observed that during multiple reboots nfs hangs. The status of receive descriptors shows that all the descriptors were in control of CPU, and none were assigned to DMA. Also the DMA status register confirmed that the Rx buffer is unavailable. This patch adds the fix for the same by adding the memory barriers to ascertain that the all instructions before enabling the Rx or Tx DMA are completed which involves the proper setting of the ownership bit in DMA descriptors. Signed-off-by: Deepak Sikri <deepak.sikri@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 51b3b68528ee..ea3003edde18 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1212,6 +1212,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion); wmb(); priv->hw->desc->set_tx_owner(desc); + wmb(); } /* Interrupt on completition only for the latest segment */ @@ -1227,6 +1228,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) /* To avoid raise condition */ priv->hw->desc->set_tx_owner(first); + wmb(); priv->cur_tx++; @@ -1290,6 +1292,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv) } wmb(); priv->hw->desc->set_rx_owner(p + entry); + wmb(); } } The first wmb() is bogus, since we don't need any wmb() before or after setting the own bit on fragments. The second wmb() is performed after setting the own bit on the first desc (something that is always done last). This also seems bogus, since there already was a wmb() just before set_tx_owner(first), and a writel() is performed shortly after. The last wmb(), after set_rx_owner(), might actually be needed, since the commit message refered to problems with RX, and I don't see any writel() being performed after this. It is worth noting that the last barrier was changed to a dma_wmb() by Pavel in: ad688cdbb076 ("stmmac: fix memory barriers"). TL;DL: I will send a patch that removes the barriers performed after the last descriptor write for TX, since a writel() is performed shortly after. However for RX, since this barrier is performed after setting the own bit, and there is no writel() performed shortly after, I don't know if this should be a dma_wmb() or has to be a wmb(). Best regards, Niklas
On Wed 2018-03-07 12:42:49, David Miller wrote: > From: Niklas Cassel <niklas.cassel@axis.com> > Date: Wed, 7 Mar 2018 18:21:57 +0100 > > > Considering this, you can drop/revert: > > 95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and MMIO") > > or perhaps you want me to send a revert? > > You must submit explicit patches to do a revert or any other change. > > > After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the > > last descriptor word write. You just explained that nothing else is needed > > after the last descriptor word write, so I actually think that this last > > barrier is superfluous. > > You don't need one after the last descriptor write. > > Look, you're only concerned with ordering within the descriptor writes. > > So it's only about: > > desc->a = x; > > /* Write to 'a' must be visible to the hardware before 'b'. */ > dma_wmb(); > desc->b = y; > > writel(); > > That's all that you need. We may need to fix the docs then, there's wmb() there in the docs: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/memory-barriers.txt?h=v4.16-rc1#n1913 Pavel
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 3b5e7b06e796..6dd04f237b2a 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2997,7 +2997,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) * descriptor and then barrier is needed to make sure that * all is coherent before granting the DMA engine. */ - dma_wmb(); + wmb(); if (netif_msg_pktdata(priv)) { pr_info("%s: curr=%d dirty=%d f=%d, e=%d, f_p=%p, nfrags %d\n", @@ -3221,7 +3221,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) * descriptor and then barrier is needed to make sure that * all is coherent before granting the DMA engine. */ - dma_wmb(); + wmb(); } netdev_tx_sent_queue(netdev_get_tx_queue(dev, queue), skb->len);
The last memory barrier in stmmac_xmit()/stmmac_tso_xmit() is placed between a coherent memory write and a MMIO write: The own bit is written in First Desc (TSO: MSS desc or First Desc). <barrier> The DMA engine is started by a write to the tx desc tail pointer/ enable dma transmission register, i.e. a MMIO write. This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only used to guarantee the ordering, with respect to other writes, to cache coherent DMA memory. To guarantee that the cache coherent memory writes have completed before we attempt to write to the cache incoherent MMIO region, we need to use the more heavyweight barrier wmb(). Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)