diff mbox

[net,3/3] r8169: increase the lifespan of the hardware counters dump area.

Message ID 20150907144412.GA26524@calimero.vinschen.de
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Corinna Vinschen Sept. 7, 2015, 2:44 p.m. UTC
On Sep  6 12:37, Corinna Vinschen wrote:
> On Sep  5 14:18, romieu@fr.zoreil.com wrote:
> > From: Francois Romieu <romieu@fr.zoreil.com>
> > 
> > net/core/net-sysfs.c::netstat_show retrieves stats with spinlock held.
> > 
> > This change avoids sleepable allocation and performs some housekeeping:
> > - receive ring, transmit ring and counters dump area allocation failures
> >   are all considered fatal during open()
> > - netif_warn is now redundant with rtl_reset_counters_cond built-in
> >   failure message.
> > - rtl_reset_counters_cond induced failures in open() are also considered
> >   fatal: it takes acceptable work to unwind comfortably.
> 
> Why?  The counter reset is not a fatal condition for the operation of
> the NIC.  Even if the counters show incorrect values, the normal
> operation of the NIC is not affected.  And to top that off, even if
> resetting the counters actually doesn't work, the driver keeps the
> offset values, so a failed reset won't even harm the statistics.  It
> just doesn't make sense to break the NIC operation for such a minor
> problem.  And worse:
> 
> > -static bool rtl8169_reset_counters(struct net_device *dev)
> > +static int rtl8169_reset_counters(struct net_device *dev)
> >  {
> >  	struct rtl8169_private *tp = netdev_priv(dev);
> > -	struct rtl8169_counters *counters;
> > -	dma_addr_t paddr;
> > -	bool ret = true;
> >  
> >  	/*
> >  	 * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
> >  	 * tally counters.
> >  	 */
> >  	if (tp->mac_version < RTL_GIGA_MAC_VER_19)
> > -		return true;
> > -
> > -	counters = rtl8169_map_counters(dev, &paddr, CounterReset);
> > -	if (!counters)
> > -		return false;
> > -
> > -	if (!rtl_udelay_loop_wait_low(tp, &rtl_reset_counters_cond, 10, 1000))
> > -		ret = false;
> > -
> > -	rtl8169_unmap_counters(dev, paddr, counters);
> > +		return -EINVAL;
> 
> This returns -EINVAL even for older chip versions which are not capable
> of resetting the counters.  The result is, this driver will not work at
> all on chip versions prior to RTL_GIGA_MAC_VER_19 anymore, because
> rtl_open will always fail.
> 
> Other then that, I can test the patch probably tomorrow.

I have a bit of a problem with this patch.  It crashes when loading the
driver manually w/ modprobe.  For some reason dev_get_stats is called
during rtl_init_one and at that time the counters pointer is NULL, so
the kernel gets a SEGV.  After I worked around that I got a SEGV in
rtl_remove_one, which I still have to find out why.  I didn't test with
the latest kernel, though, so I still have to check if that's the reason
for the crashes.  That takes a bit longer, I just wanted to let you
know.

There's also a problem with rtl8169_cmd_counters:  It always calls
rtl_udelay_loop_wait_low w/ rtl_reset_counters_cond, even when called
from rtl8169_update_counters, where it should call
rtl_udelay_loop_wait_low w/ rtl_counters_cond to check for the
CounterDump flag, rather than for the CounterReset flag.

For now I applied the below patch, but I think it's a bit ugly to
conditionalize the rtl_cond struct by the incoming flag value.


Corinna

Comments

Francois Romieu Sept. 7, 2015, 9:52 p.m. UTC | #1
Corinna Vinschen <vinschen@redhat.com> :
[...]
> I have a bit of a problem with this patch.  It crashes when loading the
> driver manually w/ modprobe.  For some reason dev_get_stats is called
> during rtl_init_one and at that time the counters pointer is NULL, so
> the kernel gets a SEGV.
>
>  After I worked around that I got a SEGV in
> rtl_remove_one, which I still have to find out why.  I didn't test with
> the latest kernel, though, so I still have to check if that's the reason
> for the crashes.  That takes a bit longer, I just wanted to let you
> know.

Call me stupid: I forgot that it's perfectly fine to request the stats
of a down interface. :o/

Updated patch is on the way.

> There's also a problem with rtl8169_cmd_counters:  It always calls
> rtl_udelay_loop_wait_low w/ rtl_reset_counters_cond, even when called
> from rtl8169_update_counters, where it should call
> rtl_udelay_loop_wait_low w/ rtl_counters_cond to check for the
> CounterDump flag, rather than for the CounterReset flag.
> 
> For now I applied the below patch, but I think it's a bit ugly to
> conditionalize the rtl_cond struct by the incoming flag value.

<captain obvious>

Let's check both at once:

	return RTL_R32(CounterAddrLow) & (CounterDump | CounterReset);

</captain obvious>

Thanks for reviewing.
Corinna Vinschen Sept. 8, 2015, 8:09 a.m. UTC | #2
On Sep  7 23:52, Francois Romieu wrote:
> Corinna Vinschen <vinschen@redhat.com> :
> [...]
> > I have a bit of a problem with this patch.  It crashes when loading the
> > driver manually w/ modprobe.  For some reason dev_get_stats is called
> > during rtl_init_one and at that time the counters pointer is NULL, so
> > the kernel gets a SEGV.
> >
> >  After I worked around that I got a SEGV in
> > rtl_remove_one, which I still have to find out why.  I didn't test with
> > the latest kernel, though, so I still have to check if that's the reason
> > for the crashes.  That takes a bit longer, I just wanted to let you
> > know.
> 
> Call me stupid: I forgot that it's perfectly fine to request the stats
> of a down interface. :o/
> 
> Updated patch is on the way.
> 
> > There's also a problem with rtl8169_cmd_counters:  It always calls
> > rtl_udelay_loop_wait_low w/ rtl_reset_counters_cond, even when called
> > from rtl8169_update_counters, where it should call
> > rtl_udelay_loop_wait_low w/ rtl_counters_cond to check for the
> > CounterDump flag, rather than for the CounterReset flag.
> > 
> > For now I applied the below patch, but I think it's a bit ugly to
> > conditionalize the rtl_cond struct by the incoming flag value.
> 
> <captain obvious>
> 
> Let's check both at once:
> 
> 	return RTL_R32(CounterAddrLow) & (CounterDump | CounterReset);

If you're sure that's valid, then why not?  It's certainly cleaner
code.


Corinna
diff mbox

Patch

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ae07567..cd7adbf 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2200,6 +2200,13 @@  DECLARE_RTL_COND(rtl_reset_counters_cond)
 	return RTL_R32(CounterAddrLow) & CounterReset;
 }
 
+DECLARE_RTL_COND(rtl_counters_cond)
+{
+	void __iomem *ioaddr = tp->mmio_addr;
+
+	return RTL_R32(CounterAddrLow) & CounterDump;
+}
+
 static int rtl8169_cmd_counters(struct net_device *dev, u32 counter_cmd)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -2212,7 +2219,10 @@  static int rtl8169_cmd_counters(struct net_device *dev, u32 counter_cmd)
 	RTL_W32(CounterAddrLow, cmd);
 	RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-	return rtl_udelay_loop_wait_low(tp, &rtl_reset_counters_cond, 10, 1000);
+	return rtl_udelay_loop_wait_low(tp,
+		counter_cmd == CounterDump ? &rtl_counters_cond :
+					     &rtl_reset_counters_cond,
+		10, 1000);
 }
 
 static int rtl8169_reset_counters(struct net_device *dev)
@@ -2229,13 +2239,6 @@  static int rtl8169_reset_counters(struct net_device *dev)
 	return rtl8169_cmd_counters(dev, CounterReset);
 }
 
-DECLARE_RTL_COND(rtl_counters_cond)
-{
-	void __iomem *ioaddr = tp->mmio_addr;
-
-	return RTL_R32(CounterAddrLow) & CounterDump;
-}
-
 static int rtl8169_update_counters(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -7800,8 +7803,11 @@  rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 
 	/*
 	 * Fetch additonal counter values missing in stats collected by driver
-	 * from tally counters.
+	 * from tally counters, but only if we're already initialized.
 	 */
+	if (!counters)
+	  return stats;
+
 	rtl8169_update_counters(dev);
 
 	/*