diff mbox

[Q] How to invalidate ARP cache for a network device from within kernel

Message ID 1290821143.4145.3.camel@maxim-laptop
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Maxim Levitsky Nov. 27, 2010, 1:25 a.m. UTC
On Sat, 2010-11-27 at 02:18 +0100, Stefan Richter wrote:
> On Nov 26 Maxim Levitsky wrote:
> > However as soon as bus reset happens, the upper layer ARP cache isn't
> > invalidated, thus all attempts to send packets to remote node now fail,
> > because the additional information (node id and bus address) about
> > remote node is now invalid, but ARP core doesn't send ARP requests
> > because it has the response in the cache.
> 
> When is this a problem?  With nodes which stay on the bus (i.e. are
> present before and after the bus reset)?  Or with nodes which go away
> and come back much later (but before the old ARP cache entry was cleaned
> out)?
Its about later.
A node that disconnects and connects after 5 seconds for example or 20
seconds.
ARP timeout is I think 30 seconds or even more.

Btw I already solved that problem.
Patches attached.

With this and all great patches from you and Clemens, the firewire
networking strongly resembles ethernet in terms of speed and
reliability.
(When I resume from ram my desktop, connection restores after less that
5 seconds).

On laptop I still see few issues on s2ram cycle, I am tackling them now.

Best regards,
	Maxim Levitsky

Comments

Stefan Richter Nov. 27, 2010, 8:33 a.m. UTC | #1
On Nov 27 Maxim Levitsky wrote:
> Subject: [PATCH 1/3] firewire: net: restart ISO channel on bus resets
> 
> ---
>  drivers/firewire/net.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
> index 1a467a9..007969c 100644
> --- a/drivers/firewire/net.c
> +++ b/drivers/firewire/net.c
> @@ -1593,10 +1593,15 @@ static void fwnet_update(struct fw_unit *unit)
>  {
>  	struct fw_device *device = fw_parent_device(unit);
>  	struct fwnet_peer *peer = dev_get_drvdata(&unit->device);
> +	struct fwnet_device *dev = peer->dev;
>  	int generation;
>  
>  	generation = device->generation;
>  
> +	fw_iso_context_stop(dev->broadcast_rcv_context);
> +	fw_iso_context_start(dev->broadcast_rcv_context, -1, 0,
> +			FW_ISO_CONTEXT_MATCH_ALL_TAGS);
> +
>  	spin_lock_irq(&peer->dev->lock);
>  	peer->node_id    = device->node_id;
>  	peer->generation = generation;

Could you add a changelog?

And it can be optimized to do only once per bus generation.  E.g. add a
generation field for the IR context in fwnet_device.  Or do it only if
fw_unit is that of the local node.

OTOH, is this actually necessary on normal bus resets?  It should only
be necessary after PM resume, right?  If so, perhaps do it only if the
generation increased by more than one.

Also, wouldn't a fw_iso_context_start alone, without prior stop,
suffice?
Stefan Richter Nov. 27, 2010, 2:13 p.m. UTC | #2
On Nov 27 Maxim Levitsky wrote:
> > > However as soon as bus reset happens, the upper layer ARP cache
> > > isn't invalidated, thus all attempts to send packets to remote
> > > node now fail, because the additional information (node id and
> > > bus address) about remote node is now invalid, but ARP core
> > > doesn't send ARP requests because it has the response in the
> > > cache.  
> > 
> > When is this a problem?  With nodes which stay on the bus (i.e. are
> > present before and after the bus reset)?  Or with nodes which go
> > away and come back much later (but before the old ARP cache entry
> > was cleaned out)?  
> Its about later.
> A node that disconnects and connects after 5 seconds for example or 20
> seconds.
> ARP timeout is I think 30 seconds or even more.
> 
> Btw I already solved that problem.
> Patches attached.
[...]
> Subject: [PATCH 2/3] NET: ARP: allow to invalidate specific ARP entries
> 
> IPv4 over firewire needs to be able to remove ARP entries
> from cache that belong to nodes that are removed, because
> IPv4 over firewire uses ARP packets for private information
> about nodes.
> 
> This information becames invalid on node removal, thus
> as soon as it is connected again, ARP packet should be sent
> to it which is not done due to valid cache entry.
> 
> CC: netdev@vger.kernel.org
> Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> ---
>  include/net/arp.h |    1 +
>  net/ipv4/arp.c    |   29 ++++++++++++++++++-----------
>  2 files changed, 19 insertions(+), 11 deletions(-)

[...]

> Subject: [PATCH 3/3] firewire: net: invalidate ARP entries for
> removed nodes.
> 
> This allows to be able to connect to nodes that disappered
> from the bus and after some time appeared again.
> 
> Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> ---
>  drivers/firewire/net.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)

I wonder if this is the right approach.

Suppose somebody implements IPv6 over 1394 (RFC 3146) which uses
Neighbour Discovery (RFC 2461).  What are we going to do then to solve
the very same problem?

(Is it a problem at all?  There is just an annoying period of 30
seconds or so during which packets are dropped.  And that period
starts when the cable was pulled or the remote node PM-suspended or a
hub powered down or the likes.)

Anyhow.  I suspect eth1394's/ firewire-net's neighbour (fwnet_peer)
management is lacking.  Consider this example session between
Linux/firewire-net and OS X.

1.) Plug them together, ifup on Linux.  On the Linux node, the local
node is fw5 and the remote OS X node is fw9.

2.) On OS X, don't start any user action on the FireWire networking
interface.  On Linux, start pinging the remote node.  Ping gets replies.

3.) Unplug the cable.  Ping's requests are being dropped from now on.
There is a bit of log spam until firewire-core releases the fw9
fw_device instance, which includes that firewire-net removes the
corresponding fwnet_peer instance:
Nov 27 12:17:15 stein kernel: firewire_net: fwnet_write_complete: failed: 13
Nov 27 12:17:16 stein kernel: firewire_net: fwnet_write_complete: failed: 13

4.) Plug the cable back in a few seconds later.  Resulting dmesg:
Nov 27 12:17:19 stein kernel: firewire_core: skipped bus generations, destroying all nodes
Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
Nov 27 12:17:20 stein kernel: firewire_core: rediscovered device fw5
Nov 27 12:17:20 stein kernel: firewire_core: phy config: card 2, new root=ffc1, gap_count=5
Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
Nov 27 12:17:21 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
Nov 27 12:17:21 stein kernel: firewire_net: No peer for ARP packet from
0017f2fffe66fb80 Nov 27 12:17:21 stein kernel: firewire_net: No peer
for ARP packet from 0017f2fffe66fb80 Nov 27 12:17:22 stein kernel:
firewire_net: No peer for ARP packet from 0017f2fffe66fb80 Nov 27
12:17:23 stein kernel: firewire_core: created device fw9: GUID
0017f2fffe66fb80, S400, 1 config ROM retries

5.) At this point, ping's requests are still being dropped.

6.) A whole while later, ping is back in business again, obviously
because the old ARP entry was cleared and a new ARP request--response
was performed.

We learn two things from that:

  - OS X sends gratuitous ARP messages.  Maybe that's Zeroconf (RFC
    3927), or maybe that's just part of their RFC 2734 driver.
    There seem to be consistently nine of such messages sent within a
    period of 3 or 4 seconds, starting almost immediately after
    self-ID-complete after cable replug.

  - fwnet_probe, which adds the fwnet_peer instance that pertains to
    fw9, is performed just a little bit too late to match one of those
    ARP packets with an fwnet_peer instance.

Should firewire-net send gratuitous ARP messages too?  I.e., in
fwnet_probe, if the interface is up, send an ARP Request packet which
solicits a response.  Likewise, if/when IPv6-over-1394 is implemented,
let fwnet_probe send a Neighbour Solicitation packet.  ---  In effect,
this means that we would not add EXPORT_SYMBOL(arp_invalidate) and,
perspectively, EXPORT_SYMBOL(ndisc_invalidate), and call those when a
node went away.  Instead, we solicit an ARP Response or a Neighbor
Advertisement when a node joined us and let that response or
advertisement update the ARP cache or NDP cache.

The question is, is the link-layer driver firewire-net a proper place
to call arp_send() and ndisc_send_ns()?

And is this any better than a new arp_invalidate() and
ndisc_invalidate()?

----

On a loosely related note, after looking at 1394 AR and at NDP,
shouldn't we rather set
	net_device.addr_len = 16
and
	net_device.dev_addr = concatenation of EUI-64, max_rec, spd,
	                      and unicast_FIFO
?
Maxim Levitsky Nov. 27, 2010, 2:33 p.m. UTC | #3
On Sat, 2010-11-27 at 15:13 +0100, Stefan Richter wrote:
> On Nov 27 Maxim Levitsky wrote:
> > > > However as soon as bus reset happens, the upper layer ARP cache
> > > > isn't invalidated, thus all attempts to send packets to remote
> > > > node now fail, because the additional information (node id and
> > > > bus address) about remote node is now invalid, but ARP core
> > > > doesn't send ARP requests because it has the response in the
> > > > cache.  
> > > 
> > > When is this a problem?  With nodes which stay on the bus (i.e. are
> > > present before and after the bus reset)?  Or with nodes which go
> > > away and come back much later (but before the old ARP cache entry
> > > was cleaned out)?  
> > Its about later.
> > A node that disconnects and connects after 5 seconds for example or 20
> > seconds.
> > ARP timeout is I think 30 seconds or even more.
> > 
> > Btw I already solved that problem.
> > Patches attached.
> [...]
> > Subject: [PATCH 2/3] NET: ARP: allow to invalidate specific ARP entries
> > 
> > IPv4 over firewire needs to be able to remove ARP entries
> > from cache that belong to nodes that are removed, because
> > IPv4 over firewire uses ARP packets for private information
> > about nodes.
> > 
> > This information becames invalid on node removal, thus
> > as soon as it is connected again, ARP packet should be sent
> > to it which is not done due to valid cache entry.
> > 
> > CC: netdev@vger.kernel.org
> > Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> > ---
> >  include/net/arp.h |    1 +
> >  net/ipv4/arp.c    |   29 ++++++++++++++++++-----------
> >  2 files changed, 19 insertions(+), 11 deletions(-)
> 
> [...]
> 
> > Subject: [PATCH 3/3] firewire: net: invalidate ARP entries for
> > removed nodes.
> > 
> > This allows to be able to connect to nodes that disappered
> > from the bus and after some time appeared again.
> > 
> > Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> > ---
> >  drivers/firewire/net.c |    7 +++++++
> >  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> I wonder if this is the right approach.
> 
> Suppose somebody implements IPv6 over 1394 (RFC 3146) which uses
> Neighbour Discovery (RFC 2461).  What are we going to do then to solve
> the very same problem?
Well, thats a problem, but firewire is somewhat unique.
I don't image any other networking transport to be protocol dependent.

> 
> (Is it a problem at all?  There is just an annoying period of 30
> seconds or so during which packets are dropped.  And that period
> starts when the cable was pulled or the remote node PM-suspended or a
> hub powered down or the likes.)
It is somewhat a problem, if you for example suspend a system by mistake
and on resume you need to wait too much.
It is annoying.


> 
> Anyhow.  I suspect eth1394's/ firewire-net's neighbour (fwnet_peer)
> management is lacking.  Consider this example session between
> Linux/firewire-net and OS X.
> 
> 1.) Plug them together, ifup on Linux.  On the Linux node, the local
> node is fw5 and the remote OS X node is fw9.
> 
> 2.) On OS X, don't start any user action on the FireWire networking
> interface.  On Linux, start pinging the remote node.  Ping gets replies.
> 
> 3.) Unplug the cable.  Ping's requests are being dropped from now on.
> There is a bit of log spam until firewire-core releases the fw9
> fw_device instance, which includes that firewire-net removes the
> corresponding fwnet_peer instance:
> Nov 27 12:17:15 stein kernel: firewire_net: fwnet_write_complete: failed: 13
> Nov 27 12:17:16 stein kernel: firewire_net: fwnet_write_complete: failed: 13
> 
> 4.) Plug the cable back in a few seconds later.  Resulting dmesg:
> Nov 27 12:17:19 stein kernel: firewire_core: skipped bus generations, destroying all nodes
> Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
> Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
> Nov 27 12:17:20 stein kernel: firewire_core: rediscovered device fw5
> Nov 27 12:17:20 stein kernel: firewire_core: phy config: card 2, new root=ffc1, gap_count=5
> Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
> Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
> Nov 27 12:17:20 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
> Nov 27 12:17:21 stein kernel: firewire_net: No peer for ARP packet from 0017f2fffe66fb80
> Nov 27 12:17:21 stein kernel: firewire_net: No peer for ARP packet from
> 0017f2fffe66fb80 Nov 27 12:17:21 stein kernel: firewire_net: No peer
> for ARP packet from 0017f2fffe66fb80 Nov 27 12:17:22 stein kernel:
> firewire_net: No peer for ARP packet from 0017f2fffe66fb80 Nov 27
> 12:17:23 stein kernel: firewire_core: created device fw9: GUID
> 0017f2fffe66fb80, S400, 1 config ROM retries
> 
> 5.) At this point, ping's requests are still being dropped.
> 
> 6.) A whole while later, ping is back in business again, obviously
> because the old ARP entry was cleared and a new ARP request--response
> was performed.
> 
> We learn two things from that:
> 
>   - OS X sends gratuitous ARP messages.  Maybe that's Zeroconf (RFC
>     3927), or maybe that's just part of their RFC 2734 driver.
>     There seem to be consistently nine of such messages sent within a
>     period of 3 or 4 seconds, starting almost immediately after
>     self-ID-complete after cable replug.
> 
>   - fwnet_probe, which adds the fwnet_peer instance that pertains to
>     fw9, is performed just a little bit too late to match one of those
>     ARP packets with an fwnet_peer instance.
Which means that even if we teach firewire-net to send ARP requests,
these won't be handled by other side that runs firewire-net too.
Of course 

> 
> Should firewire-net send gratuitous ARP messages too?  I.e., in
> fwnet_probe, if the interface is up, send an ARP Request packet which
> solicits a response.  Likewise, if/when IPv6-over-1394 is implemented,
> let fwnet_probe send a Neighbour Solicitation packet.  ---  In effect,
> this means that we would not add EXPORT_SYMBOL(arp_invalidate) and,
> perspectively, EXPORT_SYMBOL(ndisc_invalidate), and call those when a
> node went away.  Instead, we solicit an ARP Response or a Neighbor
> Advertisement when a node joined us and let that response or
> advertisement update the ARP cache or NDP cache.
I am not against that at all.
Clearning the cache seemed just to be very robust and solve a root case.
This is less robust solution (which you even proved because OSX does
it...)


> 
> The question is, is the link-layer driver firewire-net a proper place
> to call arp_send() and ndisc_send_ns()?
> 
> And is this any better than a new arp_invalidate() and
> ndisc_invalidate()?
That what I am not sure at all.
I can bypass arp_send, and just create a 1394 ARP packet and send it
using fw_request.
But doing that as I did seemed to be also quite simple.
It is protocol depedent but that is firewire  fault not mine.

> 
> ----
> 
> On a loosely related note, after looking at 1394 AR and at NDP,
> shouldn't we rather set
> 	net_device.addr_len = 16
> and
> 	net_device.dev_addr = concatenation of EUI-64, max_rec, spd,
> 	                      and unicast_FIFO
> ?
The problem is that except GUID, the rest can change.
And hardware addresses should be fixed.


Best regards,
	Maxim Levitsky

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Richter Nov. 27, 2010, 3:19 p.m. UTC | #4
On Nov 27 Stefan Richter wrote:
> > @@ -1593,10 +1593,15 @@ static void fwnet_update(struct fw_unit
> > *unit) {
> >  	struct fw_device *device = fw_parent_device(unit);
> >  	struct fwnet_peer *peer = dev_get_drvdata(&unit->device);
> > +	struct fwnet_device *dev = peer->dev;
> >  	int generation;
> >  
> >  	generation = device->generation;
> >  
> > +	fw_iso_context_stop(dev->broadcast_rcv_context);
> > +	fw_iso_context_start(dev->broadcast_rcv_context, -1, 0,
> > +			FW_ISO_CONTEXT_MATCH_ALL_TAGS);
[...]
> OTOH, is this actually necessary on normal bus resets?  It should only
> be necessary after PM resume, right?  If so, perhaps do it only if the
> generation increased by more than one.

Actually, shouldn't context restart at resume happen in firewire-ohci
rather?  Any other protocol driver which uses isochronous I/O (be it a
kernel driver or userspace driver) would have the same issue, I guess.
Maxim Levitsky Nov. 27, 2010, 3:44 p.m. UTC | #5
On Sat, 2010-11-27 at 16:19 +0100, Stefan Richter wrote:
> On Nov 27 Stefan Richter wrote:
> > > @@ -1593,10 +1593,15 @@ static void fwnet_update(struct fw_unit
> > > *unit) {
> > >  	struct fw_device *device = fw_parent_device(unit);
> > >  	struct fwnet_peer *peer = dev_get_drvdata(&unit->device);
> > > +	struct fwnet_device *dev = peer->dev;
> > >  	int generation;
> > >  
> > >  	generation = device->generation;
> > >  
> > > +	fw_iso_context_stop(dev->broadcast_rcv_context);
> > > +	fw_iso_context_start(dev->broadcast_rcv_context, -1, 0,
> > > +			FW_ISO_CONTEXT_MATCH_ALL_TAGS);
> [...]
> > OTOH, is this actually necessary on normal bus resets?  It should only
> > be necessary after PM resume, right?  If so, perhaps do it only if the
> > generation increased by more than one.
> 
> Actually, shouldn't context restart at resume happen in firewire-ohci
> rather?  Any other protocol driver which uses isochronous I/O (be it a
> kernel driver or userspace driver) would have the same issue, I guess.
Well, the spec says that upon bus reset ISO channels must be allocated
again.
This somewhat mimics that, although it can be done better.

Of course the same true after suspend/resume.

Best regards,
	Maxim Levitsky

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From f284b2644694797f42df9df94cc6ccbaa17155ca Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <maximlevitsky@gmail.com>
Date: Sat, 27 Nov 2010 00:35:04 +0200
Subject: [PATCH 3/3] firewire: net: invalidate ARP entries for removed nodes.

This allows to be able to connect to nodes that disappered
from the bus and after some time appeared again.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
 drivers/firewire/net.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
index 007969c..bb7939a 100644
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -189,6 +189,7 @@  struct fwnet_peer {
 	struct fwnet_device *dev;
 	u64 guid;
 	u64 fifo;
+	__be32 ip;
 
 	/* guarded by dev->lock */
 	struct list_head pd_list; /* received partial datagrams */
@@ -568,6 +569,8 @@  static int fwnet_finish_incoming_packet(struct net_device *net,
 				peer->speed = sspd;
 			if (peer->max_payload > max_payload)
 				peer->max_payload = max_payload;
+
+			peer->ip = arp1394->sip;
 		}
 		spin_unlock_irqrestore(&dev->lock, flags);
 
@@ -1443,6 +1446,7 @@  static int fwnet_add_peer(struct fwnet_device *dev,
 	peer->dev = dev;
 	peer->guid = (u64)device->config_rom[3] << 32 | device->config_rom[4];
 	peer->fifo = FWNET_NO_FIFO_ADDR;
+	peer->ip = 0;
 	INIT_LIST_HEAD(&peer->pd_list);
 	peer->pdg_size = 0;
 	peer->datagram_label = 0;
@@ -1558,6 +1562,9 @@  static int fwnet_remove(struct device *_dev)
 
 	mutex_lock(&fwnet_device_mutex);
 
+	if (dev->netdev && peer->ip)
+		arp_invalidate(dev->netdev, peer->ip);
+
 	fwnet_remove_peer(peer);
 
 	if (list_empty(&dev->peer_list)) {
-- 
1.7.1