Message ID | 1424381068-22252-1-git-send-email-simon@farnz.org.uk |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
Michal's e-mail as listed in MAINTAINERS is bouncing (he's listed as PPPoE maintainer). If anyone has alternate contact details for him, please let him know that his kernel-related e-mail is bouncing. The relevant chunk of MAINTAINERS is: PPP OVER ETHERNET M: Michal Ostrowski <mostrows@earthlink.net> S: Maintained F: drivers/net/ppp/pppoe.c F: drivers/net/ppp/pppox.c If I don't hear from him in the next couple of weeks, I'll send the trivial patch to remove PPP OVER ETHERNET from maintained status. On Thursday 19 February 2015 21:24:28 Simon Farnsworth wrote: > When a PADT frame is received, the socket may not be in a good state to > close down the PPP interface. The current implementation handles this by > simply blocking all further PPP traffic, and hoping that the lack of traffic > will trigger the user to investigate. > > Use schedule_work to get to a process context from which we clear down the > PPP interface, in a fashion analogous to hangup on a TTY-based PPP > interface. This causes pppd to disconnect immediately, and allows tools to > take immediate corrective action. > > Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> > --- > Note that I'm not subscribed to netdev; please cc me on any replies. > > The patch falls out of https://bugzilla.gnome.org/show_bug.cgi?id=742939 > I'm trying to get NetworkManager back to using kernel PPPoE partly because > it performs a little better, and mostly because kernel PPPoE copes with > larger MTUs than userspace PPPoE. > > Dan Williams (cc'd) has tested a previous version of this patch; the > differences to this version are only cosmetic. > > drivers/net/ppp/pppoe.c | 17 ++++++++++++++++- > include/linux/if_pppox.h | 2 ++ > 2 files changed, 18 insertions(+), 1 deletion(-) > <snip patch> -- Simon Farnsworth
(Cc: linux-ppp@vger.kernel.org, mostrows@gmail.com) Hello! Simon Farnsworth schrieb am Fri, 20 Feb 2015 11:17:26 +0000: > Michal's e-mail as listed in MAINTAINERS is bouncing (he's listed as PPPoE > maintainer). I once had the same problem. I finally used mostrows@gmail.com, found elsewhere (see e.g. http://patchwork.ozlabs.org/patch/36932/). Regards,
(Cc: linux-ppp@vger.kernel.org, mostrows@gmail.com) Hello! Simon Farnsworth schrieb am Thu, 19 Feb 2015 21:24:28 +0000: > When a PADT frame is received, the socket may not be in a good state to > close down the PPP interface. The current implementation handles this by > simply blocking all further PPP traffic, and hoping that the lack of traffic > will trigger the user to investigate. > > Use schedule_work to get to a process context from which we clear down the > PPP interface, in a fashion analogous to hangup on a TTY-based PPP > interface. This causes pppd to disconnect immediately, and allows tools to > take immediate corrective action. Tested-by: Christoph Schulz <develop@kristov.de> I tested your patch von top of 3.14.33 and 3.18.7 together with a slightly modified pppd based on Debian's pppd-2.4.6-3 (which is almost identical to pppd 2.4.7). The client now indeed receives PADT feedback immediately. Closing a PPPoE connection on the server side Feb 20 16:45:44 swing local2.notice pppoe-server-wrapper[20154]: circ4: pppoe-server died with exit code 0, cleaning up leads to immediate hangup on the client side: Feb 20 16:45:44 sandbox local2.notice pppd[539]: Modem hangup Formerly, the LCP echo timeout had to be reached. On the server side: Feb 20 14:56:53 swing local2.notice pppoe-server-wrapper[12744]: circ4: pppoe-server died with exit code 0, cleaning up On the client side: Feb 20 14:57:01 sandbox local2.info pppd[15575]: No response to 2 echo-requests Feb 20 14:57:01 sandbox local2.notice pppd[15575]: Serial link appears to be disconnected. Note that the short detection of the terminated link (~ 8 s) stems from very aggressive LCP echo settings (lcp-echo-interval "3", lcp-echo-failure "2"). Typically, these timeouts are much larger to accomodate for bad links. Often the client recognizes a dead link only after two to three minutes (lcp-echo-interval "30", lcp-echo-failure "5"). However, note also that your patch causes pppd (or rather the rp-pppoe plugin) to wonder about the socket closed by the kernel: Feb 20 16:45:44 sandbox local2.err pppd[539]: Failed to disconnect PPPoE socket: 114 Operation already in progress I don't fully understand the code there; it seems that the plugin *connects* the PPPoE session socket in order to *disconnect* it: static void PPPOEDisconnectDevice(void) { struct sockaddr_pppox sp; sp.sa_family = AF_PPPOX; sp.sa_protocol = PX_PROTO_OE; sp.sa_addr.pppoe.sid = 0; memcpy(sp.sa_addr.pppoe.dev, conn->ifName, IFNAMSIZ); memcpy(sp.sa_addr.pppoe.remote, conn->peerEth, ETH_ALEN); if (connect(conn->sessionSocket, (struct sockaddr *) &sp, sizeof(struct sockaddr_pppox)) < 0) error("Failed to disconnect PPPoE socket: %d %m", errno); close(conn->sessionSocket); /* don't send PADT?? */ if (conn->discoverySocket >= 0) close(conn->discoverySocket); } Regards,
On Friday 20 February 2015 17:10:14 Christoph Schulz wrote: > (Cc: linux-ppp@vger.kernel.org, mostrows@gmail.com) > > Hello! > > Simon Farnsworth schrieb am Thu, 19 Feb 2015 21:24:28 +0000: > > > When a PADT frame is received, the socket may not be in a good state to > > close down the PPP interface. The current implementation handles this by > > simply blocking all further PPP traffic, and hoping that the lack of traffic > > will trigger the user to investigate. > > > > Use schedule_work to get to a process context from which we clear down the > > PPP interface, in a fashion analogous to hangup on a TTY-based PPP > > interface. This causes pppd to disconnect immediately, and allows tools to > > take immediate corrective action. > > Tested-by: Christoph Schulz <develop@kristov.de> > <snip success report> > However, note also that your patch causes pppd (or rather the rp-pppoe > plugin) to wonder about the socket closed by the kernel: > > Feb 20 16:45:44 sandbox local2.err pppd[539]: Failed to disconnect > PPPoE socket: 114 Operation already in progress > I assume there's nothing else wrong here, other than pppd complaining? The code doesn't suggest there will be issues if we fail to disconnect. > I don't fully understand the code there; it seems that the plugin > *connects* the PPPoE session socket in order to *disconnect* it: > > static void > PPPOEDisconnectDevice(void) > { > struct sockaddr_pppox sp; > > sp.sa_family = AF_PPPOX; > sp.sa_protocol = PX_PROTO_OE; > sp.sa_addr.pppoe.sid = 0; > memcpy(sp.sa_addr.pppoe.dev, conn->ifName, IFNAMSIZ); > memcpy(sp.sa_addr.pppoe.remote, conn->peerEth, ETH_ALEN); > if (connect(conn->sessionSocket, (struct sockaddr *) &sp, > sizeof(struct sockaddr_pppox)) < 0) > error("Failed to disconnect PPPoE socket: %d %m", errno); > close(conn->sessionSocket); > /* don't send PADT?? */ > if (conn->discoverySocket >= 0) > close(conn->discoverySocket); > } The code is trying to disconnect the session by connecting to session 0 (which is invalid) in order to stop data flow. I'll have another look at the kernel code tonight to see if that does anything that close(conn->sessionSocket) won't do - I can't see a good reason for it, though. I suspect this is a straight bug in the rp-pppoe.so plugin. -- Simon Farnsworth
On Thu, 2015-02-19 at 21:24 +0000, Simon Farnsworth wrote: > When a PADT frame is received, the socket may not be in a good state to > close down the PPP interface. The current implementation handles this by > simply blocking all further PPP traffic, and hoping that the lack of traffic > will trigger the user to investigate. > > Use schedule_work to get to a process context from which we clear down the > PPP interface, in a fashion analogous to hangup on a TTY-based PPP > interface. This causes pppd to disconnect immediately, and allows tools to > take immediate corrective action. > > Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> Adding my tested-by since I tested the patch and confirmed it fixes the issue before Simon posted to netdev. Tested-by: Dan Williams <dcbw@redhat.com> > --- > Note that I'm not subscribed to netdev; please cc me on any replies. > > The patch falls out of https://bugzilla.gnome.org/show_bug.cgi?id=742939 > I'm trying to get NetworkManager back to using kernel PPPoE partly because > it performs a little better, and mostly because kernel PPPoE copes with > larger MTUs than userspace PPPoE. > > Dan Williams (cc'd) has tested a previous version of this patch; the > differences to this version are only cosmetic. > > drivers/net/ppp/pppoe.c | 17 ++++++++++++++++- > include/linux/if_pppox.h | 2 ++ > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c > index d2408a5..9c97e9b 100644 > --- a/drivers/net/ppp/pppoe.c > +++ b/drivers/net/ppp/pppoe.c > @@ -455,6 +455,18 @@ out: > return NET_RX_DROP; > } > > +static void pppoe_unbind_sock_work(struct work_struct *work) > +{ > + struct pppox_sock *po = container_of(work, struct pppox_sock, > + proto.pppoe.padt_work); > + struct sock *sk = sk_pppox(po); > + > + lock_sock(sk); > + pppox_unbind_sock(sk); > + release_sock(sk); > + sock_put(sk); > +} > + > /************************************************************************ > * > * Receive a PPPoE Discovery frame. > @@ -500,7 +512,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb, struct net_device *dev, > } > > bh_unlock_sock(sk); > - sock_put(sk); > + if (!schedule_work(&po->proto.pppoe.padt_work)) > + sock_put(sk); > } > > abort: > @@ -613,6 +626,8 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr, > > lock_sock(sk); > > + INIT_WORK(&po->proto.pppoe.padt_work, pppoe_unbind_sock_work); > + > error = -EINVAL; > if (sp->sa_protocol != PX_PROTO_OE) > goto end; > diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h > index aff7ad8..66a7d76 100644 > --- a/include/linux/if_pppox.h > +++ b/include/linux/if_pppox.h > @@ -19,6 +19,7 @@ > #include <linux/netdevice.h> > #include <linux/ppp_channel.h> > #include <linux/skbuff.h> > +#include <linux/workqueue.h> > #include <uapi/linux/if_pppox.h> > > static inline struct pppoe_hdr *pppoe_hdr(const struct sk_buff *skb) > @@ -32,6 +33,7 @@ struct pppoe_opt { > struct pppoe_addr pa; /* what this socket is bound to*/ > struct sockaddr_pppox relay; /* what socket data will be > relayed to (PPPoE relaying) */ > + struct work_struct padt_work;/* Work item for handling PADT */ > }; > > struct pptp_opt { -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello! Simon Farnsworth schrieb am Fri, 20 Feb 2015 16:41:17 +0000: >> Feb 20 16:45:44 sandbox local2.err pppd[539]: Failed to disconnect >> PPPoE socket: 114 Operation already in progress >> > I assume there's nothing else wrong here, other than pppd complaining? The > code doesn't suggest there will be issues if we fail to disconnect. Yes, there are no further problems beside the message. Nevertheless I wanted to mention it because any error message containing something like "Failed to disconnect" may upset the user. Of course, most PPPoE users nowadays use a flat rate for accessing the Internet and need not worry much about a "failed" disconnection, but you never know for sure... Regards, Christoph Schulz -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday 20 February 2015 20:49:23 Christoph Schulz wrote: > Hello! > > Simon Farnsworth schrieb am Fri, 20 Feb 2015 16:41:17 +0000: > >> Feb 20 16:45:44 sandbox local2.err pppd[539]: Failed to disconnect > >> PPPoE socket: 114 Operation already in progress > > > > I assume there's nothing else wrong here, other than pppd complaining? The > > code doesn't suggest there will be issues if we fail to disconnect. > > Yes, there are no further problems beside the message. Nevertheless I > wanted to mention it because any error message containing something > like "Failed to disconnect" may upset the user. Of course, most PPPoE > users nowadays use a flat rate for accessing the Internet and need not > worry much about a "failed" disconnection, but you never know for > sure... > I've now looked at the kernel code; the message is harmless, as closing the socket will also get rid of the associated session. However, I think I should probably do a v2 patch with a clear commit message, so that if someone does bisect down to this commit as a cause of the "error" they're seeing, they'll understand why it's not a problem. I'll also put together a pppd patch to not change session ID during shutdown. Won't be tonight - but that's no bad thing as it leaves time for more reviewers to comment on my kernel patch.
From: Simon Farnsworth <simon@farnz.org.uk> Date: Fri, 20 Feb 2015 21:04:53 +0000 > However, I think I should probably do a v2 patch with a clear commit message, > so that if someone does bisect down to this commit as a cause of the "error" > they're seeing, they'll understand why it's not a problem. Ok then, I'll wait for that v2. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c index d2408a5..9c97e9b 100644 --- a/drivers/net/ppp/pppoe.c +++ b/drivers/net/ppp/pppoe.c @@ -455,6 +455,18 @@ out: return NET_RX_DROP; } +static void pppoe_unbind_sock_work(struct work_struct *work) +{ + struct pppox_sock *po = container_of(work, struct pppox_sock, + proto.pppoe.padt_work); + struct sock *sk = sk_pppox(po); + + lock_sock(sk); + pppox_unbind_sock(sk); + release_sock(sk); + sock_put(sk); +} + /************************************************************************ * * Receive a PPPoE Discovery frame. @@ -500,7 +512,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb, struct net_device *dev, } bh_unlock_sock(sk); - sock_put(sk); + if (!schedule_work(&po->proto.pppoe.padt_work)) + sock_put(sk); } abort: @@ -613,6 +626,8 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr, lock_sock(sk); + INIT_WORK(&po->proto.pppoe.padt_work, pppoe_unbind_sock_work); + error = -EINVAL; if (sp->sa_protocol != PX_PROTO_OE) goto end; diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h index aff7ad8..66a7d76 100644 --- a/include/linux/if_pppox.h +++ b/include/linux/if_pppox.h @@ -19,6 +19,7 @@ #include <linux/netdevice.h> #include <linux/ppp_channel.h> #include <linux/skbuff.h> +#include <linux/workqueue.h> #include <uapi/linux/if_pppox.h> static inline struct pppoe_hdr *pppoe_hdr(const struct sk_buff *skb) @@ -32,6 +33,7 @@ struct pppoe_opt { struct pppoe_addr pa; /* what this socket is bound to*/ struct sockaddr_pppox relay; /* what socket data will be relayed to (PPPoE relaying) */ + struct work_struct padt_work;/* Work item for handling PADT */ }; struct pptp_opt {
When a PADT frame is received, the socket may not be in a good state to close down the PPP interface. The current implementation handles this by simply blocking all further PPP traffic, and hoping that the lack of traffic will trigger the user to investigate. Use schedule_work to get to a process context from which we clear down the PPP interface, in a fashion analogous to hangup on a TTY-based PPP interface. This causes pppd to disconnect immediately, and allows tools to take immediate corrective action. Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> --- Note that I'm not subscribed to netdev; please cc me on any replies. The patch falls out of https://bugzilla.gnome.org/show_bug.cgi?id=742939 I'm trying to get NetworkManager back to using kernel PPPoE partly because it performs a little better, and mostly because kernel PPPoE copes with larger MTUs than userspace PPPoE. Dan Williams (cc'd) has tested a previous version of this patch; the differences to this version are only cosmetic. drivers/net/ppp/pppoe.c | 17 ++++++++++++++++- include/linux/if_pppox.h | 2 ++ 2 files changed, 18 insertions(+), 1 deletion(-)