Message ID | 5232806B.6050601@cn.fujitsu.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Duan Jiong <duanj.fnst@cn.fujitsu.com> Date: Fri, 13 Sep 2013 11:03:07 +0800 > From: Duan Jiong <duanj.fnst@cn.fujitsu.com> > > Do the whole verification and route updating in ndisc > lay and then just call into icmpv6_notify() to notify > the upper protocols. > > Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> This is completely broken, and I believe your patch set fundamentally is too. We absolutely _must_ handle the redirect at the socket level when we are able to, otherwise we cannot specify the mark properly and the mark is an essential part of the key used to find the correct route to work with. I am not applying this patch series until you deal with this deficiency. I am not willing to consider changes which stop using the more precise keying information available from a socket. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 17, 2013 at 08:29:36PM -0400, David Miller wrote: > From: Duan Jiong <duanj.fnst@cn.fujitsu.com> > Date: Fri, 13 Sep 2013 11:03:07 +0800 > > > From: Duan Jiong <duanj.fnst@cn.fujitsu.com> > > > > Do the whole verification and route updating in ndisc > > lay and then just call into icmpv6_notify() to notify > > the upper protocols. > > > > Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> > > This is completely broken, and I believe your patch set fundamentally > is too. > > We absolutely _must_ handle the redirect at the socket level when > we are able to, otherwise we cannot specify the mark properly and > the mark is an essential part of the key used to find the correct > route to work with. > > I am not applying this patch series until you deal with this > deficiency. I am not willing to consider changes which stop using the > more precise keying information available from a socket. Oh, Duan, I am very sorry for not catching this earlier. We use the sk->mark to select the proper routing table where we clone the rt6_info into. And we only get that value out of the sockets. I missed that. We should leave the redirect logic in the socket layer where it is possible. But parts of this series are still valid. We need to fix redirects for tunnels and I do think we can still simplify some code in the error handlers. Thanks for pointing this out, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
于 2013年09月18日 09:39, Hannes Frederic Sowa 写道: > On Tue, Sep 17, 2013 at 08:29:36PM -0400, David Miller wrote: >> From: Duan Jiong <duanj.fnst@cn.fujitsu.com> >> Date: Fri, 13 Sep 2013 11:03:07 +0800 >> >>> From: Duan Jiong <duanj.fnst@cn.fujitsu.com> >>> >>> Do the whole verification and route updating in ndisc >>> lay and then just call into icmpv6_notify() to notify >>> the upper protocols. >>> >>> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> >> >> This is completely broken, and I believe your patch set fundamentally >> is too. >> >> We absolutely _must_ handle the redirect at the socket level when >> we are able to, otherwise we cannot specify the mark properly and >> the mark is an essential part of the key used to find the correct >> route to work with. >> >> I am not applying this patch series until you deal with this >> deficiency. I am not willing to consider changes which stop using the >> more precise keying information available from a socket. > > Oh, Duan, I am very sorry for not catching this earlier. We use the > sk->mark to select the proper routing table where we clone the rt6_info into. > And we only get that value out of the sockets. I missed that. We should leave > the redirect logic in the socket layer where it is possible. > > But parts of this series are still valid. We need to fix redirects for tunnels > and I do think we can still simplify some code in the error handlers. > I got it. Thanks, Duan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 18, 2013 at 09:52:42AM +0800, Duan Jiong wrote: > 于 2013年09月18日 09:39, Hannes Frederic Sowa 写道: > > On Tue, Sep 17, 2013 at 08:29:36PM -0400, David Miller wrote: > >> From: Duan Jiong <duanj.fnst@cn.fujitsu.com> > >> Date: Fri, 13 Sep 2013 11:03:07 +0800 > >> > >>> From: Duan Jiong <duanj.fnst@cn.fujitsu.com> > >>> > >>> Do the whole verification and route updating in ndisc > >>> lay and then just call into icmpv6_notify() to notify > >>> the upper protocols. > >>> > >>> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> > >> > >> This is completely broken, and I believe your patch set fundamentally > >> is too. > >> > >> We absolutely _must_ handle the redirect at the socket level when > >> we are able to, otherwise we cannot specify the mark properly and > >> the mark is an essential part of the key used to find the correct > >> route to work with. > >> > >> I am not applying this patch series until you deal with this > >> deficiency. I am not willing to consider changes which stop using the > >> more precise keying information available from a socket. > > > > Oh, Duan, I am very sorry for not catching this earlier. We use the > > sk->mark to select the proper routing table where we clone the rt6_info into. > > And we only get that value out of the sockets. I missed that. We should leave > > the redirect logic in the socket layer where it is possible. > > > > But parts of this series are still valid. We need to fix redirects for tunnels > > and I do think we can still simplify some code in the error handlers. > > > > I got it. I gave it a bit more thought: RFC 4861 8.3: " Redirect messages apply to all flows that are being sent to a given destination. That is, upon receipt of a Redirect for a Destination Address, all Destination Cache entries to that address should be updated to use the specified next-hop, regardless of the contents of the Flow Label field that appears in the Redirected Header option. " Especially because redirects also help in the on-link determination (same RFC, section 8), I changed my mind and am still in favour of updating it in the ndisc layer. In my opinion we just have to consider all routing tables and apply the update to every one which carries a valid next hop to the source of the redirect (under consideration of the destination). This will be important if we actually try to get linux to correctly implement the ipv6 subnet model (RFC 5942, Section 4 Rule 1). In that case we are not allowed to assume nodes on-link even if they would match the same prefix as a locally configured address. Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
于 2013年09月18日 12:13, Hannes Frederic Sowa 写道: > On Wed, Sep 18, 2013 at 09:52:42AM +0800, Duan Jiong wrote: >> 于 2013年09月18日 09:39, Hannes Frederic Sowa 写道: >>> On Tue, Sep 17, 2013 at 08:29:36PM -0400, David Miller wrote: >>>> From: Duan Jiong <duanj.fnst@cn.fujitsu.com> >>>> Date: Fri, 13 Sep 2013 11:03:07 +0800 >>>> >>>>> From: Duan Jiong <duanj.fnst@cn.fujitsu.com> >>>>> >>>>> Do the whole verification and route updating in ndisc >>>>> lay and then just call into icmpv6_notify() to notify >>>>> the upper protocols. >>>>> >>>>> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> >>>> >>>> This is completely broken, and I believe your patch set fundamentally >>>> is too. >>>> >>>> We absolutely _must_ handle the redirect at the socket level when >>>> we are able to, otherwise we cannot specify the mark properly and >>>> the mark is an essential part of the key used to find the correct >>>> route to work with. >>>> >>>> I am not applying this patch series until you deal with this >>>> deficiency. I am not willing to consider changes which stop using the >>>> more precise keying information available from a socket. >>> >>> Oh, Duan, I am very sorry for not catching this earlier. We use the >>> sk->mark to select the proper routing table where we clone the rt6_info into. >>> And we only get that value out of the sockets. I missed that. We should leave >>> the redirect logic in the socket layer where it is possible. >>> >>> But parts of this series are still valid. We need to fix redirects for tunnels >>> and I do think we can still simplify some code in the error handlers. >>> >> >> I got it. > > I gave it a bit more thought: > > RFC 4861 8.3: > " > Redirect messages apply to all flows that are being sent to a given > destination. That is, upon receipt of a Redirect for a Destination > Address, all Destination Cache entries to that address should be > updated to use the specified next-hop, regardless of the contents of > the Flow Label field that appears in the Redirected Header option. > " > > Especially because redirects also help in the on-link determination (same > RFC, section 8), I changed my mind and am still in favour of updating it > in the ndisc layer. In my opinion we just have to consider all routing > tables and apply the update to every one which carries a valid next hop > to the source of the redirect (under consideration of the destination). > > This will be important if we actually try to get linux to correctly > implement the ipv6 subnet model (RFC 5942, Section 4 Rule 1). In that > case we are not allowed to assume nodes on-link even if they would match > the same prefix as a locally configured address. > I think this need a little time to discuss with David Miller, so i will send the other patchs to fix redirects for tunnels and to fix the sk->sk_err problems in udpv6_err()/rawv6_err(). Thanks, Duan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Duan! On Wed, Sep 18, 2013 at 06:13:37AM +0200, Hannes Frederic Sowa wrote: > Especially because redirects also help in the on-link determination (same > RFC, section 8), I changed my mind and am still in favour of updating it > in the ndisc layer. In my opinion we just have to consider all routing > tables and apply the update to every one which carries a valid next hop > to the source of the redirect (under consideration of the destination). > > This will be important if we actually try to get linux to correctly > implement the ipv6 subnet model (RFC 5942, Section 4 Rule 1). In that > case we are not allowed to assume nodes on-link even if they would match > the same prefix as a locally configured address. I am playing around with a simple patch which does suppress adding routing information for the on-link assumption we currently do in linux. Are you intereseted in following up on this? I still do think we should update not only the routing table the socket uses but all routing tables which have a valid route towards the router which emitted the redirect. I try to check if we actually handle redirect messages when ECMP routes are in use correctly. Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
于 2013年10月09日 09:43, Hannes Frederic Sowa 写道: > Hi Duan! > > On Wed, Sep 18, 2013 at 06:13:37AM +0200, Hannes Frederic Sowa wrote: >> Especially because redirects also help in the on-link determination (same >> RFC, section 8), I changed my mind and am still in favour of updating it >> in the ndisc layer. In my opinion we just have to consider all routing >> tables and apply the update to every one which carries a valid next hop >> to the source of the redirect (under consideration of the destination). >> >> This will be important if we actually try to get linux to correctly >> implement the ipv6 subnet model (RFC 5942, Section 4 Rule 1). In that >> case we are not allowed to assume nodes on-link even if they would match >> the same prefix as a locally configured address. > > I am playing around with a simple patch which does suppress adding routing > information for the on-link assumption we currently do in linux. > > Are you intereseted in following up on this? I still do think we should update > not only the routing table the socket uses but all routing tables which have a > valid route towards the router which emitted the redirect. > No, thanks, i have got other things on my mind, more important, but i will pay attention to it. Thanks, Duan. > I try to check if we actually handle redirect messages when ECMP routes are in > use correctly. > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index f525e70..5db259e 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -133,9 +133,6 @@ extern void ip6_update_pmtu(struct sk_buff *skb, struct net *net, __be32 mtu, extern void ip6_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, __be32 mtu); extern void ip6_redirect(struct sk_buff *skb, struct net *net, int oif, u32 mark); -extern void ip6_redirect_no_header(struct sk_buff *skb, struct net *net, int oif, - u32 mark); -extern void ip6_sk_redirect(struct sk_buff *skb, struct sock *sk); struct netlink_callback; diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index f8a55ff..6bd1b41 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1368,11 +1368,9 @@ static void ndisc_redirect_rcv(struct sk_buff *skb) if (!ndisc_parse_options(msg->opt, ndoptlen, &ndopts)) return; - if (!ndopts.nd_opts_rh) { - ip6_redirect_no_header(skb, dev_net(skb->dev), - skb->dev->ifindex, 0); + ip6_redirect(skb, dev_net(skb->dev), skb->dev->ifindex, 0); + if (!ndopts.nd_opts_rh) return; - } hdr = (u8 *)ndopts.nd_opts_rh; hdr += 8; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index c979dd9..151bd6c 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1227,27 +1227,7 @@ static struct dst_entry *ip6_route_redirect(struct net *net, flags, __ip6_route_redirect); } -void ip6_redirect(struct sk_buff *skb, struct net *net, int oif, u32 mark) -{ - const struct ipv6hdr *iph = (struct ipv6hdr *) skb->data; - struct dst_entry *dst; - struct flowi6 fl6; - - memset(&fl6, 0, sizeof(fl6)); - fl6.flowi6_oif = oif; - fl6.flowi6_mark = mark; - fl6.flowi6_flags = 0; - fl6.daddr = iph->daddr; - fl6.saddr = iph->saddr; - fl6.flowlabel = ip6_flowinfo(iph); - - dst = ip6_route_redirect(net, &fl6, &ipv6_hdr(skb)->saddr); - rt6_do_redirect(dst, NULL, skb); - dst_release(dst); -} -EXPORT_SYMBOL_GPL(ip6_redirect); - -void ip6_redirect_no_header(struct sk_buff *skb, struct net *net, int oif, +void ip6_redirect(struct sk_buff *skb, struct net *net, int oif, u32 mark) { const struct ipv6hdr *iph = ipv6_hdr(skb); @@ -1266,12 +1246,7 @@ void ip6_redirect_no_header(struct sk_buff *skb, struct net *net, int oif, rt6_do_redirect(dst, NULL, skb); dst_release(dst); } - -void ip6_sk_redirect(struct sk_buff *skb, struct sock *sk) -{ - ip6_redirect(skb, sock_net(sk), sk->sk_bound_dev_if, sk->sk_mark); -} -EXPORT_SYMBOL_GPL(ip6_sk_redirect); +EXPORT_SYMBOL_GPL(ip6_redirect); static unsigned int ip6_default_advmss(const struct dst_entry *dst) {