Message ID | 1287666383-17615-1-git-send-email-timo.teras@iki.fi |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
Le jeudi 21 octobre 2010 à 16:06 +0300, Timo Teräs a écrit : > Otherwise we have race condition to user land: > 1. process A: changes IP address > 2. process A: kernel sends RTM_NEWADDR (and schedules out) > 3. process B: gets notification > 4. process B: tries to bind() to new IP, but fails with EADDRNOTAVAIL > because FIB is not yet updated and inet_addr_type() in inet_bind() > does not recognize the IP as local > 5. process A: calls inetaddr_chain notifiers which updates FIB > > Fix the error path to synchronize with configuration changes and retry > the address type check. > > IPv6 side seems to handle the notifications properly: bind() immediately > after RTM_NEWADDR succeeds as expected. This is because ipv6_chk_addr() > uses inet6_addr_lst which is updated before address notification. > > Signed-off-by: Timo Teräs <timo.teras@iki.fi> > --- > Since there was no reply to my question if this is ok, I interpreted it > as "maybe". So here's the code for review. Hopefully this helps determining > if this is an acceptable fix. > Just say : no Really Timo, this problem must get another fix. I understand you need an urgent fix, you can use your patch in the meantime, of course ;) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/21/2010 05:10 PM, Eric Dumazet wrote: > Le jeudi 21 octobre 2010 à 16:06 +0300, Timo Teräs a écrit : >> Otherwise we have race condition to user land: >> 1. process A: changes IP address >> 2. process A: kernel sends RTM_NEWADDR (and schedules out) >> 3. process B: gets notification >> 4. process B: tries to bind() to new IP, but fails with EADDRNOTAVAIL >> because FIB is not yet updated and inet_addr_type() in inet_bind() >> does not recognize the IP as local >> 5. process A: calls inetaddr_chain notifiers which updates FIB >> >> Fix the error path to synchronize with configuration changes and retry >> the address type check. >> >> IPv6 side seems to handle the notifications properly: bind() immediately >> after RTM_NEWADDR succeeds as expected. This is because ipv6_chk_addr() >> uses inet6_addr_lst which is updated before address notification. >> >> Signed-off-by: Timo Teräs <timo.teras@iki.fi> >> --- >> Since there was no reply to my question if this is ok, I interpreted it >> as "maybe". So here's the code for review. Hopefully this helps determining >> if this is an acceptable fix. > > Just say : no > > Really Timo, this problem must get another fix. > > I understand you need an urgent fix, you can use your patch in the > meantime, of course ;) Fine. I figured you might feel this way. My final idea to fix this, is to modify inet_addr_type() do the address type check using the ifa lists. This probably involves making ifa lists rcu (did not take close look on how ifa lists work in current ipv4 code). This is basically how ipv6 side works. I'm not too sure how to go on with this, so I'll wait up until someone more qualified gets the time to look at this. I think for my immediate needs I might be able to get away with using the kludge patch, enabling non-local binding or fixing my userland code to listen RTM_NEWROUTE/RTN_LOCAL events (though, IPv6 link-local addresses are special and would need RTM_NEWADDR handling still to get the real device's ifindex). Btw. why do IPv6 RTN_LOCAL routes have loopback interface as dst instead of the real device? IPv4 RTN_LOCAL routes seem to have the real device so this looks inconsistent at first sight. I guess IPv6 requires this for a bunch of other things. This way it's just not really possible to get link-local IPv6 routes. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 6a1100c..013ab11 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -481,8 +481,22 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) addr->sin_addr.s_addr != htonl(INADDR_ANY) && chk_addr_ret != RTN_LOCAL && chk_addr_ret != RTN_MULTICAST && - chk_addr_ret != RTN_BROADCAST) - goto out; + chk_addr_ret != RTN_BROADCAST) { + /* inet_addr_type() does a FIB lookup to check the + * address type. Since FIB is updated after sending + * RTM_NEWADDR notification, an application may end up + * doing bind() before the FIB is updated. To avoid + * returning a false negative, wait for possible ongoing + * address changes to finish by acquiring rtnl lock and + * retry the address type lookup. */ + rtnl_lock(); + rtnl_unlock(); + chk_addr_ret = inet_addr_type(sock_net(sk), addr->sin_addr.s_addr); + if (chk_addr_ret != RTN_LOCAL && + chk_addr_ret != RTN_MULTICAST && + chk_addr_ret != RTN_BROADCAST) + goto out; + } snum = ntohs(addr->sin_port); err = -EACCES; diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 56b9bf2..b1a83e1 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -300,7 +300,7 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) goto out; } - /* Reproduce AF_INET checks to make the bindings consitant */ + /* Reproduce AF_INET checks to make the bindings consistent */ v4addr = addr->sin6_addr.s6_addr32[3]; chk_addr_ret = inet_addr_type(net, v4addr); if (!sysctl_ip_nonlocal_bind && @@ -309,8 +309,15 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) chk_addr_ret != RTN_LOCAL && chk_addr_ret != RTN_MULTICAST && chk_addr_ret != RTN_BROADCAST) { - err = -EADDRNOTAVAIL; - goto out; + rtnl_lock(); + rtnl_unlock(); + chk_addr_ret = inet_addr_type(net, v4addr); + if (chk_addr_ret != RTN_LOCAL && + chk_addr_ret != RTN_MULTICAST && + chk_addr_ret != RTN_BROADCAST) { + err = -EADDRNOTAVAIL; + goto out; + } } } else { if (addr_type != IPV6_ADDR_ANY) {
Otherwise we have race condition to user land: 1. process A: changes IP address 2. process A: kernel sends RTM_NEWADDR (and schedules out) 3. process B: gets notification 4. process B: tries to bind() to new IP, but fails with EADDRNOTAVAIL because FIB is not yet updated and inet_addr_type() in inet_bind() does not recognize the IP as local 5. process A: calls inetaddr_chain notifiers which updates FIB Fix the error path to synchronize with configuration changes and retry the address type check. IPv6 side seems to handle the notifications properly: bind() immediately after RTM_NEWADDR succeeds as expected. This is because ipv6_chk_addr() uses inet6_addr_lst which is updated before address notification. Signed-off-by: Timo Teräs <timo.teras@iki.fi> --- Since there was no reply to my question if this is ok, I interpreted it as "maybe". So here's the code for review. Hopefully this helps determining if this is an acceptable fix. net/ipv4/af_inet.c | 18 ++++++++++++++++-- net/ipv6/af_inet6.c | 13 ++++++++++--- 2 files changed, 26 insertions(+), 5 deletions(-)