Message ID | 20190225212721.2908-1-dima@arista.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | rtnetlink: Synchronze net in rtnl_unregister() | expand |
On 02/25/2019 01:27 PM, Dmitry Safonov wrote: > rtnl_unregister() unsets handler from table, which is protected > by rtnl_lock or RCU. At this moment only dump handlers access the table > with rcu_lock(). Every other user accesses under rtnl. > > Callers may expect that rtnl_unregister() prevents any further handlers > calls, alike rtnl_unregister_all(). And they do expect it. > > I've looked on in-tree caller uses: > br_mdb: safe, but in err-path br_netlink_init() > fib_rules: safe - err-path is very early in __init > ip6mr: safe - following unregister_pernet_subsys() calls internally rcu_barrier() > qrtr: safe - following sock_unregister() calls internally synchronize_rcu() If rcu_barrier() was needed, then all callers should use it. If synchronize_rcu() was needed, then all callers should use it. But mixing is probably wrong. > > While it's possible to document that rtnl_unregister() requires > synchronize_net() afterwards - unlike rtnl_unregister_all(), I believe > the module exit is very much slow-path. rtnl_unregister_all() needs the sychronize_rcu() at this moment because of the kfree(tab), not because of the kfree_rcu(link, rcu); > > Issue seems to be very theoretical and unlikely, so I'm not Cc'ing > stable tree. > > Fixes: 6853dd488119 ("rtnetlink: protect handler table with rcu") > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Florian Westphal <fw@strlen.de> > Cc: "Hannes Frederic Sowa" <hannes@stressinduktion.org> > Cc: netdev@vger.kernel.org > Signed-off-by: Dmitry Safonov <dima@arista.com> > --- > net/core/rtnetlink.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index 5ea1bed08ede..3db70da4f951 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -308,7 +308,9 @@ int rtnl_unregister(int protocol, int msgtype) > rcu_assign_pointer(tab[msgindex], NULL); > rtnl_unlock(); > > - kfree_rcu(link, rcu); > + synchronize_net(); > + > + kfree(link); I really do not see a difference here (other than this being much slower of course) If the caller needs rcu_barrier(), then add it in the caller ?
Hi Eric, On 2/25/19 11:09 PM, Eric Dumazet wrote: > On 02/25/2019 01:27 PM, Dmitry Safonov wrote: >> While it's possible to document that rtnl_unregister() requires >> synchronize_net() afterwards - unlike rtnl_unregister_all(), I believe >> the module exit is very much slow-path. > > rtnl_unregister_all() needs the sychronize_rcu() at this moment > because of the kfree(tab), not because of the kfree_rcu(link, rcu); I may be wrong here, but shouldn't we wait for grace period to elapse by the reason that rtnl_msg_handlers are protected by RCU, not only by rtnl? Like, without synchronize_net() in rtnl_unregister() - what prevents module exit race to say, rtnetlink_rcv_msg()=>rtnl_get_link()? >> --- a/net/core/rtnetlink.c >> +++ b/net/core/rtnetlink.c >> @@ -308,7 +308,9 @@ int rtnl_unregister(int protocol, int msgtype) >> rcu_assign_pointer(tab[msgindex], NULL); >> rtnl_unlock(); >> >> - kfree_rcu(link, rcu); >> + synchronize_net(); >> + >> + kfree(link); > > > I really do not see a difference here (other than this being much slower of course) > > If the caller needs rcu_barrier(), then add it in the caller ? Well, sure - but it seems confusing that rtnl_unregister() will require synchronize_rcu(), while rtnl_unregister_all() will not. And I thought no one would care about another synchronize_rcu() in exit path. Thanks, Dmitry
On 2/25/19 11:21 PM, Dmitry Safonov wrote: > Hi Eric, > > On 2/25/19 11:09 PM, Eric Dumazet wrote: >> On 02/25/2019 01:27 PM, Dmitry Safonov wrote: >>> While it's possible to document that rtnl_unregister() requires >>> synchronize_net() afterwards - unlike rtnl_unregister_all(), I believe >>> the module exit is very much slow-path. >> >> rtnl_unregister_all() needs the sychronize_rcu() at this moment >> because of the kfree(tab), not because of the kfree_rcu(link, rcu); > > I may be wrong here, but shouldn't we wait for grace period to elapse by > the reason that rtnl_msg_handlers are protected by RCU, not only by rtnl? > Like, without synchronize_net() in rtnl_unregister() - what prevents > module exit race to say, rtnetlink_rcv_msg()=>rtnl_get_link()? There is synchronize_rcu() in free_module() - but I believe it's a bit too far in unloading. Maybe, I'm missing another call on the way. Thanks, Dmitry
On 02/25/2019 03:21 PM, Dmitry Safonov wrote: > Hi Eric, > > On 2/25/19 11:09 PM, Eric Dumazet wrote: >> On 02/25/2019 01:27 PM, Dmitry Safonov wrote: >>> While it's possible to document that rtnl_unregister() requires >>> synchronize_net() afterwards - unlike rtnl_unregister_all(), I believe >>> the module exit is very much slow-path. >> >> rtnl_unregister_all() needs the sychronize_rcu() at this moment >> because of the kfree(tab), not because of the kfree_rcu(link, rcu); > > I may be wrong here, but shouldn't we wait for grace period to elapse by > the reason that rtnl_msg_handlers are protected by RCU, not only by rtnl? > Like, without synchronize_net() in rtnl_unregister() - what prevents > module exit race to say, rtnetlink_rcv_msg()=>rtnl_get_link()? > > >>> --- a/net/core/rtnetlink.c >>> +++ b/net/core/rtnetlink.c >>> @@ -308,7 +308,9 @@ int rtnl_unregister(int protocol, int msgtype) >>> rcu_assign_pointer(tab[msgindex], NULL); >>> rtnl_unlock(); >>> >>> - kfree_rcu(link, rcu); >>> + synchronize_net(); >>> + >>> + kfree(link); >> >> >> I really do not see a difference here (other than this being much slower of course) >> >> If the caller needs rcu_barrier(), then add it in the caller ? > > Well, sure - but it seems confusing that rtnl_unregister() will require > synchronize_rcu(), while rtnl_unregister_all() will not. rtnl_unregister_all() is a different beast, since it removes the whole rtnl_msg_handlers[protocol] rtnl_unregister() only removes a subset, with different usages. > And I thought no one would care about another synchronize_rcu() in exit > path. We definitely care about things be done properly. If synchronize_rcu() is needed there, be it, but kfree_rcu() seems to be fine. In any case, I believe you need to more carefully explain what is the problem here, because I could not really see it.
On 2/25/19 11:31 PM, Eric Dumazet wrote: > On 02/25/2019 03:21 PM, Dmitry Safonov wrote: >> Well, sure - but it seems confusing that rtnl_unregister() will require >> synchronize_rcu(), while rtnl_unregister_all() will not. > > rtnl_unregister_all() is a different beast, since it removes the whole rtnl_msg_handlers[protocol] > > rtnl_unregister() only removes a subset, with different usages. > >> And I thought no one would care about another synchronize_rcu() in exit >> path. > > We definitely care about things be done properly. > > If synchronize_rcu() is needed there, be it, but kfree_rcu() seems to be fine. > > In any case, I believe you need to more carefully explain what is the problem here, > because I could not really see it. Ugh, sorry - it seems that I haven't had enough coffee today. I've read again rtnetlink_rcv_msg(), who is the only user of rtnl_msg_handlers[] under read_lock, and it calls try_module_get(owner). Sorry for the noise, Dmitry
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 5ea1bed08ede..3db70da4f951 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -308,7 +308,9 @@ int rtnl_unregister(int protocol, int msgtype) rcu_assign_pointer(tab[msgindex], NULL); rtnl_unlock(); - kfree_rcu(link, rcu); + synchronize_net(); + + kfree(link); return 0; }
rtnl_unregister() unsets handler from table, which is protected by rtnl_lock or RCU. At this moment only dump handlers access the table with rcu_lock(). Every other user accesses under rtnl. Callers may expect that rtnl_unregister() prevents any further handlers calls, alike rtnl_unregister_all(). And they do expect it. I've looked on in-tree caller uses: br_mdb: safe, but in err-path br_netlink_init() fib_rules: safe - err-path is very early in __init ip6mr: safe - following unregister_pernet_subsys() calls internally rcu_barrier() qrtr: safe - following sock_unregister() calls internally synchronize_rcu() While it's possible to document that rtnl_unregister() requires synchronize_net() afterwards - unlike rtnl_unregister_all(), I believe the module exit is very much slow-path. Issue seems to be very theoretical and unlikely, so I'm not Cc'ing stable tree. Fixes: 6853dd488119 ("rtnetlink: protect handler table with rcu") Cc: "David S. Miller" <davem@davemloft.net> Cc: Florian Westphal <fw@strlen.de> Cc: "Hannes Frederic Sowa" <hannes@stressinduktion.org> Cc: netdev@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> --- net/core/rtnetlink.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)