Message ID | 1483625601-10552-1-git-send-email-wangyufen@huawei.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Wang Yufen <wangyufen@huawei.com> Date: Thu, 5 Jan 2017 22:13:21 +0800 > From: Yufen Wang <wangyufen@huawei.com> > > A possible NULL pointer dereference in tg3_get_stats64 while doing > tg3_free_consistent. ... > This patch avoids the NULL pointer dereference by using !tg3_flag(tp, INIT_COMPLETE) > instate of !tp->hw_stats. > > Signed-off-by: Yufen Wang <wangyufen@huawei.com> > --- > drivers/net/ethernet/broadcom/tg3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c > index 185e9e0..012f18d 100644 > --- a/drivers/net/ethernet/broadcom/tg3.c > +++ b/drivers/net/ethernet/broadcom/tg3.c > @@ -14148,7 +14148,7 @@ static struct rtnl_link_stats64 *tg3_get_stats64(struct net_device *dev, > struct tg3 *tp = netdev_priv(dev); > > spin_lock_bh(&tp->lock); > - if (!tp->hw_stats) { > + if (!tg3_flag(tp, INIT_COMPLETE)) { The real issue is the manner and order in which the driver performs initialization actions relative to netif_device_{attach,detach}(). That is what needs to be fixed here, instead of adding more and more ad-hoc tests to the various methods which can be invoked once the netif_device_attach() occurs.
On Thu, Jan 5, 2017 at 9:33 AM, David Miller <davem@davemloft.net> wrote: > From: Wang Yufen <wangyufen@huawei.com> > Date: Thu, 5 Jan 2017 22:13:21 +0800 > >> From: Yufen Wang <wangyufen@huawei.com> >> >> A possible NULL pointer dereference in tg3_get_stats64 while doing >> tg3_free_consistent. > ... >> This patch avoids the NULL pointer dereference by using !tg3_flag(tp, INIT_COMPLETE) >> instate of !tp->hw_stats. >> >> Signed-off-by: Yufen Wang <wangyufen@huawei.com> >> --- >> drivers/net/ethernet/broadcom/tg3.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c >> index 185e9e0..012f18d 100644 >> --- a/drivers/net/ethernet/broadcom/tg3.c >> +++ b/drivers/net/ethernet/broadcom/tg3.c >> @@ -14148,7 +14148,7 @@ static struct rtnl_link_stats64 *tg3_get_stats64(struct net_device *dev, >> struct tg3 *tp = netdev_priv(dev); >> >> spin_lock_bh(&tp->lock); >> - if (!tp->hw_stats) { >> + if (!tg3_flag(tp, INIT_COMPLETE)) { > > The real issue is the manner and order in which the driver performs > initialization actions relative to netif_device_{attach,detach}(). > > That is what needs to be fixed here, instead of adding more and more > ad-hoc tests to the various methods which can be invoked once the > netif_device_attach() occurs. Normally, ndo_get_stats64() should be under rtnl lock in the netlink code path and we should be safe. We only free tp->hw_stats under rtnl lock in the close path or ethtool path. But it looks like ndo_get_stats() can be called without rtnl lock from net-procfs.c. So it is possible that we'll read tp->hw_stats after it has been freed. For example, if we are reading /proc/net/dev and closing tg3 at the same time. David, is not taking rtnl_lock in net-procfs.c by design?
From: Michael Chan <michael.chan@broadcom.com> Date: Thu, 5 Jan 2017 12:04:13 -0800 > But it looks like ndo_get_stats() can be called without rtnl lock from > net-procfs.c. So it is possible that we'll read tp->hw_stats after it > has been freed. For example, if we are reading /proc/net/dev and > closing tg3 at the same time. David, is not taking rtnl_lock in > net-procfs.c by design? Probably not, that dev_get_stats() call probably should be surrounded by RTNL protection. Doing a quick grep on dev_get_stats() shows other call sites, most of which are using it to fetch slave device statistics from the get stats method of the parent. Which should be ok. It appears that the vlan procfs code in net/8021q/vlanproc.c has a similar bug as net/core/net-procfs.c Maybe net/core/net-sysfs.c has the same issue as well, and perhaps also net/openvswitch/vport.c:ovs_vport_get_stats().
On Thu, Jan 5, 2017 at 12:17 PM, David Miller <davem@davemloft.net> wrote: > From: Michael Chan <michael.chan@broadcom.com> > Date: Thu, 5 Jan 2017 12:04:13 -0800 > >> But it looks like ndo_get_stats() can be called without rtnl lock from >> net-procfs.c. So it is possible that we'll read tp->hw_stats after it >> has been freed. For example, if we are reading /proc/net/dev and >> closing tg3 at the same time. David, is not taking rtnl_lock in >> net-procfs.c by design? > > Probably not, that dev_get_stats() call probably should be surrounded > by RTNL protection. > > Doing a quick grep on dev_get_stats() shows other call sites, most of > which are using it to fetch slave device statistics from the get stats > method of the parent. Which should be ok. > > It appears that the vlan procfs code in net/8021q/vlanproc.c has a > similar bug as net/core/net-procfs.c > > Maybe net/core/net-sysfs.c has the same issue as well, and perhaps also > net/openvswitch/vport.c:ovs_vport_get_stats(). > OK. I will send a patch later today to add rtnl_lock to these callers.
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 185e9e0..012f18d 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -14148,7 +14148,7 @@ static struct rtnl_link_stats64 *tg3_get_stats64(struct net_device *dev, struct tg3 *tp = netdev_priv(dev); spin_lock_bh(&tp->lock); - if (!tp->hw_stats) { + if (!tg3_flag(tp, INIT_COMPLETE)) { *stats = tp->net_stats_prev; spin_unlock_bh(&tp->lock); return stats;