Message ID | 20111007124954.7089.9776.stgit@ltc219.sdl.hitachi.co.jp |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> wrote: > The bond->recv_probe is called in bond_handle_frame() when > a packet is received, but bond_close() sets it to NULL. So, > a panic occurs when both functions work in parallel. > > Why this happen: > After null pointer check of bond->recv_probe, an sk_buff is > duplicated and bond->recv_probe is called in bond_handle_frame. > So, a panic occurs when bond_close() is called between the > check and call of bond->recv_probe. > > Patch: > This patch uses a local function pointer of bond->recv_probe > in bond_handle_frame(). So, it can avoid the null pointer > dereference. > Hmm, I don't doubt it can fix the problem, I am wondering if bond->recv_probe should be protected by bond->lock... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi WANG Cong Thank you for your comments. (2011/10/07 22:24), Américo Wang wrote: > On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka > <mitsuo.hayasaka.hu@hitachi.com> wrote: >> The bond->recv_probe is called in bond_handle_frame() when >> a packet is received, but bond_close() sets it to NULL. So, >> a panic occurs when both functions work in parallel. >> >> Why this happen: >> After null pointer check of bond->recv_probe, an sk_buff is >> duplicated and bond->recv_probe is called in bond_handle_frame. >> So, a panic occurs when bond_close() is called between the >> check and call of bond->recv_probe. >> >> Patch: >> This patch uses a local function pointer of bond->recv_probe >> in bond_handle_frame(). So, it can avoid the null pointer >> dereference. >> > > Hmm, I don't doubt it can fix the problem, I am wondering if > bond->recv_probe should be protected by bond->lock... Indeed, in general any resources should be protected from the asynchronous workers. At first, I thought it should be handled with lock protection, as well. However, I guess that using bond->lock on this kind of hot-path may introduces unnecessary overhead. In addition, this code works well without the strict lock protection. So, I think this change is the right way to fix it. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 11 octobre 2011 à 22:02 +0900, HAYASAKA Mitsuo a écrit : > Hi WANG Cong > > Thank you for your comments. > > (2011/10/07 22:24), Américo Wang wrote: > > On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka > > <mitsuo.hayasaka.hu@hitachi.com> wrote: > >> The bond->recv_probe is called in bond_handle_frame() when > >> a packet is received, but bond_close() sets it to NULL. So, > >> a panic occurs when both functions work in parallel. > >> > >> Why this happen: > >> After null pointer check of bond->recv_probe, an sk_buff is > >> duplicated and bond->recv_probe is called in bond_handle_frame. > >> So, a panic occurs when bond_close() is called between the > >> check and call of bond->recv_probe. > >> > >> Patch: > >> This patch uses a local function pointer of bond->recv_probe > >> in bond_handle_frame(). So, it can avoid the null pointer > >> dereference. > >> > > > > Hmm, I don't doubt it can fix the problem, I am wondering if > > bond->recv_probe should be protected by bond->lock... > > Indeed, in general any resources should be protected from the asynchronous > workers. > > At first, I thought it should be handled with lock protection, as well. > However, I guess that using bond->lock on this kind of hot-path may > introduces unnecessary overhead. In addition, this code works well > without the strict lock protection. So, I think this change is the > right way to fix it. Maybe, but then ACCESS_ONCE() is needed to prevent compiler to 'optimize' the temporary variable. Or use rcu_dereference() to make the whole thing really safe and self documented. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Eric, Thank you for your comment. (2011/10/11 22:23), Eric Dumazet wrote: > Le mardi 11 octobre 2011 à 22:02 +0900, HAYASAKA Mitsuo a écrit : >> Hi WANG Cong >> >> Thank you for your comments. >> >> (2011/10/07 22:24), Américo Wang wrote: >>> On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka >>> <mitsuo.hayasaka.hu@hitachi.com> wrote: >>>> The bond->recv_probe is called in bond_handle_frame() when >>>> a packet is received, but bond_close() sets it to NULL. So, >>>> a panic occurs when both functions work in parallel. >>>> >>>> Why this happen: >>>> After null pointer check of bond->recv_probe, an sk_buff is >>>> duplicated and bond->recv_probe is called in bond_handle_frame. >>>> So, a panic occurs when bond_close() is called between the >>>> check and call of bond->recv_probe. >>>> >>>> Patch: >>>> This patch uses a local function pointer of bond->recv_probe >>>> in bond_handle_frame(). So, it can avoid the null pointer >>>> dereference. >>>> >>> >>> Hmm, I don't doubt it can fix the problem, I am wondering if >>> bond->recv_probe should be protected by bond->lock... >> >> Indeed, in general any resources should be protected from the asynchronous >> workers. >> >> At first, I thought it should be handled with lock protection, as well. >> However, I guess that using bond->lock on this kind of hot-path may >> introduces unnecessary overhead. In addition, this code works well >> without the strict lock protection. So, I think this change is the >> right way to fix it. > > Maybe, but then ACCESS_ONCE() is needed to prevent compiler to > 'optimize' the temporary variable. > > Or use rcu_dereference() to make the whole thing really safe and self > documented. > I agreed. I'd like to send the patch with ACCESS_ONCE(), again. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 6d79b78..f1baec5 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1435,6 +1435,8 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb) struct sk_buff *skb = *pskb; struct slave *slave; struct bonding *bond; + void (*recv_probe)(struct sk_buff *, struct bonding *, + struct slave *); skb = skb_share_check(skb, GFP_ATOMIC); if (unlikely(!skb)) @@ -1448,11 +1450,12 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb) if (bond->params.arp_interval) slave->dev->last_rx = jiffies; - if (bond->recv_probe) { + recv_probe = bond->recv_probe; + if (recv_probe) { struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC); if (likely(nskb)) { - bond->recv_probe(nskb, bond, slave); + recv_probe(nskb, bond, slave); dev_kfree_skb(nskb); } }
The bond->recv_probe is called in bond_handle_frame() when a packet is received, but bond_close() sets it to NULL. So, a panic occurs when both functions work in parallel. Why this happen: After null pointer check of bond->recv_probe, an sk_buff is duplicated and bond->recv_probe is called in bond_handle_frame. So, a panic occurs when bond_close() is called between the check and call of bond->recv_probe. Patch: This patch uses a local function pointer of bond->recv_probe in bond_handle_frame(). So, it can avoid the null pointer dereference. Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> --- drivers/net/bonding/bond_main.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html