diff mbox

[RFC,net,1/2,BUGFIX] bonding: use local function pointer of bond->recv_probe in bond_handle_frame

Message ID 20111007124954.7089.9776.stgit@ltc219.sdl.hitachi.co.jp
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Mitsuo Hayasaka Oct. 7, 2011, 12:49 p.m. UTC
The bond->recv_probe is called in bond_handle_frame() when
a packet is received, but bond_close() sets it to NULL. So,
a panic occurs when both functions work in parallel.

Why this happen:
After null pointer check of bond->recv_probe, an sk_buff is
duplicated and bond->recv_probe is called in bond_handle_frame.
So, a panic occurs when bond_close() is called between the
check and call of bond->recv_probe.

Patch:
This patch uses a local function pointer of bond->recv_probe
in bond_handle_frame(). So, it can avoid the null pointer
dereference.


Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
---

 drivers/net/bonding/bond_main.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Cong Wang Oct. 7, 2011, 1:24 p.m. UTC | #1
On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka
<mitsuo.hayasaka.hu@hitachi.com> wrote:
> The bond->recv_probe is called in bond_handle_frame() when
> a packet is received, but bond_close() sets it to NULL. So,
> a panic occurs when both functions work in parallel.
>
> Why this happen:
> After null pointer check of bond->recv_probe, an sk_buff is
> duplicated and bond->recv_probe is called in bond_handle_frame.
> So, a panic occurs when bond_close() is called between the
> check and call of bond->recv_probe.
>
> Patch:
> This patch uses a local function pointer of bond->recv_probe
> in bond_handle_frame(). So, it can avoid the null pointer
> dereference.
>

Hmm, I don't doubt it can fix the problem, I am wondering if
bond->recv_probe should be protected by bond->lock...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitsuo Hayasaka Oct. 11, 2011, 1:02 p.m. UTC | #2
Hi WANG Cong

Thank you for your comments.

(2011/10/07 22:24), Américo Wang wrote:
> On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka
> <mitsuo.hayasaka.hu@hitachi.com> wrote:
>> The bond->recv_probe is called in bond_handle_frame() when
>> a packet is received, but bond_close() sets it to NULL. So,
>> a panic occurs when both functions work in parallel.
>>
>> Why this happen:
>> After null pointer check of bond->recv_probe, an sk_buff is
>> duplicated and bond->recv_probe is called in bond_handle_frame.
>> So, a panic occurs when bond_close() is called between the
>> check and call of bond->recv_probe.
>>
>> Patch:
>> This patch uses a local function pointer of bond->recv_probe
>> in bond_handle_frame(). So, it can avoid the null pointer
>> dereference.
>>
> 
> Hmm, I don't doubt it can fix the problem, I am wondering if
> bond->recv_probe should be protected by bond->lock...

Indeed, in general any resources should be protected from the asynchronous
workers.

At first, I thought it should be handled with lock protection, as well.
However, I guess that using bond->lock on this kind of hot-path may
introduces unnecessary overhead. In addition, this code works well
without the strict lock protection. So, I think this change is the
right way to fix it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 11, 2011, 1:23 p.m. UTC | #3
Le mardi 11 octobre 2011 à 22:02 +0900, HAYASAKA Mitsuo a écrit :
> Hi WANG Cong
> 
> Thank you for your comments.
> 
> (2011/10/07 22:24), Américo Wang wrote:
> > On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka
> > <mitsuo.hayasaka.hu@hitachi.com> wrote:
> >> The bond->recv_probe is called in bond_handle_frame() when
> >> a packet is received, but bond_close() sets it to NULL. So,
> >> a panic occurs when both functions work in parallel.
> >>
> >> Why this happen:
> >> After null pointer check of bond->recv_probe, an sk_buff is
> >> duplicated and bond->recv_probe is called in bond_handle_frame.
> >> So, a panic occurs when bond_close() is called between the
> >> check and call of bond->recv_probe.
> >>
> >> Patch:
> >> This patch uses a local function pointer of bond->recv_probe
> >> in bond_handle_frame(). So, it can avoid the null pointer
> >> dereference.
> >>
> > 
> > Hmm, I don't doubt it can fix the problem, I am wondering if
> > bond->recv_probe should be protected by bond->lock...
> 
> Indeed, in general any resources should be protected from the asynchronous
> workers.
> 
> At first, I thought it should be handled with lock protection, as well.
> However, I guess that using bond->lock on this kind of hot-path may
> introduces unnecessary overhead. In addition, this code works well
> without the strict lock protection. So, I think this change is the
> right way to fix it.

Maybe, but then ACCESS_ONCE() is needed to prevent compiler to
'optimize' the temporary variable.

Or use rcu_dereference() to make the whole thing really safe and self
documented.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitsuo Hayasaka Oct. 12, 2011, 7:04 a.m. UTC | #4
Hi Eric,

Thank you for your comment.

(2011/10/11 22:23), Eric Dumazet wrote:
> Le mardi 11 octobre 2011 à 22:02 +0900, HAYASAKA Mitsuo a écrit :
>> Hi WANG Cong
>>
>> Thank you for your comments.
>>
>> (2011/10/07 22:24), Américo Wang wrote:
>>> On Fri, Oct 7, 2011 at 8:49 PM, Mitsuo Hayasaka
>>> <mitsuo.hayasaka.hu@hitachi.com> wrote:
>>>> The bond->recv_probe is called in bond_handle_frame() when
>>>> a packet is received, but bond_close() sets it to NULL. So,
>>>> a panic occurs when both functions work in parallel.
>>>>
>>>> Why this happen:
>>>> After null pointer check of bond->recv_probe, an sk_buff is
>>>> duplicated and bond->recv_probe is called in bond_handle_frame.
>>>> So, a panic occurs when bond_close() is called between the
>>>> check and call of bond->recv_probe.
>>>>
>>>> Patch:
>>>> This patch uses a local function pointer of bond->recv_probe
>>>> in bond_handle_frame(). So, it can avoid the null pointer
>>>> dereference.
>>>>
>>>
>>> Hmm, I don't doubt it can fix the problem, I am wondering if
>>> bond->recv_probe should be protected by bond->lock...
>>
>> Indeed, in general any resources should be protected from the asynchronous
>> workers.
>>
>> At first, I thought it should be handled with lock protection, as well.
>> However, I guess that using bond->lock on this kind of hot-path may
>> introduces unnecessary overhead. In addition, this code works well
>> without the strict lock protection. So, I think this change is the
>> right way to fix it.
> 
> Maybe, but then ACCESS_ONCE() is needed to prevent compiler to
> 'optimize' the temporary variable.
> 
> Or use rcu_dereference() to make the whole thing really safe and self
> documented.
> 
I agreed.
I'd like to send the patch with ACCESS_ONCE(), again.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6d79b78..f1baec5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1435,6 +1435,8 @@  static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 	struct sk_buff *skb = *pskb;
 	struct slave *slave;
 	struct bonding *bond;
+	void (*recv_probe)(struct sk_buff *, struct bonding *,
+				struct slave *);
 
 	skb = skb_share_check(skb, GFP_ATOMIC);
 	if (unlikely(!skb))
@@ -1448,11 +1450,12 @@  static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 	if (bond->params.arp_interval)
 		slave->dev->last_rx = jiffies;
 
-	if (bond->recv_probe) {
+	recv_probe = bond->recv_probe;
+	if (recv_probe) {
 		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
 
 		if (likely(nskb)) {
-			bond->recv_probe(nskb, bond, slave);
+			recv_probe(nskb, bond, slave);
 			dev_kfree_skb(nskb);
 		}
 	}