diff mbox

[net,v2] openvswitch: fix a possible deadlock and lockdep warning

Message ID 1395929134-4487-1-git-send-email-fbl@redhat.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Flavio Leitner March 27, 2014, 2:05 p.m. UTC
There are two problematic situations.

A deadlock can happen when is_percpu is false because it can get
interrupted while holding the spinlock. Then it executes
ovs_flow_stats_update() in softirq context which tries to get
the same lock.

The second sitation is that when is_percpu is true, the code
correctly disables BH but only for the local CPU, so the
following can happen when locking the remote CPU without
disabling BH:

       CPU#0                            CPU#1
  ovs_flow_stats_get()
   stats_read()
 +->spin_lock remote CPU#1        ovs_flow_stats_get()
 |  <interrupted>                  stats_read()
 |  ...                       +-->  spin_lock remote CPU#0
 |                            |     <interrupted>
 |  ovs_flow_stats_update()   |     ...
 |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
 +---------------------------------- spin_lock local CPU#1

This patch disables BH for both cases fixing the deadlocks.

Comments

Pravin B Shelar March 27, 2014, 5:19 p.m. UTC | #1
On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
> There are two problematic situations.
>
> A deadlock can happen when is_percpu is false because it can get
> interrupted while holding the spinlock. Then it executes
> ovs_flow_stats_update() in softirq context which tries to get
> the same lock.
>
> The second sitation is that when is_percpu is true, the code
> correctly disables BH but only for the local CPU, so the
> following can happen when locking the remote CPU without
> disabling BH:
>
>        CPU#0                            CPU#1
>   ovs_flow_stats_get()
>    stats_read()
>  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>  |  <interrupted>                  stats_read()
>  |  ...                       +-->  spin_lock remote CPU#0
>  |                            |     <interrupted>
>  |  ovs_flow_stats_update()   |     ...
>  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>  +---------------------------------- spin_lock local CPU#1
>
> This patch disables BH for both cases fixing the deadlocks.

This bug is already fixed in OVS.

Jesse,
Can you send the fix upstream or you waiting for other patches?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Flavio Leitner March 27, 2014, 5:33 p.m. UTC | #2
On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
> > There are two problematic situations.
> >
> > A deadlock can happen when is_percpu is false because it can get
> > interrupted while holding the spinlock. Then it executes
> > ovs_flow_stats_update() in softirq context which tries to get
> > the same lock.
> >
> > The second sitation is that when is_percpu is true, the code
> > correctly disables BH but only for the local CPU, so the
> > following can happen when locking the remote CPU without
> > disabling BH:
> >
> >        CPU#0                            CPU#1
> >   ovs_flow_stats_get()
> >    stats_read()
> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
> >  |  <interrupted>                  stats_read()
> >  |  ...                       +-->  spin_lock remote CPU#0
> >  |                            |     <interrupted>
> >  |  ovs_flow_stats_update()   |     ...
> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
> >  +---------------------------------- spin_lock local CPU#1
> >
> > This patch disables BH for both cases fixing the deadlocks.
> 
> This bug is already fixed in OVS.

Could you point me to the commit? I am not finding anything
recent.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross March 27, 2014, 5:37 p.m. UTC | #3
On Thu, Mar 27, 2014 at 10:33 AM, Flavio Leitner <fbl@redhat.com> wrote:
> On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
>> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
>> > There are two problematic situations.
>> >
>> > A deadlock can happen when is_percpu is false because it can get
>> > interrupted while holding the spinlock. Then it executes
>> > ovs_flow_stats_update() in softirq context which tries to get
>> > the same lock.
>> >
>> > The second sitation is that when is_percpu is true, the code
>> > correctly disables BH but only for the local CPU, so the
>> > following can happen when locking the remote CPU without
>> > disabling BH:
>> >
>> >        CPU#0                            CPU#1
>> >   ovs_flow_stats_get()
>> >    stats_read()
>> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>> >  |  <interrupted>                  stats_read()
>> >  |  ...                       +-->  spin_lock remote CPU#0
>> >  |                            |     <interrupted>
>> >  |  ovs_flow_stats_update()   |     ...
>> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>> >  +---------------------------------- spin_lock local CPU#1
>> >
>> > This patch disables BH for both cases fixing the deadlocks.
>>
>> This bug is already fixed in OVS.
>
> Could you point me to the commit? I am not finding anything
> recent.

This is the commit:

commit 9d73c9cac76ba557fdac4a89c1b7eafe132b85a3
Author: Pravin B Shelar <pshelar@nicira.com>
Date:   Tue Dec 17 15:43:30 2013 -0800

    datapath: Fix deadlock during stats update.

I thought that I had sent it in the most recent batch of changes for
net but it looks like I missed it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Flavio Leitner March 27, 2014, 5:44 p.m. UTC | #4
On Thu, Mar 27, 2014 at 10:37:32AM -0700, Jesse Gross wrote:
> On Thu, Mar 27, 2014 at 10:33 AM, Flavio Leitner <fbl@redhat.com> wrote:
> > On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
> >> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
> >> > There are two problematic situations.
> >> >
> >> > A deadlock can happen when is_percpu is false because it can get
> >> > interrupted while holding the spinlock. Then it executes
> >> > ovs_flow_stats_update() in softirq context which tries to get
> >> > the same lock.
> >> >
> >> > The second sitation is that when is_percpu is true, the code
> >> > correctly disables BH but only for the local CPU, so the
> >> > following can happen when locking the remote CPU without
> >> > disabling BH:
> >> >
> >> >        CPU#0                            CPU#1
> >> >   ovs_flow_stats_get()
> >> >    stats_read()
> >> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
> >> >  |  <interrupted>                  stats_read()
> >> >  |  ...                       +-->  spin_lock remote CPU#0
> >> >  |                            |     <interrupted>
> >> >  |  ovs_flow_stats_update()   |     ...
> >> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
> >> >  +---------------------------------- spin_lock local CPU#1
> >> >
> >> > This patch disables BH for both cases fixing the deadlocks.
> >>
> >> This bug is already fixed in OVS.
> >
> > Could you point me to the commit? I am not finding anything
> > recent.
> 
> This is the commit:
> 
> commit 9d73c9cac76ba557fdac4a89c1b7eafe132b85a3
> Author: Pravin B Shelar <pshelar@nicira.com>
> Date:   Tue Dec 17 15:43:30 2013 -0800
> 
>     datapath: Fix deadlock during stats update.
> 
> I thought that I had sent it in the most recent batch of changes for
> net but it looks like I missed it.

That commit is incomplete. Look at the scenario #2 which I explain
why it is needed to disable bh for all cpus and not just local ones.

fbl
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross March 27, 2014, 6:32 p.m. UTC | #5
On Thu, Mar 27, 2014 at 10:44 AM, Flavio Leitner <fbl@redhat.com> wrote:
> On Thu, Mar 27, 2014 at 10:37:32AM -0700, Jesse Gross wrote:
>> On Thu, Mar 27, 2014 at 10:33 AM, Flavio Leitner <fbl@redhat.com> wrote:
>> > On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
>> >> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
>> >> > There are two problematic situations.
>> >> >
>> >> > A deadlock can happen when is_percpu is false because it can get
>> >> > interrupted while holding the spinlock. Then it executes
>> >> > ovs_flow_stats_update() in softirq context which tries to get
>> >> > the same lock.
>> >> >
>> >> > The second sitation is that when is_percpu is true, the code
>> >> > correctly disables BH but only for the local CPU, so the
>> >> > following can happen when locking the remote CPU without
>> >> > disabling BH:
>> >> >
>> >> >        CPU#0                            CPU#1
>> >> >   ovs_flow_stats_get()
>> >> >    stats_read()
>> >> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>> >> >  |  <interrupted>                  stats_read()
>> >> >  |  ...                       +-->  spin_lock remote CPU#0
>> >> >  |                            |     <interrupted>
>> >> >  |  ovs_flow_stats_update()   |     ...
>> >> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>> >> >  +---------------------------------- spin_lock local CPU#1
>> >> >
>> >> > This patch disables BH for both cases fixing the deadlocks.
>> >>
>> >> This bug is already fixed in OVS.
>> >
>> > Could you point me to the commit? I am not finding anything
>> > recent.
>>
>> This is the commit:
>>
>> commit 9d73c9cac76ba557fdac4a89c1b7eafe132b85a3
>> Author: Pravin B Shelar <pshelar@nicira.com>
>> Date:   Tue Dec 17 15:43:30 2013 -0800
>>
>>     datapath: Fix deadlock during stats update.
>>
>> I thought that I had sent it in the most recent batch of changes for
>> net but it looks like I missed it.
>
> That commit is incomplete. Look at the scenario #2 which I explain
> why it is needed to disable bh for all cpus and not just local ones.

OK, I understand the second problem now. OVS master (which I am
currently working to cross-port to net-next) uses a different strategy
that also always disables bottom halves for a different reason. Since
I forgot to send the original patch, maybe we can just apply this one
to net instead and use the new stuff directly everywhere else.

Pravin, any objections?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pravin B Shelar March 27, 2014, 6:40 p.m. UTC | #6
On Thu, Mar 27, 2014 at 11:32 AM, Jesse Gross <jesse@nicira.com> wrote:
> On Thu, Mar 27, 2014 at 10:44 AM, Flavio Leitner <fbl@redhat.com> wrote:
>> On Thu, Mar 27, 2014 at 10:37:32AM -0700, Jesse Gross wrote:
>>> On Thu, Mar 27, 2014 at 10:33 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>> > On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
>>> >> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>> >> > There are two problematic situations.
>>> >> >
>>> >> > A deadlock can happen when is_percpu is false because it can get
>>> >> > interrupted while holding the spinlock. Then it executes
>>> >> > ovs_flow_stats_update() in softirq context which tries to get
>>> >> > the same lock.
>>> >> >
>>> >> > The second sitation is that when is_percpu is true, the code
>>> >> > correctly disables BH but only for the local CPU, so the
>>> >> > following can happen when locking the remote CPU without
>>> >> > disabling BH:
>>> >> >
>>> >> >        CPU#0                            CPU#1
>>> >> >   ovs_flow_stats_get()
>>> >> >    stats_read()
>>> >> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>>> >> >  |  <interrupted>                  stats_read()
>>> >> >  |  ...                       +-->  spin_lock remote CPU#0
>>> >> >  |                            |     <interrupted>
>>> >> >  |  ovs_flow_stats_update()   |     ...
>>> >> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>>> >> >  +---------------------------------- spin_lock local CPU#1
>>> >> >
>>> >> > This patch disables BH for both cases fixing the deadlocks.
>>> >>
>>> >> This bug is already fixed in OVS.
>>> >
>>> > Could you point me to the commit? I am not finding anything
>>> > recent.
>>>
>>> This is the commit:
>>>
>>> commit 9d73c9cac76ba557fdac4a89c1b7eafe132b85a3
>>> Author: Pravin B Shelar <pshelar@nicira.com>
>>> Date:   Tue Dec 17 15:43:30 2013 -0800
>>>
>>>     datapath: Fix deadlock during stats update.
>>>
>>> I thought that I had sent it in the most recent batch of changes for
>>> net but it looks like I missed it.
>>
>> That commit is incomplete. Look at the scenario #2 which I explain
>> why it is needed to disable bh for all cpus and not just local ones.
>
> OK, I understand the second problem now. OVS master (which I am
> currently working to cross-port to net-next) uses a different strategy
> that also always disables bottom halves for a different reason. Since
> I forgot to send the original patch, maybe we can just apply this one
> to net instead and use the new stuff directly everywhere else.
>

The ovs patch fixes locking issue.

local cpu check is optimization for better latency which can be
significant on large SMP system. I think we should fix lockdep rather
than adding latency for ovs packet processing. But all this is going
to change with NUMA stats anyways, so I do not have any problem
pushing this new patch.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross March 27, 2014, 8:01 p.m. UTC | #7
On Thu, Mar 27, 2014 at 11:40 AM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Thu, Mar 27, 2014 at 11:32 AM, Jesse Gross <jesse@nicira.com> wrote:
>> On Thu, Mar 27, 2014 at 10:44 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>> On Thu, Mar 27, 2014 at 10:37:32AM -0700, Jesse Gross wrote:
>>>> On Thu, Mar 27, 2014 at 10:33 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>>> > On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
>>>> >> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>>> >> > There are two problematic situations.
>>>> >> >
>>>> >> > A deadlock can happen when is_percpu is false because it can get
>>>> >> > interrupted while holding the spinlock. Then it executes
>>>> >> > ovs_flow_stats_update() in softirq context which tries to get
>>>> >> > the same lock.
>>>> >> >
>>>> >> > The second sitation is that when is_percpu is true, the code
>>>> >> > correctly disables BH but only for the local CPU, so the
>>>> >> > following can happen when locking the remote CPU without
>>>> >> > disabling BH:
>>>> >> >
>>>> >> >        CPU#0                            CPU#1
>>>> >> >   ovs_flow_stats_get()
>>>> >> >    stats_read()
>>>> >> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>>>> >> >  |  <interrupted>                  stats_read()
>>>> >> >  |  ...                       +-->  spin_lock remote CPU#0
>>>> >> >  |                            |     <interrupted>
>>>> >> >  |  ovs_flow_stats_update()   |     ...
>>>> >> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>>>> >> >  +---------------------------------- spin_lock local CPU#1
>>>> >> >
>>>> >> > This patch disables BH for both cases fixing the deadlocks.
>>>> >>
>>>> >> This bug is already fixed in OVS.
>>>> >
>>>> > Could you point me to the commit? I am not finding anything
>>>> > recent.
>>>>
>>>> This is the commit:
>>>>
>>>> commit 9d73c9cac76ba557fdac4a89c1b7eafe132b85a3
>>>> Author: Pravin B Shelar <pshelar@nicira.com>
>>>> Date:   Tue Dec 17 15:43:30 2013 -0800
>>>>
>>>>     datapath: Fix deadlock during stats update.
>>>>
>>>> I thought that I had sent it in the most recent batch of changes for
>>>> net but it looks like I missed it.
>>>
>>> That commit is incomplete. Look at the scenario #2 which I explain
>>> why it is needed to disable bh for all cpus and not just local ones.
>>
>> OK, I understand the second problem now. OVS master (which I am
>> currently working to cross-port to net-next) uses a different strategy
>> that also always disables bottom halves for a different reason. Since
>> I forgot to send the original patch, maybe we can just apply this one
>> to net instead and use the new stuff directly everywhere else.
>>
>
> The ovs patch fixes locking issue.
>
> local cpu check is optimization for better latency which can be
> significant on large SMP system. I think we should fix lockdep rather
> than adding latency for ovs packet processing. But all this is going
> to change with NUMA stats anyways, so I do not have any problem
> pushing this new patch.

I think there is actually a possible deadlock here, not just a lockdep
warning. If you look at Flavio's diagram, it's a rather complicated
series of events that would be necessary but it seems theoretically
possible.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pravin B Shelar March 27, 2014, 9:18 p.m. UTC | #8
On Thu, Mar 27, 2014 at 1:01 PM, Jesse Gross <jesse@nicira.com> wrote:
> On Thu, Mar 27, 2014 at 11:40 AM, Pravin Shelar <pshelar@nicira.com> wrote:
>> On Thu, Mar 27, 2014 at 11:32 AM, Jesse Gross <jesse@nicira.com> wrote:
>>> On Thu, Mar 27, 2014 at 10:44 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>>> On Thu, Mar 27, 2014 at 10:37:32AM -0700, Jesse Gross wrote:
>>>>> On Thu, Mar 27, 2014 at 10:33 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>>>> > On Thu, Mar 27, 2014 at 10:19:23AM -0700, Pravin Shelar wrote:
>>>>> >> On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>>>> >> > There are two problematic situations.
>>>>> >> >
>>>>> >> > A deadlock can happen when is_percpu is false because it can get
>>>>> >> > interrupted while holding the spinlock. Then it executes
>>>>> >> > ovs_flow_stats_update() in softirq context which tries to get
>>>>> >> > the same lock.
>>>>> >> >
>>>>> >> > The second sitation is that when is_percpu is true, the code
>>>>> >> > correctly disables BH but only for the local CPU, so the
>>>>> >> > following can happen when locking the remote CPU without
>>>>> >> > disabling BH:
>>>>> >> >
>>>>> >> >        CPU#0                            CPU#1
>>>>> >> >   ovs_flow_stats_get()
>>>>> >> >    stats_read()
>>>>> >> >  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>>>>> >> >  |  <interrupted>                  stats_read()
>>>>> >> >  |  ...                       +-->  spin_lock remote CPU#0
>>>>> >> >  |                            |     <interrupted>
>>>>> >> >  |  ovs_flow_stats_update()   |     ...
>>>>> >> >  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>>>>> >> >  +---------------------------------- spin_lock local CPU#1
>>>>> >> >
>>>>> >> > This patch disables BH for both cases fixing the deadlocks.
>>>>> >>
>>>>> >> This bug is already fixed in OVS.
>>>>> >
>>>>> > Could you point me to the commit? I am not finding anything
>>>>> > recent.
>>>>>
>>>>> This is the commit:
>>>>>
>>>>> commit 9d73c9cac76ba557fdac4a89c1b7eafe132b85a3
>>>>> Author: Pravin B Shelar <pshelar@nicira.com>
>>>>> Date:   Tue Dec 17 15:43:30 2013 -0800
>>>>>
>>>>>     datapath: Fix deadlock during stats update.
>>>>>
>>>>> I thought that I had sent it in the most recent batch of changes for
>>>>> net but it looks like I missed it.
>>>>
>>>> That commit is incomplete. Look at the scenario #2 which I explain
>>>> why it is needed to disable bh for all cpus and not just local ones.
>>>
>>> OK, I understand the second problem now. OVS master (which I am
>>> currently working to cross-port to net-next) uses a different strategy
>>> that also always disables bottom halves for a different reason. Since
>>> I forgot to send the original patch, maybe we can just apply this one
>>> to net instead and use the new stuff directly everywhere else.
>>>
>>
>> The ovs patch fixes locking issue.
>>
>> local cpu check is optimization for better latency which can be
>> significant on large SMP system. I think we should fix lockdep rather
>> than adding latency for ovs packet processing. But all this is going
>> to change with NUMA stats anyways, so I do not have any problem
>> pushing this new patch.
>
> I think there is actually a possible deadlock here, not just a lockdep
> warning. If you look at Flavio's diagram, it's a rather complicated
> series of events that would be necessary but it seems theoretically
> possible.

Right, I missed v2 commit msg. This patch is required to fix all deadlock cases.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross March 27, 2014, 9:51 p.m. UTC | #9
On Thu, Mar 27, 2014 at 7:05 AM, Flavio Leitner <fbl@redhat.com> wrote:
> There are two problematic situations.
>
> A deadlock can happen when is_percpu is false because it can get
> interrupted while holding the spinlock. Then it executes
> ovs_flow_stats_update() in softirq context which tries to get
> the same lock.
>
> The second sitation is that when is_percpu is true, the code
> correctly disables BH but only for the local CPU, so the
> following can happen when locking the remote CPU without
> disabling BH:
>
>        CPU#0                            CPU#1
>   ovs_flow_stats_get()
>    stats_read()
>  +->spin_lock remote CPU#1        ovs_flow_stats_get()
>  |  <interrupted>                  stats_read()
>  |  ...                       +-->  spin_lock remote CPU#0
>  |                            |     <interrupted>
>  |  ovs_flow_stats_update()   |     ...
>  |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
>  +---------------------------------- spin_lock local CPU#1
>
> This patch disables BH for both cases fixing the deadlocks.
>
> =================================
> [ INFO: inconsistent lock state ]
> 3.14.0-rc8-00007-g632b06a #1 Tainted: G          I
> ---------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
> (&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
> {SOFTIRQ-ON-W} state was registered at:
> [<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40
> [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
> [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
> [<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch]
> [<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch]
> [<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch]
> [<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch]
> [<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0
> [<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0
> [<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0
> [<ffffffff816c0798>] genl_rcv+0x28/0x40
> [<ffffffff816bf830>] netlink_unicast+0x100/0x1e0
> [<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770
> [<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0
> [<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0
> [<ffffffff8166a911>] __sys_sendmsg+0x51/0x90
> [<ffffffff8166a962>] SyS_sendmsg+0x12/0x20
> [<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b
> irq event stamp: 1740726
> hardirqs last  enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840
> hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840
> softirqs last  enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50
> softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&(&cpu_stats->lock)->rlock);
>   <Interrupt>
>     lock(&(&cpu_stats->lock)->rlock);
>
>  *** DEADLOCK ***
>
> 5 locks held by swapper/0/0:
>  #0:  (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320
>  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0
>  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840
>  #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0
>  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch]
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G          I  3.14.0-rc8-00007-g632b06a #1
> Hardware name:                  /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012
>  0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c
>  ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005
>  ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0
> Call Trace:
>  <IRQ>  [<ffffffff817cfe3c>] dump_stack+0x4d/0x66
>  [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205
>  [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180
>  [<ffffffff810f8963>] mark_lock+0x223/0x2b0
>  [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40
>  [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80
>  [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch]
>  [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
>  [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
>  [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
>  [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
>  [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
>  [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch]
>  [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40
>  [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch]
>  [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch]
>  [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch]
>  [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0
>  [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0
>  [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0
>  [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840
>  [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20
>  [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840
>  [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220
>  [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220
>  [<ffffffff8176145f>] ip6_output+0x4f/0x1f0
>  [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0
>  [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0
>  [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
>  [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50
>  [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220
>  [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
>  [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0
>  [<ffffffff810a71e9>] call_timer_fn+0x99/0x320
>  [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320
>  [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
>  [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0
>  [<ffffffff8109d47d>] __do_softirq+0x12d/0x480
>
> Signed-off-by: Flavio Leitner <fbl@redhat.com>

David, can you apply this directly to the net tree at this point?

Acked-by: Jesse Gross <jesse@nicira.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 28, 2014, 8:42 p.m. UTC | #10
From: Flavio Leitner <fbl@redhat.com>
Date: Thu, 27 Mar 2014 11:05:34 -0300

> There are two problematic situations.
> 
> A deadlock can happen when is_percpu is false because it can get
> interrupted while holding the spinlock. Then it executes
> ovs_flow_stats_update() in softirq context which tries to get
> the same lock.
> 
> The second sitation is that when is_percpu is true, the code
> correctly disables BH but only for the local CPU, so the
> following can happen when locking the remote CPU without
> disabling BH:
 ...
> Signed-off-by: Flavio Leitner <fbl@redhat.com>
> ---
>  v2: better commit message

Ok, I just saw Jesse's request that I apply this directly, so I
have done so.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

=================================
[ INFO: inconsistent lock state ]
3.14.0-rc8-00007-g632b06a #1 Tainted: G          I
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
(&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
{SOFTIRQ-ON-W} state was registered at:
[<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40
[<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
[<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
[<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch]
[<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch]
[<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch]
[<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch]
[<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0
[<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0
[<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0
[<ffffffff816c0798>] genl_rcv+0x28/0x40
[<ffffffff816bf830>] netlink_unicast+0x100/0x1e0
[<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770
[<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0
[<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0
[<ffffffff8166a911>] __sys_sendmsg+0x51/0x90
[<ffffffff8166a962>] SyS_sendmsg+0x12/0x20
[<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b
irq event stamp: 1740726
hardirqs last  enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840
hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840
softirqs last  enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50
softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&cpu_stats->lock)->rlock);
  <Interrupt>
    lock(&(&cpu_stats->lock)->rlock);

 *** DEADLOCK ***

5 locks held by swapper/0/0:
 #0:  (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320
 #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0
 #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840
 #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0
 #4:  (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch]

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G          I  3.14.0-rc8-00007-g632b06a #1
Hardware name:                  /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012
 0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c
 ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005
 ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0
Call Trace:
 <IRQ>  [<ffffffff817cfe3c>] dump_stack+0x4d/0x66
 [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205
 [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180
 [<ffffffff810f8963>] mark_lock+0x223/0x2b0
 [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40
 [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80
 [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch]
 [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
 [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
 [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
 [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
 [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
 [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch]
 [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40
 [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch]
 [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch]
 [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch]
 [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0
 [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0
 [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0
 [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840
 [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20
 [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840
 [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220
 [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220
 [<ffffffff8176145f>] ip6_output+0x4f/0x1f0
 [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0
 [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0
 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
 [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50
 [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220
 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
 [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0
 [<ffffffff810a71e9>] call_timer_fn+0x99/0x320
 [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320
 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
 [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0
 [<ffffffff8109d47d>] __do_softirq+0x12d/0x480

Signed-off-by: Flavio Leitner <fbl@redhat.com>
---
 v2: better commit message

 net/openvswitch/flow.c | 26 ++++++--------------------
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index dda451f..2998989 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -103,30 +103,24 @@  static void stats_read(struct flow_stats *stats,
 void ovs_flow_stats_get(struct sw_flow *flow, struct ovs_flow_stats *ovs_stats,
 			unsigned long *used, __be16 *tcp_flags)
 {
-	int cpu, cur_cpu;
+	int cpu;
 
 	*used = 0;
 	*tcp_flags = 0;
 	memset(ovs_stats, 0, sizeof(*ovs_stats));
 
+	local_bh_disable();
 	if (!flow->stats.is_percpu) {
 		stats_read(flow->stats.stat, ovs_stats, used, tcp_flags);
 	} else {
-		cur_cpu = get_cpu();
 		for_each_possible_cpu(cpu) {
 			struct flow_stats *stats;
 
-			if (cpu == cur_cpu)
-				local_bh_disable();
-
 			stats = per_cpu_ptr(flow->stats.cpu_stats, cpu);
 			stats_read(stats, ovs_stats, used, tcp_flags);
-
-			if (cpu == cur_cpu)
-				local_bh_enable();
 		}
-		put_cpu();
 	}
+	local_bh_enable();
 }
 
 static void stats_reset(struct flow_stats *stats)
@@ -141,25 +135,17 @@  static void stats_reset(struct flow_stats *stats)
 
 void ovs_flow_stats_clear(struct sw_flow *flow)
 {
-	int cpu, cur_cpu;
+	int cpu;
 
+	local_bh_disable();
 	if (!flow->stats.is_percpu) {
 		stats_reset(flow->stats.stat);
 	} else {
-		cur_cpu = get_cpu();
-
 		for_each_possible_cpu(cpu) {
-
-			if (cpu == cur_cpu)
-				local_bh_disable();
-
 			stats_reset(per_cpu_ptr(flow->stats.cpu_stats, cpu));
-
-			if (cpu == cur_cpu)
-				local_bh_enable();
 		}
-		put_cpu();
 	}
+	local_bh_enable();
 }
 
 static int check_header(struct sk_buff *skb, int len)