Message ID | 1566270830-28981-1-git-send-email-decui@microsoft.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | vsock: Fix a lockdep warning in __vsock_release() | expand |
From: Dexuan Cui <decui@microsoft.com> Date: Tue, 20 Aug 2019 03:14:22 +0000 > +static void __vsock_release2(struct sock *sk) Do not duplicate an entire function just to adjust some aspect of the lock debugging, please find a cleaner and more minimal way to implement this fix.
On Tue, Aug 20, 2019 at 03:14:22AM +0000, Dexuan Cui wrote: > Lockdep is unhappy if two locks from the same class are held. > > Fix the below warning by making __vsock_release() non-recursive -- this > patch is kind of ugly, but it looks to me there is not a better way to > deal with the problem here. > > ============================================ > WARNING: possible recursive locking detected > 5.2.0+ #6 Not tainted > -------------------------------------------- > a.out/1020 is trying to acquire lock: > 0000000074731a98 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 [hv_sock] > > but task is already holding lock: > 0000000014ff8397 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock] > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(sk_lock-AF_VSOCK); > lock(sk_lock-AF_VSOCK); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 2 locks held by a.out/1020: > #0: 00000000f8bceaa7 (&sb->s_type->i_mutex_key#10){+.+.}, at: __sock_release+0x2d/0xa0 > #1: 0000000014ff8397 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock] > > stack backtrace: > CPU: 7 PID: 1020 Comm: a.out Not tainted 5.2.0+ #6 > Call Trace: > dump_stack+0x67/0x90 > __lock_acquire.cold.66+0x14d/0x1f8 > lock_acquire+0xb5/0x1c0 > lock_sock_nested+0x6d/0x90 > hvs_release+0x10/0x120 [hv_sock] > __vsock_release+0x24/0xf0 [vsock] > __vsock_release+0xa0/0xf0 [vsock] > vsock_release+0x12/0x30 [vsock] > __sock_release+0x37/0xa0 > sock_close+0x14/0x20 > __fput+0xc1/0x250 > task_work_run+0x98/0xc0 > do_exit+0x3dd/0xc60 > do_group_exit+0x47/0xc0 > get_signal+0x169/0xc60 > do_signal+0x30/0x710 > exit_to_usermode_loop+0x50/0xa0 > do_syscall_64+0x1fc/0x220 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Signed-off-by: Dexuan Cui <decui@microsoft.com> > --- > net/vmw_vsock/af_vsock.c | 33 ++++++++++++++++++++++++++++++++- > net/vmw_vsock/hyperv_transport.c | 2 +- > 2 files changed, 33 insertions(+), 2 deletions(-) > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c > index ab47bf3..420f605 100644 > --- a/net/vmw_vsock/af_vsock.c > +++ b/net/vmw_vsock/af_vsock.c > @@ -638,6 +638,37 @@ struct sock *__vsock_create(struct net *net, > } > EXPORT_SYMBOL_GPL(__vsock_create); > > +static void __vsock_release2(struct sock *sk) > +{ > + if (sk) { > + struct sk_buff *skb; > + struct vsock_sock *vsk; > + > + vsk = vsock_sk(sk); > + > + /* The release call is supposed to use lock_sock_nested() > + * rather than lock_sock(), if a lock should be acquired. > + */ > + transport->release(vsk); > + > + /* Use the nested version to avoid the warning > + * "possible recursive locking detected". > + */ > + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); What about using lock_sock_nested() in the __vsock_release() without define this new function? > + sock_orphan(sk); > + sk->sk_shutdown = SHUTDOWN_MASK; > + > + while ((skb = skb_dequeue(&sk->sk_receive_queue))) > + kfree_skb(skb); > + > + /* This sk can not be a listener, so it's unnecessary > + * to call vsock_dequeue_accept(). > + */ > + release_sock(sk); > + sock_put(sk); > + } > +} > + > static void __vsock_release(struct sock *sk) > { > if (sk) { > @@ -659,7 +690,7 @@ static void __vsock_release(struct sock *sk) > > /* Clean up any sockets that never were accepted. */ > while ((pending = vsock_dequeue_accept(sk)) != NULL) { > - __vsock_release(pending); > + __vsock_release2(pending); > sock_put(pending); > } > > diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c > index 9d864eb..4b126b2 100644 > --- a/net/vmw_vsock/hyperv_transport.c > +++ b/net/vmw_vsock/hyperv_transport.c > @@ -559,7 +559,7 @@ static void hvs_release(struct vsock_sock *vsk) > struct sock *sk = sk_vsock(vsk); > bool remove_sock; > > - lock_sock(sk); > + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); Should we update also other transports? Thanks, Stefano
> From: Stefano Garzarella <sgarzare@redhat.com> > Sent: Thursday, August 22, 2019 3:25 AM > > [...snipped...] > > --- a/net/vmw_vsock/hyperv_transport.c > > +++ b/net/vmw_vsock/hyperv_transport.c > > @@ -559,7 +559,7 @@ static void hvs_release(struct vsock_sock *vsk) > > struct sock *sk = sk_vsock(vsk); > > bool remove_sock; > > > > - lock_sock(sk); > > + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); > > Should we update also other transports? > > Stefano Hi Stefano, Sorry for the late reply! I'll post a v2 shortly. As I checked, hyperv socket and virtio socket need to be fixed. The vmci socket code doesn't acquire the sock lock in the release callback, so it doesn't need any fix. Thanks, -- Dexuan
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index ab47bf3..420f605 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -638,6 +638,37 @@ struct sock *__vsock_create(struct net *net, } EXPORT_SYMBOL_GPL(__vsock_create); +static void __vsock_release2(struct sock *sk) +{ + if (sk) { + struct sk_buff *skb; + struct vsock_sock *vsk; + + vsk = vsock_sk(sk); + + /* The release call is supposed to use lock_sock_nested() + * rather than lock_sock(), if a lock should be acquired. + */ + transport->release(vsk); + + /* Use the nested version to avoid the warning + * "possible recursive locking detected". + */ + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); + sock_orphan(sk); + sk->sk_shutdown = SHUTDOWN_MASK; + + while ((skb = skb_dequeue(&sk->sk_receive_queue))) + kfree_skb(skb); + + /* This sk can not be a listener, so it's unnecessary + * to call vsock_dequeue_accept(). + */ + release_sock(sk); + sock_put(sk); + } +} + static void __vsock_release(struct sock *sk) { if (sk) { @@ -659,7 +690,7 @@ static void __vsock_release(struct sock *sk) /* Clean up any sockets that never were accepted. */ while ((pending = vsock_dequeue_accept(sk)) != NULL) { - __vsock_release(pending); + __vsock_release2(pending); sock_put(pending); } diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index 9d864eb..4b126b2 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -559,7 +559,7 @@ static void hvs_release(struct vsock_sock *vsk) struct sock *sk = sk_vsock(vsk); bool remove_sock; - lock_sock(sk); + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); remove_sock = hvs_close_lock_held(vsk); release_sock(sk); if (remove_sock)
Lockdep is unhappy if two locks from the same class are held. Fix the below warning by making __vsock_release() non-recursive -- this patch is kind of ugly, but it looks to me there is not a better way to deal with the problem here. ============================================ WARNING: possible recursive locking detected 5.2.0+ #6 Not tainted -------------------------------------------- a.out/1020 is trying to acquire lock: 0000000074731a98 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 [hv_sock] but task is already holding lock: 0000000014ff8397 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sk_lock-AF_VSOCK); lock(sk_lock-AF_VSOCK); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by a.out/1020: #0: 00000000f8bceaa7 (&sb->s_type->i_mutex_key#10){+.+.}, at: __sock_release+0x2d/0xa0 #1: 0000000014ff8397 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock] stack backtrace: CPU: 7 PID: 1020 Comm: a.out Not tainted 5.2.0+ #6 Call Trace: dump_stack+0x67/0x90 __lock_acquire.cold.66+0x14d/0x1f8 lock_acquire+0xb5/0x1c0 lock_sock_nested+0x6d/0x90 hvs_release+0x10/0x120 [hv_sock] __vsock_release+0x24/0xf0 [vsock] __vsock_release+0xa0/0xf0 [vsock] vsock_release+0x12/0x30 [vsock] __sock_release+0x37/0xa0 sock_close+0x14/0x20 __fput+0xc1/0x250 task_work_run+0x98/0xc0 do_exit+0x3dd/0xc60 do_group_exit+0x47/0xc0 get_signal+0x169/0xc60 do_signal+0x30/0x710 exit_to_usermode_loop+0x50/0xa0 do_syscall_64+0x1fc/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Dexuan Cui <decui@microsoft.com> --- net/vmw_vsock/af_vsock.c | 33 ++++++++++++++++++++++++++++++++- net/vmw_vsock/hyperv_transport.c | 2 +- 2 files changed, 33 insertions(+), 2 deletions(-)