diff mbox series

tipc: fix an interrupt unsafe locking scenario

Message ID 1533628352-31928-1-git-send-email-ying.xue@windriver.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series tipc: fix an interrupt unsafe locking scenario | expand

Commit Message

Ying Xue Aug. 7, 2018, 7:52 a.m. UTC
Commit 9faa89d4ed9d ("tipc: make function tipc_net_finalize() thread
safe") tries to make it thread safe to set node address, so it uses
node_list_lock lock to serialize the whole process of setting node
address in tipc_net_finalize(). But it causes the following interrupt
unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  rht_deferred_worker()
  rhashtable_rehash_table()
  lock(&(&ht->lock)->rlock)
			       tipc_nl_compat_doit()
                               tipc_net_finalize()
                               local_irq_disable();
                               lock(&(&tn->node_list_lock)->rlock);
                               tipc_sk_reinit()
                               rhashtable_walk_enter()
                               lock(&(&ht->lock)->rlock);
  <Interrupt>
  tipc_disc_rcv()
  tipc_node_check_dest()
  tipc_node_create()
  lock(&(&tn->node_list_lock)->rlock);

 *** DEADLOCK ***

When rhashtable_rehash_table() holds ht->lock on CPU0, it doesn't
disable BH. So if an interrupt happens after the lock, it can create
an inverse lock ordering between ht->lock and tn->node_list_lock. As
a consequence, deadlock might happen.

The reason causing the inverse lock ordering scenario above is because
the initial purpose of node_list_lock is not designed to do the
serialization of node address setting.

As cmpxchg() can guarantee CAS (compare-and-swap) process is atomic,
we use it to replace node_list_lock to ensure setting node address can
be atomically finished. It turns out the potential deadlock can be
avoided as well.

Fixes: 9faa89d4ed9d ("tipc: make function tipc_net_finalize() thread safe")
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/tipc/net.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Jon Maloy Aug. 7, 2018, 4:10 p.m. UTC | #1
Nice fix. I was a little in doubt about using that lock, but deemed it was overkill to introduce an extra lock just for this case.

Acked-by: Jon Maloy <maloy@donjonn.com>

> -----Original Message-----
> From: Ying Xue <ying.xue@windriver.com>
> Sent: Tuesday, 07 August, 2018 02:53
> To: netdev@vger.kernel.org; Jon Maloy <jon.maloy@ericsson.com>
> Cc: davem@davemloft.net; tipc-discussion@lists.sourceforge.net
> Subject: [PATCH] tipc: fix an interrupt unsafe locking scenario
> 
> Commit 9faa89d4ed9d ("tipc: make function tipc_net_finalize() thread
> safe") tries to make it thread safe to set node address, so it uses
> node_list_lock lock to serialize the whole process of setting node
> address in tipc_net_finalize(). But it causes the following interrupt
> unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   rht_deferred_worker()
>   rhashtable_rehash_table()
>   lock(&(&ht->lock)->rlock)
> 			       tipc_nl_compat_doit()
>                                tipc_net_finalize()
>                                local_irq_disable();
>                                lock(&(&tn->node_list_lock)->rlock);
>                                tipc_sk_reinit()
>                                rhashtable_walk_enter()
>                                lock(&(&ht->lock)->rlock);
>   <Interrupt>
>   tipc_disc_rcv()
>   tipc_node_check_dest()
>   tipc_node_create()
>   lock(&(&tn->node_list_lock)->rlock);
> 
>  *** DEADLOCK ***
> 
> When rhashtable_rehash_table() holds ht->lock on CPU0, it doesn't
> disable BH. So if an interrupt happens after the lock, it can create
> an inverse lock ordering between ht->lock and tn->node_list_lock. As
> a consequence, deadlock might happen.
> 
> The reason causing the inverse lock ordering scenario above is because
> the initial purpose of node_list_lock is not designed to do the
> serialization of node address setting.
> 
> As cmpxchg() can guarantee CAS (compare-and-swap) process is atomic,
> we use it to replace node_list_lock to ensure setting node address can
> be atomically finished. It turns out the potential deadlock can be
> avoided as well.
> 
> Fixes: 9faa89d4ed9d ("tipc: make function tipc_net_finalize() thread safe")
> Signed-off-by: Ying Xue <ying.xue@windriver.com>
> ---
>  net/tipc/net.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/net/tipc/net.c b/net/tipc/net.c
> index a7f6964..62199cf 100644
> --- a/net/tipc/net.c
> +++ b/net/tipc/net.c
> @@ -123,15 +123,13 @@ void tipc_net_finalize(struct net *net, u32 addr)
>  {
>  	struct tipc_net *tn = tipc_net(net);
> 
> -	spin_lock_bh(&tn->node_list_lock);
> -	if (!tipc_own_addr(net)) {
> +	if (!cmpxchg(&tn->node_addr, 0, addr)) {
>  		tipc_set_node_addr(net, addr);
>  		tipc_named_reinit(net);
>  		tipc_sk_reinit(net);
>  		tipc_nametbl_publish(net, TIPC_CFG_SRV, addr, addr,
>  				     TIPC_CLUSTER_SCOPE, 0, addr);
>  	}
> -	spin_unlock_bh(&tn->node_list_lock);
>  }
> 
>  void tipc_net_stop(struct net *net)
> --
> 2.7.4
David Miller Aug. 7, 2018, 8:18 p.m. UTC | #2
From: Ying Xue <ying.xue@windriver.com>
Date: Tue, 7 Aug 2018 15:52:32 +0800

> Commit 9faa89d4ed9d ("tipc: make function tipc_net_finalize() thread
> safe") tries to make it thread safe to set node address, so it uses
> node_list_lock lock to serialize the whole process of setting node
> address in tipc_net_finalize(). But it causes the following interrupt
> unsafe locking scenario:
 ...
> When rhashtable_rehash_table() holds ht->lock on CPU0, it doesn't
> disable BH. So if an interrupt happens after the lock, it can create
> an inverse lock ordering between ht->lock and tn->node_list_lock. As
> a consequence, deadlock might happen.
> 
> The reason causing the inverse lock ordering scenario above is because
> the initial purpose of node_list_lock is not designed to do the
> serialization of node address setting.
> 
> As cmpxchg() can guarantee CAS (compare-and-swap) process is atomic,
> we use it to replace node_list_lock to ensure setting node address can
> be atomically finished. It turns out the potential deadlock can be
> avoided as well.
> 
> Fixes: 9faa89d4ed9d ("tipc: make function tipc_net_finalize() thread safe")
> Signed-off-by: Ying Xue <ying.xue@windriver.com>

Applied, thanks.
diff mbox series

Patch

diff --git a/net/tipc/net.c b/net/tipc/net.c
index a7f6964..62199cf 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -123,15 +123,13 @@  void tipc_net_finalize(struct net *net, u32 addr)
 {
 	struct tipc_net *tn = tipc_net(net);
 
-	spin_lock_bh(&tn->node_list_lock);
-	if (!tipc_own_addr(net)) {
+	if (!cmpxchg(&tn->node_addr, 0, addr)) {
 		tipc_set_node_addr(net, addr);
 		tipc_named_reinit(net);
 		tipc_sk_reinit(net);
 		tipc_nametbl_publish(net, TIPC_CFG_SRV, addr, addr,
 				     TIPC_CLUSTER_SCOPE, 0, addr);
 	}
-	spin_unlock_bh(&tn->node_list_lock);
 }
 
 void tipc_net_stop(struct net *net)