[net] ipv6: avoid lockdep issue in fib6_del()

Message ID	20200908082023.3690438-1-edumazet@google.com
State	Accepted
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> Date: Tue, 8 Sep 2020 01:20:23 -0700 Message-Id: <20200908082023.3690438-1-edumazet@google.com> Mime-Version: 1.0 Subject: [PATCH net] ipv6: avoid lockdep issue in fib6_del() From: Eric Dumazet <edumazet@google.com> To: "David S . Miller" <davem@davemloft.net> Cc: netdev <netdev@vger.kernel.org>, Eric Dumazet <edumazet@google.com>, Eric Dumazet <eric.dumazet@gmail.com>, David Ahern <dsahern@gmail.com> Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk
Series	[net] ipv6: avoid lockdep issue in fib6_del() \| expand [net] ipv6: avoid lockdep issue in fib6_del()

Eric Dumazet Sept. 8, 2020, 8:20 a.m. UTC

syzbot reported twice a lockdep issue in fib6_del() [1]
which I think is caused by net->ipv6.fib6_null_entry
having a NULL fib6_table pointer.

fib6_del() already checks for fib6_null_entry special
case, we only need to return earlier.

Bug seems to occur very rarely, I have thus chosen
a 'bug origin' that makes backports not too complex.

[1]
WARNING: suspicious RCU usage
5.9.0-rc4-syzkaller #0 Not tainted
-----------------------------
net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
4 locks held by syz-executor.5/8095:
 #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
 #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
 #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
 #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
 #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
 #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245

stack backtrace:
CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fd lib/dump_stack.c:118
 fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
 fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
 fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
 fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
 __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
 fib6_clean_all net/ipv6/ip6_fib.c:2257 [inline]
 fib6_run_gc+0x113/0x2d0 net/ipv6/ip6_fib.c:2320
 ndisc_netdev_event+0x217/0x350 net/ipv6/ndisc.c:1805
 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2033
 call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
 call_netdevice_notifiers net/core/dev.c:2059 [inline]
 dev_close_many+0x30b/0x650 net/core/dev.c:1634
 rollback_registered_many+0x3a8/0x1210 net/core/dev.c:9261
 rollback_registered net/core/dev.c:9329 [inline]
 unregister_netdevice_queue+0x2dd/0x570 net/core/dev.c:10410
 unregister_netdevice include/linux/netdevice.h:2774 [inline]
 ppp_release+0x216/0x240 drivers/net/ppp/ppp_generic.c:403
 __fput+0x285/0x920 fs/file_table.c:281
 task_work_run+0xdd/0x190 kernel/task_work.c:141
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
 exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 421842edeaf6 ("net/ipv6: Add fib6_null_entry")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
---
 net/ipv6/ip6_fib.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

David Ahern Sept. 8, 2020, 4:50 p.m. UTC | #1

On 9/8/20 2:20 AM, Eric Dumazet wrote:
> syzbot reported twice a lockdep issue in fib6_del() [1]
> which I think is caused by net->ipv6.fib6_null_entry
> having a NULL fib6_table pointer.
> 
> fib6_del() already checks for fib6_null_entry special
> case, we only need to return earlier.
> 
> Bug seems to occur very rarely, I have thus chosen
> a 'bug origin' that makes backports not too complex.
> 
> [1]
> WARNING: suspicious RCU usage
> 5.9.0-rc4-syzkaller #0 Not tainted
> -----------------------------
> net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
> 
> other info that might help us debug this:
> 
> rcu_scheduler_active = 2, debug_locks = 1
> 4 locks held by syz-executor.5/8095:
>  #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
>  #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
> 
> stack backtrace:
> CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x198/0x1fd lib/dump_stack.c:118
>  fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
>  fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
>  fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
>  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
>  fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
>  __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246

This is walking a table and __fib6_clean_all takes the lock for the
table (and you can see that above), so puzzling how fib6_del can be
called for an entry with NULL fib6_table.

Eric Dumazet Sept. 8, 2020, 5:06 p.m. UTC | #2

On Tue, Sep 8, 2020 at 6:50 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 9/8/20 2:20 AM, Eric Dumazet wrote:
> > syzbot reported twice a lockdep issue in fib6_del() [1]
> > which I think is caused by net->ipv6.fib6_null_entry
> > having a NULL fib6_table pointer.
> >
> > fib6_del() already checks for fib6_null_entry special
> > case, we only need to return earlier.
> >
> > Bug seems to occur very rarely, I have thus chosen
> > a 'bug origin' that makes backports not too complex.
> >
> > [1]
> > WARNING: suspicious RCU usage
> > 5.9.0-rc4-syzkaller #0 Not tainted
> > -----------------------------
> > net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
> >
> > other info that might help us debug this:
> >
> > rcu_scheduler_active = 2, debug_locks = 1
> > 4 locks held by syz-executor.5/8095:
> >  #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
> >  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
> >  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
> >  #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
> >  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
> >  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
> >
> > stack backtrace:
> > CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x198/0x1fd lib/dump_stack.c:118
> >  fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
> >  fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
> >  fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
> >  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
> >  fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
> >  __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
>
> This is walking a table and __fib6_clean_all takes the lock for the
> table (and you can see that above), so puzzling how fib6_del can be
> called for an entry with NULL fib6_table.

So you think the test for  (rt == net->ipv6.fib6_null_entry)
should be replaced by

BUG_ON(rt == net->ipv6.fib6_null_entry); ?

David Ahern Sept. 8, 2020, 5:41 p.m. UTC | #3

On 9/8/20 11:06 AM, Eric Dumazet wrote:
> On Tue, Sep 8, 2020 at 6:50 PM David Ahern <dsahern@gmail.com> wrote:
>>
>> On 9/8/20 2:20 AM, Eric Dumazet wrote:
>>> syzbot reported twice a lockdep issue in fib6_del() [1]
>>> which I think is caused by net->ipv6.fib6_null_entry
>>> having a NULL fib6_table pointer.
>>>
>>> fib6_del() already checks for fib6_null_entry special
>>> case, we only need to return earlier.
>>>
>>> Bug seems to occur very rarely, I have thus chosen
>>> a 'bug origin' that makes backports not too complex.
>>>
>>> [1]
>>> WARNING: suspicious RCU usage
>>> 5.9.0-rc4-syzkaller #0 Not tainted
>>> -----------------------------
>>> net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
>>>
>>> other info that might help us debug this:
>>>
>>> rcu_scheduler_active = 2, debug_locks = 1
>>> 4 locks held by syz-executor.5/8095:
>>>  #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
>>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
>>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
>>>  #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
>>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
>>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
>>>
>>> stack backtrace:
>>> CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x198/0x1fd lib/dump_stack.c:118
>>>  fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
>>>  fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
>>>  fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
>>>  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
>>>  fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
>>>  __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
>>
>> This is walking a table and __fib6_clean_all takes the lock for the
>> table (and you can see that above), so puzzling how fib6_del can be
>> called for an entry with NULL fib6_table.
> 
> So you think the test for  (rt == net->ipv6.fib6_null_entry)
> should be replaced by
> 
> BUG_ON(rt == net->ipv6.fib6_null_entry); ?
> 

BUG_ON does not seem right.

Backing out to the callers, why does fib6_clean_node not catch that it
is the root of the table and abort the walk or at least not try to
remove the root? This might be related to the problem Ben has complained
about many times.

If syzbot has only triggered it a few times then I presume no reproducer.

Eric Dumazet Sept. 8, 2020, 5:46 p.m. UTC | #4

On Tue, Sep 8, 2020 at 7:41 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 9/8/20 11:06 AM, Eric Dumazet wrote:
> > On Tue, Sep 8, 2020 at 6:50 PM David Ahern <dsahern@gmail.com> wrote:
> >>
> >> On 9/8/20 2:20 AM, Eric Dumazet wrote:
> >>> syzbot reported twice a lockdep issue in fib6_del() [1]
> >>> which I think is caused by net->ipv6.fib6_null_entry
> >>> having a NULL fib6_table pointer.
> >>>
> >>> fib6_del() already checks for fib6_null_entry special
> >>> case, we only need to return earlier.
> >>>
> >>> Bug seems to occur very rarely, I have thus chosen
> >>> a 'bug origin' that makes backports not too complex.
> >>>
> >>> [1]
> >>> WARNING: suspicious RCU usage
> >>> 5.9.0-rc4-syzkaller #0 Not tainted
> >>> -----------------------------
> >>> net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
> >>>
> >>> other info that might help us debug this:
> >>>
> >>> rcu_scheduler_active = 2, debug_locks = 1
> >>> 4 locks held by syz-executor.5/8095:
> >>>  #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
> >>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
> >>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
> >>>  #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
> >>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
> >>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
> >>>
> >>> stack backtrace:
> >>> CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
> >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >>> Call Trace:
> >>>  __dump_stack lib/dump_stack.c:77 [inline]
> >>>  dump_stack+0x198/0x1fd lib/dump_stack.c:118
> >>>  fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
> >>>  fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
> >>>  fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
> >>>  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
> >>>  fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
> >>>  __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
> >>
> >> This is walking a table and __fib6_clean_all takes the lock for the
> >> table (and you can see that above), so puzzling how fib6_del can be
> >> called for an entry with NULL fib6_table.
> >
> > So you think the test for  (rt == net->ipv6.fib6_null_entry)
> > should be replaced by
> >
> > BUG_ON(rt == net->ipv6.fib6_null_entry); ?
> >
>
> BUG_ON does not seem right.

Yes, currently we return -ENOENT, which looks reasonable.


>
> Backing out to the callers, why does fib6_clean_node not catch that it
> is the root of the table and abort the walk or at least not try to
> remove the root? This might be related to the problem Ben has complained
> about many times.
>
> If syzbot has only triggered it a few times then I presume no reproducer.

No repro, only occurred twice...

David Ahern Sept. 8, 2020, 5:56 p.m. UTC | #5

On 9/8/20 2:20 AM, Eric Dumazet wrote:
> syzbot reported twice a lockdep issue in fib6_del() [1]
> which I think is caused by net->ipv6.fib6_null_entry
> having a NULL fib6_table pointer.
> 
> fib6_del() already checks for fib6_null_entry special
> case, we only need to return earlier.
> 
> Bug seems to occur very rarely, I have thus chosen
> a 'bug origin' that makes backports not too complex.

Make sense.
> 
> [1]
...
> 
> Fixes: 421842edeaf6 ("net/ipv6: Add fib6_null_entry")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: David Ahern <dsahern@gmail.com>
> ---
>  net/ipv6/ip6_fib.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> index 25a90f3f705c7e6d53615f490f36c5722f3bd8b1..4a664ad4f4d4bb2b521f67e8433a06c77bd301ee 100644
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -1993,14 +1993,19 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
>  /* Need to own table->tb6_lock */
>  int fib6_del(struct fib6_info *rt, struct nl_info *info)
>  {
> -	struct fib6_node *fn = rcu_dereference_protected(rt->fib6_node,
> -				    lockdep_is_held(&rt->fib6_table->tb6_lock));
> -	struct fib6_table *table = rt->fib6_table;
>  	struct net *net = info->nl_net;
>  	struct fib6_info __rcu **rtp;
>  	struct fib6_info __rcu **rtp_next;
> +	struct fib6_table *table;
> +	struct fib6_node *fn;
>  
> -	if (!fn || rt == net->ipv6.fib6_null_entry)
> +	if (rt == net->ipv6.fib6_null_entry)
> +		return -ENOENT;
> +
> +	table = rt->fib6_table;
> +	fn = rcu_dereference_protected(rt->fib6_node,
> +				       lockdep_is_held(&table->tb6_lock));
> +	if (!fn)
>  		return -ENOENT;
>  
>  	WARN_ON(!(fn->fn_flags & RTN_RTINFO));
> 

seems like a reasonable refactoring for the noted problem.

Reviewed-by: David Ahern <dsahern@gmail.com>

David Miller Sept. 9, 2020, 2:57 a.m. UTC | #6

From: Eric Dumazet <edumazet@google.com>
Date: Tue,  8 Sep 2020 01:20:23 -0700

> syzbot reported twice a lockdep issue in fib6_del() [1]
> which I think is caused by net->ipv6.fib6_null_entry
> having a NULL fib6_table pointer.
> 
> fib6_del() already checks for fib6_null_entry special
> case, we only need to return earlier.
> 
> Bug seems to occur very rarely, I have thus chosen
> a 'bug origin' that makes backports not too complex.
 ...
> Fixes: 421842edeaf6 ("net/ipv6: Add fib6_null_entry")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks Eric.

[net] ipv6: avoid lockdep issue in fib6_del()

Commit Message

Comments

Patch