diff mbox series

mwifiex: don't call del_timer_sync() on uninitialized timer

Message ID 1595900652-3842-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
State Awaiting Upstream
Delegated to: David Miller
Headers show
Series mwifiex: don't call del_timer_sync() on uninitialized timer | expand

Commit Message

Tetsuo Handa July 28, 2020, 1:44 a.m. UTC
syzbot is reporting that del_timer_sync() is called from
mwifiex_usb_cleanup_tx_aggr() from mwifiex_unregister_dev() without
checking timer_setup() from mwifiex_usb_tx_init() was called [1].
Since mwifiex_usb_prepare_tx_aggr_skb() is calling del_timer() if
is_hold_timer_set == true, use the same condition for del_timer_sync().

[1] https://syzkaller.appspot.com/bug?id=fdeef9cf7348be8b8ab5b847f2ed993aba8ea7b6

Reported-by: syzbot <syzbot+373e6719b49912399d21@syzkaller.appspotmail.com>
Cc: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
A patch from Ganapathi Bhat ( https://patchwork.kernel.org/patch/10990275/ ) is stalling
at https://lore.kernel.org/linux-usb/MN2PR18MB2637D7C742BC235FE38367F0A09C0@MN2PR18MB2637.namprd18.prod.outlook.com/ .
syzbot by now got this report for 10000 times. Do we want to go with this simple patch?

 drivers/net/wireless/marvell/mwifiex/usb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Andy Shevchenko July 28, 2020, 5:29 p.m. UTC | #1
On Tue, Jul 28, 2020 at 4:46 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> syzbot is reporting that del_timer_sync() is called from
> mwifiex_usb_cleanup_tx_aggr() from mwifiex_unregister_dev() without
> checking timer_setup() from mwifiex_usb_tx_init() was called [1].
> Since mwifiex_usb_prepare_tx_aggr_skb() is calling del_timer() if
> is_hold_timer_set == true, use the same condition for del_timer_sync().
>
> [1] https://syzkaller.appspot.com/bug?id=fdeef9cf7348be8b8ab5b847f2ed993aba8ea7b6
>

Can you use BugLink: tag for above?

> Reported-by: syzbot <syzbot+373e6719b49912399d21@syzkaller.appspotmail.com>
> Cc: Ganapathi Bhat <gbhat@marvell.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> A patch from Ganapathi Bhat ( https://patchwork.kernel.org/patch/10990275/ ) is stalling
> at https://lore.kernel.org/linux-usb/MN2PR18MB2637D7C742BC235FE38367F0A09C0@MN2PR18MB2637.namprd18.prod.outlook.com/ .
> syzbot by now got this report for 10000 times. Do we want to go with this simple patch?
>
>  drivers/net/wireless/marvell/mwifiex/usb.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c
> index 6f3cfde..04a1461 100644
> --- a/drivers/net/wireless/marvell/mwifiex/usb.c
> +++ b/drivers/net/wireless/marvell/mwifiex/usb.c
> @@ -1353,7 +1353,8 @@ static void mwifiex_usb_cleanup_tx_aggr(struct mwifiex_adapter *adapter)
>                                 skb_dequeue(&port->tx_aggr.aggr_list)))
>                                 mwifiex_write_data_complete(adapter, skb_tmp,
>                                                             0, -1);
> -               del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
> +               if (port->tx_aggr.timer_cnxt.is_hold_timer_set)
> +                       del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
>                 port->tx_aggr.timer_cnxt.is_hold_timer_set = false;
>                 port->tx_aggr.timer_cnxt.hold_tmo_msecs = 0;
>         }
> --
> 1.8.3.1
>
Brian Norris July 28, 2020, 6:45 p.m. UTC | #2
Hi,

On Mon, Jul 27, 2020 at 6:45 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> syzbot is reporting that del_timer_sync() is called from
> mwifiex_usb_cleanup_tx_aggr() from mwifiex_unregister_dev() without
> checking timer_setup() from mwifiex_usb_tx_init() was called [1].
> Since mwifiex_usb_prepare_tx_aggr_skb() is calling del_timer() if
> is_hold_timer_set == true, use the same condition for del_timer_sync().
>
> [1] https://syzkaller.appspot.com/bug?id=fdeef9cf7348be8b8ab5b847f2ed993aba8ea7b6
>
> Reported-by: syzbot <syzbot+373e6719b49912399d21@syzkaller.appspotmail.com>
> Cc: Ganapathi Bhat <gbhat@marvell.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> A patch from Ganapathi Bhat ( https://patchwork.kernel.org/patch/10990275/ ) is stalling
> at https://lore.kernel.org/linux-usb/MN2PR18MB2637D7C742BC235FE38367F0A09C0@MN2PR18MB2637.namprd18.prod.outlook.com/ .
> syzbot by now got this report for 10000 times. Do we want to go with this simple patch?

Sorry, that stall is partly my fault, and partly Ganapathi's. It
doesn't help that it took him 4 months to reply to my questions, so I
completely lost even the tiny bit of context I had managed to build up
in my head at initial review time... and so it's still buried in the
dark corners of my inbox. (I think I'll go archive that now, because
it really deserves a better sell than it had initially, if Ganapathi
really wants to land it.)

>  drivers/net/wireless/marvell/mwifiex/usb.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c
> index 6f3cfde..04a1461 100644
> --- a/drivers/net/wireless/marvell/mwifiex/usb.c
> +++ b/drivers/net/wireless/marvell/mwifiex/usb.c
> @@ -1353,7 +1353,8 @@ static void mwifiex_usb_cleanup_tx_aggr(struct mwifiex_adapter *adapter)
>                                 skb_dequeue(&port->tx_aggr.aggr_list)))
>                                 mwifiex_write_data_complete(adapter, skb_tmp,
>                                                             0, -1);
> -               del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
> +               if (port->tx_aggr.timer_cnxt.is_hold_timer_set)

I believe if we ever actually started aggregation, then the timer can
be active at this point, and thus, the access to 'is_hold_timer_set'
is racy.

This *probably* deserves a better refactor, but in absence of that
(and a better explanation than Ganapathi gave), I think you at least
need to hold port->tx_aggr_lock. So perhaps (totally untested):

  spin_lock_bh(&port->tx_aggr_lock);
  if (port->tx_aggr.timer_cnxt.is_hold_timer_set) {
    port->tx_aggr.timer_cnxt.is_hold_timer_set = false;
    spin_unlock_bh(&port->tx_aggr_lock);
    /* Timer could still be running, but it can't be restarted at this
point, so this is safe. */
    del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
  } else {
    spin_unlock_bh(&port->tx_aggr_lock);
  }

Otherwise, I think this is fine:

Reviewed-by: Brian Norris <briannorris@chromium.org>

I also believe mwifiex_usb_prepare_tx_aggr_skb() needs to stop using
del_timer() (without the _sync()), because otherwise we might have
deactivated the timer already but not ensured that it has completely
finished executing on other CPUs. But that is probably orthogonal to
the current patch. (Again, so much in this driver needs refactoring.)

Side note: this entire TX aggregation feature for USB has been hidden
behind the mwifiex.aggr_ctrl module param since its introduction,
which has always been disabled by default. I wonder whether anybody is
*really* testing it, or whether it's 100% broken, as with many things
in this driver...


Brian

> +                       del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
>                 port->tx_aggr.timer_cnxt.is_hold_timer_set = false;
>                 port->tx_aggr.timer_cnxt.hold_tmo_msecs = 0;
>         }
> --
> 1.8.3.1
>
Tetsuo Handa Aug. 17, 2020, 1:06 p.m. UTC | #3
Ganapathi, how do you want to fix this bug?

On 2020/07/29 3:45, Brian Norris wrote:
>> syzbot is reporting that del_timer_sync() is called from
>> mwifiex_usb_cleanup_tx_aggr() from mwifiex_unregister_dev() without
>> checking timer_setup() from mwifiex_usb_tx_init() was called [1].
>> Since mwifiex_usb_prepare_tx_aggr_skb() is calling del_timer() if
>> is_hold_timer_set == true, use the same condition for del_timer_sync().
>>
>> [1] https://syzkaller.appspot.com/bug?id=fdeef9cf7348be8b8ab5b847f2ed993aba8ea7b6
>>
>> Reported-by: syzbot <syzbot+373e6719b49912399d21@syzkaller.appspotmail.com>
>> Cc: Ganapathi Bhat <gbhat@marvell.com>
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> ---
>> A patch from Ganapathi Bhat ( https://patchwork.kernel.org/patch/10990275/ ) is stalling
>> at https://lore.kernel.org/linux-usb/MN2PR18MB2637D7C742BC235FE38367F0A09C0@MN2PR18MB2637.namprd18.prod.outlook.com/ .
>> syzbot by now got this report for 10000 times. Do we want to go with this simple patch?
> 
> Sorry, that stall is partly my fault, and partly Ganapathi's. It
> doesn't help that it took him 4 months to reply to my questions, so I
> completely lost even the tiny bit of context I had managed to build up
> in my head at initial review time... and so it's still buried in the
> dark corners of my inbox. (I think I'll go archive that now, because
> it really deserves a better sell than it had initially, if Ganapathi
> really wants to land it.)
diff mbox series

Patch

diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c
index 6f3cfde..04a1461 100644
--- a/drivers/net/wireless/marvell/mwifiex/usb.c
+++ b/drivers/net/wireless/marvell/mwifiex/usb.c
@@ -1353,7 +1353,8 @@  static void mwifiex_usb_cleanup_tx_aggr(struct mwifiex_adapter *adapter)
 				skb_dequeue(&port->tx_aggr.aggr_list)))
 				mwifiex_write_data_complete(adapter, skb_tmp,
 							    0, -1);
-		del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
+		if (port->tx_aggr.timer_cnxt.is_hold_timer_set)
+			del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
 		port->tx_aggr.timer_cnxt.is_hold_timer_set = false;
 		port->tx_aggr.timer_cnxt.hold_tmo_msecs = 0;
 	}