From patchwork Fri Apr 21 17:25:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Torvalds X-Patchwork-Id: 753514 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3w8klv5DzWz9s75 for ; Sat, 22 Apr 2017 04:29:15 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="QftiHq9a"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422747AbdDUS25 (ORCPT ); Fri, 21 Apr 2017 14:28:57 -0400 Received: from mail-io0-f196.google.com ([209.85.223.196]:36824 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1036880AbdDUS2p (ORCPT ); Fri, 21 Apr 2017 14:28:45 -0400 Received: by mail-io0-f196.google.com with SMTP id x86so33192788ioe.3; Fri, 21 Apr 2017 11:28:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=XwsBa13ci1EUjsblbJPz7B++qfo5NBkaYondVk4LMz4=; b=QftiHq9aTfupPQo5I4NCA6ADzAf/KNtofitxG8isPxUELIM6qLrxSbkfokaR+aA0G9 Jvq9iKhrP/b++GU4hBGQ8JvgS0oxJtITv6qS+fRJf/gbRzsomAUVdV+11JgylgE6H4l5 n94HvBLVJDcvHZROWd4l6q8gNJw2c6gznT4uTU+Vp5TbUKHdZXkTstAoIna0h2hokIB2 OqviiKAIxg++N780kjk1wLFCUibzlF3A9uETJFZBv88VLW41FTaCdt0HmGvKXUgzpPZS HLwycGjuv8LNEBCH33RLIiAy8P8HVfKbsFBTAlMZzOrf6Lu+kaBUzynaychMxvfxHU68 Me5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=XwsBa13ci1EUjsblbJPz7B++qfo5NBkaYondVk4LMz4=; b=VM6y/TBIFH7Lwv2C9VVOIoUeT6Eoz1zsWcy6CvuoIN1djTX1D3DZnxqsr0WfWW5EYB OYVJ4kdMLjJ1BL1vo93wn4lK94a7bUGdHmqPSIjusIEuVJy7SgKXJliXrJrPusvj8v0t T2uyyl40utjXx3qYY5vcykezj/Dm8WUSP0YT7i3RIls9e7ycw7VMwl7GLAEKxC+Qj+d1 ML91MpgAuvOWAjJpyWv6y+fTma70S9kmF5YQbUguf+lGd7UFJ1XGGxiqbs9Jjgq1FPyK GRrHPC2rH+Pk5aGQVi1yF+GyzoTx0C5XQ4w6GBBNpdDv8qireBWzgQDPlX6t79khnqYc HKJg== X-Gm-Message-State: AN3rC/5lNH9gYebXrlYfNI+Yfo6XIehg4Y/KkLNiUUAlwHJxB5hvtFAL /ycUdfXTAtLGmthuHGUewyj4M3bOSWgg X-Received: by 10.107.136.143 with SMTP id s15mr17352495ioi.224.1492795534401; Fri, 21 Apr 2017 10:25:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.136.152 with HTTP; Fri, 21 Apr 2017 10:25:33 -0700 (PDT) In-Reply-To: References: From: Linus Torvalds Date: Fri, 21 Apr 2017 10:25:33 -0700 X-Google-Sender-Auth: KsNhYkZxj8F8ENw7KbfTutqFK1o Message-ID: Subject: Re: net/core: BUG in unregister_netdevice_many To: Andrey Konovalov , Eric Dumazet Cc: "David S. Miller" , Alexey Kuznetsov , James Morris , Hideaki YOSHIFUJI , Patrick McHardy , netdev , LKML , Alexander Duyck , David Ahern , Daniel Borkmann , tcharding , Jiri Pirko , stephen hemminger , Dmitry Vyukov , Kostya Serebryany , syzkaller Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Apr 21, 2017 at 5:48 AM, Andrey Konovalov wrote: > > I've got the following error report while fuzzing the kernel with syzkaller. > > ------------[ cut here ]------------ > kernel BUG at net/core/dev.c:6813! Another useless BUG_ON() that (a) kills the machine (b) doesn't tell the actual useful information. Grr. But that BUG_ON() is ancient, and actually goes back to pre-git days. So you do seem to have triggered some code-path that doesn't ever happen in any normal use. This code looks odd: void unregister_netdevice_many(struct list_head *head) { struct net_device *dev; if (!list_empty(head)) { rollback_registered_many(head); list_for_each_entry(dev, head, unreg_list) net_set_todo(dev); list_del(head); } } In particular the pattern of "look if list is empty, do something to the list, and then delete the list" is a rather nasty pattern. Why? That final "delete the list" looks like garbage to me. Either that list is never used again - in which case the list_del() is pointless - or it _is_ used again, in which case the list_del is wrong. Why is it wrong? Because "list_del()" only removes the list from the head, but leaves the head itself untouched. So it will still end up pointing to the list entries that we've just walked. Now, almost every single case I looked at, the head is always a temporary list that was created just for this "unregister_netdevice_many()", so the code works fine, and it's a case of "that list_del() is just pointless". But it's a very dangerous pattern. Either the list head should be left alone, or it should be cleaned up with list_del_init() Anyway, because each user seems fine, and really just uses it as a temporary list, this is not the cause of the bug. I'm assuming that the real cause is simply that "dev->reg_state" ends up being NETREG_UNREGISTERING or something. Maybe the BUG_ON() could be just removed, and replaced by the previous warning about NETREG_UNINITIALIZED. Something like the attached (TOTALLY UNTESTED) patch. We really shouldn't have BUG_ON()'s in the kernel, particularly for cases that we already have error handling for and are ignoring. But whatever. Eric Dumazet seems to be the main person to look at this. Linus net/core/dev.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 533a6d6f6092..4e50b2fb3a90 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6801,16 +6801,14 @@ static void rollback_registered_many(struct list_head *head) * for initialization unwind. Remove those * devices and proceed with the remaining. */ - if (dev->reg_state == NETREG_UNINITIALIZED) { - pr_debug("unregister_netdevice: device %s/%p never was registered\n", - dev->name, dev); + if (WARN_ON_ONCE(dev->reg_state != NETREG_REGISTERED)) { + printk("unregister_netdevice: device %s/%p was not registered (%d)\n", + dev->name, dev, dev->reg_state); - WARN_ON(1); list_del(&dev->unreg_list); continue; } dev->dismantle = true; - BUG_ON(dev->reg_state != NETREG_REGISTERED); } /* If device is running, close it first. */