From patchwork Thu Mar 7 14:59:18 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Veaceslav Falico X-Patchwork-Id: 225865 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id B47062C03B1 for ; Fri, 8 Mar 2013 01:59:55 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933065Ab3CGO7n (ORCPT ); Thu, 7 Mar 2013 09:59:43 -0500 Received: from mail-wg0-f45.google.com ([74.125.82.45]:62448 "EHLO mail-wg0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759216Ab3CGO7l (ORCPT ); Thu, 7 Mar 2013 09:59:41 -0500 Received: by mail-wg0-f45.google.com with SMTP id dq12so986753wgb.0 for ; Thu, 07 Mar 2013 06:59:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=y/tZXBufPf8bc0IhMsC1s0v6e5f/L9g9phFqwE0PlGo=; b=OO+i3JEwWeALX7WGlCHp/eDSRGUIuSsoFSYOJ3xcwC67dI/nFl5YUXueTu7S5dXAxe 1/AWQTkW6WIKWngqT25QDLsiKasVXiUqQxBf13mKfQUdznICsdtxqSaGPBdU69yYcykL PuO/e6m260z6yk+7+5mAeSe4l2IwwR9Pp6LQTJh7Hi+d82v9rHVcB259bvhkj+24ej06 Z22cK/f6DL2UQsQphCMJq7mW6Z1R8ri6eq/svfVTSsquCKfrv7H6mL5h1K3xg1SjoPnY 4SSN48HohJjfpsnrWhxmcO65WMa9tsZTTi2HAXi6OcEOhbmAJ6LNJ/R3PYciTIbVkVEH btbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :x-gm-message-state; bh=y/tZXBufPf8bc0IhMsC1s0v6e5f/L9g9phFqwE0PlGo=; b=JKqHHIdQFP8mkP1uXy1PO5Q7n9QhTqTtuxtymkMOR9fvXNuSBSBqjyNAXMGORekKra mRTHlve209fvZ37crywtQOllBzPqXs5a959PL95U8fUukBu1wBkOmMq6VmmQBuqstvP4 rcVP7e2PdjnvxwGW5P9QsaHq+7bORG44tbtunOVjNZFlwVwXhf7998HSzSGzCRPKeoOR ukC5I/0kbwfRxNYAM5UU8gBTT7u2ho7WZjni5Dh/LZGMbJx0g0IEdK6ZpzM+PIjH9F3E gMfiMqkqBl5PUU+HB7Fytbb3U9q+eBaXvGUbJfWkk6J3Oryc/2rgamGMG7k2qCPUmU/f solg== X-Received: by 10.180.103.167 with SMTP id fx7mr33884432wib.15.1362668379226; Thu, 07 Mar 2013 06:59:39 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.32.225 with HTTP; Thu, 7 Mar 2013 06:59:18 -0800 (PST) In-Reply-To: <513862DA.20704@linux.vnet.ibm.com> References: <513063B4.8070604@tlinx.org> <1362155087.15793.54.camel@edumazet-glaptop> <51317FCF.1070400@tlinx.org> <51318C5B.60001@tlinx.org> <51382AFE.5080302@linux.vnet.ibm.com> <51383C47.7050009@tlinx.org> <513849F8.3090302@linux.vnet.ibm.com> <513854E6.9010709@tlinx.org> <513862DA.20704@linux.vnet.ibm.com> From: Veaceslav Falico Date: Thu, 7 Mar 2013 15:59:18 +0100 X-Google-Sender-Auth: ysGZX9OvhOkGZ7EebUSctNuKYAc Message-ID: Subject: Re: upgrade to 3.8.1 : BUG Scheduling while atomic in bonding driver: To: Michael Wang Cc: Linda Walsh , Eric Dumazet , Linux-Kernel , netdev , Jay Vosburgh , Jeff Kirsher , Cong Wang , Andy Gospodarek X-Gm-Message-State: ALoCoQmeFWbQRY/6VcVZ8DxMPjeMdpVGw5EKnDjWSAIFttGiEpEyLNI199y57brHerZFiDXMXANh Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, Mar 7, 2013 at 10:50 AM, Michael Wang wrote: > On 03/07/2013 04:50 PM, Linda Walsh wrote: >> >> I am *not* seeing the bug in 3.8.2 with the 2nd patch applied (in >> addition to the first)... > > So that means bond lock is the reason, nice, but this is really not a > good fix if we just unlock it... You're right, we can't unlock bond->lock while in bond_for_each_slave(), however I think we don't even need that bond_update_speed_duplex() in bond_miimon_commit() - cause the speed/duplex on interface state change were already updated via NETDEV_UP/CHANGE event in bond_slave_netdev_event() - and if not, it's not our fault, but the driver's that he didn't call the notifiers. So, maybe something like this would work (any feedback welcome, it's definitely not the first time we hit the wall with sleeping in bond_update_speed_duplex): Subject: [PATCH] bonding: don't call update_speed_duplex() under spinlocks bond_update_speed_duplex() might sleep while calling underlying slave's routines. Move it out of atomic context in bond_enslave() and remove it from bond_miimon_commit() - it was introduced by commit 546add79, however when the slave interfaces go up/change state it's their responsability to fire NETDEV_UP/NETDEV_CHANGE events so that bonding can properly update their speed. --- drivers/net/bonding/bond_main.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) new_slave->last_arp_rx = jiffies - @@ -1798,8 +1800,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) new_slave->link == BOND_LINK_DOWN ? "DOWN" : (new_slave->link == BOND_LINK_UP ? "UP" : "BACK")); - bond_update_speed_duplex(new_slave); - if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) { /* if there is a primary slave, remember it */ if (strcmp(bond->params.primary, new_slave->dev->name) == 0) { @@ -2373,8 +2373,6 @@ static void bond_miimon_commit(struct bonding *bond) bond_set_backup_slave(slave); } - bond_update_speed_duplex(slave); - pr_info("%s: link status definitely up for interface %s, %u Mbps %s duplex.\n", bond->dev->name, slave->dev->name, slave->speed, slave->duplex ? "full" : "half"); diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7bd068a..c63a33c 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1746,6 +1746,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) bond_compute_features(bond); + bond_update_speed_duplex(new_slave); + read_lock(&bond->lock);