From patchwork Thu Feb 25 08:35:18 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhu Yanjun X-Patchwork-Id: 587958 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 8F16814032F for ; Thu, 25 Feb 2016 19:34:56 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=aF9c5Y2L; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759232AbcBYIev (ORCPT ); Thu, 25 Feb 2016 03:34:51 -0500 Received: from mail-pa0-f67.google.com ([209.85.220.67]:35898 "EHLO mail-pa0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758731AbcBYIeu (ORCPT ); Thu, 25 Feb 2016 03:34:50 -0500 Received: by mail-pa0-f67.google.com with SMTP id a7so1021231pax.3 for ; Thu, 25 Feb 2016 00:34:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=LgI6cg5CtPAQEodWMzVonk76xvSmuJW94ng2bvdXyEw=; b=aF9c5Y2LXbfttXaWtxgaCjSUnKuXRlEzLreg/BJdYGgCabEjuk6P00g24N9GO60L/9 F5HoKJHZ1V4HpZ7j3Sl9E0TWucbOz6X1hyWg2G0d7+zS9YjN9AkmI+lDdLfq33qqsqhy LtOmxCC94zP7YWGIak5ZIDAJ9RCmwepbdspXMZUCZwoV4pJejdYDYlN8UTSh2CNuRqaa FiAyz22G43tA0zD0u9gUZCgNynzCADIkDRG0cQtUAPONrSW5TYl3u5Pf24IAPl421o6k ncr45zP13OFrSNcGiiEmXg7ixoUe3IuTJz/W7TBidFxq3HcokuDyyK26I/fS3RONJOy1 6IYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=LgI6cg5CtPAQEodWMzVonk76xvSmuJW94ng2bvdXyEw=; b=R5JjoSD8RMDvIQTnHIyWz+CeQX4ecIlE9xXoRphyjgy7tNWRPp0NfzWUucZqiXDxfl UVxHn+JYewn2lZ79WK33NAeRYoirCl99N90N59xvcu5uPt7Icntv+qlFo6irjaCYkXKq QlTbhuJ8kqvJv9HcTxlwjJd/x/MdqWM/VQUHeUtfD+41NQcz6ZK/hCW/s5yGffFRSlm5 6QFkX891cFMs/VsNGXosJO14SbOT7bTDFtmjuiwbU8y+MgxRY+eUtAnznu0rPTegvvR8 lc3a/uz3O1zS/6Yc3cc6z6OdtL2BhghhZkoGGjX6dnJqzpm3H+kmJgsT1jqQuX//nW0+ 8nDg== X-Gm-Message-State: AG10YORrwPh5ygiEJXaoIMMseuRCy4Yqfjr5DrWkE3bOnjz0cRWZ6SFM2YJi7u2aYML1rA== X-Received: by 10.66.147.74 with SMTP id ti10mr61822975pab.128.1456389290005; Thu, 25 Feb 2016 00:34:50 -0800 (PST) Received: from [0.0.0.0] (unknown-105-122.windriver.com. [147.11.105.122]) by smtp.gmail.com with ESMTPSA id p8sm10270276pfi.34.2016.02.25.00.34.44 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 25 Feb 2016 00:34:49 -0800 (PST) Subject: Re: [PATCH v2 net] bonding: don't use stale speed and duplex information To: Jay Vosburgh , netdev@vger.kernel.org, "Tantilov, Emil S" References: <25869.1454962202@famine> Cc: Veaceslav Falico , dingtianhong , Andy Gospodarek , "David S. Miller" From: zhuyj Message-ID: <56CEBCC6.3040008@gmail.com> Date: Thu, 25 Feb 2016 16:35:18 +0800 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <25869.1454962202@famine> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 02/09/2016 04:10 AM, Jay Vosburgh wrote: > There is presently a race condition between the bonding periodic > link monitor and the updating of a slave's speed and duplex. The former > occurs on a periodic basis, and the latter in response to a driver's > calling of netif_carrier_on. > > It is possible for the periodic monitor to run between the > driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE > event that causes bonding to update the slave's speed and duplex. This > manifests most notably as a report that a slave is up and "0 Mbps full > duplex" after enslavement, but in principle could report an incorrect > speed and duplex after any link up event if the device comes up with a > different speed or duplex. This affects the 802.3ad aggregator > selection, as the speed and duplex are selection criteria. > > This is fixed by updating the speed and duplex in the periodic > monitor, prior to using that information. > > This was done historically in bonding, but the call to > bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding: > don't call update_speed_duplex() under spinlocks"), as it might sleep > under lock. Later, the locking was changed to only hold RTNL, and so > after commit 876254ae2758 ("bonding: don't call update_speed_duplex() > under spinlocks") this call is again safe. > > Tested-by: "Tantilov, Emil S" > Cc: Veaceslav Falico > Cc: dingtianhong > Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks") > Signed-off-by: Jay Vosburgh > > --- > > v2: Correct Veaceslav's email address > > Note: The "Fixes" commit is the commit that makes this operation safe > again, not the commit that originally introduced the race. I don't see > any simple way to resolve this bug between these two commits. > > drivers/net/bonding/bond_main.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 56b560558884..cabaeb61333d 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond) > continue; > > case BOND_LINK_UP: > + bond_update_speed_duplex(slave); > bond_set_slave_link_state(slave, BOND_LINK_UP, > BOND_SLAVE_NOTIFY_NOW); > slave->last_link_up = jiffies; Hi, Jay Thanks for your patch. I delved into the source code and Emil's tests. I think that the problem that this patch expects to fix occurs very unusually. Do you agree with me? If so, maybe the following patch can reduce the performance loss. Please comment on it. Thanks a lot. Best Regards! Zhu Yanjun diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index b7f1a99..c4c511a 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond) continue; case BOND_LINK_UP: - bond_update_speed_duplex(slave); + if (slave->speed == SPEED_UNKNOWN) + bond_update_speed_duplex(slave); + bond_set_slave_link_state(slave, BOND_LINK_UP, BOND_SLAVE_NOTIFY_NOW); slave->last_link_up = jiffies;