From patchwork Mon Jan 11 09:03:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhu Yanjun X-Patchwork-Id: 565710 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 02DF9140311 for ; Mon, 11 Jan 2016 20:04:12 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=RwqiyjOW; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759014AbcAKJEF (ORCPT ); Mon, 11 Jan 2016 04:04:05 -0500 Received: from mail-pa0-f66.google.com ([209.85.220.66]:36229 "EHLO mail-pa0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758978AbcAKJDw (ORCPT ); Mon, 11 Jan 2016 04:03:52 -0500 Received: by mail-pa0-f66.google.com with SMTP id a20so19243286pag.3 for ; Mon, 11 Jan 2016 01:03:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=ue/pDf60dEjEvYasX4pDmEQVysVDHz4YZw3AfH8ozSE=; b=RwqiyjOWeYLtohdWSe4jcTlBQtipPNiNZeHkCG4e9gMaqHUnJLYzUJtGL6fsPr4Ni/ SLNo0qlFKXOHCGcJBBMt9M3r9SwXsZjL/n+4izmSiTbhCXWO54RawvIVMk0nzisHr9Ar kbJMuZTbZICoRjNI23RR/gsab5km/cFXgT3TpAs7i7WX0ixo6XWkBR7viERk6HGfawhu sxZpOxIwAS0kBopiK8FVR7QWQ3KsANVS75AT7F+DOb4jfZfbqFddScPVnvGteq0/i4a6 HZa/2/pZH9ZJC4iBzXLTy2i9ms81HW7qslgCO6XgsiLMMXKrqtSFlneJ2xS9KNMwU21q c9KQ== X-Received: by 10.66.236.129 with SMTP id uu1mr74808767pac.158.1452503031573; Mon, 11 Jan 2016 01:03:51 -0800 (PST) Received: from [0.0.0.0] (unknown-105-122.windriver.com. [147.11.105.122]) by smtp.gmail.com with ESMTPSA id mj1sm184096213pab.34.2016.01.11.01.03.46 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 11 Jan 2016 01:03:50 -0800 (PST) Subject: Re: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection To: Jay Vosburgh , "Tantilov, Emil S" References: <87618083B2453E4A8714035B62D6799250504549@FMSMSX105.amr.corp.intel.com> <1452147313-22886-1-git-send-email-zyjzyj2000@gmail.com> <16587.1452148421@famine> <27321.1452216515@famine> <87618083B2453E4A8714035B62D6799250505491@FMSMSX105.amr.corp.intel.com> <11809.1452305978@famine> Cc: "mkubecek@suse.cz" , "vfalico@gmail.com" , "gospo@cumulusnetworks.com" , "netdev@vger.kernel.org" , "Shteinbock, Boris (Wind River)" From: zhuyj Message-ID: <56936FEF.1020301@gmail.com> Date: Mon, 11 Jan 2016 17:03:43 +0800 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <11809.1452305978@famine> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi, Jay && Emil I delved into the source code. This patch is based on notifiers. When a NETDEV_UP notifier is received in bond_slave_netdev_event, in bond_miimon_inspect_slave, bond_check_dev_link is called to detect link_state. Because of link flap, link_state is sometimes different from NETDEV_UP. That is, though event is NETDEV_UP, sometime link_state is down because of link flap. In the following patch, if link_state is different from the event, it is unnecessary to make further setup. { int link_state; bool ignore_updelay; @@ -2022,6 +2023,17 @@ static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave) slave->new_link = BOND_LINK_NOCHANGE; link_state = bond_check_dev_link(bond, slave->dev, 0); + switch (event) { + case NETDEV_UP: + if (!link_state) + return 0; + break; + + case NETDEV_DOWN: + if (link_state) + return 0; + break; + } switch (slave->link) { case BOND_LINK_UP: @@ -2107,7 +2119,7 @@ static int bond_miimon_inspect(struct bonding *bond) int commit = 0; bond_for_each_slave_rcu(bond, slave, iter) - commit += bond_miimon_inspect_slave(bond, slave); + commit += bond_miimon_inspect_slave(bond, slave, 0xFF); return commit; } @@ -3019,7 +3031,7 @@ static int bond_slave_netdev_event(unsigned long event, bond_3ad_adapter_speed_duplex_changed(slave); /* Fallthrough */ case NETDEV_DOWN: - if (bond_miimon_inspect_slave(bond, slave)) + if (bond_miimon_inspect_slave(bond, slave, event)) bond_miimon_commit_slave(bond, slave); /* Refresh slave-array if applicable! Best Regards! Zhu Yanjun On 01/09/2016 10:19 AM, Jay Vosburgh wrote: > Tantilov, Emil S wrote: > >>> -----Original Message----- >> From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com] >>> Sent: Thursday, January 07, 2016 5:29 PM >>> Subject: [RFC PATCH net-next] bonding: Use notifiers for slave link state >>> detection >>> >>> >>> TEST PATCH >>> >>> This patch modifies bonding to utilize notifier callbacks to >>> detect slave link state changes. It is intended to be used with miimon >>> set to zero, and does not support the updelay or downdelay options to >>> bonding. It's not as complicated as it looks; most of the change set is >>> to break out the inner loop of bond_miimon_inspect into its own >>> function. >> Jay, >> >> I managed to do a quick test with this patch and occasionally there is >> a case where I see the bonding driver reporting link up for an >> interface (eth1) that is not up just yet: > [...] >> [12985.213752] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX >> [12985.213970] bond0: link status definitely up for interface eth0, 10000 Mbps full duplex >> [12985.213975] bond0: link status definitely up for interface eth1, 0 Mbps full duplex > Thanks for testing; the misbehavior is because I cheaped out and > didn't break out the commit function into a "single slave" version. The > below patch (against net-next, replacing the original patch) shouldn't > generate the erroneous additional link messages any more. > > This does generate an RCU warning, although the code actually is > safe (since the notifier callback holds RTNL); I'll sort that out next > week. > > -J > > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index cab99fd..12dd533 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -2012,203 +2012,206 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in > /*-------------------------------- Monitoring -------------------------------*/ > > /* called with rcu_read_lock() */ > -static int bond_miimon_inspect(struct bonding *bond) > +static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave) > { > - int link_state, commit = 0; > - struct list_head *iter; > - struct slave *slave; > + int link_state; > bool ignore_updelay; > > ignore_updelay = !rcu_dereference(bond->curr_active_slave); > > - bond_for_each_slave_rcu(bond, slave, iter) { > - slave->new_link = BOND_LINK_NOCHANGE; > + slave->new_link = BOND_LINK_NOCHANGE; > > - link_state = bond_check_dev_link(bond, slave->dev, 0); > + link_state = bond_check_dev_link(bond, slave->dev, 0); > > - switch (slave->link) { > - case BOND_LINK_UP: > - if (link_state) > - continue; > + switch (slave->link) { > + case BOND_LINK_UP: > + if (link_state) > + return 0; > > - bond_set_slave_link_state(slave, BOND_LINK_FAIL, > + bond_set_slave_link_state(slave, BOND_LINK_FAIL, > + BOND_SLAVE_NOTIFY_LATER); > + slave->delay = bond->params.downdelay; > + if (slave->delay) { > + netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n", > + (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ? > + (bond_is_active_slave(slave) ? > + "active " : "backup ") : "", > + slave->dev->name, > + bond->params.downdelay * bond->params.miimon); > + } > + /*FALLTHRU*/ > + case BOND_LINK_FAIL: > + if (link_state) { > + /* recovered before downdelay expired */ > + bond_set_slave_link_state(slave, BOND_LINK_UP, > BOND_SLAVE_NOTIFY_LATER); > - slave->delay = bond->params.downdelay; > - if (slave->delay) { > - netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n", > - (BOND_MODE(bond) == > - BOND_MODE_ACTIVEBACKUP) ? > - (bond_is_active_slave(slave) ? > - "active " : "backup ") : "", > - slave->dev->name, > - bond->params.downdelay * bond->params.miimon); > - } > - /*FALLTHRU*/ > - case BOND_LINK_FAIL: > - if (link_state) { > - /* recovered before downdelay expired */ > - bond_set_slave_link_state(slave, BOND_LINK_UP, > - BOND_SLAVE_NOTIFY_LATER); > - slave->last_link_up = jiffies; > - netdev_info(bond->dev, "link status up again after %d ms for interface %s\n", > - (bond->params.downdelay - slave->delay) * > - bond->params.miimon, > - slave->dev->name); > - continue; > - } > + slave->last_link_up = jiffies; > + netdev_info(bond->dev, "link status up again after %d ms for interface %s\n", > + (bond->params.downdelay - slave->delay) * > + bond->params.miimon, slave->dev->name); > + return 0; > + } > > - if (slave->delay <= 0) { > - slave->new_link = BOND_LINK_DOWN; > - commit++; > - continue; > - } > + if (slave->delay <= 0) { > + slave->new_link = BOND_LINK_DOWN; > + return 1; > + } > > - slave->delay--; > - break; > + slave->delay--; > + break; > > - case BOND_LINK_DOWN: > - if (!link_state) > - continue; > + case BOND_LINK_DOWN: > + if (!link_state) > + return 0; > > - bond_set_slave_link_state(slave, BOND_LINK_BACK, > - BOND_SLAVE_NOTIFY_LATER); > - slave->delay = bond->params.updelay; > - > - if (slave->delay) { > - netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n", > - slave->dev->name, > - ignore_updelay ? 0 : > - bond->params.updelay * > - bond->params.miimon); > - } > - /*FALLTHRU*/ > - case BOND_LINK_BACK: > - if (!link_state) { > - bond_set_slave_link_state(slave, > - BOND_LINK_DOWN, > - BOND_SLAVE_NOTIFY_LATER); > - netdev_info(bond->dev, "link status down again after %d ms for interface %s\n", > - (bond->params.updelay - slave->delay) * > - bond->params.miimon, > - slave->dev->name); > + bond_set_slave_link_state(slave, BOND_LINK_BACK, > + BOND_SLAVE_NOTIFY_LATER); > + slave->delay = bond->params.updelay; > > - continue; > - } > + if (slave->delay) { > + netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n", > + slave->dev->name, ignore_updelay ? 0 : > + bond->params.updelay * bond->params.miimon); > + } > + /*FALLTHRU*/ > + case BOND_LINK_BACK: > + if (!link_state) { > + bond_set_slave_link_state(slave, BOND_LINK_DOWN, > + BOND_SLAVE_NOTIFY_LATER); > + netdev_info(bond->dev, "link status down again after %d ms for interface %s\n", > + (bond->params.updelay - slave->delay) * > + bond->params.miimon, slave->dev->name); > > - if (ignore_updelay) > - slave->delay = 0; > + return 0; > + } > > - if (slave->delay <= 0) { > - slave->new_link = BOND_LINK_UP; > - commit++; > - ignore_updelay = false; > - continue; > - } > + if (ignore_updelay) > + slave->delay = 0; > > - slave->delay--; > - break; > + if (slave->delay <= 0) { > + slave->new_link = BOND_LINK_UP; > + return 1; > } > + > + slave->delay--; > + break; > } > > - return commit; > + return 0; > } > > -static void bond_miimon_commit(struct bonding *bond) > +static int bond_miimon_inspect(struct bonding *bond) > { > struct list_head *iter; > - struct slave *slave, *primary; > + struct slave *slave; > + int commit = 0; > > - bond_for_each_slave(bond, slave, iter) { > - switch (slave->new_link) { > - case BOND_LINK_NOCHANGE: > - continue; > + bond_for_each_slave_rcu(bond, slave, iter) > + commit += bond_miimon_inspect_slave(bond, slave); > > - case BOND_LINK_UP: > - bond_set_slave_link_state(slave, BOND_LINK_UP, > - BOND_SLAVE_NOTIFY_NOW); > - slave->last_link_up = jiffies; > + return commit; > +} > > - primary = rtnl_dereference(bond->primary_slave); > - if (BOND_MODE(bond) == BOND_MODE_8023AD) { > - /* prevent it from being the active one */ > - bond_set_backup_slave(slave); > - } else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) { > - /* make it immediately active */ > - bond_set_active_slave(slave); > - } else if (slave != primary) { > - /* prevent it from being the active one */ > - bond_set_backup_slave(slave); > - } > +static void bond_miimon_commit_slave(struct bonding *bond, struct slave *slave) > +{ > + struct slave *primary; > > - netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n", > - slave->dev->name, > - slave->speed == SPEED_UNKNOWN ? 0 : slave->speed, > - slave->duplex ? "full" : "half"); > + switch (slave->new_link) { > + case BOND_LINK_NOCHANGE: > + return; > > - /* notify ad that the link status has changed */ > - if (BOND_MODE(bond) == BOND_MODE_8023AD) > - bond_3ad_handle_link_change(slave, BOND_LINK_UP); > + case BOND_LINK_UP: > + bond_set_slave_link_state(slave, BOND_LINK_UP, > + BOND_SLAVE_NOTIFY_NOW); > + slave->last_link_up = jiffies; > > - if (bond_is_lb(bond)) > - bond_alb_handle_link_change(bond, slave, > - BOND_LINK_UP); > + primary = rtnl_dereference(bond->primary_slave); > + if (BOND_MODE(bond) == BOND_MODE_8023AD) { > + /* prevent it from being the active one */ > + bond_set_backup_slave(slave); > + } else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) { > + /* make it immediately active */ > + bond_set_active_slave(slave); > + } else if (slave != primary) { > + /* prevent it from being the active one */ > + bond_set_backup_slave(slave); > + } > > - if (BOND_MODE(bond) == BOND_MODE_XOR) > - bond_update_slave_arr(bond, NULL); > + netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n", > + slave->dev->name, > + slave->speed == SPEED_UNKNOWN ? 0 : slave->speed, > + slave->duplex ? "full" : "half"); > > - if (!bond->curr_active_slave || slave == primary) > - goto do_failover; > + /* notify ad that the link status has changed */ > + if (BOND_MODE(bond) == BOND_MODE_8023AD) > + bond_3ad_handle_link_change(slave, BOND_LINK_UP); > > - continue; > + if (bond_is_lb(bond)) > + bond_alb_handle_link_change(bond, slave, BOND_LINK_UP); > > - case BOND_LINK_DOWN: > - if (slave->link_failure_count < UINT_MAX) > - slave->link_failure_count++; > + if (BOND_MODE(bond) == BOND_MODE_XOR) > + bond_update_slave_arr(bond, NULL); > > - bond_set_slave_link_state(slave, BOND_LINK_DOWN, > - BOND_SLAVE_NOTIFY_NOW); > + if (!bond->curr_active_slave || slave == primary) > + goto do_failover; > > - if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP || > - BOND_MODE(bond) == BOND_MODE_8023AD) > - bond_set_slave_inactive_flags(slave, > - BOND_SLAVE_NOTIFY_NOW); > + goto out; > > - netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n", > - slave->dev->name); > + case BOND_LINK_DOWN: > + if (slave->link_failure_count < UINT_MAX) > + slave->link_failure_count++; > > - if (BOND_MODE(bond) == BOND_MODE_8023AD) > - bond_3ad_handle_link_change(slave, > - BOND_LINK_DOWN); > + bond_set_slave_link_state(slave, BOND_LINK_DOWN, > + BOND_SLAVE_NOTIFY_NOW); > > - if (bond_is_lb(bond)) > - bond_alb_handle_link_change(bond, slave, > - BOND_LINK_DOWN); > + if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP || > + BOND_MODE(bond) == BOND_MODE_8023AD) > + bond_set_slave_inactive_flags(slave, > + BOND_SLAVE_NOTIFY_NOW); > > - if (BOND_MODE(bond) == BOND_MODE_XOR) > - bond_update_slave_arr(bond, NULL); > + netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n", > + slave->dev->name); > > - if (slave == rcu_access_pointer(bond->curr_active_slave)) > - goto do_failover; > + if (BOND_MODE(bond) == BOND_MODE_8023AD) > + bond_3ad_handle_link_change(slave, BOND_LINK_DOWN); > > - continue; > + if (bond_is_lb(bond)) > + bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN); > > - default: > - netdev_err(bond->dev, "invalid new link %d on slave %s\n", > - slave->new_link, slave->dev->name); > - slave->new_link = BOND_LINK_NOCHANGE; > + if (BOND_MODE(bond) == BOND_MODE_XOR) > + bond_update_slave_arr(bond, NULL); > > - continue; > - } > + if (slave == rcu_access_pointer(bond->curr_active_slave)) > + goto do_failover; > > -do_failover: > - block_netpoll_tx(); > - bond_select_active_slave(bond); > - unblock_netpoll_tx(); > + goto out; > + > + default: > + netdev_err(bond->dev, "invalid new link %d on slave %s\n", > + slave->new_link, slave->dev->name); > + slave->new_link = BOND_LINK_NOCHANGE; > + > + goto out; > } > > +do_failover: > + block_netpoll_tx(); > + bond_select_active_slave(bond); > + unblock_netpoll_tx(); > + > +out: > bond_set_carrier(bond); > } > > +static void bond_miimon_commit(struct bonding *bond) > +{ > + struct list_head *iter; > + struct slave *slave; > + > + bond_for_each_slave(bond, slave, iter) > + bond_miimon_commit_slave(bond, slave); > +} > + > /* bond_mii_monitor > * > * Really a wrapper that splits the mii monitor into two phases: an > @@ -3016,6 +3019,9 @@ static int bond_slave_netdev_event(unsigned long event, > bond_3ad_adapter_speed_duplex_changed(slave); > /* Fallthrough */ > case NETDEV_DOWN: > + if (bond_miimon_inspect_slave(bond, slave)) > + bond_miimon_commit_slave(bond, slave); > + > /* Refresh slave-array if applicable! > * If the setup does not use miimon or arpmon (mode-specific!), > * then these events will not cause the slave-array to be > > > --- > -Jay Vosburgh, jay.vosburgh@canonical.com > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 12dd533..1b53da0 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2012,7 +2012,8 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in /*-------------------------------- Monitoring -------------------------------*/ /* called with rcu_read_lock() */ -static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave) +static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave, + unsigned long event)