From patchwork Mon Apr 22 12:43:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Henrique Cerri X-Patchwork-Id: 1088682 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44nmVC55C7z9s5c; Mon, 22 Apr 2019 22:44:02 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1hIYIe-0004bZ-UE; Mon, 22 Apr 2019 12:43:56 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1hIYId-0004bN-4g for kernel-team@lists.ubuntu.com; Mon, 22 Apr 2019 12:43:55 +0000 Received: from mail-qt1-f197.google.com ([209.85.160.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1hIYIc-0000AV-Qf for kernel-team@lists.ubuntu.com; Mon, 22 Apr 2019 12:43:54 +0000 Received: by mail-qt1-f197.google.com with SMTP id f15so11531833qtk.16 for ; Mon, 22 Apr 2019 05:43:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CmPXYkivNwSy3D3tUBWb2SFqTqcX6E6RG0DJZdv7CCc=; b=SIVckXoxbIeVcN5DuQ2KtL0wayWIiqok17pvVFnWSk363ejqpIKlk9zQZVP65aXUAn FTMzb8VE7O+yVR9t3wgDyV1FfJwDjpiZmHpNiX+2fYox47dpVfEj2HYilIdHU0WprLHN L5w7E7WeiXhC2oip0VBJ1+t+Oa9asszZnZ/5i0j+M5e2uowT4zZ6wk82c7cNohFhNOPE O7btQEiUpqX4pFhr4gNAsM06LVh9WnBsTfNeZY0cT6QJHjH4fV4L1xgNWgh9sbgBeH9w H/utX0fw4PxbbcGdiRw6umFHCZTrF/AeRY12yX5havU9osVIjhIUbttmB6mavwmo05ES XhdA== X-Gm-Message-State: APjAAAWRO1pouzPdQhoH7CFdZ5YI8WKsUaf74T13FTSVoyil//vrNPao oosAPnNWjmDVAUOSztdizEoD3uf1Fyvur+sRAty9SoU6U6RGP2KT/mfCUnRuxdd8slWjDLJlR8o HTWXO4QG9Brf0N3FXc1JlSIUyAooBmPPVNu6c3GJQ X-Received: by 2002:a0c:b05c:: with SMTP id l28mr15097885qvc.95.1555937033557; Mon, 22 Apr 2019 05:43:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqymviMJo6saEM0FKkwUb+20gwuo8o6fz5tzTnhWZoOglPzP0QHFxQRiIQ0jyjRp6h6i1Q3bCQ== X-Received: by 2002:a0c:b05c:: with SMTP id l28mr15097870qvc.95.1555937033243; Mon, 22 Apr 2019 05:43:53 -0700 (PDT) Received: from gallifrey.lan ([2804:14c:4e3:4a76:f494:9cbd:d4c6:b703]) by smtp.gmail.com with ESMTPSA id h6sm5301860qta.97.2019.04.22.05.43.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Apr 2019 05:43:52 -0700 (PDT) From: Marcelo Henrique Cerri To: kernel-team@lists.ubuntu.com Subject: [b/oracle][PATCH 1/3] Revert "UBUNTU: SAUCE: net_failover: delay taking over primary device to accommodate udevd renaming" Date: Mon, 22 Apr 2019 09:43:45 -0300 Message-Id: <20190422124347.31277-2-marcelo.cerri@canonical.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190422124347.31277-1-marcelo.cerri@canonical.com> References: <20190422124347.31277-1-marcelo.cerri@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: http://bugs.launchpad.net/bugs/1825229 This reverts commit 6f89769c2772ccdf10bff4da1c3f3f3fbe551ed8. Signed-off-by: Marcelo Henrique Cerri --- drivers/net/net_failover.c | 73 +++++--------------------------------- include/net/net_failover.h | 6 ---- 2 files changed, 8 insertions(+), 71 deletions(-) diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c index eeeed475a6a5..4f390fa557e4 100644 --- a/drivers/net/net_failover.c +++ b/drivers/net/net_failover.c @@ -28,10 +28,6 @@ #include #include -#define TAKEOVER_DELAY_DEFAULT 100 -static unsigned long takeover_delay = TAKEOVER_DELAY_DEFAULT; -module_param(takeover_delay, ulong, 0000); - static bool net_failover_xmit_ready(struct net_device *dev) { return netif_running(dev) && netif_carrier_ok(dev); @@ -505,7 +501,6 @@ static int net_failover_slave_register(struct net_device *slave_dev, { struct net_device *standby_dev, *primary_dev; struct net_failover_info *nfo_info; - bool work_scheduled = false; bool slave_is_standby; u32 orig_mtu; int err; @@ -521,21 +516,12 @@ static int net_failover_slave_register(struct net_device *slave_dev, dev_hold(slave_dev); - slave_is_standby = slave_dev->dev.parent == failover_dev->dev.parent; - nfo_info = netdev_priv(failover_dev); - if (netif_running(failover_dev)) { - if (takeover_delay && !slave_is_standby) { - schedule_delayed_work(&nfo_info->takeover, - takeover_delay * HZ / 1000); - work_scheduled = true; - } else { - err = dev_open(slave_dev); - if (err && (err != -EBUSY)) { - netdev_err(failover_dev, "Opening slave %s failed err:%d\n", - slave_dev->name, err); - goto err_dev_open; - } + err = dev_open(slave_dev); + if (err && (err != -EBUSY)) { + netdev_err(failover_dev, "Opening slave %s failed err:%d\n", + slave_dev->name, err); + goto err_dev_open; } } @@ -548,13 +534,13 @@ static int net_failover_slave_register(struct net_device *slave_dev, if (err) { netdev_err(failover_dev, "Failed to add vlan ids to device %s err:%d\n", slave_dev->name, err); - if (work_scheduled) - cancel_delayed_work(&nfo_info->takeover); goto err_vlan_add; } + nfo_info = netdev_priv(failover_dev); standby_dev = rtnl_dereference(nfo_info->standby_dev); primary_dev = rtnl_dereference(nfo_info->primary_dev); + slave_is_standby = slave_dev->dev.parent == failover_dev->dev.parent; if (slave_is_standby) { rcu_assign_pointer(nfo_info->standby_dev, slave_dev); @@ -691,48 +677,11 @@ static int net_failover_slave_name_change(struct net_device *slave_dev, /* We need to bring up the slave after the rename by udev in case * open failed with EBUSY when it was registered. */ - if (netif_running(failover_dev)) { - dev_open(slave_dev); - - net_failover_lower_state_changed(slave_dev, - primary_dev, standby_dev); - } + dev_open(slave_dev); return 0; } -static void net_failover_takeover_primary(struct work_struct *w) -{ - struct net_failover_info *nfo_info - = container_of(w, struct net_failover_info, takeover.work); - struct net_device *primary_dev, *standby_dev; - struct net_device *failover_dev; - int err; - - if (!rtnl_trylock()) { - schedule_delayed_work(&nfo_info->takeover, 0); - return; - } - - failover_dev = nfo_info->failover_dev; - primary_dev = rtnl_dereference(nfo_info->primary_dev); - standby_dev = rtnl_dereference(nfo_info->standby_dev); - - if (primary_dev && netif_running(failover_dev)) { - err = dev_open(primary_dev); - if (err) { - netdev_err(failover_dev, "Opening primary %s failed err:%d\n", - primary_dev->name, err); - } else { - net_failover_lower_state_changed(primary_dev, - primary_dev, - standby_dev); - } - } - - rtnl_unlock(); -} - static struct failover_ops net_failover_ops = { .slave_pre_register = net_failover_slave_pre_register, .slave_register = net_failover_slave_register, @@ -759,7 +708,6 @@ static struct failover_ops net_failover_ops = { struct failover *net_failover_create(struct net_device *standby_dev) { struct device *dev = standby_dev->dev.parent; - struct net_failover_info *nfo_info; struct net_device *failover_dev; struct failover *failover; int err; @@ -811,9 +759,6 @@ struct failover *net_failover_create(struct net_device *standby_dev) } netif_carrier_off(failover_dev); - nfo_info = netdev_priv(failover_dev); - nfo_info->failover_dev = failover_dev; - INIT_DELAYED_WORK(&nfo_info->takeover, net_failover_takeover_primary); failover = failover_register(failover_dev, &net_failover_ops); if (IS_ERR(failover)) @@ -853,8 +798,6 @@ void net_failover_destroy(struct failover *failover) failover_dev = rcu_dereference(failover->failover_dev); nfo_info = netdev_priv(failover_dev); - cancel_delayed_work_sync(&nfo_info->takeover); - netif_device_detach(failover_dev); rtnl_lock(); diff --git a/include/net/net_failover.h b/include/net/net_failover.h index 3cd0a6142b2b..b12a1c469d1c 100644 --- a/include/net/net_failover.h +++ b/include/net/net_failover.h @@ -25,12 +25,6 @@ struct net_failover_info { /* spinlock while updating stats */ spinlock_t stats_lock; - - /* back reference to associated net_device */ - struct net_device *failover_dev; - - /* delayed work to take over primary netdev */ - struct delayed_work takeover; }; struct failover *net_failover_create(struct net_device *standby_dev); From patchwork Mon Apr 22 12:43:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Henrique Cerri X-Patchwork-Id: 1088684 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44nmVF25Kdz9s9T; Mon, 22 Apr 2019 22:44:05 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1hIYIj-0004dV-9M; Mon, 22 Apr 2019 12:44:01 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1hIYIh-0004c9-C9 for kernel-team@lists.ubuntu.com; Mon, 22 Apr 2019 12:43:59 +0000 Received: from mail-qk1-f199.google.com ([209.85.222.199]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1hIYIh-0000Az-1j for kernel-team@lists.ubuntu.com; Mon, 22 Apr 2019 12:43:59 +0000 Received: by mail-qk1-f199.google.com with SMTP id q127so10324448qkd.2 for ; Mon, 22 Apr 2019 05:43:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bONEK2e32Wcuje9Cmtv6G8nhZ29r9AeafKszoR6Gajs=; b=TVru09hx1XBVwSO0Sq6l+j8vqDg5k8S0ID3QzJMYOzANquoX2YHNo5px2HbIyxkYW/ pXtNwezsb+lyUXwPq0wwgD8RSfbDIF7D2QriuVqUsulXf7gTC5eD/XVW2MYiFe5RW3tY PEdN8NiwHrnneQI2c4rlRoRoeKsMajLhIBdzViAdhEuSuQdFOPv+Q9qqhHpHnfSw65Ut cpXh3IIkMRr+6Y0zKsFCP0yaK8p9HqQAE0HTgCjX9t/bZ2xx2+CiSfuZXTHVq2/3V84k KR9YLZjCiukhmFK65WKCNqqnuxHZtpaJKjBHQgsxaynwPTPmkPFTtY3ko2rsvKkO6dY3 OvrQ== X-Gm-Message-State: APjAAAWOoHXFSqPOq0HgCcH/zCIDK3KHkofk3xdA1u+Ex8TQy891TnbP Ts4XJ7BOv7X5ofgfzBaVsMS4fw4zlMz5NrIUlFOk4/uxAzduQw137mQgzpJT1jl4xi5oqXDY0du U5KzGg/nkGDV3nBjc967hbxUdO/fqLH1iyCxM3SvI X-Received: by 2002:a0c:d10b:: with SMTP id a11mr15160546qvh.149.1555937037786; Mon, 22 Apr 2019 05:43:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqzcs5me7fhp5fvfEjzEkeIh2xqm/68EhdBBVROqzcN1nR7Ps6ON2ycWOjN7G956hoc8AQR0Qg== X-Received: by 2002:a0c:d10b:: with SMTP id a11mr15160522qvh.149.1555937037343; Mon, 22 Apr 2019 05:43:57 -0700 (PDT) Received: from gallifrey.lan ([2804:14c:4e3:4a76:f494:9cbd:d4c6:b703]) by smtp.gmail.com with ESMTPSA id h6sm5301860qta.97.2019.04.22.05.43.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Apr 2019 05:43:56 -0700 (PDT) From: Marcelo Henrique Cerri To: kernel-team@lists.ubuntu.com Subject: [b/oracle][PATCH 3/3] UBUNTU: SAUCE: failover: allow name change on IFF_UP slave interfaces Date: Mon, 22 Apr 2019 09:43:47 -0300 Message-Id: <20190422124347.31277-4-marcelo.cerri@canonical.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190422124347.31277-1-marcelo.cerri@canonical.com> References: <20190422124347.31277-1-marcelo.cerri@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Si-Wei Liu BugLink: http://bugs.launchpad.net/bugs/1825229 When a netdev appears through hot plug then gets enslaved by a failover master that is already up and running, the slave will be opened right away after getting enslaved. Today there's a race that userspace (udev) may fail to rename the slave if the kernel (net_failover) opens the slave earlier than when the userspace rename happens. Unlike bond or team, the primary slave of failover can't be renamed by userspace ahead of time, since the kernel initiated auto-enslavement is unable to, or rather, is never meant to be synchronized with the rename request from userspace. As the failover slave interfaces are not designed to be operated directly by userspace apps: IP configuration, filter rules with regard to network traffic passing and etc., should all be done on master interface. In general, userspace apps only care about the name of master interface, while slave names are less important as long as admin users can see reliable names that may carry other information describing the netdev. For e.g., they can infer that "ens3nsby" is a standby slave of "ens3", while for a name like "eth0" they can't tell which master it belongs to. Historically the name of IFF_UP interface can't be changed because there might be admin script or management software that is already relying on such behavior and assumes that the slave name can't be changed once UP. But failover is special: with the in-kernel auto-enslavement mechanism, the userspace expectation for device enumeration and bring-up order is already broken. Previously initramfs and various userspace config tools were modified to bypass failover slaves because of auto-enslavement and duplicate MAC address. Similarly, in case that users care about seeing reliable slave name, the new type of failover slaves needs to be taken care of specifically in userspace anyway. It's less risky to lift up the rename restriction on failover slave which is already UP. Although it's possible this change may potentially break userspace component (most likely configuration scripts or management software) that assumes slave name can't be changed while UP, it's relatively a limited and controllable set among all userspace components, which can be fixed specifically to listen for the rename events on failover slaves. Userspace component interacting with slaves is expected to be changed to operate on failover master interface instead, as the failover slave is dynamic in nature which may come and go at any point. The goal is to make the role of failover slaves less relevant, and userspace components should only deal with failover master in the long run. Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module") Signed-off-by: Si-Wei Liu Reviewed-by: Liran Alon Acked-by: Sridhar Samudrala Signed-off-by: David S. Miller (cherry picked from commit 8065a779f17e94536a1c4dcee4f9d88011672f97) [marcelo.cerri: picked from git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git] Signed-off-by: Marcelo Henrique Cerri --- include/linux/netdevice.h | 3 +++ net/core/dev.c | 16 +++++++++++++++- net/core/failover.c | 6 +++--- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index cda67002cf1a..3343b48e8737 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1382,6 +1382,7 @@ struct net_device_ops { * @IFF_FAILOVER: device is a failover master device * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device + * @IFF_LIVE_RENAME_OK: rename is allowed while device is up and running */ enum netdev_priv_flags { IFF_802_1Q_VLAN = 1<<0, @@ -1414,6 +1415,7 @@ enum netdev_priv_flags { IFF_FAILOVER = 1<<27, IFF_FAILOVER_SLAVE = 1<<28, IFF_L3MDEV_RX_HANDLER = 1<<29, + IFF_LIVE_RENAME_OK = 1<<30, }; #define IFF_802_1Q_VLAN IFF_802_1Q_VLAN @@ -1445,6 +1447,7 @@ enum netdev_priv_flags { #define IFF_FAILOVER IFF_FAILOVER #define IFF_FAILOVER_SLAVE IFF_FAILOVER_SLAVE #define IFF_L3MDEV_RX_HANDLER IFF_L3MDEV_RX_HANDLER +#define IFF_LIVE_RENAME_OK IFF_LIVE_RENAME_OK /** * struct net_device - The DEVICE structure. diff --git a/net/core/dev.c b/net/core/dev.c index ea49a28976a8..58e69a24a234 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1182,7 +1182,21 @@ int dev_change_name(struct net_device *dev, const char *newname) BUG_ON(!dev_net(dev)); net = dev_net(dev); - if (dev->flags & IFF_UP) + + /* Some auto-enslaved devices e.g. failover slaves are + * special, as userspace might rename the device after + * the interface had been brought up and running since + * the point kernel initiated auto-enslavement. Allow + * live name change even when these slave devices are + * up and running. + * + * Typically, users of these auto-enslaving devices + * don't actually care about slave name change, as + * they are supposed to operate on master interface + * directly. + */ + if (dev->flags & IFF_UP && + likely(!(dev->priv_flags & IFF_LIVE_RENAME_OK))) return -EBUSY; write_seqcount_begin(&devnet_rename_seq); diff --git a/net/core/failover.c b/net/core/failover.c index 4a92a98ccce9..b5cd3c727285 100644 --- a/net/core/failover.c +++ b/net/core/failover.c @@ -80,14 +80,14 @@ static int failover_slave_register(struct net_device *slave_dev) goto err_upper_link; } - slave_dev->priv_flags |= IFF_FAILOVER_SLAVE; + slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK); if (fops && fops->slave_register && !fops->slave_register(slave_dev, failover_dev)) return NOTIFY_OK; netdev_upper_dev_unlink(slave_dev, failover_dev); - slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE; + slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK); err_upper_link: netdev_rx_handler_unregister(slave_dev); done: @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device *slave_dev) netdev_rx_handler_unregister(slave_dev); netdev_upper_dev_unlink(slave_dev, failover_dev); - slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE; + slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK); if (fops && fops->slave_unregister && !fops->slave_unregister(slave_dev, failover_dev))