From patchwork Tue Mar 26 23:48:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Si-Wei Liu X-Patchwork-Id: 1066240 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=oracle.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=oracle.com header.i=@oracle.com header.b="NBHMGmq9"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44TT5867qhz9sS5 for ; Wed, 27 Mar 2019 11:14:20 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731769AbfC0ANq (ORCPT ); Tue, 26 Mar 2019 20:13:46 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:58750 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726982AbfC0ANq (ORCPT ); Tue, 26 Mar 2019 20:13:46 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x2R046Gb169878; Wed, 27 Mar 2019 00:13:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2018-07-02; bh=JYi/azm0uwMGWzITHNt5JwhXnQkO8mYRgkt7VG7O3rU=; b=NBHMGmq9KZb5TmYIIcutSSes1n80rUOPq/FAgVOKQvpk88PKJ9n9Djg6lWGMDE9hLzLJ EskDeKyiwgdt4Tn+VSYR06NhfvV9KSbB16/Kj8l9PbD12jDkTbiLF8sBt36xmuUXc5yG z73qi7iP3VVQUaxwdDVbjpV9UGcYvpVG7u7npws/SCnJTQBwoOPZNffNAWTPPphMUvB+ /03wLtGDX+b1JTI/89xLRhm40+fld0/3H/RMWKF58S5OnCxUJSH1k89p4z7Eu85Bct27 Ig4N8CSbrosFWsEdDRK2uSMS/+UPGfPxHxg0C/hbjozYul8ChqL2xMpmnUYHhGnjlU56 Mw== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2130.oracle.com with ESMTP id 2re6g15j1n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 27 Mar 2019 00:13:30 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x2R0DS8d016352 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 27 Mar 2019 00:13:29 GMT Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x2R0DPkM030349; Wed, 27 Mar 2019 00:13:25 GMT Received: from ban25x6uut24.us.oracle.com (/10.153.73.24) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 26 Mar 2019 17:13:24 -0700 From: Si-Wei Liu To: mst@redhat.com, sridhar.samudrala@intel.com, stephen@networkplumber.org, davem@davemloft.net, kubakici@wp.pl, alexander.duyck@gmail.com, jiri@resnulli.us, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org Cc: liran.alon@oracle.com, boris.ostrovsky@oracle.com, vijay.balakrishna@oracle.com, si-wei liu Subject: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces Date: Tue, 26 Mar 2019 19:48:13 -0400 Message-Id: <1553644093-10917-1-git-send-email-si-wei.liu@oracle.com> X-Mailer: git-send-email 1.8.3.1 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9207 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903260163 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When a netdev appears through hot plug then gets enslaved by a failover master that is already up and running, the slave will be opened right away after getting enslaved. Today there's a race that userspace (udev) may fail to rename the slave if the kernel (net_failover) opens the slave earlier than when the userspace rename happens. Unlike bond or team, the primary slave of failover can't be renamed by userspace ahead of time, since the kernel initiated auto-enslavement is unable to, or rather, is never meant to be synchronized with the rename request from userspace. As the failover slave interfaces are not designed to be operated directly by userspace apps: IP configuration, filter rules with regard to network traffic passing and etc., should all be done on master interface. In general, userspace apps only care about the name of master interface, while slave names are less important as long as admin users can see reliable names that may carry other information describing the netdev. For e.g., they can infer that "ens3nsby" is a standby slave of "ens3", while for a name like "eth0" they can't tell which master it belongs to. Historically the name of IFF_UP interface can't be changed because there might be admin script or management software that is already relying on such behavior and assumes that the slave name can't be changed once UP. But failover is special: with the in-kernel auto-enslavement mechanism, the userspace expectation for device enumeration and bring-up order is already broken. Previously initramfs and various userspace config tools were modified to bypass failover slaves because of auto-enslavement and duplicate MAC address. Similarly, in case that users care about seeing reliable slave name, the new type of failover slaves needs to be taken care of specifically in userspace anyway. It's less risky to lift up the rename restriction on failover slave which is already UP. Although it's possible this change may potentially break userspace component (most likely configuration scripts or management software) that assumes slave name can't be changed while UP, it's relatively a limited and controllable set among all userspace components, which can be fixed specifically to listen for the rename and/or link down/up events on failover slaves. Userspace component interacting with slaves is expected to be changed to operate on failover master interface instead, as the failover slave is dynamic in nature which may come and go at any point. The goal is to make the role of failover slaves less relevant, and userspace components should only deal with failover master in the long run. Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module") Signed-off-by: Si-Wei Liu Reviewed-by: Liran Alon --- v1 -> v2: - Drop configurable module parameter (Sridhar) v2 -> v3: - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar) - Send down and up events around rename (Michael S. Tsirkin) --- net/core/dev.c | 37 ++++++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 722d50d..3e0cd80 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1171,6 +1171,7 @@ int dev_get_valid_name(struct net *net, struct net_device *dev, int dev_change_name(struct net_device *dev, const char *newname) { unsigned char old_assign_type; + bool reopen_needed = false; char oldname[IFNAMSIZ]; int err = 0; int ret; @@ -1180,8 +1181,24 @@ int dev_change_name(struct net_device *dev, const char *newname) BUG_ON(!dev_net(dev)); net = dev_net(dev); - if (dev->flags & IFF_UP) - return -EBUSY; + + /* Allow failover slave to rename even when + * it is up and running. + * + * Failover slaves are special, since userspace + * might rename the slave after the interface + * has been brought up and running due to + * auto-enslavement. + * + * Failover users don't actually care about slave + * name change, as they are only expected to operate + * on master interface directly. + */ + if (dev->flags & IFF_UP) { + if (likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE))) + return -EBUSY; + reopen_needed = true; + } write_seqcount_begin(&devnet_rename_seq); @@ -1198,6 +1215,9 @@ int dev_change_name(struct net_device *dev, const char *newname) return err; } + if (reopen_needed) + dev_close(dev); + if (oldname[0] && !strchr(oldname, '%')) netdev_info(dev, "renamed from %s\n", oldname); @@ -1210,7 +1230,9 @@ int dev_change_name(struct net_device *dev, const char *newname) memcpy(dev->name, oldname, IFNAMSIZ); dev->name_assign_type = old_assign_type; write_seqcount_end(&devnet_rename_seq); - return ret; + if (err >= 0) + err = ret; + goto reopen; } write_seqcount_end(&devnet_rename_seq); @@ -1246,6 +1268,15 @@ int dev_change_name(struct net_device *dev, const char *newname) } } +reopen: + if (reopen_needed) { + ret = dev_open(dev); + if (ret) { + pr_err("%s: reopen device failed: %d\n", + dev->name, ret); + } + } + return err; }