Message ID | 1452147313-22886-1-git-send-email-zyjzyj2000@gmail.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Hi, Emil Would you like to help me to make tests with this patch? If the root cause is not the time span, I will make a new patch for this. Thanks a lot. Zhu Yanjun On 01/07/2016 02:15 PM, zyjzyj2000@gmail.com wrote: > From: Zhu Yanjun <yanjun.zhu@windriver.com> > > In 802.3ad mode, the speed and duplex is needed. But in some NIC, > there is a time span between NIC up state and getting speed and duplex. > As such, sometimes a slave in 802.3ad mode is in up state without > speed and duplex. This will make bonding in 802.3ad mode can not > work well. > To make bonding driver be compatible with more NICs, it is > necessary to restrict the up state in 802.3ad mode. > > Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com> > --- > drivers/net/bonding/bond_main.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 09f8a48..7df8af5 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond) > > link_state = bond_check_dev_link(bond, slave->dev, 0); > > + if ((BMSR_LSTATUS == link_state) && > + (BOND_MODE(bond) == BOND_MODE_8023AD)) { > + rtnl_lock(); > + bond_update_speed_duplex(slave); > + rtnl_unlock(); > + if ((slave->speed == SPEED_UNKNOWN) || > + (slave->duplex == DUPLEX_UNKNOWN)) { > + link_state = 0; > + netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex"); > + } > + } > switch (slave->link) { > case BOND_LINK_UP: > if (link_state) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
<zyjzyj2000@gmail.com> wrote: >From: Zhu Yanjun <yanjun.zhu@windriver.com> > >In 802.3ad mode, the speed and duplex is needed. But in some NIC, >there is a time span between NIC up state and getting speed and duplex. >As such, sometimes a slave in 802.3ad mode is in up state without >speed and duplex. This will make bonding in 802.3ad mode can not >work well. From my reading of Emil's comments in the discussion, I'm not sure the above is an accurate description of the problem. If I'm understanding correctly, the cause is due to link flaps racing with the bonding monitor workqueue polling the state. Is this correct? >To make bonding driver be compatible with more NICs, it is >necessary to restrict the up state in 802.3ad mode. > >Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com> >--- > drivers/net/bonding/bond_main.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index 09f8a48..7df8af5 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond) > > link_state = bond_check_dev_link(bond, slave->dev, 0); > >+ if ((BMSR_LSTATUS == link_state) && >+ (BOND_MODE(bond) == BOND_MODE_8023AD)) { >+ rtnl_lock(); >+ bond_update_speed_duplex(slave); >+ rtnl_unlock(); This will add a round trip on the RTNL mutex for every miimon interval when the slave is carrier up. At common miimon rates (10 - 50 ms), this will hit RTNL between 20 and 100 times per second. I do not see how this is acceptable. I believe the proper solution here is to supplant the periodic miimon polling from bonding with link state detection based on notifiers (As Stephen suggested, not for the first time). My suggestion is to have bonding set slave link state based on notifiers if miimon is set to zero, and poll as usual if it is not. This would preserve any backwards compatibility with any device out there that might possibly still be doing netif_carrier_on/off incorrectly or not at all. The only minor complication is synchronizing notifier carrier state detection with the ARP monitor. This should have been done a long time ago; I'll work something up tomorrow (it's late here right now) and post a patch for testing. -J >+ if ((slave->speed == SPEED_UNKNOWN) || >+ (slave->duplex == DUPLEX_UNKNOWN)) { >+ link_state = 0; >+ netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex"); >+ } >+ } > switch (slave->link) { > case BOND_LINK_UP: > if (link_state) >-- >1.7.9.5 > --- -Jay Vosburgh, jay.vosburgh@canonical.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 07, 2016 at 02:15:13PM +0800, zyjzyj2000@gmail.com wrote: > From: Zhu Yanjun <yanjun.zhu@windriver.com> > > In 802.3ad mode, the speed and duplex is needed. But in some NIC, > there is a time span between NIC up state and getting speed and duplex. > As such, sometimes a slave in 802.3ad mode is in up state without > speed and duplex. This will make bonding in 802.3ad mode can not > work well. > To make bonding driver be compatible with more NICs, it is > necessary to restrict the up state in 802.3ad mode. > > Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com> > --- > drivers/net/bonding/bond_main.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 09f8a48..7df8af5 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond) > > link_state = bond_check_dev_link(bond, slave->dev, 0); > > + if ((BMSR_LSTATUS == link_state) && > + (BOND_MODE(bond) == BOND_MODE_8023AD)) { > + rtnl_lock(); > + bond_update_speed_duplex(slave); > + rtnl_unlock(); > + if ((slave->speed == SPEED_UNKNOWN) || > + (slave->duplex == DUPLEX_UNKNOWN)) { > + link_state = 0; > + netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex"); If I read this right, whenever this state (link up but speed/duplex unknown) is entered, you'll keep writing this message into kernel log every miimon milliseconds until something changes. I'm not sure how long a NIC can stay in such state but it might get quite annoying (even more if something really goes wrong and NIC stays that way which can't be completely ruled out, IMHO). > + } > + } > switch (slave->link) { > case BOND_LINK_UP: > if (link_state) BtW, you accidentally submitted this patch twice. Michal Kubecek -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/07/2016 02:53 PM, Michal Kubecek wrote: > On Thu, Jan 07, 2016 at 02:15:13PM +0800, zyjzyj2000@gmail.com wrote: >> From: Zhu Yanjun <yanjun.zhu@windriver.com> >> >> In 802.3ad mode, the speed and duplex is needed. But in some NIC, >> there is a time span between NIC up state and getting speed and duplex. >> As such, sometimes a slave in 802.3ad mode is in up state without >> speed and duplex. This will make bonding in 802.3ad mode can not >> work well. >> To make bonding driver be compatible with more NICs, it is >> necessary to restrict the up state in 802.3ad mode. >> >> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com> >> --- >> drivers/net/bonding/bond_main.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index 09f8a48..7df8af5 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond) >> >> link_state = bond_check_dev_link(bond, slave->dev, 0); >> >> + if ((BMSR_LSTATUS == link_state) && >> + (BOND_MODE(bond) == BOND_MODE_8023AD)) { >> + rtnl_lock(); >> + bond_update_speed_duplex(slave); >> + rtnl_unlock(); >> + if ((slave->speed == SPEED_UNKNOWN) || >> + (slave->duplex == DUPLEX_UNKNOWN)) { >> + link_state = 0; >> + netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex"); > If I read this right, whenever this state (link up but speed/duplex > unknown) is entered, you'll keep writing this message into kernel log > every miimon milliseconds until something changes. I'm not sure how long > a NIC can stay in such state but it might get quite annoying (even more > if something really goes wrong and NIC stays that way which can't be > completely ruled out, IMHO). Sure, Thanks a lot. I want to confirm link_up without link_speed. It is not usual. So I think this only lasts for several seconds. It is very important to us since it can help us to find the root cause. Zhu Yanjun > > >> + } >> + } >> switch (slave->link) { >> case BOND_LINK_UP: >> if (link_state) > BtW, you accidentally submitted this patch twice. > > Michal Kubecek -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 07, 2016 at 03:37:26PM +0800, zhuyj wrote: > >If I read this right, whenever this state (link up but speed/duplex > >unknown) is entered, you'll keep writing this message into kernel log > >every miimon milliseconds until something changes. I'm not sure how long > >a NIC can stay in such state but it might get quite annoying (even more > >if something really goes wrong and NIC stays that way which can't be > >completely ruled out, IMHO). > > Sure, Thanks a lot. I want to confirm link_up without link_speed. It > is not usual. So I think this only lasts for several seconds. > It is very important to us since it can help us to find the root cause. For debugging purposes it's fine, of course. But this looked like an officially submitted patch so I didn't like the idea of log spamming (even one second could result in 10-100 messages and admins certainly would hate that). Michal Kubecek -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/07/2016 03:59 PM, Michal Kubecek wrote: > On Thu, Jan 07, 2016 at 03:37:26PM +0800, zhuyj wrote: >>> If I read this right, whenever this state (link up but speed/duplex >>> unknown) is entered, you'll keep writing this message into kernel log >>> every miimon milliseconds until something changes. I'm not sure how long >>> a NIC can stay in such state but it might get quite annoying (even more >>> if something really goes wrong and NIC stays that way which can't be >>> completely ruled out, IMHO). >> Sure, Thanks a lot. I want to confirm link_up without link_speed. It >> is not usual. So I think this only lasts for several seconds. >> It is very important to us since it can help us to find the root cause. > For debugging purposes it's fine, of course. But this looked like an > officially submitted patch so I didn't like the idea of log spamming > (even one second could result in 10-100 messages and admins certainly > would hate that). > > Michal Kubecek > Thanks a lot. Zhu Yanjun -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>-----Original Message----- >From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com] >Sent: Wednesday, January 06, 2016 10:34 PM >To: zyjzyj2000@gmail.com >Cc: Tantilov, Emil S; mkubecek@suse.cz; vfalico@gmail.com; >gospo@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind >River) >Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode > ><zyjzyj2000@gmail.com> wrote: > >>From: Zhu Yanjun <yanjun.zhu@windriver.com> >> >>In 802.3ad mode, the speed and duplex is needed. But in some NIC, >>there is a time span between NIC up state and getting speed and duplex. >>As such, sometimes a slave in 802.3ad mode is in up state without >>speed and duplex. This will make bonding in 802.3ad mode can not >>work well. > > From my reading of Emil's comments in the discussion, I'm not >sure the above is an accurate description of the problem. If I'm >understanding correctly, the cause is due to link flaps racing with the >bonding monitor workqueue polling the state. Is this correct? That is correct. >>To make bonding driver be compatible with more NICs, it is >>necessary to restrict the up state in 802.3ad mode. >> >>Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com> >>--- >> drivers/net/bonding/bond_main.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >>diff --git a/drivers/net/bonding/bond_main.c >b/drivers/net/bonding/bond_main.c >>index 09f8a48..7df8af5 100644 >>--- a/drivers/net/bonding/bond_main.c >>+++ b/drivers/net/bonding/bond_main.c >>@@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding >*bond) >> >> link_state = bond_check_dev_link(bond, slave->dev, 0); >> >>+ if ((BMSR_LSTATUS == link_state) && >>+ (BOND_MODE(bond) == BOND_MODE_8023AD)) { >>+ rtnl_lock(); >>+ bond_update_speed_duplex(slave); >>+ rtnl_unlock(); > > This will add a round trip on the RTNL mutex for every miimon >interval when the slave is carrier up. At common miimon rates (10 - 50 >ms), this will hit RTNL between 20 and 100 times per second. I do not >see how this is acceptable. > > I believe the proper solution here is to supplant the periodic >miimon polling from bonding with link state detection based on notifiers >(As Stephen suggested, not for the first time). > > My suggestion is to have bonding set slave link state based on >notifiers if miimon is set to zero, and poll as usual if it is not. >This would preserve any backwards compatibility with any device out >there that might possibly still be doing netif_carrier_on/off >incorrectly or not at all. The only minor complication is synchronizing >notifier carrier state detection with the ARP monitor. > > This should have been done a long time ago; I'll work something >up tomorrow (it's late here right now) and post a patch for testing. That would be awesome. Looking forward to it. Thanks, Emil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/07/2016 02:33 PM, Jay Vosburgh wrote: > <zyjzyj2000@gmail.com> wrote: > >> From: Zhu Yanjun <yanjun.zhu@windriver.com> >> >> In 802.3ad mode, the speed and duplex is needed. But in some NIC, >> there is a time span between NIC up state and getting speed and duplex. >> As such, sometimes a slave in 802.3ad mode is in up state without >> speed and duplex. This will make bonding in 802.3ad mode can not >> work well. > From my reading of Emil's comments in the discussion, I'm not > sure the above is an accurate description of the problem. If I'm > understanding correctly, the cause is due to link flaps racing with the > bonding monitor workqueue polling the state. Is this correct? The following are from my user. What I have done is based on it. " Here's one theory that would seem to match my observations: It seems that the x540T can sometimes report 'link up' a few moments before the speed and duplex are known. If the bond driver reads and stores the speed and duplex immediately after the link becomes 'up', it may occasionally miss the actual parameters by reading them before they are ready. If the parameters are only read when the link state changes, the 'unknown' status will stay until next state change happens. I have attached a file 'test.log' that shows the kernel log and states of the relevant interfaces after the problem is hit. It shows that the final state has all the links up and speeds are known on individual interfaces, but bond0 shows only single interface speed. After successful negotiation bond0 shows the aggregate speed i.e. 20000Mb/s. In the end of the file there is bunch of ethtool runs that were taken in a tight loop during the negotiation. It shows in the end that the link becomes up some time before the speed and duplex are actually known. If the bond driver only reads the speed and duplex during this window, it will get it wrong and it won't be corrected when the real speed becomes known since the link state won't change at that time. " Zhu Yanjun > >> To make bonding driver be compatible with more NICs, it is >> necessary to restrict the up state in 802.3ad mode. >> >> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com> >> --- >> drivers/net/bonding/bond_main.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index 09f8a48..7df8af5 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond) >> >> link_state = bond_check_dev_link(bond, slave->dev, 0); >> >> + if ((BMSR_LSTATUS == link_state) && >> + (BOND_MODE(bond) == BOND_MODE_8023AD)) { >> + rtnl_lock(); >> + bond_update_speed_duplex(slave); >> + rtnl_unlock(); > This will add a round trip on the RTNL mutex for every miimon > interval when the slave is carrier up. At common miimon rates (10 - 50 > ms), this will hit RTNL between 20 and 100 times per second. I do not > see how this is acceptable. > > I believe the proper solution here is to supplant the periodic > miimon polling from bonding with link state detection based on notifiers > (As Stephen suggested, not for the first time). > > My suggestion is to have bonding set slave link state based on > notifiers if miimon is set to zero, and poll as usual if it is not. > This would preserve any backwards compatibility with any device out > there that might possibly still be doing netif_carrier_on/off > incorrectly or not at all. The only minor complication is synchronizing > notifier carrier state detection with the ARP monitor. > > This should have been done a long time ago; I'll work something > up tomorrow (it's late here right now) and post a patch for testing. > > -J > >> + if ((slave->speed == SPEED_UNKNOWN) || >> + (slave->duplex == DUPLEX_UNKNOWN)) { >> + link_state = 0; >> + netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex"); >> + } >> + } >> switch (slave->link) { >> case BOND_LINK_UP: >> if (link_state) >> -- >> 1.7.9.5 >> > --- > -Jay Vosburgh, jay.vosburgh@canonical.com
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 09f8a48..7df8af5 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond) link_state = bond_check_dev_link(bond, slave->dev, 0); + if ((BMSR_LSTATUS == link_state) && + (BOND_MODE(bond) == BOND_MODE_8023AD)) { + rtnl_lock(); + bond_update_speed_duplex(slave); + rtnl_unlock(); + if ((slave->speed == SPEED_UNKNOWN) || + (slave->duplex == DUPLEX_UNKNOWN)) { + link_state = 0; + netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex"); + } + } switch (slave->link) { case BOND_LINK_UP: if (link_state)