Message ID | 1373005979-10196-1-git-send-email-wangyufen@huawei.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Wangyufen <wangyufen@huawei.com> Date: Fri, 5 Jul 2013 14:32:59 +0800 > @@ -2301,8 +2301,11 @@ static int bond_miimon_inspect(struct bonding *bond) > switch (slave->link) { > case BOND_LINK_UP: > - if (link_state) > + if (link_state) { > + if (slave->speed == SPEED_UNKNOWN) > + bond_update_speed_duplex(slave); > continue; bond_miimon_inspect() does not hold the RTNL mutex, and it is required that the RTNL mutex is held when bond_update_speed_duplex() is called. If you ran this new code, you should be hitting the assertion at the beginning of __ethtool_get_settings() which reads: ASSERT_RTNL(); In fact, if you look at bond_miimon_inspect()'s caller it goes: if (!rtnl_trylock()) { right after calling bond_miimon_inspect(). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 5, 2013 at 8:32 AM, Wangyufen <wangyufen@huawei.com> wrote: > From: "Wang Yufen" <wangyufen@huawei.com> > > We bonded nic using LACP mode repeatedly, occasionally LACP bonding failed, > because a slave nic port speed was unknown. But when we used ethtool to > check the slave NIC status, the nic status was normal,speed was 10000Mb/s. Can you give a bit more details on how did you test? And which nic was it? I've tried to reproduce it with with while :; do echo +/- > /sys/.../bonding/slaves; done but failed. > > Bonding get the NIC speed from NIC drivers,just when enslave nic > and receive NETDEV_CHANGE event.We call bond_update_speed_duplex to > update speed and duplex when miimon inspect slave link is OK and slave > speed is unknown. > > > Signed-off-by: Wang Yufen <wangyufen@huawei.com> > --- > drivers/net/bonding/bond_main.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index f975696..d288a98 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -2301,8 +2301,11 @@ static int bond_miimon_inspect(struct bonding *bond) > > switch (slave->link) { > case BOND_LINK_UP: > - if (link_state) > + if (link_state) { > + if (slave->speed == SPEED_UNKNOWN) > + bond_update_speed_duplex(slave); > continue; > + } > > slave->link = BOND_LINK_FAIL; > slave->delay = bond->params.downdelay; > -- > 1.8.0 > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best regards, Veaceslav Falico -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013/7/5 17:20, Veaceslav Falico wrote: > On Fri, Jul 5, 2013 at 8:32 AM, Wangyufen <wangyufen@huawei.com> wrote: >> From: "Wang Yufen" <wangyufen@huawei.com> >> >> We bonded nic using LACP mode repeatedly, occasionally LACP bonding failed, >> because a slave nic port speed was unknown. But when we used ethtool to >> check the slave NIC status, the nic status was normal,speed was 10000Mb/s. > > Can you give a bit more details on how did you test? And which nic was it? > > I've tried to reproduce it with with while :; do echo +/- > > /sys/.../bonding/slaves; done > but failed. > that is my test script: # !/bin/sh function bond_enable() { echo -vpa0 >/sys/class/net/bonding_masters sleep 2 echo +vpa0 > /sys/class/net/bonding_masters echo 4 > /sys/class/net/vpa0/bonding/mode echo 1 > /sys/class/net/vpa0/bonding/xmit_hash_policy ifconfig vpa0 up sleep 1 echo +eth0 > /sys/class/net/vpa0/bonding/slaves echo +eth5 > /sys/class/net/vpa0/bonding/slaves echo +eth7 > /sys/class/net/vpa0/bonding/slaves echo +eth9 > /sys/class/net/vpa0/bonding/slaves sleep 2 } for((i=0;i<5000;i++)) do bond_enable j=0 while [ $j -lt 10 ] do sleep 0.5 ifconfig vpa0 192.168.13.8/24 out=`ifconfig | grep "192.168.13.8"` if [ -n "$out" ];then break fi ((j++)) done out1=`ping 192.168.13.8 -c 4` out2=`cat /proc/net/bonding/vpa0 | grep "Number of ports: 4"` if [ -n "$out1" -a -n "$out2" ];then echo $i PASS >> dep002.log else echo "out1 is $out1, out2 is $out2" >>dep002.log echo $i FAIL >> dep002.log exit 0 fi done >> >> Bonding get the NIC speed from NIC drivers,just when enslave nic >> and receive NETDEV_CHANGE event.We call bond_update_speed_duplex to >> update speed and duplex when miimon inspect slave link is OK and slave >> speed is unknown. >> >> >> Signed-off-by: Wang Yufen <wangyufen@huawei.com> >> --- >> drivers/net/bonding/bond_main.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index f975696..d288a98 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -2301,8 +2301,11 @@ static int bond_miimon_inspect(struct bonding *bond) >> >> switch (slave->link) { >> case BOND_LINK_UP: >> - if (link_state) >> + if (link_state) { >> + if (slave->speed == SPEED_UNKNOWN) >> + bond_update_speed_duplex(slave); >> continue; >> + } >> >> slave->link = BOND_LINK_FAIL; >> slave->delay = bond->params.downdelay; >> -- >> 1.8.0 >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Best regards, > Veaceslav Falico > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013/7/5 16:40, David Miller wrote: > From: Wangyufen <wangyufen@huawei.com> > Date: Fri, 5 Jul 2013 14:32:59 +0800 > >> @@ -2301,8 +2301,11 @@ static int bond_miimon_inspect(struct bonding *bond) > >> switch (slave->link) { >> case BOND_LINK_UP: >> - if (link_state) >> + if (link_state) { >> + if (slave->speed == SPEED_UNKNOWN) >> + bond_update_speed_duplex(slave); >> continue; > > bond_miimon_inspect() does not hold the RTNL mutex, and it is required > that the RTNL mutex is held when bond_update_speed_duplex() is called. > > If you ran this new code, you should be hitting the assertion at the > beginning of __ethtool_get_settings() which reads: > > ASSERT_RTNL(); > > In fact, if you look at bond_miimon_inspect()'s caller it goes: > > if (!rtnl_trylock()) { > > right after calling bond_miimon_inspect(). OK,thanks. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-07-05 at 14:32 +0800, Wangyufen wrote: > From: "Wang Yufen" <wangyufen@huawei.com> > > We bonded nic using LACP mode repeatedly, occasionally LACP bonding failed, > because a slave nic port speed was unknown. But when we used ethtool to > check the slave NIC status, the nic status was normal,speed was 10000Mb/s. > > Bonding get the NIC speed from NIC drivers,just when enslave nic > and receive NETDEV_CHANGE event.We call bond_update_speed_duplex to > update speed and duplex when miimon inspect slave link is OK and slave > speed is unknown. bond_update_sleep_duplex() must not be called in atomic context. Ben. > Signed-off-by: Wang Yufen <wangyufen@huawei.com> > --- > drivers/net/bonding/bond_main.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index f975696..d288a98 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -2301,8 +2301,11 @@ static int bond_miimon_inspect(struct bonding *bond) > > switch (slave->link) { > case BOND_LINK_UP: > - if (link_state) > + if (link_state) { > + if (slave->speed == SPEED_UNKNOWN) > + bond_update_speed_duplex(slave); > continue; > + } > > slave->link = BOND_LINK_FAIL; > slave->delay = bond->params.downdelay;
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index f975696..d288a98 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2301,8 +2301,11 @@ static int bond_miimon_inspect(struct bonding *bond) switch (slave->link) { case BOND_LINK_UP: - if (link_state) + if (link_state) { + if (slave->speed == SPEED_UNKNOWN) + bond_update_speed_duplex(slave); continue; + } slave->link = BOND_LINK_FAIL; slave->delay = bond->params.downdelay;