Message ID | alpine.DEB.2.00.1009221631520.32661@router.home |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Christoph Lameter <cl@linux.com> wrote on 09/22/2010 02:33:14 PM: This can address issues where joins are slow because the initial join is > frequently lost. > > Also increment the frequency so that we get a 10 reports send over a > few seconds. Except you want to conform and not conform at the same time. :-) IGMPv2 should be: default count 2, interval 10secs IGMPv3 should be: default count 2, interval 1sec ...and no way is it a good idea to send 10 unsolicited reports on an Ethernet. I think system-wide defaults must be as suggested (which allows for v3 being shortened to 1sec, but not v2) and if you want to use longer values, you should have either a *per-interface* tunable [ie, the default value for your interface only] or make these per-interface variables and have the IB code bump them up for IB interfaces only. An attached Ethernet on the same system shouldn't be using larger values unless bumped for some reason by an administrator. There is no problem with current values on Ethernet; lets not create one. :-) +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 22 Sep 2010, David Stevens wrote: > > > > Also increment the frequency so that we get a 10 reports send over a > > few seconds. > > Except you want to conform and not conform at the same time. :-) > IGMPv2 should be: default count 2, interval 10secs > IGMPv3 should be: default count 2, interval 1sec This is during the period of unsolicited igmp reports. We do not know if this group is managed using V3 or V2 since no igmp query/report has been received yet. > ...and no way is it a good idea to send 10 unsolicited reports on an > Ethernet. Why would that be an issue? The IGMPv2 RFC has no strict limit and RFC3376 mentions that the retransmission occurs "Robustness Variable" times minus one. Choosing 10 for the "Robustness Variable" is certainly ok. If we do not increase the number of reports but just limit the interval then the chance of outages of a second or so during mc group creation causing routers missing igmp reports is significantly increased. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Lameter <cl@linux.com> wrote on 09/23/2010 08:37:48 AM: > > On Wed, 22 Sep 2010, David Stevens wrote: > > > > > > > Also increment the frequency so that we get a 10 reports send over a > > > few seconds. > > > > Except you want to conform and not conform at the same time. :-) > > IGMPv2 should be: default count 2, interval 10secs > > IGMPv3 should be: default count 2, interval 1sec > > This is during the period of unsolicited igmp reports. We do not know if > this group is managed using V3 or V2 since no igmp query/report has been > received yet. The default is IGMPv3 unless a v2 querier is present. You can force it to be IGMPv2 with by having an IGMPv2 querier on the network or by using the force_igmp_version tunable. > > ...and no way is it a good idea to send 10 unsolicited reports on an > > Ethernet. > > Why would that be an issue? Because the traffic for all joins is multiplied by >3. If you're joining 1 group, maybe that wouldn't be an issue, but what if I join 100, and what if hundreds of other hosts on that network do too? And applications that dynamically join and leave groups may do this "normally." Even 3 reports on switched networks with low loss is really unnecessary overkill; 10 is just wasted bandwidth. > The IGMPv2 RFC has no strict limit and RFC3376 > mentions that the retransmission occurs "Robustness Variable" times > minus one. Choosing 10 for the "Robustness Variable" is certainly ok. Both of them specify the default value and say a querier is the mechanism for changing that. If you want to follow the RFC, the default is "2", not "10." While it'd be reasonable for a sysadmin to tune this per-interface without a querier, it's not reasonable to make all linux systems on all networks more than triple the number of reports they send from the RFC-specified default. Right?!? :-) > If we do not increase the number of reports but just limit the interval > then the chance of outages of a second or so during mc group creation > causing routers missing igmp reports is significantly increased. If you can't send on a group for 1 second, all of the initial IGMPv3 reports will be lost about half of the time if we make that conformant (it looks like it now uses the 10sec v2 time instead of the 1 sec v3 time it should). That's a problem IB needs to solve. Ideally, you wouldn't want to return from the hardware join until you can actually send the reports, but I expect there are locks held and that can't be 1 second of spinning on a processor. So, I think you really should put a queue in IB for that hardware multicast address and send those packets when/if you get positive acknowledgement (much as done for ARP completion, but maybe queue more than 1) from the fabric that you can use it. If you don't get any sort of ACK for that, then you can instrument a delay for it, but any fixed number you use may be either too big or too small for a particular fabric. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-2.6/net/ipv4/igmp.c =================================================================== --- linux-2.6.orig/net/ipv4/igmp.c 2010-09-22 16:28:17.000000000 -0500 +++ linux-2.6/net/ipv4/igmp.c 2010-09-22 16:28:54.000000000 -0500 @@ -114,9 +114,9 @@ #define IGMP_V1_Router_Present_Timeout (400*HZ) #define IGMP_V2_Router_Present_Timeout (400*HZ) -#define IGMP_Unsolicited_Report_Interval (10*HZ) +#define IGMP_Unsolicited_Report_Interval (HZ) #define IGMP_Query_Response_Interval (10*HZ) -#define IGMP_Unsolicited_Report_Count 2 +#define IGMP_Unsolicited_Report_Count 10 #define IGMP_Initial_Report_Delay (1)