Message ID | alpine.DEB.1.10.0904161035390.19650@qirst.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
This isn't what I suggested-- you have the default backwards. It must default to current behavior, or it's pointless. The text you have with it is overstated, too. Of course applications using your model can still receive unexpected data-- it does not reserve the port or multicast address to just your sender or to multicast traffic alone. My suggestion is to do nothing. :-) But if that's too difficult, an alternative would be a socket option that delivers traffic for joined groups only and defaults off. In fact, it'd probably be most useful if it also prevents unicast traffic for sockets using that port, too. None of these things have the magic effect of preventing unwanted data delivery, but it'd allow you to receive multiple, specific groups on a single socket with just the joins to indicate which. +-DLS netdev-owner@vger.kernel.org wrote on 04/16/2009 07:38:23 AM: > Do what David Stevens suggest: Add a per socket option > > > > Subject: Multicast: Filter Multicast traffic per socket mc_list > > If two processes open the same port as a multicast socket and then > join two different multicast groups then traffic for both multicast groups > is forwarded to either process. This means that application will get surprising > data that they did not ask for. Applications will have to filter these out in > order to work correctly if multiple apps run on the same system. > > These are pretty strange semantics but they have been around since the > beginning of multicast support on Unix systems. Most of the other operating > systems supporting Multicast have since changed to only supplying multicast > traffic to a socket that was selected through multicast join operations. > > This patch does change Linux to behave in the same way. But there may be > applications that rely on the old behavior. Therefore we provide a means > to switch back to the old behavior using a new multicast socket option > > IP_MULTICAST_ALL > > If set then all multicast traffic to the port is forwarded to the socket > (additional constraints are the SSM inclusion and exclusion lists!). > If not set (default) then only traffic for multicast groups that were > joined by thesocket is received. > > Signed-off-by: Christoph Lameter <cl@linux.com> > > --- > include/linux/in.h | 1 + > include/net/inet_sock.h | 3 ++- > net/ipv4/igmp.c | 4 ++-- > net/ipv4/ip_sockglue.c | 11 +++++++++++ > 4 files changed, 16 insertions(+), 3 deletions(-) > > Index: linux-2.6/include/net/inet_sock.h > =================================================================== > --- linux-2.6.orig/include/net/inet_sock.h 2009-04-16 08:59:20.000000000 -0500 > +++ linux-2.6/include/net/inet_sock.h 2009-04-16 09:04:47.000000000 -0500 > @@ -130,7 +130,8 @@ struct inet_sock { > freebind:1, > hdrincl:1, > mc_loop:1, > - transparent:1; > + transparent:1, > + mc_all:1; > int mc_index; > __be32 mc_addr; > struct ip_mc_socklist *mc_list; > Index: linux-2.6/net/ipv4/igmp.c > =================================================================== > --- linux-2.6.orig/net/ipv4/igmp.c 2009-04-16 08:54:47.000000000 -0500 > +++ linux-2.6/net/ipv4/igmp.c 2009-04-16 09:04:06.000000000 -0500 > @@ -2187,7 +2187,7 @@ int ip_mc_sf_allow(struct sock *sk, __be > struct ip_sf_socklist *psl; > int i; > > - if (!ipv4_is_multicast(loc_addr)) > + if (ipv4_is_lbcast(loc_addr) || !ipv4_is_multicast(loc_addr)) > return 1; > > for (pmc=inet->mc_list; pmc; pmc=pmc->next) { > @@ -2196,7 +2196,7 @@ int ip_mc_sf_allow(struct sock *sk, __be > break; > } > if (!pmc) > - return 1; > + return inet->mc_all; > psl = pmc->sflist; > if (!psl) > return pmc->sfmode == MCAST_EXCLUDE; > Index: linux-2.6/include/linux/in.h > =================================================================== > --- linux-2.6.orig/include/linux/in.h 2009-04-16 09:05:41.000000000 -0500 > +++ linux-2.6/include/linux/in.h 2009-04-16 09:32:52.000000000 -0500 > @@ -107,6 +107,7 @@ struct in_addr { > #define MCAST_JOIN_SOURCE_GROUP 46 > #define MCAST_LEAVE_SOURCE_GROUP 47 > #define MCAST_MSFILTER 48 > +#define IP_MULTICAST_ALL 49 > > #define MCAST_EXCLUDE 0 > #define MCAST_INCLUDE 1 > Index: linux-2.6/net/ipv4/ip_sockglue.c > =================================================================== > --- linux-2.6.orig/net/ipv4/ip_sockglue.c 2009-04-16 09:09:52.000000000 -0500 > +++ linux-2.6/net/ipv4/ip_sockglue.c 2009-04-16 09:31:40.000000000 -0500 > @@ -449,6 +449,7 @@ static int do_ip_setsockopt(struct sock > (1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) | > (1<<IP_PASSSEC) | (1<<IP_TRANSPARENT))) || > optname == IP_MULTICAST_TTL || > + optname == IP_MULTICAST_ALL || > optname == IP_MULTICAST_LOOP || > optname == IP_RECVORIGDSTADDR) { > if (optlen >= sizeof(int)) { > @@ -895,6 +896,13 @@ static int do_ip_setsockopt(struct sock > kfree(gsf); > break; > } > + case IP_MULTICAST_ALL: > + if (optlen<1) > + goto e_inval; > + if (val != 0 && val != 1) > + goto e_inval; > + inet->mc_all = val; > + break; > case IP_ROUTER_ALERT: > err = ip_ra_control(sk, val ? 1 : 0, NULL); > break; > @@ -1147,6 +1155,9 @@ static int do_ip_getsockopt(struct sock > release_sock(sk); > return err; > } > + case IP_MULTICAST_ALL: > + val = inet->mc_all; > + break; > case IP_PKTOPTIONS: > { > struct msghdr msg; > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 16, 2009 at 10:38:23AM -0400, Christoph Lameter wrote: > Do what David Stevens suggest: Add a per socket option > > > > Subject: Multicast: Filter Multicast traffic per socket mc_list > > If two processes open the same port as a multicast socket and then > join two different multicast groups then traffic for both multicast groups > is forwarded to either process. This means that application will get surprising > data that they did not ask for. Applications will have to filter these out in > order to work correctly if multiple apps run on the same system. > > These are pretty strange semantics but they have been around since the > beginning of multicast support on Unix systems. Most of the other operating > systems supporting Multicast have since changed to only supplying multicast > traffic to a socket that was selected through multicast join operations. > > This patch does change Linux to behave in the same way. But there may be > applications that rely on the old behavior. Therefore we provide a means > to switch back to the old behavior using a new multicast socket option > > IP_MULTICAST_ALL > > If set then all multicast traffic to the port is forwarded to the socket > (additional constraints are the SSM inclusion and exclusion lists!). > If not set (default) then only traffic for multicast groups that were > joined by thesocket is received. > I think your comment is reveresed here isn't it? the default you have below is that mc_all is set, which defaults you to the existing behavior, rather than the new behavior introduced by this patch. Ack to the patch though Acked-by: Neil Horman <nhorman@tuxdriver.com> Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Lameter wrote: > Do what David Stevens suggest: Add a per socket option > > > > Subject: Multicast: Filter Multicast traffic per socket mc_list > > If two processes open the same port as a multicast socket and then > join two different multicast groups then traffic for both multicast groups > is forwarded to either process. This means that application will get surprising > data that they did not ask for. Applications will have to filter these out in > order to work correctly if multiple apps run on the same system. > > These are pretty strange semantics but they have been around since the > beginning of multicast support on Unix systems. Most of the other operating > systems supporting Multicast have since changed to only supplying multicast > traffic to a socket that was selected through multicast join operations. > > This patch does change Linux to behave in the same way. But there may be > applications that rely on the old behavior. Therefore we provide a means > to switch back to the old behavior using a new multicast socket option > > IP_MULTICAST_ALL > > If set then all multicast traffic to the port is forwarded to the socket > (additional constraints are the SSM inclusion and exclusion lists!). > If not set (default) then only traffic for multicast groups that were > joined by thesocket is received. > > Signed-off-by: Christoph Lameter <cl@linux.com> > > --- > include/linux/in.h | 1 + > include/net/inet_sock.h | 3 ++- > net/ipv4/igmp.c | 4 ++-- > net/ipv4/ip_sockglue.c | 11 +++++++++++ > 4 files changed, 16 insertions(+), 3 deletions(-) > > Index: linux-2.6/include/net/inet_sock.h > =================================================================== > --- linux-2.6.orig/include/net/inet_sock.h 2009-04-16 08:59:20.000000000 -0500 > +++ linux-2.6/include/net/inet_sock.h 2009-04-16 09:04:47.000000000 -0500 > @@ -130,7 +130,8 @@ struct inet_sock { > freebind:1, > hdrincl:1, > mc_loop:1, > - transparent:1; > + transparent:1, > + mc_all:1; > int mc_index; > __be32 mc_addr; > struct ip_mc_socklist *mc_list; > Index: linux-2.6/net/ipv4/igmp.c > =================================================================== > --- linux-2.6.orig/net/ipv4/igmp.c 2009-04-16 08:54:47.000000000 -0500 > +++ linux-2.6/net/ipv4/igmp.c 2009-04-16 09:04:06.000000000 -0500 > @@ -2187,7 +2187,7 @@ int ip_mc_sf_allow(struct sock *sk, __be > struct ip_sf_socklist *psl; > int i; > > - if (!ipv4_is_multicast(loc_addr)) > + if (ipv4_is_lbcast(loc_addr) || !ipv4_is_multicast(loc_addr)) > return 1; I don't think this change is needed. ipv4_is_lbcast() checks if the address is 255.255.255.255. That address is already !ipv4_is_multicast(). Subnet broadcasts are also !ipv4_is_multicast. > > for (pmc=inet->mc_list; pmc; pmc=pmc->next) { > @@ -2196,7 +2196,7 @@ int ip_mc_sf_allow(struct sock *sk, __be > break; > } > if (!pmc) > - return 1; > + return inet->mc_all; > psl = pmc->sflist; > if (!psl) > return pmc->sfmode == MCAST_EXCLUDE; > Index: linux-2.6/include/linux/in.h > =================================================================== > --- linux-2.6.orig/include/linux/in.h 2009-04-16 09:05:41.000000000 -0500 > +++ linux-2.6/include/linux/in.h 2009-04-16 09:32:52.000000000 -0500 > @@ -107,6 +107,7 @@ struct in_addr { > #define MCAST_JOIN_SOURCE_GROUP 46 > #define MCAST_LEAVE_SOURCE_GROUP 47 > #define MCAST_MSFILTER 48 > +#define IP_MULTICAST_ALL 49 > > #define MCAST_EXCLUDE 0 > #define MCAST_INCLUDE 1 > Index: linux-2.6/net/ipv4/ip_sockglue.c > =================================================================== > --- linux-2.6.orig/net/ipv4/ip_sockglue.c 2009-04-16 09:09:52.000000000 -0500 > +++ linux-2.6/net/ipv4/ip_sockglue.c 2009-04-16 09:31:40.000000000 -0500 > @@ -449,6 +449,7 @@ static int do_ip_setsockopt(struct sock > (1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) | > (1<<IP_PASSSEC) | (1<<IP_TRANSPARENT))) || > optname == IP_MULTICAST_TTL || > + optname == IP_MULTICAST_ALL || > optname == IP_MULTICAST_LOOP || > optname == IP_RECVORIGDSTADDR) { > if (optlen >= sizeof(int)) { > @@ -895,6 +896,13 @@ static int do_ip_setsockopt(struct sock > kfree(gsf); > break; > } > + case IP_MULTICAST_ALL: > + if (optlen<1) > + goto e_inval; > + if (val != 0 && val != 1) > + goto e_inval; > + inet->mc_all = val; > + break; > case IP_ROUTER_ALERT: > err = ip_ra_control(sk, val ? 1 : 0, NULL); > break; > @@ -1147,6 +1155,9 @@ static int do_ip_getsockopt(struct sock > release_sock(sk); > return err; > } > + case IP_MULTICAST_ALL: > + val = inet->mc_all; > + break; > case IP_PKTOPTIONS: > { > struct msghdr msg; You might need to set inet->mc_all to 1 in inet_create() since I am not sure if we want to change the default behavior. The knowledge that some apps have a very "unique" way of doing multicast makes me a little hesitant. -vlad -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009, David Stevens wrote: > This isn't what I suggested-- you have the default backwards. It must > default > to current behavior, or it's pointless. If it would default to the current behavior then it would be incompatible with the behavior of other operating systems and the surprising behavior of the Linux multicast stack would continue to exist. The unusual behavior needs to be switched on if wanted for legacy or other reasons. > The text you have with it is overstated, too. Of course applications using > your model can still receive unexpected data-- it does not reserve the > port or multicast address to just your sender or to multicast traffic > alone. The application will no longer receive traffic from multicast groups that it did not subscribe to. Yes unicast can still result in unexpected traffic. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009, Neil Horman wrote: > I think your comment is reveresed here isn't it? the default you have below is > that mc_all is set, which defaults you to the existing behavior, rather than the > new behavior introduced by this patch. mc_all is 0 by default. > Ack to the patch though > Acked-by: Neil Horman <nhorman@tuxdriver.com> > Neil Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009, Vlad Yasevich wrote: > > - if (!ipv4_is_multicast(loc_addr)) > > + if (ipv4_is_lbcast(loc_addr) || !ipv4_is_multicast(loc_addr)) > > return 1; > > I don't think this change is needed. ipv4_is_lbcast() checks if the > address is 255.255.255.255. That address is already !ipv4_is_multicast(). > > Subnet broadcasts are also !ipv4_is_multicast. ok will drop this. > > { > > struct msghdr msg; > > You might need to set inet->mc_all to 1 in inet_create() since I am not sure if > we want to change the default behavior. The knowledge that some apps have > a very "unique" way of doing multicast makes me a little hesitant. Those "unique" applications would only be able to run on Linux. Application mostly are written for multiple Unix variants. Since the other Unix variants have changed their default behavior it is reasonable to also change the default under Linux. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 16, 2009 at 11:36:56AM -0400, Christoph Lameter wrote: > On Thu, 16 Apr 2009, Neil Horman wrote: > > > I think your comment is reveresed here isn't it? the default you have below is > > that mc_all is set, which defaults you to the existing behavior, rather than the > > new behavior introduced by this patch. > > mc_all is 0 by default. > > > Ack to the patch though > > Acked-by: Neil Horman <nhorman@tuxdriver.com> > > Neil > > Thanks. > I'm sorry, I misread it (confused the definiton of a bitfield with its default value. As Dave noted, the default needs to be the current behavior, not your new behavior. Until thats changed, I rescind my Ack Neil > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009, Neil Horman wrote: > I'm sorry, I misread it (confused the definiton of a bitfield with its default > value. As Dave noted, the default needs to be the current behavior, not your > new behavior. Until thats changed, I rescind my Ack Well guess then we need the global proc setting after all. With the current misbehavior as a default applications need to be rebuilt and source code that is running on multiple OSes now would have to customized to special case for Linux. So add a global proc setting to determine the initial setting of IP_MULTICAST_ALL? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Well guess then we need the global proc setting after all. With the > current misbehavior as a default applications need to be rebuilt and The current behavior, as either your or Vlad's RFC quotes pointed out as easily as the history to go with it, is exactly the expected behavior for decades. I think it is not misbehavior so much as your misconception, though a common one. > source code that is running on multiple OSes now would have to customized > to special case for Linux. No, actually. If you write it for the current behavior, it'll work fine on an OS like Solaris that has departed from the original socket behavior. If you're sloppy and don't handle unexpected traffic, it'll be wrong on both-- you just won't know it until someone runs something with the same port and multicast address on your network and wrecks your app. > So add a global proc setting to determine the initial setting of IP_MULTICAST_ALL? This breaks unknown existing applications that are correctly written. I think it's clearly wrong to change the behavior of someone else's socket to match your idea of how it should've been done 25 years too late. An option that enables new behavior for your own socket, which must be a new app, is fine. Adding a socket option as part of a port is no great hurdle, and I'm guessing you aren't trying to run a Solaris binary on Linux. So what's the problem? +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009, David Stevens wrote: > must be a new app, is fine. Adding a socket option as part of a port > is no great hurdle, and I'm guessing you aren't trying to run a Solaris > binary on Linux. So what's the problem? Guess its the obvious: Software should run on multiple OSes without too much special casing. Linux is the only special case that I am aware of that misbehaves. Adding a socket is no easy thing given the architecture of the software (and of other software) that did not consider that Linux faithfully replicating bugs from 25 years ago that no longer exist in other OSes. Cannot imagine there to be too much software out there that relies on this strange behavior. Otherwise the software would not work on various other platforms. Can you give us a list of products that verifiably rely on the current behavior? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Stevens wrote: >> Well guess then we need the global proc setting after all. With the >> current misbehavior as a default applications need to be rebuilt and > > The current behavior, as either your or Vlad's RFC quotes pointed > out as easily as the history to go with it, is exactly the expected behavior > for decades. I think it is not misbehavior so much as your misconception, > though a common one. > What seems to be happening though, is that there is an expectation that this behavior would change with advent of IGMPv3, which adds the additional filtering text. Now, we could point out that there is no normative text that requires this filtering on groups, only on sources, but the expectation is still there. >> source code that is running on multiple OSes now would have to customized >> to special case for Linux. > > No, actually. If you write it for the current behavior, it'll work > fine on an OS like Solaris that has departed from the original socket > behavior. If you're sloppy and don't handle unexpected traffic, it'll be > wrong on both-- you just won't know it until someone runs something with > the same port and multicast address on your network and wrecks your app. I'd have to reluctantly agree here. Any application that expects original multicast behavior will be broken by a system-wide change. I think existing applications have already figured out all the workarounds they need. > >> So add a global proc setting to determine the initial setting of > IP_MULTICAST_ALL? > > This breaks unknown existing applications that are correctly > written. I think it's clearly wrong to change the behavior of someone > else's socket to match your idea of how it should've been done 25 years > too late. An option that enables new behavior for your own socket, which > must be a new app, is fine. Adding a socket option as part of a port > is no great hurdle, and I'm guessing you aren't trying to run a Solaris > binary on Linux. So what's the problem? > > +-DLS I wonder how BSD and Solaris got away with it? They both filter on multicast groups and source addresses. This is not meant as rhetorical or provocative, just genuinely wondering. -vlad -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Lameter <cl@linux.com> wrote on 04/16/2009 02:04:30 PM: > Guess its the obvious: Software should run on multiple OSes without > too much special casing. Linux is the only special case that I am aware of > that misbehaves. All flavors of UNIX did it this way originally. I never tried it on Windows. I heard years ago when Solaris changed their behavior and it's been reported in this thread that current BSD does, too. But, again, this is not in the least misbehavior. It simply doesn't follow your model of how you thought it behaved. Linux does exactly what Steve Deering wanted multicasting to do when he wrote the RFC for it. It adds an address on the interface, and the binding determines whether it's delivered to a particular socket or not. That is the "ANY" in INADDR_ANY, just like unicasting. If you want particular addresses only, the bind system call does that already. It makes perfect sense to me. > Adding a socket is no easy thing given the architecture of the software > (and of other software) that did not consider that Linux faithfully > replicating bugs from 25 years ago that no longer exist in other OSes. I don't have any say in what other OSes do, but I'd call it a bug in them, too. > Cannot imagine there to be too much software out there that relies on this > strange behavior. Otherwise the software would not work on various other > platforms. I don't know the extent of your survey, but Linux legacy is the problem with changing the default behavior for sockets other than your app. You don't need any special code at all-- write them all to assume they may receive packets not for them, because they are broken if they don't. That works fine on Solaris, too. > Can you give us a list of products that verifiably rely on the current > behavior? I don't do app surveys any more than you do OS surveys. But I don't want to change the semantics of multicast sockets and you do. Can you guarantee nothing will break from this change? +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Christoph Lameter <cl@linux.com> Date: Thu, 16 Apr 2009 11:36:10 -0400 (EDT) > On Thu, 16 Apr 2009, David Stevens wrote: > >> This isn't what I suggested-- you have the default backwards. It must >> default >> to current behavior, or it's pointless. > > If it would default to the current behavior then it would be incompatible > with the behavior of other operating systems and the surprising behavior > of the Linux multicast stack would continue to exist. The unusual behavior > needs to be switched on if wanted for legacy or other reasons. Umm, no. We don't break existing applications "by default". You're being entirely selfish here, you want your application to work without having to specify the socket option to get the new behavior. Well guess what? Under Linux you will have to! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Christoph Lameter <cl@linux.com> Date: Thu, 16 Apr 2009 15:12:43 -0400 (EDT) > On Thu, 16 Apr 2009, Neil Horman wrote: > >> I'm sorry, I misread it (confused the definiton of a bitfield with its default >> value. As Dave noted, the default needs to be the current behavior, not your >> new behavior. Until thats changed, I rescind my Ack > > Well guess then we need the global proc setting after all. No Christoph, do this right. Linux by default will behave the way it has for 15+ years. And if an application wants new behavior, you have to ask for it. End of story. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Christoph Lameter <cl@linux.com> Date: Thu, 16 Apr 2009 17:04:30 -0400 (EDT) > Can you give us a list of products that verifiably rely on the current > behavior? Christoph just drop this, we're not creating a system-wide default selection that backs away from 15+ years of precedence. Maybe Solaris has so few users that it's OK for them to go down that path, but for us it's unacceptable to do things like this. Fix your application. And as David noted, it will be not only more robust, but also still work on those "other systems." So even your "works on all systems" argument is groundless. If you make it work under Linux it will in fact work on all systems, and be more robust in the case of other applications using the same multicast address and port. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Vlad Yasevich <vladislav.yasevich@hp.com> Date: Thu, 16 Apr 2009 17:19:14 -0400 > I wonder how BSD and Solaris got away with it? They both filter on > multicast groups and source addresses. This is not meant as > rhetorical or provocative, just genuinely wondering. Smaller user base. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Vlad Yasevich wrote on 04/16/2009 02:19:14 PM: > What seems to be happening though, is that there is an expectation that > this behavior would change with advent of IGMPv3, which adds the additional > filtering text. Now, we could point out that there is no normative text > that requires this filtering on groups, only on sources, but the expectation > is still there. I have no such expectation. :-) The additional filters are (already) applied per-socket, but existing apps not using source filters behave as they did before IGMPv3. That's what I'd expect. The RFC you quoted for SSM applies to only the SSM address space, mentions this behavior explicitly as the norm for outside of that space, and Linux doesn't support that RFC. If it did, it would include an address range check as part of it. > I wonder how BSD and Solaris got away with it? They both filter on multicast > groups and source addresses. This is not meant as rhetorical or provocative, > just genuinely wondering. I think in practice, it doesn't come up much. That's why people seem so surprised to learn it works this way, and not the way they thought it did after using it, sometimes for years. But the documentation doesn't say a join limits what you receive on a socket, or that it has to be the same socket you're doing I/O on; people simply assume it. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009 15:22:49 -0700 David Stevens <dlstevens@us.ibm.com> wrote: > Vlad Yasevich wrote on 04/16/2009 02:19:14 PM: > > > What seems to be happening though, is that there is an expectation that > > this behavior would change with advent of IGMPv3, which adds the > additional > > filtering text. Now, we could point out that there is no normative text > > that requires this filtering on groups, only on sources, but the > expectation > > is still there. > > I have no such expectation. :-) The additional filters are > (already) > applied per-socket, but existing apps not using source filters behave as > they did before IGMPv3. That's what I'd expect. > The RFC you quoted for SSM applies to only the SSM address space, > mentions this behavior explicitly as the norm for outside of that space, > and Linux doesn't support that RFC. If it did, it would include an > address range check as part of it. > > > I wonder how BSD and Solaris got away with it? They both filter on > multicast > > groups and source addresses. This is not meant as rhetorical or > provocative, > > just genuinely wondering. > > I think in practice, it doesn't come up much. That's why people > seem so surprised to learn it works this way, and not the way they > thought it did after using it, sometimes for years. But the documentation > doesn't say a join limits what you receive on a socket, or that it > has to be the same socket you're doing I/O on; people simply assume it. > > +-DLS You could always use packet/socket filter to keep the packets from coming out to user space. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Stevens wrote: > Vlad Yasevich wrote on 04/16/2009 02:19:14 PM: > >> What seems to be happening though, is that there is an expectation that >> this behavior would change with advent of IGMPv3, which adds the > additional >> filtering text. Now, we could point out that there is no normative text >> that requires this filtering on groups, only on sources, but the > expectation >> is still there. > > I have no such expectation. :-) The additional filters are > (already) > applied per-socket, but existing apps not using source filters behave as > they did before IGMPv3. That's what I'd expect. > The RFC you quoted for SSM applies to only the SSM address space, > mentions this behavior explicitly as the norm for outside of that space, > and Linux doesn't support that RFC. If it did, it would include an > address range check as part of it. Yes, after reading more of SSM spec, it definitely only applies to SSM addresses that we don't support yet. Just to clear this one item up, I think the expectation comes from the IGMPv3 spec: Filtering of packets based upon a socket's multicast reception state is a new feature of this service interface. The previous service interface [RFC1112] described no filtering based upon multicast join state; rather, a join on a socket simply caused the host to join a group on the given interface, and packets destined for that group could be delivered to all sockets whether they had joined or not. I could be inferred from this rather vague text that in addition to source filtering, group filters should be done. Thus the expectation that we've been dealing with. That's the last I'll mention this, since most salient points have been agreed on. Thanks -vlad > >> I wonder how BSD and Solaris got away with it? They both filter on > multicast >> groups and source addresses. This is not meant as rhetorical or > provocative, >> just genuinely wondering. > > I think in practice, it doesn't come up much. That's why people > seem so surprised to learn it works this way, and not the way they > thought it did after using it, sometimes for years. But the documentation > doesn't say a join limits what you receive on a socket, or that it > has to be the same socket you're doing I/O on; people simply assume it. > > +-DLS > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 16 Apr 2009, David Miller wrote: > No Christoph, do this right. > > Linux by default will behave the way it has for 15+ years. And if an > application wants new behavior, you have to ask for it. > > End of story. This is not right. All other OSes filter multicast traffic according to the multicast groups subscribed too (and that includes the evil one). There is no requirement of asking for "new" behavior. Why should multicast applications have to add special code to request something that comes by default on other platforms? The old behavior does not seem to be usable anyways and its certainly looks buggy if multicast packets are duplicated by the kernel and sent to applications that never have asked for it. And OS should do the sane thing by default and not only if someone asks for it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Lameter wrote: > On Thu, 16 Apr 2009, David Miller wrote: > >> No Christoph, do this right. >> > > Linux by default will behave the way it has for 15+ years. And if an >> application wants new behavior, you have to ask for it. >> >> End of story. > > This is not right. All other OSes filter multicast traffic according to > the multicast groups subscribed too (and that includes the evil one). > There is no requirement of asking for "new" behavior. Why should multicast > applications have to add special code to request something that comes by > default on other platforms? I need the current behaviour to not change, as it would break some people I support. DaveM is making the right decision here, and I fully support this. And I'm one of those people working on low latency and hoping messaging clients get better in their multicast usage..just that this is not one of those ways. Ideally, you could tweak OS environment configuration setting, if you don't want per socket. But it cannot be the default. thanks, Nivedita -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 17 Apr 2009, Nivedita Singhvi wrote: > I need the current behaviour to not change, as it would > break some people I support. DaveM is making the right > decision here, and I fully support this. People or applications? There are applications that only run on Linux and fail on other OS? How does this work? Special casing depending on the OS running? > Ideally, you could tweak OS environment configuration > setting, if you don't want per socket. But it cannot > be the default. Would you support an additional OS config variable that would set the default for socket operations? Then we could have a per socket option that would allow overriding the OS config variable? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Lameter wrote: >> Ideally, you could tweak OS environment configuration >> setting, if you don't want per socket. But it cannot >> be the default. > > Would you support an additional OS config variable that would set the > default for socket operations? Then we could have a per socket option that > would allow overriding the OS config variable? That would be my choice personally, because it would be easier than scripting some solution to modify potentially hundreds of sockets on a system... Does that sound acceptable? thanks, Nivedita -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
netdev-owner@vger.kernel.org wrote on 04/17/2009 06:56:04 AM: > On Thu, 16 Apr 2009, David Miller wrote: > > > No Christoph, do this right. > > > > Linux by default will behave the way it has for 15+ years. And if an > > application wants new behavior, you have to ask for it. > > > > End of story. > > This is not right. All other OSes filter multicast traffic according to > the multicast groups subscribed too (and that includes the evil one). This is not true. > There is no requirement of asking for "new" behavior. Why should multicast > applications have to add special code to request something that comes by > default on other platforms? Linux is not Solaris. I think Solaris is wrong to change the behavior from the original BSD behavior, but it should be no surprise that there are other differences in the API's, too. It's not difficult to write code that works as intended on both, and the case Solaris is trying to avoid is not really avoided since you can still receive unicast traffic, or totally unrelated multicast traffic on the shared port and multicast address space. If the app doesn't use the port to distinguish it, it simply should bind the multicast address it wants, use PKTINFO, SO_BINDTODEVICE or the like as well. In your case, multiple sockets or filtering based on the "to" address are possibilties that work on Solaris too, and fix more unintended traffic problems than just a different group. A per-socket option is a more trivial way to do this, but turning it on for sockets that want the existing, intended and long-standing behavior is obviously wrong. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Christoph Lameter <cl@linux.com> Date: Fri, 17 Apr 2009 12:02:19 -0400 (EDT) > On Fri, 17 Apr 2009, Nivedita Singhvi wrote: > >> I need the current behaviour to not change, as it would >> break some people I support. DaveM is making the right >> decision here, and I fully support this. > > People or applications? There are applications that only run on Linux and > fail on other OS? How does this work? Special casing depending on the OS > running? Christoph I just want to let you know that I'm totally ignoring everything further you say on this issue, becuase you're way out of line and totally ignoring the real issues here. What's next? Tomorrow, if you think Linux's open() system call behavior doesn't suit your needs, I want you to send a sysctl patch to Al Viro that changes the system wide behavior and we'll see how far you get with that. The fact is, you cannot just say "oops we didn't mean to do that" when something has behaved a certain way, visible to users, for more that 15 years. And the fact is, WE DID MEAN to do things this way. As David Stevens explained, the original creator of multicasting, the original BSD code, and the RFCs, INTENDED this behavior from the very beginning. You want to ignore all of this, as if none of it matters and that what you want to achieve is so much more important. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 17 Apr 2009, David Stevens wrote: > Linux is not Solaris. I think Solaris is wrong to change the > behavior from the original BSD behavior, but it should be no surprise > that there are other differences in the API's, too. It's not difficult > to write code that works as intended on both, and the case Solaris is > trying to avoid is not really avoided since you can still receive > unicast traffic, or totally unrelated multicast traffic on the shared > port and multicast address space. If the app doesn't use the port to By that you mean unrelated multicast traffic destined to the same multicast address and port? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 17 Apr 2009, David Miller wrote: > And the fact is, WE DID MEAN to do things this way. I fully agree. We meant to do this. > As David Stevens explained, the original creator of multicasting, the > original BSD code, and the RFCs, INTENDED this behavior from the very > beginning. > > You want to ignore all of this, as if none of it matters and that what > you want to achieve is so much more important. I am not ignoring it. It seems just that other OSes have moved from this and we are one of the last holdouts. Its not only Solaris but also BSD and Windoze. Best to have a solution that is consistent across multiple OSes. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Lameter <cl@linux.com> wrote on 04/20/2009 09:43:45 AM: > On Fri, 17 Apr 2009, David Stevens wrote: > > > Linux is not Solaris. I think Solaris is wrong to change the > > behavior from the original BSD behavior, but it should be no surprise > > that there are other differences in the API's, too. It's not difficult > > to write code that works as intended on both, and the case Solaris is > > trying to avoid is not really avoided since you can still receive > > unicast traffic, or totally unrelated multicast traffic on the shared > > port and multicast address space. If the app doesn't use the port to > > By that you mean unrelated multicast traffic destined to the same > multicast address and port? Yes. If neither the port nor the multicast address are registered than anyone on your network can use them for anything. Even if they are registered, someone may still use it; sending requires no special privilege, and neither does joing groups or binding to ports above 1024. Anyone on your network, or within your multicast routing domain, may reuse both (even if they intend it for a different machine) and your app will receive them. I think generally the best approach is to bind to the particular multicast address and use SO_BINDTODEVICE if it matters to the app. But the app still has to handle receiving data from a different source or totally unrelated data; it certainly can receive those, because anyone can send those. I can see the value of a per-socket, default-off option in the case where you want multiple groups on a single socket, and I encourage you to submit that as a patch. It reduces the work the receiver has to do, but doesn't eliminate it. The way I'd do that is to use multiple sockets, one bound to each group, but ok. As long as it doesn't change the existing behavior out from under existing, unknown apps. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-2.6/include/net/inet_sock.h =================================================================== --- linux-2.6.orig/include/net/inet_sock.h 2009-04-16 08:59:20.000000000 -0500 +++ linux-2.6/include/net/inet_sock.h 2009-04-16 09:04:47.000000000 -0500 @@ -130,7 +130,8 @@ struct inet_sock { freebind:1, hdrincl:1, mc_loop:1, - transparent:1; + transparent:1, + mc_all:1; int mc_index; __be32 mc_addr; struct ip_mc_socklist *mc_list; Index: linux-2.6/net/ipv4/igmp.c =================================================================== --- linux-2.6.orig/net/ipv4/igmp.c 2009-04-16 08:54:47.000000000 -0500 +++ linux-2.6/net/ipv4/igmp.c 2009-04-16 09:04:06.000000000 -0500 @@ -2187,7 +2187,7 @@ int ip_mc_sf_allow(struct sock *sk, __be struct ip_sf_socklist *psl; int i; - if (!ipv4_is_multicast(loc_addr)) + if (ipv4_is_lbcast(loc_addr) || !ipv4_is_multicast(loc_addr)) return 1; for (pmc=inet->mc_list; pmc; pmc=pmc->next) { @@ -2196,7 +2196,7 @@ int ip_mc_sf_allow(struct sock *sk, __be break; } if (!pmc) - return 1; + return inet->mc_all; psl = pmc->sflist; if (!psl) return pmc->sfmode == MCAST_EXCLUDE; Index: linux-2.6/include/linux/in.h =================================================================== --- linux-2.6.orig/include/linux/in.h 2009-04-16 09:05:41.000000000 -0500 +++ linux-2.6/include/linux/in.h 2009-04-16 09:32:52.000000000 -0500 @@ -107,6 +107,7 @@ struct in_addr { #define MCAST_JOIN_SOURCE_GROUP 46 #define MCAST_LEAVE_SOURCE_GROUP 47 #define MCAST_MSFILTER 48 +#define IP_MULTICAST_ALL 49 #define MCAST_EXCLUDE 0 #define MCAST_INCLUDE 1 Index: linux-2.6/net/ipv4/ip_sockglue.c =================================================================== --- linux-2.6.orig/net/ipv4/ip_sockglue.c 2009-04-16 09:09:52.000000000 -0500 +++ linux-2.6/net/ipv4/ip_sockglue.c 2009-04-16 09:31:40.000000000 -0500 @@ -449,6 +449,7 @@ static int do_ip_setsockopt(struct sock (1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) | (1<<IP_PASSSEC) | (1<<IP_TRANSPARENT))) || optname == IP_MULTICAST_TTL || + optname == IP_MULTICAST_ALL || optname == IP_MULTICAST_LOOP || optname == IP_RECVORIGDSTADDR) { if (optlen >= sizeof(int)) { @@ -895,6 +896,13 @@ static int do_ip_setsockopt(struct sock kfree(gsf); break; } + case IP_MULTICAST_ALL: + if (optlen<1) + goto e_inval; + if (val != 0 && val != 1) + goto e_inval; + inet->mc_all = val; + break; case IP_ROUTER_ALERT: err = ip_ra_control(sk, val ? 1 : 0, NULL); break; @@ -1147,6 +1155,9 @@ static int do_ip_getsockopt(struct sock release_sock(sk); return err; } + case IP_MULTICAST_ALL: + val = inet->mc_all; + break; case IP_PKTOPTIONS: { struct msghdr msg;
Do what David Stevens suggest: Add a per socket option Subject: Multicast: Filter Multicast traffic per socket mc_list If two processes open the same port as a multicast socket and then join two different multicast groups then traffic for both multicast groups is forwarded to either process. This means that application will get surprising data that they did not ask for. Applications will have to filter these out in order to work correctly if multiple apps run on the same system. These are pretty strange semantics but they have been around since the beginning of multicast support on Unix systems. Most of the other operating systems supporting Multicast have since changed to only supplying multicast traffic to a socket that was selected through multicast join operations. This patch does change Linux to behave in the same way. But there may be applications that rely on the old behavior. Therefore we provide a means to switch back to the old behavior using a new multicast socket option IP_MULTICAST_ALL If set then all multicast traffic to the port is forwarded to the socket (additional constraints are the SSM inclusion and exclusion lists!). If not set (default) then only traffic for multicast groups that were joined by thesocket is received. Signed-off-by: Christoph Lameter <cl@linux.com> --- include/linux/in.h | 1 + include/net/inet_sock.h | 3 ++- net/ipv4/igmp.c | 4 ++-- net/ipv4/ip_sockglue.c | 11 +++++++++++ 4 files changed, 16 insertions(+), 3 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html