diff mbox

net: VRF: Fix receiving multicast traffic

Message ID 20160922041308.30848-1-mark.tomlinson@alliedtelesis.co.nz
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Mark Tomlinson Sept. 22, 2016, 4:13 a.m. UTC
The previous patch to ensure that the original iif was used when
checking for forwarding also meant that this same interface was used to
determine whether multicast packets should be received or not. This was
incorrect, and would cause multicast packets to be dropped.

The fix here is to use skb->dev when checking multicast addresses.
skb->dev has been set to the l3mdev by this point, so the check will be
against that, rather than the ingress interface.

Fixes: "net:VRF: Pass original iif to ip_route_input()"
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
---
 net/ipv4/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Ahern Sept. 22, 2016, 3:14 p.m. UTC | #1
On 9/21/16 10:13 PM, Mark Tomlinson wrote:
> The previous patch to ensure that the original iif was used when
> checking for forwarding also meant that this same interface was used to
> determine whether multicast packets should be received or not. This was
> incorrect, and would cause multicast packets to be dropped.
> 
> The fix here is to use skb->dev when checking multicast addresses.
> skb->dev has been set to the l3mdev by this point, so the check will be
> against that, rather than the ingress interface.

l3mdev devices do not support IPv4 multicast so checking mcast against that device should not be working at all. For that reason I was fine with the change in the previous patch. ie., you want the real ingress device there not the vrf device.

What test are you running that says your previous patch broke something?
Mark Tomlinson Sept. 22, 2016, 10:10 p.m. UTC | #2
On 09/23/2016 03:14 AM, David Ahern wrote:
>
> l3mdev devices do not support IPv4 multicast so checking mcast against that device should not be working at all. For that reason I was fine with the change in the previous patch. ie., you want the real ingress device there not the vrf device.
>
> What test are you running that says your previous patch broke something?
Although we do not expect any multicast routing to work in an l3mdev, 
(IGMP snooping or PIM), we still want to have multicast packets 
delivered for protocols such as RIP. This was working before my previous 
patch, but these multicast packets are now dropped. This current patch 
fixes that again, hopefully still with the benefits of my first patch.
David Ahern Sept. 22, 2016, 10:41 p.m. UTC | #3
On 9/22/16 4:10 PM, Mark Tomlinson wrote:
> 
> On 09/23/2016 03:14 AM, David Ahern wrote:
>>
>> l3mdev devices do not support IPv4 multicast so checking mcast against that device should not be working at all. For that reason I was fine with the change in the previous patch. ie., you want the real ingress device there not the vrf device.
>>
>> What test are you running that says your previous patch broke something?
> Although we do not expect any multicast routing to work in an l3mdev, 
> (IGMP snooping or PIM), we still want to have multicast packets 
> delivered for protocols such as RIP. This was working before my previous 
> patch, but these multicast packets are now dropped. This current patch 
> fixes that again, hopefully still with the benefits of my first patch.
> 

can you discern which check is making that happen?

It does not make sense to look at the in_device of a vrf device for mcast addresses. For IPv6 linklocal and mcast is specifically blocked. IPv4 should do the same. So, how is RIP getting the packet at all?
Mark Tomlinson Sept. 23, 2016, 3:06 a.m. UTC | #4
On 09/23/2016 10:41 AM, David Ahern wrote:
> On 9/22/16 4:10 PM, Mark Tomlinson wrote:
>> On 09/23/2016 03:14 AM, David Ahern wrote:
>>> l3mdev devices do not support IPv4 multicast so checking mcast against that device should not be working at all. For that reason I was fine with the change in the previous patch. ie., you want the real ingress device there not the vrf device.
>>>
>>> What test are you running that says your previous patch broke something?
>> Although we do not expect any multicast routing to work in an l3mdev,
>> (IGMP snooping or PIM), we still want to have multicast packets
>> delivered for protocols such as RIP. This was working before my previous
>> patch, but these multicast packets are now dropped. This current patch
>> fixes that again, hopefully still with the benefits of my first patch.
>>
> can you discern which check is making that happen?
>
> It does not make sense to look at the in_device of a vrf device for mcast addresses. For IPv6 linklocal and mcast is specifically blocked. IPv4 should do the same. So, how is RIP getting the packet at all?
This might be due to some other changes we've made for VRF and multicast 
but haven't sent upstream. In particular, a change to do_ip_setsockopt() 
and its handling of IP_MULTICAST_IF as well as IP_ADD/DROP_MEMBERSHIP. I 
am guessing that without these changes, we wouldn't be able to receive 
multicast packets in RIP. With our changes, the in_dev->mc_list does 
contain the RIP MC address (224.0.0.9) in the master interface, and so 
the function ip_check_mc_rcu() returns success with the master only.

Our RIP daemon is VRF-aware. So it does use setsockopt(SO_BINDTODEVICE, 
"vrf-master") when running in a VRF. Without following it all the way 
down, I believe that it is this that allows the multicast lookup at the 
top of ip_check_mc_rcu() to succeed on the vrf-master, but not the 
ingress interface. That is, in_dev->mc_list does contain 224.0.0.9 only 
on the vrf-master. Provided the lookup in ip_check_mc_rcu() succeeds (im 
!= NULL), this function can return success.

Are you interested in the other patches at the moment?
David Ahern Sept. 23, 2016, 1:49 p.m. UTC | #5
On 9/22/16 9:06 PM, Mark Tomlinson wrote:
> 
> On 09/23/2016 10:41 AM, David Ahern wrote:
>> On 9/22/16 4:10 PM, Mark Tomlinson wrote:
>>> On 09/23/2016 03:14 AM, David Ahern wrote:
>>>> l3mdev devices do not support IPv4 multicast so checking mcast against that device should not be working at all. For that reason I was fine with the change in the previous patch. ie., you want the real ingress device there not the vrf device.
>>>>
>>>> What test are you running that says your previous patch broke something?
>>> Although we do not expect any multicast routing to work in an l3mdev,
>>> (IGMP snooping or PIM), we still want to have multicast packets
>>> delivered for protocols such as RIP. This was working before my previous
>>> patch, but these multicast packets are now dropped. This current patch
>>> fixes that again, hopefully still with the benefits of my first patch.
>>>
>> can you discern which check is making that happen?
>>
>> It does not make sense to look at the in_device of a vrf device for mcast addresses. For IPv6 linklocal and mcast is specifically blocked. IPv4 should do the same. So, how is RIP getting the packet at all?
> This might be due to some other changes we've made for VRF and multicast 
> but haven't sent upstream. In particular, a change to do_ip_setsockopt() 
> and its handling of IP_MULTICAST_IF as well as IP_ADD/DROP_MEMBERSHIP. I 
> am guessing that without these changes, we wouldn't be able to receive 
> multicast packets in RIP. With our changes, the in_dev->mc_list does 
> contain the RIP MC address (224.0.0.9) in the master interface, and so 
> the function ip_check_mc_rcu() returns success with the master only.
> 
> Our RIP daemon is VRF-aware. So it does use setsockopt(SO_BINDTODEVICE, 
> "vrf-master") when running in a VRF. Without following it all the way 
> down, I believe that it is this that allows the multicast lookup at the 
> top of ip_check_mc_rcu() to succeed on the vrf-master, but not the 
> ingress interface. That is, in_dev->mc_list does contain 224.0.0.9 only 
> on the vrf-master. Provided the lookup in ip_check_mc_rcu() succeeds (im 
> != NULL), this function can return success.
> 
> Are you interested in the other patches at the moment?
> 

Yes, but with the context of the bigger IPv4 multicast solution. I am on PTO today - about to get on a plane. How about we leave the upstream kernel as is - i.e., drop this patch. You can carry it locally with the others. We can take a look at the bigger mcast picture for 4.10. Agree?
diff mbox

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a1f2830..75e1de6 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1971,7 +1971,7 @@  int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	   route cache entry is created eventually.
 	 */
 	if (ipv4_is_multicast(daddr)) {
-		struct in_device *in_dev = __in_dev_get_rcu(dev);
+		struct in_device *in_dev = __in_dev_get_rcu(skb->dev);
 
 		if (in_dev) {
 			int our = ip_check_mc_rcu(in_dev, daddr, saddr,