diff mbox

net: VRF: Pass original iif to ip_route_input()

Message ID 20160912014553.20927-1-mark.tomlinson@alliedtelesis.co.nz
State Deferred, archived
Delegated to: David Miller
Headers show

Commit Message

Mark Tomlinson Sept. 12, 2016, 1:45 a.m. UTC
The function ip_rcv_finish() calls l3mdev_ip_rcv(). On any VRF except
the global VRF, this replaces skb->dev with the VRF master interface.
When calling ip_route_input_noref() from here, the checks for forwarding
look at this master device instead of the initial ingress interface.
This will allow packets to be routed which normally would be dropped.
For example, an interface that is not assigned an IP address should
drop packets, but because the checking is against the master device, the
packet will be forwarded.

The fix here is to still call l3mdev_ip_rcv(), but remember the initial
net_device. This is passed to the other functions within ip_rcv_finish,
so they still see the original interface.

Please note that while this patch fixes my issue, I am not entirely sure
why the skb->dev is changed to the master device, so I am not sure this
is the right fix.

Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
---
 net/ipv4/ip_input.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

David Ahern Sept. 12, 2016, 2:48 p.m. UTC | #1
On 9/11/16 7:45 PM, Mark Tomlinson wrote:
> The function ip_rcv_finish() calls l3mdev_ip_rcv(). On any VRF except
> the global VRF, this replaces skb->dev with the VRF master interface.
> When calling ip_route_input_noref() from here, the checks for forwarding
> look at this master device instead of the initial ingress interface.
> This will allow packets to be routed which normally would be dropped.
> For example, an interface that is not assigned an IP address should
> drop packets, but because the checking is against the master device, the
> packet will be forwarded.
> 
> The fix here is to still call l3mdev_ip_rcv(), but remember the initial
> net_device. This is passed to the other functions within ip_rcv_finish,
> so they still see the original interface.
> 
> Please note that while this patch fixes my issue, I am not entirely sure
> why the skb->dev is changed to the master device, so I am not sure this
> is the right fix.

It is done for socket lookups. VRF can handle global sockets with connected sockets bound to the VRF domain (VRF device index) or sockets bound to the VRF device which requires inet_iif / skb_iif to be the VRF device.

With the changes to l3mdev to look at enslaved devices as well as master devices this change might be fine, but need to confirm.

Please cc me on VRF related patches or questions. I only scan netdev as time allows.
David Ahern Sept. 14, 2016, 7:29 p.m. UTC | #2
On 9/11/16 7:45 PM, Mark Tomlinson wrote:
> Please note that while this patch fixes my issue, I am not entirely sure
> why the skb->dev is changed to the master device, so I am not sure this
> is the right fix.

Can you send a v2 dropping the above paragraph from the commit log.

> 
> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> ---
>  net/ipv4/ip_input.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

And you can add

Acked-by: David Ahern <dsa@cumulusnetworks.com>
diff mbox

Patch

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 4b351af..d6feabb 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -312,6 +312,7 @@  static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
 	const struct iphdr *iph = ip_hdr(skb);
 	struct rtable *rt;
+	struct net_device *dev = skb->dev;
 
 	/* if ingress device is enslaved to an L3 master device pass the
 	 * skb to its handler for processing
@@ -341,7 +342,7 @@  static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 	 */
 	if (!skb_valid_dst(skb)) {
 		int err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
-					       iph->tos, skb->dev);
+					       iph->tos, dev);
 		if (unlikely(err)) {
 			if (err == -EXDEV)
 				__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);
@@ -370,7 +371,7 @@  static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 		__IP_UPD_PO_STATS(net, IPSTATS_MIB_INBCAST, skb->len);
 	} else if (skb->pkt_type == PACKET_BROADCAST ||
 		   skb->pkt_type == PACKET_MULTICAST) {
-		struct in_device *in_dev = __in_dev_get_rcu(skb->dev);
+		struct in_device *in_dev = __in_dev_get_rcu(dev);
 
 		/* RFC 1122 3.3.6:
 		 *