Message ID | 1383725153-26298-1-git-send-email-christophe.gouault@6wind.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: > The vti interface inbound and outbound SPD lookups are based on the > ipsec packet instead of the plaintext packet. > > Not only is it counterintuitive, it also restricts vti interfaces > to a single policy (whose selector must match the tunnel local and > remote addresses). > > The policy selector is supposed to match the plaintext packet, before > encryption or after decryption. > > This patch performs the SPD lookup based on the plaintext packet. It > enables to create several polices bound to the vti interface (via a > mark equal to the vti interface okey). > > It remains possible to apply the same policy to all packets entering > the vti interface, by setting an any-to-any selector (src 0.0.0.0/0 > dst 0.0.0.0/0 proto any mark OKEY). > > Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> Hm, this patch breaks my current vti test setup. I would need to configure the policies and states dependent on the kernel version if we apply this patch... It would be good to hear from the original author of the vti code whether the current behaviour is intentional before we do anything here. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Steffen, I am also interested in knowing Saurabh's intentions regarding the behavior of policies bound to vti interfaces. However, please note that setting a policy with a wildcard selector works in both cases (before or after this patch), so a common test case can be defined. Actually the *previous* patch on vti (7263a5187f9e vti: get rid of nf mark rule in prerouting) introduced significant changes, and implies behaviors dependant on the kernel version, but it seemed to meet Saurabh's agreement, as the following thread witnesses: http://www.spinics.net/lists/netdev/msg253134.html Best Regards, Christophe On 11/07/2013 12:25 PM, Steffen Klassert wrote: > On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: >> The vti interface inbound and outbound SPD lookups are based on the >> ipsec packet instead of the plaintext packet. >> >> Not only is it counterintuitive, it also restricts vti interfaces >> to a single policy (whose selector must match the tunnel local and >> remote addresses). >> >> The policy selector is supposed to match the plaintext packet, before >> encryption or after decryption. >> >> This patch performs the SPD lookup based on the plaintext packet. It >> enables to create several polices bound to the vti interface (via a >> mark equal to the vti interface okey). >> >> It remains possible to apply the same policy to all packets entering >> the vti interface, by setting an any-to-any selector (src 0.0.0.0/0 >> dst 0.0.0.0/0 proto any mark OKEY). >> >> Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> > Hm, this patch breaks my current vti test setup. I would need to > configure the policies and states dependent on the kernel version > if we apply this patch... > > It would be good to hear from the original author of the vti code > whether the current behaviour is intentional before we do anything > here. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Steffen Klassert <steffen.klassert@secunet.com> Date: Thu, 7 Nov 2013 12:25:49 +0100 > On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: >> The vti interface inbound and outbound SPD lookups are based on the >> ipsec packet instead of the plaintext packet. >> >> Not only is it counterintuitive, it also restricts vti interfaces >> to a single policy (whose selector must match the tunnel local and >> remote addresses). >> >> The policy selector is supposed to match the plaintext packet, before >> encryption or after decryption. >> >> This patch performs the SPD lookup based on the plaintext packet. It >> enables to create several polices bound to the vti interface (via a >> mark equal to the vti interface okey). >> >> It remains possible to apply the same policy to all packets entering >> the vti interface, by setting an any-to-any selector (src 0.0.0.0/0 >> dst 0.0.0.0/0 proto any mark OKEY). >> >> Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> > > Hm, this patch breaks my current vti test setup. I would need to > configure the policies and states dependent on the kernel version > if we apply this patch... > > It would be good to hear from the original author of the vti code > whether the current behaviour is intentional before we do anything > here. I'm disappointed that we're breaking IPSEC semantics several times. This really isn't acceptable at all, not even remotely. The fact that a developer has a configuration he was actually using, and would be broken by this patch, says to me that there is absolutely no way I can apply this. Sorry. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 07, 2013 at 01:55:33PM +0100, Christophe Gouault wrote: > Hello Steffen, > > I am also interested in knowing Saurabh's intentions regarding the > behavior of policies bound to vti interfaces. > > However, please note that setting a policy with a wildcard selector > works in both cases (before or after this patch), so a common test > case can be defined. Yes, I looked at the Cisco vti documents but all examples I found use wildcard selectors which work for both. So I'm still not sure which version is the right one. Let's wait on Saurabh's explaination. > > Actually the *previous* patch on vti (7263a5187f9e vti: get rid of > nf mark rule in prerouting) introduced significant changes, and > implies behaviors dependant on the kernel version, but it seemed to > meet Saurabh's agreement, as the following thread witnesses: > > http://www.spinics.net/lists/netdev/msg253134.html I've just noticed that this went to the stable trees. People who update a stable kernel want (security) fixes in the first place, they don't want to change their configuration on the IPsec gateways. So I think patches that require a configuration change should better go to net-next, unless it's a urgent fix. I was not on Cc and it looks like I've overlooked this on the list. The vti interfaces are pure IPsec interfaces, so perhaps we should add them to the IPsec section in the maintainers file (maybe together with the main IPsec protocols esp, ah and ipcomp, which are also not listed there). David, would you agree with such a patch? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi David, On 11/08/2013 12:17 AM, David Miller wrote: > From: Steffen Klassert <steffen.klassert@secunet.com> > Date: Thu, 7 Nov 2013 12:25:49 +0100 > >> On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: >>> The vti interface inbound and outbound SPD lookups are based on the >>> ipsec packet instead of the plaintext packet. >>> >>> Not only is it counterintuitive, it also restricts vti interfaces >>> to a single policy (whose selector must match the tunnel local and >>> remote addresses). >>> >>> The policy selector is supposed to match the plaintext packet, before >>> encryption or after decryption. >>> >>> This patch performs the SPD lookup based on the plaintext packet. It >>> enables to create several polices bound to the vti interface (via a >>> mark equal to the vti interface okey). >>> >>> It remains possible to apply the same policy to all packets entering >>> the vti interface, by setting an any-to-any selector (src 0.0.0.0/0 >>> dst 0.0.0.0/0 proto any mark OKEY). >>> >>> Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> >> Hm, this patch breaks my current vti test setup. I would need to >> configure the policies and states dependent on the kernel version >> if we apply this patch... >> >> It would be good to hear from the original author of the vti code >> whether the current behaviour is intentional before we do anything >> here. > I'm disappointed that we're breaking IPSEC semantics several times. > This really isn't acceptable at all, not even remotely. I understand your disappointment, however this patch precisely aims at *restoring* theIPsec semantics: the original vti code uses the SP selector to match the ipsec encrypted packetinstead of the plaintext packet, which is contrary to the IPsec semantics. > The fact that a developer has a configuration he was actually > using, and would be broken by this patch, says to me that there > is absolutely no way I can apply this. As wrote Steffen Klassert, it would be good to hear from the original author ofthe vti code (Saurabh Mohan) whether the current behaviour is intentional. It would also be good to know how many people currently use vti, but my feeling is that vti is still at an experimental stage. Best Regards, Christophe > Sorry. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Steffen Klassert <steffen.klassert@secunet.com> Date: Fri, 8 Nov 2013 12:01:01 +0100 > I was not on Cc and it looks like I've overlooked this on the list. > The vti interfaces are pure IPsec interfaces, so perhaps we should > add them to the IPsec section in the maintainers file (maybe together > with the main IPsec protocols esp, ah and ipcomp, which are also not > listed there). > > David, would you agree with such a patch? Yes. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Christophe Gouault [mailto:christophe.gouault@6wind.com] > Sent: Thursday, November 07, 2013 4:56 AM > To: Steffen Klassert > Cc: David S. Miller; Herbert Xu; netdev@vger.kernel.org; Saurabh Mohan; > Sergei Shtylyov; Eric Dumazet > Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not ipsec > pkt > > Hello Steffen, > > I am also interested in knowing Saurabh's intentions regarding the > behavior of policies bound to vti interfaces. > The semantics is to match the policy "src 0.0.0.0/0 dst 0.0.0.0/0 proto any" That is the only policy that VTI should use. The mark is needed to distinguish and limit the policy to a specific vti tunnel interface only. There is no other policy that may be applied to a vti interface. The fact that traffic is going over the tunnel interface implies that it must be encrypted/decrypted. Applying the above policy is a way to achieve that. > However, please note that setting a policy with a wildcard selector > works in both cases (before or after this patch), so a common test case > can be defined. > > Actually the *previous* patch on vti (7263a5187f9e vti: get rid of nf > mark rule in prerouting) introduced significant changes, and implies > behaviors dependant on the kernel version, but it seemed to meet > Saurabh's agreement, as the following thread witnesses: > > http://www.spinics.net/lists/netdev/msg253134.html > Getting rid of the pre-routing mark, which had to be done outside of the vti tunnel code was prone to misconfiguration. Though it is unfortunate that it creates a kernel version dependency. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Getting rid of the pre-routing mark, which had to be done outside of > the vti tunnel code was prone to misconfiguration. > Though it is unfortunate that it creates a kernel version dependency. Perhaps a section in the ip-link manpage for VTI describing the expected configuration is worthwhile? I quite like the functionality, especially since the pre-routing mark removal, but I ended up having to read through commit logs to figure out how to use it which is obviously non-ideal. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, Saurabh On 2013年11月19日 05:38, Saurabh Mohan wrote: > > >> -----Original Message----- >> From: Christophe Gouault [mailto:christophe.gouault@6wind.com] >> Sent: Thursday, November 07, 2013 4:56 AM >> To: Steffen Klassert >> Cc: David S. Miller; Herbert Xu; netdev@vger.kernel.org; Saurabh Mohan; >> Sergei Shtylyov; Eric Dumazet >> Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not ipsec >> pkt >> >> Hello Steffen, >> >> I am also interested in knowing Saurabh's intentions regarding the >> behavior of policies bound to vti interfaces. >> > The semantics is to match the policy "src 0.0.0.0/0 dst 0.0.0.0/0 proto any" > That is the only policy that VTI should use. The mark is needed to > distinguish and limit the policy to a specific vti tunnel interface only. > There is no other policy that may be applied to a vti interface. > The fact that traffic is going over the tunnel interface implies that it > must be encrypted/decrypted. Applying the above policy is a way > to achieve that. I'm not much experienced with VTI usage practical production usage scenario, but I have one question about the necessity of policy checking on VTI receiving part. - A VTI tunnel is hashed by destination address and i_key when creating them; - After each tunneled IP packet delivered to vti_rcv, the first step is looking for the right tunnel, this is done by using tunneled IP packet outer source and destination address without any key matching rule involved. If there are any other tunnel with the same source/destination address, but not the same mark in place, the tunnel lookup in the vti_rcv will properly not hit VTI tunnel, but the non-VTI tunnel. So the VTI net device statistics will not be accurate, and what's the point of checking policy for the wrong tunnel interface? Or the VTI tunnel is the only tunnel with this specific source/destination address in the production deployment. Again the upper layer 4 will check the policy after all, that's the right place to do the policy checking. So IMO, it's unnecessary to check policy for a net_device like VTI, actually I hold a patch of removing the VTI policy checking due to net-next closure for the moment. >> However, please note that setting a policy with a wildcard selector >> works in both cases (before or after this patch), so a common test case >> can be defined. >> >> Actually the *previous* patch on vti (7263a5187f9e vti: get rid of nf >> mark rule in prerouting) introduced significant changes, and implies >> behaviors dependant on the kernel version, but it seemed to meet >> Saurabh's agreement, as the following thread witnesses: >> >> http://www.spinics.net/lists/netdev/msg253134.html >> > Getting rid of the pre-routing mark, which had to be done outside of > the vti tunnel code was prone to misconfiguration. > Though it is unfortunate that it creates a kernel version dependency. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Hi Saurabh, Good to read your rationale. On 11/18/2013 10:38 PM, Saurabh Mohan wrote: > > >> -----Original Message----- From: Christophe Gouault >> [mailto:christophe.gouault@6wind.com] Sent: Thursday, November 07, >> 2013 4:56 AM To: Steffen Klassert Cc: David S. Miller; Herbert Xu; >> netdev@vger.kernel.org; Saurabh Mohan; Sergei Shtylyov; Eric >> Dumazet Subject: Re: [PATCH net v3] vti: fix spd lookup: match >> plaintext pkt, not ipsec pkt >> >> Hello Steffen, >> >> I am also interested in knowing Saurabh's intentions regarding the >> behavior of policies bound to vti interfaces. >> > The semantics is to match the policy "src 0.0.0.0/0 dst 0.0.0.0/0 > proto any" That is the only policy that VTI should use. The mark is > needed to distinguish and limit the policy to a specific vti tunnel > interface only. There is no other policy that may be applied to a vti > interface. The fact that traffic is going over the tunnel interface > implies that it must be encrypted/decrypted. Applying the above > policy is a way to achieve that. The proposed patch respects this model and accepts the same configuration, but extends the possibilities: you can still set the policy "src 0.0.0.0/0 dst 0.0.0.0/0 proto any" (which is the typical use case), the mark is still used to distinguish and limit the policy to a specific vti tunnel interface only, the traffic that is going over the tunnel interface still implies that it must be encrypted/decrypted (in tunnel mode). But you can optionally apply differentiated policies within the same tunnel, by setting SPs with narrower selectors: according to the plaintext traffic that crosses the tunnel, you can request to use different protocols (esp/ah), different SAs, maybe drop some traffic. Only ipsec tunnel mode and drop policies should be bound to a VTI interface. And the patch restores the SP semantics: the selector is used to match the plaintext traffic, not the IPsec encrypted traffic. Best Regards, Christophe >> However, please note that setting a policy with a wildcard >> selector works in both cases (before or after this patch), so a >> common test case can be defined. >> >> Actually the *previous* patch on vti (7263a5187f9e vti: get rid of >> nf mark rule in prerouting) introduced significant changes, and >> implies behaviors dependant on the kernel version, but it seemed to >> meet Saurabh's agreement, as the following thread witnesses: >> >> http://www.spinics.net/lists/netdev/msg253134.html >> > Getting rid of the pre-routing mark, which had to be done outside of > the vti tunnel code was prone to misconfiguration. Though it is > unfortunate that it creates a kernel version dependency. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 21, 2013 at 11:07:27AM +0100, Christophe Gouault wrote: > > But you can optionally apply differentiated policies within the same > tunnel, by setting SPs with narrower selectors: according to the > plaintext traffic that crosses the tunnel, you can request to use > different protocols (esp/ah), different SAs, maybe drop some traffic. This raises the question about the MTU of a vti device. If the SA is not unique, it is not clear which MTU we should use for that device. > Only ipsec tunnel mode and drop policies should be bound to a VTI interface. > > And the patch restores the SP semantics: the selector is used to match > the plaintext traffic, not the IPsec encrypted traffic. > On the other hand, I've spend quite some time to figure out how inter address family tunneling can work with vti devices. It seems that we need plaintext matching to get this to work. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: > > @@ -133,7 +134,13 @@ static int vti_rcv(struct sk_buff *skb) > * only match policies with this mark. > */ > skb->mark = be32_to_cpu(tunnel->parms.o_key); > + /* The packet is decrypted, but not yet decapsulated. > + * Temporarily make network_header point to the inner header > + * for policy check. > + */ > + skb_reset_network_header(skb); > ret = xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb); If we do it like this, we would do an input policy check even for packets that should be forwarded. I think that's a bit odd. If we really change to match plaintext traffic, we should do it like Fan Du proposed. Remove the policy check here and let the further layers do the policy enforcement. All we have to do is to set the skb mark, then the lookup should match the vti policy. It is already clear that this packet was IPsec transformed when it enters vti_rcv, so deferring the policy check should be ok. > > if (skb->protocol != htons(ETH_P_IP)) > @@ -173,17 +182,35 @@ static netdev_tx_t vti_tunnel_xmit(struct sk_buff *skb, struct net_device *dev) > > tos = old_iph->tos; > > + /* SPD lookup: we must provide a dst_entry to xfrm_lookup, normally the > + * route to the final destination. However this route is a route via > + * the vti interface. Now vti interfaces typically have the NOXFRM > + * flag, hence xfrm_lookup would bypass IPsec. > + * > + * Therefore, we feed xfrm_lookup with a route to the vti tunnel remote > + * endpoint instead. > + */ > memset(&fl4, 0, sizeof(fl4)); > flowi4_init_output(&fl4, tunnel->parms.link, > be32_to_cpu(tunnel->parms.o_key), RT_TOS(tos), > RT_SCOPE_UNIVERSE, > IPPROTO_IPIP, 0, > dst, tiph->saddr, 0, 0); > - rt = ip_route_output_key(dev_net(dev), &fl4); > + rt = __ip_route_output_key(tunnel->net, &fl4); > if (IS_ERR(rt)) { > dev->stats.tx_carrier_errors++; > goto tx_error_icmp; > } > + > + memset(&fl, 0, sizeof(fl)); > + /* Temporarily mark the skb with the tunnel o_key, to look up > + * for a policy with this mark, matching the plaintext traffic. > + */ > + skb->mark = be32_to_cpu(tunnel->parms.o_key); > + __xfrm_decode_session(skb, &fl, AF_INET, 0); > + skb->mark = oldmark; > + rt = (struct rtable *)xfrm_lookup(tunnel->net, &rt->dst, &fl, NULL, 0); > + I think we should not do such a workarround for the NOXFRM case. In particular because this is not really typical, it is not the default and it is not documented that it should be like that. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 19, 2013 at 05:16:34PM +0800, Fan Du wrote: > > Or the VTI tunnel is the only tunnel with this specific source/destination address > in the production deployment. Again the upper layer 4 will check the policy after > all, that's the right place to do the policy checking. > > So IMO, it's unnecessary to check policy for a net_device like VTI, actually I hold > a patch of removing the VTI policy checking due to net-next closure for the moment. > Please keep in mind that this will change the lookup from the IPsec traffic to the plaintext traffic, like Christophe proposed to do. This means that the transmit side has to be changed too. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Steffen Klassert [mailto:steffen.klassert@secunet.com] > Sent: Thursday, November 21, 2013 4:13 AM > To: Christophe Gouault > Cc: David S. Miller; Herbert Xu; netdev@vger.kernel.org; Saurabh Mohan; > Sergei Shtylyov; Eric Dumazet > Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not ipsec > pkt > > On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: > > > > @@ -133,7 +134,13 @@ static int vti_rcv(struct sk_buff *skb) > > * only match policies with this mark. > > */ > > skb->mark = be32_to_cpu(tunnel->parms.o_key); > > + /* The packet is decrypted, but not yet decapsulated. > > + * Temporarily make network_header point to the inner > header > > + * for policy check. > > + */ > > + skb_reset_network_header(skb); > > ret = xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb); > > If we do it like this, we would do an input policy check even for > packets that should be forwarded. I think that's a bit odd. > > If we really change to match plaintext traffic, we should do > it like Fan Du proposed. Remove the policy check here and > let the further layers do the policy enforcement. All we > have to do is to set the skb mark, then the lookup should > match the vti policy. > > It is already clear that this packet was IPsec transformed > when it enters vti_rcv, so deferring the policy check should > be ok. Agreed. The right thing to do here is apply the vti mark and then let xfrm policy check match the clear packet with the vti-mark on it. The side-effect is that we'll lose the old mark. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Fan Du [mailto:fan.du@windriver.com] > Sent: Tuesday, November 19, 2013 1:17 AM > To: Saurabh Mohan > Cc: Christophe Gouault; Steffen Klassert; David S. Miller; Herbert Xu; > netdev@vger.kernel.org; Sergei Shtylyov; Eric Dumazet > Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not ipsec > pkt > > Hi, Saurabh > > On 2013年11月19日 05:38, Saurabh Mohan wrote: > > > > > >> -----Original Message----- > >> From: Christophe Gouault [mailto:christophe.gouault@6wind.com] > >> Sent: Thursday, November 07, 2013 4:56 AM > >> To: Steffen Klassert > >> Cc: David S. Miller; Herbert Xu; netdev@vger.kernel.org; Saurabh Mohan; > >> Sergei Shtylyov; Eric Dumazet > >> Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not > ipsec > >> pkt > >> > >> Hello Steffen, > >> > >> I am also interested in knowing Saurabh's intentions regarding the > >> behavior of policies bound to vti interfaces. > >> > > The semantics is to match the policy "src 0.0.0.0/0 dst 0.0.0.0/0 proto any" > > That is the only policy that VTI should use. The mark is needed to > > distinguish and limit the policy to a specific vti tunnel interface only. > > There is no other policy that may be applied to a vti interface. > > The fact that traffic is going over the tunnel interface implies that it > > must be encrypted/decrypted. Applying the above policy is a way > > to achieve that. > > I'm not much experienced with VTI usage practical production usage > scenario, but > I have one question about the necessity of policy checking on VTI receiving > part. > - A VTI tunnel is hashed by destination address and i_key when creating > them; > - After each tunneled IP packet delivered to vti_rcv, the first step is looking > for the right tunnel, this is done by using tunneled IP packet outer source > and > destination address without any key matching rule involved. > > If there are any other tunnel with the same source/destination address, but > not > the same mark in place, the tunnel lookup in the vti_rcv will properly not hit > VTI tunnel, but the non-VTI tunnel. So the VTI net device statistics will not be > accurate, and what's the point of checking policy for the wrong tunnel > interface? So far this is not supported. If it were needed then we'd have to use another key on the tunnel(s) to distinguish between tunnel with same src and dst. In such a case there would be two keys on the tunnel (one for vti mark and the other one to separate out tunnels with same src and dst).
On 11/21/2013 01:12 PM, Steffen Klassert wrote: > On Wed, Nov 06, 2013 at 09:05:53AM +0100, Christophe Gouault wrote: >> >> @@ -133,7 +134,13 @@ static int vti_rcv(struct sk_buff *skb) >> * only match policies with this mark. >> */ >> skb->mark = be32_to_cpu(tunnel->parms.o_key); >> + /* The packet is decrypted, but not yet decapsulated. >> + * Temporarily make network_header point to the inner header >> + * for policy check. >> + */ >> + skb_reset_network_header(skb); >> ret = xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb); > > If we do it like this, we would do an input policy check even for > packets that should be forwarded. I think that's a bit odd. Admittedly, a forward policy check would be more appropriate for forwarded traffic. > If we really change to match plaintext traffic, we should do > it like Fan Du proposed. Remove the policy check here and > let the further layers do the policy enforcement. All we > have to do is to set the skb mark, then the lookup should > match the vti policy. This solution sounds seductive, however, we must be careful because we change the skb input device (from the physical interface to the vti interface). So we are supposed to call skb_scrub_packet as is normally done when decapsulating a packet from a tunnel. This will reset the skb secpath and mark, and hence will compromise the deferred inbound policy check. > It is already clear that this packet was IPsec transformed > when it enters vti_rcv, so deferring the policy check should > be ok. I had in mind to later support cross netns in vti interfaces like for ipip tunnels (different netns for the decapsulated and encapsulated packets). With the deferred inbound policy check, the SA and SP will not be in the same netns, this will cause problems for the inbound policy check. Best Regards, Christophe -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013年11月22日 02:39, Saurabh Mohan wrote: > > >> -----Original Message----- >> From: Fan Du [mailto:fan.du@windriver.com] >> Sent: Tuesday, November 19, 2013 1:17 AM >> To: Saurabh Mohan >> Cc: Christophe Gouault; Steffen Klassert; David S. Miller; Herbert Xu; >> netdev@vger.kernel.org; Sergei Shtylyov; Eric Dumazet >> Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not ipsec >> pkt >> >> Hi, Saurabh >> >> On 2013年11月19日 05:38, Saurabh Mohan wrote: >>> >>> >>>> -----Original Message----- >>>> From: Christophe Gouault [mailto:christophe.gouault@6wind.com] >>>> Sent: Thursday, November 07, 2013 4:56 AM >>>> To: Steffen Klassert >>>> Cc: David S. Miller; Herbert Xu; netdev@vger.kernel.org; Saurabh Mohan; >>>> Sergei Shtylyov; Eric Dumazet >>>> Subject: Re: [PATCH net v3] vti: fix spd lookup: match plaintext pkt, not >> ipsec >>>> pkt >>>> >>>> Hello Steffen, >>>> >>>> I am also interested in knowing Saurabh's intentions regarding the >>>> behavior of policies bound to vti interfaces. >>>> >>> The semantics is to match the policy "src 0.0.0.0/0 dst 0.0.0.0/0 proto any" >>> That is the only policy that VTI should use. The mark is needed to >>> distinguish and limit the policy to a specific vti tunnel interface only. >>> There is no other policy that may be applied to a vti interface. >>> The fact that traffic is going over the tunnel interface implies that it >>> must be encrypted/decrypted. Applying the above policy is a way >>> to achieve that. >> >> I'm not much experienced with VTI usage practical production usage >> scenario, but >> I have one question about the necessity of policy checking on VTI receiving >> part. >> - A VTI tunnel is hashed by destination address and i_key when creating >> them; >> - After each tunneled IP packet delivered to vti_rcv, the first step is looking >> for the right tunnel, this is done by using tunneled IP packet outer source >> and >> destination address without any key matching rule involved. >> >> If there are any other tunnel with the same source/destination address, but >> not >> the same mark in place, the tunnel lookup in the vti_rcv will properly not hit >> VTI tunnel, but the non-VTI tunnel. So the VTI net device statistics will not be >> accurate, and what's the point of checking policy for the wrong tunnel >> interface? > > So far this is not supported. If it were needed then we'd have to use another > key on the tunnel(s) to distinguish between tunnel with same src and dst. > In such a case there would be two keys on the tunnel (one for vti mark > and the other one to separate out tunnels with same src and dst). > That's indeed what I am pointing, one vti tunnel with mark_a, another tunnel sharing same VTI tunnel's src/dst address with only different mark_b/wildcard mark. This configuration probably cause vti_rcv using the non-VTI tunnel for the policy checking. So after b2942004fb5c9f3304b77e187b8a1977b3626c9b ("ipv4/ip_vti.c: VTI fix post-decryption forwarding"), no other non-VTI tunnel should be mingled with VTI tunnel, otherwise, the forward process will be malfunctional.
On Fri, Nov 22, 2013 at 03:33:22PM +0100, Christophe Gouault wrote: > > I had in mind to later support cross netns in vti interfaces like for > ipip tunnels (different netns for the decapsulated and encapsulated > packets). With the deferred inbound policy check, the SA and SP will not > be in the same netns, this will cause problems for the inbound policy check. > Well, I think the current vti implementation has two problems. The first is that the receive hook is at the wrong place. I've objected this already when vti was originally implemented, but my objections remained unheared. The receive hook is in the middle of the decapsulation process, some of the header pointers point still into the IPsec packet others point already into the decapsulated packet. This makes it very unflexible and proper namespace and interfamily support can't be done as it is. I think vti should register it's own receive hooks for the IPsec protocols, then we have the control over the decryption and decapsulation process. The second problem is that we missuse the gre keys to mark the packets. Currently, we can use only the o_key to mark packets because the i_key is used to do the tunnel lookup. But if we want to do the policy check for the decapsulated packet we need two keys, one to mark transmitted and one to mark received packets. This is because vti typically uses wildcard selectors, so on forwarding the output policy maches in both directions. This generates a loop where the IPsec gateways bouncing the packet back and forth until the ttl is exceeded. I'm currently testing a patchset that implements an IPsec protocol multiplexer, so that vti can register it's own receive path hooks. Further it makes the i_key usable for vti and changes the vti4 code to do the following: vti uses the IPsec protocol multiplexer to register it's own receive side hooks for ESP and AH. Vti then does the following on receive side: 1. Do an input policy check for the IPsec packet we received. This is required because this packet could be already processed by IPsec (tunnel in tunnel or a block policy is present), so an inbound policy check is needed. 2. Clean the skb to not leak informations on namespace transitions. 3. Mark the packet with the i_key. The policy and the state must match this key now. Policy and state belong to the vti namespace and policy enforcement is done at the further layers. 4. Call the generic xfrm layer to do decryption and decapsulation. 5. Wait for a callback from the xfrm layer to properly update the device statistics. On transmit side: 1. Mark the packet with the o_key. The policy and the state must match this key now. 2. Do a xfrm_lookup on the original packet with the mark applied. 3. Check if we got an IPsec route. 4. Clean the skb to not leak informations on namespace transitions. 5. Attach the dst_enty we got from the xfrm_lookup to the skb. 6. Call dst_output to do the IPsec processing. 7. Do the device statistics. I hope to get a RFC version of this patchset ready by the end of the week. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Steffen, On 12/03/2013 08:55 AM, Steffen Klassert wrote: > On Fri, Nov 22, 2013 at 03:33:22PM +0100, Christophe Gouault wrote: >> >> I had in mind to later support cross netns in vti interfaces like for >> ipip tunnels (different netns for the decapsulated and encapsulated >> packets). With the deferred inbound policy check, the SA and SP will not >> be in the same netns, this will cause problems for the inbound policy check. >> > > Well, I think the current vti implementation has two problems. > The first is that the receive hook is at the wrong place. I've > objected this already when vti was originally implemented, but > my objections remained unheared. > The receive hook is in the middle of the decapsulation process, > some of the header pointers point still into the IPsec packet > others point already into the decapsulated packet. This makes it > very unflexible and proper namespace and interfamily support can't > be done as it is. > > I think vti should register it's own receive hooks for the IPsec > protocols, then we have the control over the decryption and > decapsulation process. > > The second problem is that we missuse the gre keys to mark > the packets. Currently, we can use only the o_key to mark > packets because the i_key is used to do the tunnel lookup. > But if we want to do the policy check for the decapsulated > packet we need two keys, one to mark transmitted and one to > mark received packets. This is because vti typically uses > wildcard selectors, so on forwarding the output policy maches > in both directions. This generates a loop where the IPsec > gateways bouncing the packet back and forth until the ttl > is exceeded. > > I'm currently testing a patchset that implements an IPsec > protocol multiplexer, so that vti can register it's own > receive path hooks. Further it makes the i_key usable > for vti and changes the vti4 code to do the following: > vti uses the IPsec protocol multiplexer to register it's > own receive side hooks for ESP and AH. > > Vti then does the following on receive side: > > 1. Do an input policy check for the IPsec packet we received. > This is required because this packet could be already > processed by IPsec (tunnel in tunnel or a block policy > is present), so an inbound policy check is needed. > > 2. Clean the skb to not leak informations on namespace > transitions. > > 3. Mark the packet with the i_key. The policy and the state > must match this key now. Policy and state belong to the vti > namespace and policy enforcement is done at the further layers. > > 4. Call the generic xfrm layer to do decryption and decapsulation. > > 5. Wait for a callback from the xfrm layer to properly update > the device statistics. > > On transmit side: > > 1. Mark the packet with the o_key. The policy and the state > must match this key now. > > 2. Do a xfrm_lookup on the original packet with the mark applied. > > 3. Check if we got an IPsec route. > > 4. Clean the skb to not leak informations on namespace > transitions. > > 5. Attach the dst_enty we got from the xfrm_lookup to the skb. I am just wondering when exactly the netns transition IPsec and route lookups are performed (as far as I understand, we first need to perform the SP+SA lookup in inner netns, then change namespaces to outer netns, then perform a route lookup). But I guess the patchset will answer this question. > 6. Call dst_output to do the IPsec processing. > > 7. Do the device statistics. > > I hope to get a RFC version of this patchset ready by the > end of the week. This patchset sounds really promising. I am eagerly waiting for it. Best Regards, Christophe -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 03, 2013 at 10:01:07AM +0100, Christophe Gouault wrote: > > > >On transmit side: > > > >1. Mark the packet with the o_key. The policy and the state > > must match this key now. > > > >2. Do a xfrm_lookup on the original packet with the mark applied. > > > >3. Check if we got an IPsec route. > > > >4. Clean the skb to not leak informations on namespace > > transitions. > > > >5. Attach the dst_enty we got from the xfrm_lookup to the skb. > > I am just wondering when exactly the netns transition IPsec and route > lookups are performed (as far as I understand, we first need to perform > the SP+SA lookup in inner netns, then change namespaces to outer > netns, then perform a route lookup). > The idea is to do the xfrm_lookup based on the original dst_entry that comes with the skb. xfrm_lookup then generates a bundle with the IPsec dst entries on top. Later xfrm_output pops all IPsec dst entires from the skb, such that only the original routing one remains. I did not do much with namespaces so far, but after the IPsec processing we should be in the situation as we are when we transmitting via ipip. If it works there it should work with vti too. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 6e87f85..2368262 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -126,6 +126,7 @@ static int vti_rcv(struct sk_buff *skb) if (tunnel != NULL) { struct pcpu_tstats *tstats; u32 oldmark = skb->mark; + unsigned int old_nh = skb->network_header; int ret; @@ -133,7 +134,13 @@ static int vti_rcv(struct sk_buff *skb) * only match policies with this mark. */ skb->mark = be32_to_cpu(tunnel->parms.o_key); + /* The packet is decrypted, but not yet decapsulated. + * Temporarily make network_header point to the inner header + * for policy check. + */ + skb_reset_network_header(skb); ret = xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb); + skb->network_header = old_nh; skb->mark = oldmark; if (!ret) return -1; @@ -166,6 +173,8 @@ static netdev_tx_t vti_tunnel_xmit(struct sk_buff *skb, struct net_device *dev) struct iphdr *old_iph = ip_hdr(skb); __be32 dst = tiph->daddr; struct flowi4 fl4; + struct flowi fl; + u32 oldmark = skb->mark; int err; if (skb->protocol != htons(ETH_P_IP)) @@ -173,17 +182,35 @@ static netdev_tx_t vti_tunnel_xmit(struct sk_buff *skb, struct net_device *dev) tos = old_iph->tos; + /* SPD lookup: we must provide a dst_entry to xfrm_lookup, normally the + * route to the final destination. However this route is a route via + * the vti interface. Now vti interfaces typically have the NOXFRM + * flag, hence xfrm_lookup would bypass IPsec. + * + * Therefore, we feed xfrm_lookup with a route to the vti tunnel remote + * endpoint instead. + */ memset(&fl4, 0, sizeof(fl4)); flowi4_init_output(&fl4, tunnel->parms.link, be32_to_cpu(tunnel->parms.o_key), RT_TOS(tos), RT_SCOPE_UNIVERSE, IPPROTO_IPIP, 0, dst, tiph->saddr, 0, 0); - rt = ip_route_output_key(dev_net(dev), &fl4); + rt = __ip_route_output_key(tunnel->net, &fl4); if (IS_ERR(rt)) { dev->stats.tx_carrier_errors++; goto tx_error_icmp; } + + memset(&fl, 0, sizeof(fl)); + /* Temporarily mark the skb with the tunnel o_key, to look up + * for a policy with this mark, matching the plaintext traffic. + */ + skb->mark = be32_to_cpu(tunnel->parms.o_key); + __xfrm_decode_session(skb, &fl, AF_INET, 0); + skb->mark = oldmark; + rt = (struct rtable *)xfrm_lookup(tunnel->net, &rt->dst, &fl, NULL, 0); + /* if there is no transform then this tunnel is not functional. * Or if the xfrm is not mode tunnel. */
The vti interface inbound and outbound SPD lookups are based on the ipsec packet instead of the plaintext packet. Not only is it counterintuitive, it also restricts vti interfaces to a single policy (whose selector must match the tunnel local and remote addresses). The policy selector is supposed to match the plaintext packet, before encryption or after decryption. This patch performs the SPD lookup based on the plaintext packet. It enables to create several polices bound to the vti interface (via a mark equal to the vti interface okey). It remains possible to apply the same policy to all packets entering the vti interface, by setting an any-to-any selector (src 0.0.0.0/0 dst 0.0.0.0/0 proto any mark OKEY). Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> --- v2: - Fixed comment style - Checked with checkpatch.pl and sparse v3: - vti_rcv: optimized skb network header shift and restore - Checked with checkpatch.pl and sparse --- net/ipv4/ip_vti.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-)