Message ID | 20160718045025.GA2490@penelope.isobedori.kobe.vergenet.net |
---|---|
State | Not Applicable |
Headers | show |
On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman <simon.horman@netronome.com> wrote: > [CC Jiri Benc for portion regarding GRE] > > Hi Pravin, > > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote: >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman >> <simon.horman@netronome.com> wrote: >> > Hi Pravin, >> > >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote: >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman >> >> <simon.horman@netronome.com> wrote: >> > >> > ... >> >> > >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c >> >> > index 0ea128eeeab2..86f2cfb19de3 100644 >> >> > --- a/net/openvswitch/flow.c >> >> > +++ b/net/openvswitch/flow.c >> >> ... >> >> >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, >> >> > key->phy.skb_mark = skb->mark; >> >> > ovs_ct_fill_key(skb, key); >> >> > key->ovs_flow_hash = 0; >> >> > + key->phy.is_layer3 = skb->mac_len == 0; >> >> >> >> I do not think mac_len can be used. mac_header needs to be checked. >> >> ... >> > >> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently >> > slipped into the following patch, sorry about that. >> > >> > With that change in place I believe that this patch is internally >> > consistent because mac_header and mac_len are set correctly by the >> > call to key_extract() which is called by ovs_flow_key_extract() just >> > after where the excerpt above ends. >> > >> > That said, I do think that it is possible to rely on skb_mac_header_was_set >> > throughout the datapath, including action processing etc... I have provided >> > an incremental patch - which I created on top of this entire series - at >> > the end of this email. If you prefer that approach I am happy to take it, >> > though I do feel that using mac_len leads to slightly cleaner code. Let me >> > know what you think. >> > >> >> >> I am not sure if you can use only mac_len to detect L3 packet. This >> does not work with MPLS packets, mac_len is used to account MPLS >> headers pushed on skb. Therefore in case of a MPLS header on L3 >> packet, mac_len would be non zero and we have to look at either >> mac_header or some other metadata like is_layer3 flag from key to >> check for L3 packet. > > At least within OvS mac_len does not include the length of the MPLS label > stack. Rather, the MPLS label stack length is the difference between the > end of (mac_header + mac_len) and network_header. > > So I think that the scheme does work as mac_len is 0 if there is no L2 > header regardless of if an MPLS label stack is present or not. > I was thinking in overall networking stack rather than just ovs datapath. I think we should have consistent method of detecting L3 packet. As commented in previous mail it could be achieved using skb-protocol and device type.
On Mon, Jul 18, 2016 at 03:34:52PM -0700, pravin shelar wrote: > On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman > <simon.horman@netronome.com> wrote: > > [CC Jiri Benc for portion regarding GRE] > > > > Hi Pravin, > > > > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote: > >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman > >> <simon.horman@netronome.com> wrote: > >> > Hi Pravin, > >> > > >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote: > >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman > >> >> <simon.horman@netronome.com> wrote: > >> > > >> > ... > >> > >> > > >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c > >> >> > index 0ea128eeeab2..86f2cfb19de3 100644 > >> >> > --- a/net/openvswitch/flow.c > >> >> > +++ b/net/openvswitch/flow.c > >> >> ... > >> >> > >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, > >> >> > key->phy.skb_mark = skb->mark; > >> >> > ovs_ct_fill_key(skb, key); > >> >> > key->ovs_flow_hash = 0; > >> >> > + key->phy.is_layer3 = skb->mac_len == 0; > >> >> > >> >> I do not think mac_len can be used. mac_header needs to be checked. > >> >> ... > >> > > >> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently > >> > slipped into the following patch, sorry about that. > >> > > >> > With that change in place I believe that this patch is internally > >> > consistent because mac_header and mac_len are set correctly by the > >> > call to key_extract() which is called by ovs_flow_key_extract() just > >> > after where the excerpt above ends. > >> > > >> > That said, I do think that it is possible to rely on skb_mac_header_was_set > >> > throughout the datapath, including action processing etc... I have provided > >> > an incremental patch - which I created on top of this entire series - at > >> > the end of this email. If you prefer that approach I am happy to take it, > >> > though I do feel that using mac_len leads to slightly cleaner code. Let me > >> > know what you think. > >> > > >> > >> > >> I am not sure if you can use only mac_len to detect L3 packet. This > >> does not work with MPLS packets, mac_len is used to account MPLS > >> headers pushed on skb. Therefore in case of a MPLS header on L3 > >> packet, mac_len would be non zero and we have to look at either > >> mac_header or some other metadata like is_layer3 flag from key to > >> check for L3 packet. > > > > At least within OvS mac_len does not include the length of the MPLS label > > stack. Rather, the MPLS label stack length is the difference between the > > end of (mac_header + mac_len) and network_header. > > > > So I think that the scheme does work as mac_len is 0 if there is no L2 > > header regardless of if an MPLS label stack is present or not. > > > > I was thinking in overall networking stack rather than just ovs > datapath. I think we should have consistent method of detecting L3 > packet. As commented in previous mail it could be achieved using > skb-protocol and device type. This is somewhat of a surprise to me. As far as I recall when MPLS support was added to OvS it and the accompanying support for MPLS GSO was the only MPLS support present in the kernel. And at the time the scheme developed by Jesse Gross, myself and others was as I describe above. Internally OvS relies on this scheme and in particular it is used by skb_mpls_header() to calculate the beginning of the MPLS label stack accurately in the presence of VLAN tags. Is it mpls_gso_segment() that you are concerned about? If so, perhaps the problem could be addressed there.
On Tue, Jul 19, 2016 at 5:02 PM, Simon Horman <simon.horman@netronome.com> wrote: > On Mon, Jul 18, 2016 at 03:34:52PM -0700, pravin shelar wrote: >> On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman >> <simon.horman@netronome.com> wrote: >> > [CC Jiri Benc for portion regarding GRE] >> > >> > Hi Pravin, >> > >> > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote: >> >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman >> >> <simon.horman@netronome.com> wrote: >> >> > Hi Pravin, >> >> > >> >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote: >> >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman >> >> >> <simon.horman@netronome.com> wrote: >> >> > >> >> > ... >> >> >> >> > >> >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c >> >> >> > index 0ea128eeeab2..86f2cfb19de3 100644 >> >> >> > --- a/net/openvswitch/flow.c >> >> >> > +++ b/net/openvswitch/flow.c >> >> >> ... >> >> >> >> >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, >> >> >> > key->phy.skb_mark = skb->mark; >> >> >> > ovs_ct_fill_key(skb, key); >> >> >> > key->ovs_flow_hash = 0; >> >> >> > + key->phy.is_layer3 = skb->mac_len == 0; >> >> >> >> >> >> I do not think mac_len can be used. mac_header needs to be checked. >> >> >> ... >> >> > >> >> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently >> >> > slipped into the following patch, sorry about that. >> >> > >> >> > With that change in place I believe that this patch is internally >> >> > consistent because mac_header and mac_len are set correctly by the >> >> > call to key_extract() which is called by ovs_flow_key_extract() just >> >> > after where the excerpt above ends. >> >> > >> >> > That said, I do think that it is possible to rely on skb_mac_header_was_set >> >> > throughout the datapath, including action processing etc... I have provided >> >> > an incremental patch - which I created on top of this entire series - at >> >> > the end of this email. If you prefer that approach I am happy to take it, >> >> > though I do feel that using mac_len leads to slightly cleaner code. Let me >> >> > know what you think. >> >> > >> >> >> >> >> >> I am not sure if you can use only mac_len to detect L3 packet. This >> >> does not work with MPLS packets, mac_len is used to account MPLS >> >> headers pushed on skb. Therefore in case of a MPLS header on L3 >> >> packet, mac_len would be non zero and we have to look at either >> >> mac_header or some other metadata like is_layer3 flag from key to >> >> check for L3 packet. >> > >> > At least within OvS mac_len does not include the length of the MPLS label >> > stack. Rather, the MPLS label stack length is the difference between the >> > end of (mac_header + mac_len) and network_header. >> > >> > So I think that the scheme does work as mac_len is 0 if there is no L2 >> > header regardless of if an MPLS label stack is present or not. >> > >> >> I was thinking in overall networking stack rather than just ovs >> datapath. I think we should have consistent method of detecting L3 >> packet. As commented in previous mail it could be achieved using >> skb-protocol and device type. > > This is somewhat of a surprise to me. As far as I recall when MPLS support > was added to OvS it and the accompanying support for MPLS GSO was the only > MPLS support present in the kernel. And at the time the scheme developed by > Jesse Gross, myself and others was as I describe above. > > Internally OvS relies on this scheme and in particular it is used > by skb_mpls_header() to calculate the beginning of the MPLS label stack > accurately in the presence of VLAN tags. > > Is it mpls_gso_segment() that you are concerned about? > If so, perhaps the problem could be addressed there. Yes. Can you read the comment I made in previous main in context of function skb_mpls_header(). I have given rational for requested change.
On Mon, 18 Jul 2016 13:50:27 +0900, Simon Horman wrote: > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote: > > I think we should send L2 header with l2 header pushed on skb. This is > > what OVS expect. The skb-push should be done for all l2 packets rather > > than for particular type of device. > > The following seems to achieve that. > Jiri, what do you think? > > diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c > index a20248355da0..edbc10690b60 100644 > --- a/net/ipv4/ip_gre.c > +++ b/net/ipv4/ip_gre.c > @@ -281,10 +281,9 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi, > raw_proto, false) < 0) > goto drop; > > - if (tunnel->dev->type != ARPHRD_NONE) > + if (tunnel->dev->type != ARPHRD_NONE || > + tpi->proto == htons(ETH_P_TEB)) > skb_pop_mac_header(skb); This is wrong. The MAC header for ARPHRD_NONE interfaces is null, that's the meaning of ARPHRD_NONE. mac_header cannot point to the outer IP header. That would be ARPHRD_IPGRE. Jiri
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index a20248355da0..edbc10690b60 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -281,10 +281,9 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi, raw_proto, false) < 0) goto drop; - if (tunnel->dev->type != ARPHRD_NONE) + if (tunnel->dev->type != ARPHRD_NONE || + tpi->proto == htons(ETH_P_TEB)) skb_pop_mac_header(skb); - else if (tpi->proto != htons(ETH_P_TEB)) - skb_unset_mac_header(skb); else skb_reset_mac_header(skb); if (tunnel->collect_md) { @@ -326,7 +325,7 @@ static int ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi, * also ETH_P_TEB traffic. */ itn = net_generic(net, ipgre_net_id); - res = __ipgre_rcv(skb, tpi, itn, hdr_len, true); + res = __ipgre_rcv(skb, tpi, itn, hdr_len, false); } return res; } diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c index 05985209f611..933ac46db53a 100644 --- a/net/openvswitch/vport-netdev.c +++ b/net/openvswitch/vport-netdev.c @@ -57,7 +57,8 @@ static void netdev_port_receive(struct sk_buff *skb) if (unlikely(!skb)) return; - if (vport->dev->type == ARPHRD_ETHER) { + if (vport->dev->type != ARPHRD_NONE || + skb->protocol == htons(ETH_P_TEB)) { skb_push(skb, ETH_HLEN); skb_postpush_rcsum(skb, skb->data, ETH_HLEN); }