From patchwork Mon Jul 18 04:50:27 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Simon Horman X-Patchwork-Id: 649335 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3rt9jC6Stvz9s4q for ; Mon, 18 Jul 2016 14:50:39 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b=orCZz8iB; dkim-atps=neutral Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id F3F3B1071E; Sun, 17 Jul 2016 21:50:38 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx3v3.cudamail.com (mx3.cudamail.com [64.34.241.5]) by archives.nicira.com (Postfix) with ESMTPS id BBC7F106BF for ; Sun, 17 Jul 2016 21:50:37 -0700 (PDT) Received: from bar6.cudamail.com (localhost [127.0.0.1]) by mx3v3.cudamail.com (Postfix) with ESMTPS id 50640162C38 for ; Sun, 17 Jul 2016 22:50:37 -0600 (MDT) X-ASG-Debug-ID: 1468817434-0b32372d9722500001-byXFYA Received: from mx1-pf1.cudamail.com ([192.168.24.1]) by bar6.cudamail.com with ESMTP id C54SYv0KuyqGWvkI (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Sun, 17 Jul 2016 22:50:34 -0600 (MDT) X-Barracuda-Envelope-From: simon.horman@netronome.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.1 Received: from unknown (HELO mail-pf0-f181.google.com) (209.85.192.181) by mx1-pf1.cudamail.com with ESMTPS (AES128-SHA encrypted); 18 Jul 2016 04:50:34 -0000 Received-SPF: neutral (mx1-pf1.cudamail.com: 209.85.192.181 is neither permitted nor denied by SPF record at mktomail.com) X-Barracuda-RBL-Trusted-Forwarder: 209.85.192.181 Received: by mail-pf0-f181.google.com with SMTP id h186so8367223pfg.3 for ; Sun, 17 Jul 2016 21:50:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=h48H5ud+dQceEnIkgA/iduyVvY/n8s04RKYsTc/iJa8=; b=orCZz8iBqWWBha/KIj7gLTPGY+0TbIdg2OxJCqi96XgvOvKNVCTSOWbla0aB5C8Tv1 C2Tst1WNLlc/9GX5kFBAMYaNlFO18RkZnl29QNzRLzn7mPF5520BYEFC4M/Qede4R8s2 hbaEyXtqUTcKUPAvevD8fuWWV9dL+0YbV0tF0Ek1cHhCFEKKkVQrB0wOTvA/ikZlVR4u qaLk0sFd8tbqjoBqUv9e36GG19fwfAR21s9LGBMVe++UIpDwE6YZYNGD5gL7ACfUf/Sd bg6uN5VCEhesWgpTto7jbotpu0RqGVdLVBvayXGcGD/jiRDGfWngV2ou5zT9VRsFxKJ0 nZAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=h48H5ud+dQceEnIkgA/iduyVvY/n8s04RKYsTc/iJa8=; b=l0kkanolvzPEZpoh9F4sgOGR6HoBgoU06k57Z7tkiPdx9+vsp1TXAaKNV6v/b0Oeks FesBBrJ8fsfJKihpWh2XNw/tBqz4y9KPABPO4ELBd89KuPugKTt8bSqj+Mxojol0E4rZ W8uCO3mPljEri2alyVclULaoGw5sVfJZnnH23ggZYzXwOYAwbhX8MSzdCfO0lPdm8WGt JLalrmPw0aKXS+RlgrdX6dETior0amzOw7DYCsV4Qu2MCofqMCy5mp8QdyXO5W5F1BJo abQXTWN6Fj+eDPv9doZ3kkakFsVZFWGJsPQg5QV/Z7XId8Xx4QizjKKKnB80xRdtxRFa HxbA== X-Gm-Message-State: ALyK8tJoJKloPD3LEpjD32K0VhHYG8lQPyIcFCvMhXw/TmKyrIyCaDxkgPrW4cxOhYeoMbnS X-Received: by 10.98.9.194 with SMTP id 63mr43251356pfj.56.1468817432783; Sun, 17 Jul 2016 21:50:32 -0700 (PDT) Received: from penelope.isobedori.kobe.vergenet.net (p6047-ipbfp602kobeminato.hyogo.ocn.ne.jp. [118.10.78.47]) by smtp.gmail.com with ESMTPSA id ln4sm453347pab.38.2016.07.17.21.50.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 17 Jul 2016 21:50:31 -0700 (PDT) Date: Mon, 18 Jul 2016 13:50:27 +0900 X-Barracuda-Apparent-Source-IP: 118.10.78.47 X-CudaMail-Envelope-Sender: simon.horman@netronome.com From: Simon Horman To: pravin shelar X-CudaMail-MID: CM-E1-716025182 X-CudaMail-DTE: 071716 X-CudaMail-Originating-IP: 209.85.192.181 Message-ID: <20160718045025.GA2490@penelope.isobedori.kobe.vergenet.net> X-ASG-Orig-Subj: [##CM-E1-716025182##]Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support References: <1467827996-32547-1-git-send-email-simon.horman@netronome.com> <1467827996-32547-6-git-send-email-simon.horman@netronome.com> <20160713073152.GC29661@penelope.isobedori.kobe.vergenet.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Barracuda-Connect: UNKNOWN[192.168.24.1] X-Barracuda-Start-Time: 1468817434 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.10 X-Barracuda-Spam-Status: No, SCORE=0.10 using global scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=4.0 tests=BSF_SC0_MISMATCH_TO, DKIM_SIGNED, RDNS_NONE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.31338 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 BSF_SC0_MISMATCH_TO Envelope rcpt doesn't match header 0.00 DKIM_SIGNED Domain Keys Identified Mail: message has a signature 0.10 RDNS_NONE Delivered to trusted network by a host with no rDNS Cc: ovs dev , Linux Kernel Network Developers , Jiri Benc Subject: Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@openvswitch.org Sender: "dev" [CC Jiri Benc for portion regarding GRE] Hi Pravin, On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote: > On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman > wrote: > > Hi Pravin, > > > > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote: > >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman > >> wrote: > > > > ... > > > > >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c > >> > index 0ea128eeeab2..86f2cfb19de3 100644 > >> > --- a/net/openvswitch/flow.c > >> > +++ b/net/openvswitch/flow.c > >> ... > >> > >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, > >> > key->phy.skb_mark = skb->mark; > >> > ovs_ct_fill_key(skb, key); > >> > key->ovs_flow_hash = 0; > >> > + key->phy.is_layer3 = skb->mac_len == 0; > >> > >> I do not think mac_len can be used. mac_header needs to be checked. > >> ... > > > > Yes, indeed. The update to use skb_mac_header_was_set() here accidently > > slipped into the following patch, sorry about that. > > > > With that change in place I believe that this patch is internally > > consistent because mac_header and mac_len are set correctly by the > > call to key_extract() which is called by ovs_flow_key_extract() just > > after where the excerpt above ends. > > > > That said, I do think that it is possible to rely on skb_mac_header_was_set > > throughout the datapath, including action processing etc... I have provided > > an incremental patch - which I created on top of this entire series - at > > the end of this email. If you prefer that approach I am happy to take it, > > though I do feel that using mac_len leads to slightly cleaner code. Let me > > know what you think. > > > > > I am not sure if you can use only mac_len to detect L3 packet. This > does not work with MPLS packets, mac_len is used to account MPLS > headers pushed on skb. Therefore in case of a MPLS header on L3 > packet, mac_len would be non zero and we have to look at either > mac_header or some other metadata like is_layer3 flag from key to > check for L3 packet. At least within OvS mac_len does not include the length of the MPLS label stack. Rather, the MPLS label stack length is the difference between the end of (mac_header + mac_len) and network_header. So I think that the scheme does work as mac_len is 0 if there is no L2 header regardless of if an MPLS label stack is present or not. > >> > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c > >> > index 4e3972344aa6..733e7914f6bd 100644 > >> > --- a/net/openvswitch/vport-netdev.c > >> > +++ b/net/openvswitch/vport-netdev.c > >> > @@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb) > >> > if (unlikely(!skb)) > >> > return; > >> > > >> > - skb_push(skb, ETH_HLEN); > >> > - skb_postpush_rcsum(skb, skb->data, ETH_HLEN); > >> > + if (vport->dev->type == ARPHRD_ETHER) { > >> > + skb_push(skb, ETH_HLEN); > >> > + skb_postpush_rcsum(skb, skb->data, ETH_HLEN); > >> > + } > >> This is still required for tunnel device of ARPHRD_NONE which can > >> handle l2 packets. > > > > That is not necessary given the current implementation (of ipgre) as it > > supplies an skb with the mac header in place if the inner packet was an > > Ethernet packet. This scheme could of course be adjusted. > > > > ... > > > > I think we should send L2 header with l2 header pushed on skb. This is > what OVS expect. The skb-push should be done for all l2 packets rather > than for particular type of device. The following seems to achieve that. Jiri, what do you think? > > Update to use skb_mac_header_was_set() more as mentioned above. > > Please let me know what you think about this approach. > > > > include/net/mpls.h | 4 ++- > > net/openvswitch/actions.c | 42 ++++++++++++++++++++--------------- > > net/openvswitch/flow.c | 23 +++++++++++-------- > > net/openvswitch/vport-internal_dev.c | 2 - > > net/openvswitch/vport-netdev.c | 4 +-- > > 5 files changed, 44 insertions(+), 31 deletions(-) > > > > diff --git a/include/net/mpls.h b/include/net/mpls.h > > index 5b3b5addfb08..296b68661be0 100644 > > --- a/include/net/mpls.h > > +++ b/include/net/mpls.h > > @@ -34,6 +34,8 @@ static inline bool eth_p_mpls(__be16 eth_type) > > */ > > static inline unsigned char *skb_mpls_header(struct sk_buff *skb) > > { > > - return skb_mac_header(skb) + skb->mac_len; > > + return skb_mac_header_was_set(skb) ? > > + skb_mac_header(skb) + skb->mac_len : > > + skb->data; > > } > > This function is also called from GSO layer. issue is in GSO layer, it > does reset mac header and mac length and then calls mpls-gso-handler. > So all subsequent check for L3 packet fails. > So far we have explored three different ways to detect L3 packet but > each has its own issue. > 1. skb mac header : GSO can reset mac header. > 2. skb mac length : MPLS uses mac_len to account for MPLS header > length along with L2 header > 3. skb protocol: ETH_P_TEB is not set for all L2 frames, networking > stack is not ready to handle this type for given skb. > > So none of them works consistently. I think the only option to detect > L3 packet reliably (and without adding field to skb) is to use > skb-protocol along with ARPHRD_NONE device type. If ARPHRD_NONE type > device generates L2 packet it needs to set protocol to ETH_P_TEB. Some > networking stack function also needs to be fixed to handle this > protocol type, e.g. vlan_get_protocol(), br_dev_queue_push_xmit(), > etc. diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index a20248355da0..edbc10690b60 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -281,10 +281,9 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi, raw_proto, false) < 0) goto drop; - if (tunnel->dev->type != ARPHRD_NONE) + if (tunnel->dev->type != ARPHRD_NONE || + tpi->proto == htons(ETH_P_TEB)) skb_pop_mac_header(skb); - else if (tpi->proto != htons(ETH_P_TEB)) - skb_unset_mac_header(skb); else skb_reset_mac_header(skb); if (tunnel->collect_md) { @@ -326,7 +325,7 @@ static int ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi, * also ETH_P_TEB traffic. */ itn = net_generic(net, ipgre_net_id); - res = __ipgre_rcv(skb, tpi, itn, hdr_len, true); + res = __ipgre_rcv(skb, tpi, itn, hdr_len, false); } return res; } diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c index 05985209f611..933ac46db53a 100644 --- a/net/openvswitch/vport-netdev.c +++ b/net/openvswitch/vport-netdev.c @@ -57,7 +57,8 @@ static void netdev_port_receive(struct sk_buff *skb) if (unlikely(!skb)) return; - if (vport->dev->type == ARPHRD_ETHER) { + if (vport->dev->type != ARPHRD_NONE || + skb->protocol == htons(ETH_P_TEB)) { skb_push(skb, ETH_HLEN); skb_postpush_rcsum(skb, skb->data, ETH_HLEN); }