Message ID | 20160520052858.GA15505@vergenet.net |
---|---|
State | Not Applicable |
Headers | show |
On Fri, 20 May 2016 14:29:01 +0900, Simon Horman wrote: > The second option does seem rather tempting although I'm not sure > that it actually plays out in the access-port scenario at this time. We support gre ports to be access ports currently. With conversion to ipgre, this needs to continue working. It's no problem for frames with the Ethernet header but now we have a situation where a port is tagged, thus the user expects that packets received on that port will behave accordingly. I don't think we can make some packets honor this and some ignore this; and we can't disallow gre to be an access port. How do you plan to solve this? By user space always pushing an ethernet header before push_vlan? Jiri
On Fri, May 20, 2016 at 10:00:28AM +0200, Jiri Benc wrote: > On Fri, 20 May 2016 14:29:01 +0900, Simon Horman wrote: > > The second option does seem rather tempting although I'm not sure > > that it actually plays out in the access-port scenario at this time. > > We support gre ports to be access ports currently. With conversion to > ipgre, this needs to continue working. It's no problem for frames with > the Ethernet header but now we have a situation where a port is tagged, > thus the user expects that packets received on that port will behave > accordingly. I don't think we can make some packets honor this and some > ignore this; and we can't disallow gre to be an access port. > > How do you plan to solve this? By user space always pushing an ethernet > header before push_vlan? Yes. That is my understanding of how OvS currently handles access ports but I have a feeling that either I am mistaken or that you are referring to a slightly different scenario.
On Fri, May 20, 2016 at 05:11:23PM +0900, Simon Horman wrote: > On Fri, May 20, 2016 at 10:00:28AM +0200, Jiri Benc wrote: > > On Fri, 20 May 2016 14:29:01 +0900, Simon Horman wrote: > > > The second option does seem rather tempting although I'm not sure > > > that it actually plays out in the access-port scenario at this time. > > > > We support gre ports to be access ports currently. With conversion to > > ipgre, this needs to continue working. It's no problem for frames with > > the Ethernet header but now we have a situation where a port is tagged, > > thus the user expects that packets received on that port will behave > > accordingly. I don't think we can make some packets honor this and some > > ignore this; and we can't disallow gre to be an access port. > > > > How do you plan to solve this? By user space always pushing an ethernet > > header before push_vlan? > > Yes. That is my understanding of how OvS currently handles access ports but > I have a feeling that either I am mistaken or that you are referring to a > slightly different scenario. Hi again. I apologise for having sent my previous email a little too quickly. My understanding is that currently OvS handles access ports using a push_vlan action. And that this patch set in conjunction with its user-space counterpart should ensure that a push_eth action occurs first. This is the context of my remarks above.
On Fri, 20 May 2016 17:16:13 +0900, Simon Horman wrote: > My understanding is that currently OvS handles access ports using a > push_vlan action. When needed (i.e. when the packet goes to a non-access port), yes. > And that this patch set in conjunction with its > user-space counterpart should ensure that a push_eth action occurs first. > This is the context of my remarks above. Okay, works for me principally. If it's later found insufficient, relaxing push_vlan and pop_vlan to work also for L3 frames is still easily possible without breaking compatibility. Out of curiosity (and without looking at the user space patchset), what will the pushed Ethernet header contain? E.g., consider the following scenario: two GRE ports, both set as access ports with the same tag, and a mirror port mirroring everything. Now an IP packet without inner Ethernet header is received on one of the GRE interfaces. Will the packet be output to the second GRE port? Will it be sent out without the inner Ethernet header? Are custom rules necessary, or will NORMAL take care of this? What will be sent to the mirror port? Thanks! Jiri
On Fri, May 20, 2016 at 10:39:39AM +0200, Jiri Benc wrote: > On Fri, 20 May 2016 17:16:13 +0900, Simon Horman wrote: > > My understanding is that currently OvS handles access ports using a > > push_vlan action. > > When needed (i.e. when the packet goes to a non-access port), yes. > > > And that this patch set in conjunction with its > > user-space counterpart should ensure that a push_eth action occurs first. > > This is the context of my remarks above. > > Okay, works for me principally. If it's later found insufficient, > relaxing push_vlan and pop_vlan to work also for L3 frames is still > easily possible without breaking compatibility. Right. I'm all for allowing extension later if the need arises. > Out of curiosity (and without looking at the user space patchset), what > will the pushed Ethernet header contain? E.g., consider the following > scenario: two GRE ports, both set as access ports with the same tag, > and a mirror port mirroring everything. Now an IP packet without inner > Ethernet header is received on one of the GRE interfaces. > > Will the packet be output to the second GRE port? Will it be sent out > without the inner Ethernet header? Are custom rules necessary, or will > NORMAL take care of this? What will be sent to the mirror port? Let me take a stab at answering that without running any tests. 1. push_eth adds an Ethernet header with all-zero addresses and the Ethernet type as determined from skb->protocol which is in turn determined by the tunnel header (we have discussed that bit before). In principle it is pushed when needed. And this happens automatically as controlled by user-space. It is possible to modify the Ethernet addresses using a custom rule. (I need to exercise that more often.) 2. For the GRE part of the scenario above it is important to know that with the accompanying user-space patch set OvS user-space the user-space representation of a vport (from now on simply vport) may be "layer3" or not. This allows OvS user-space to determine if an Ethernet header should be present or not on receive. And if it needs to be present or not on transmit. This allows it to automatically use pop_eth and push_eth to control the presence of an Ethernet header so its there when it needs to be and not when it doesn't. So if a GRE vport is "layer3" then no Ethernet header should be present on transmit, regardless of where the packet came from. And conversely if the GRE vport is not "layer3" then an Ethernet header should be present. 3. With regards to the mirroring part of your connection, I need to check on that and possibly its broken. But my thinking is that a mirroring vport would regarded in the same way as any other vport in this respect and the presence of the "layer3" flag would control whether an Ethernet header is present or not. It may be the case that its not possible to use a tunnel vport as a mirroring vport. And as all other vports currently do not support "layer3" then currently an Ethernet header would always be present on output to a mirror.
On Fri, 20 May 2016 18:12:05 +0900, Simon Horman wrote: > 1. push_eth adds an Ethernet header with all-zero addresses and > the Ethernet type as determined from skb->protocol which is in > turn determined by the tunnel header (we have discussed that > bit before). > > In principle it is pushed when needed. And this happens automatically > as controlled by user-space. > > It is possible to modify the Ethernet addresses using a custom rule. > (I need to exercise that more often.) > > 2. For the GRE part of the scenario above it is important to know that with > the accompanying user-space patch set OvS user-space the user-space > representation of a vport (from now on simply vport) may be "layer3" or > not. > > This allows OvS user-space to determine if an Ethernet header should be > present or not on receive. And if it needs to be present or not on > transmit. This allows it to automatically use pop_eth and push_eth to > control the presence of an Ethernet header so its there when it needs to > be and not when it doesn't. > > So if a GRE vport is "layer3" then no Ethernet header should be > present on transmit, regardless of where the packet came from. And > conversely if the GRE vport is not "layer3" then an Ethernet header > should be present. I think this works for me. Thanks a lot for answering my questions! > 3. With regards to the mirroring part of your connection, I need to check > on that and possibly its broken. But my thinking is that a mirroring > vport would regarded in the same way as any other vport in this respect > and the presence of the "layer3" flag would control whether an Ethernet > header is present or not. > > It may be the case that its not possible to use a tunnel vport as a > mirroring vport. And as all other vports currently do not support > "layer3" then currently an Ethernet header would always be present > on output to a mirror. We can just require a mirror port to be always L2 and output the L3 packets with zero Ethernet addresses. For mirroring, this should be okay(?) Jiri
On Fri, May 20, 2016 at 11:20:04AM +0200, Jiri Benc wrote: > On Fri, 20 May 2016 18:12:05 +0900, Simon Horman wrote: [...] > > 3. With regards to the mirroring part of your question, I need to check > > on that and possibly its broken. But my thinking is that a mirroring > > vport would regarded in the same way as any other vport in this respect > > and the presence of the "layer3" flag would control whether an Ethernet > > header is present or not. > > > > It may be the case that its not possible to use a tunnel vport as a > > mirroring vport. And as all other vports currently do not support > > "layer3" then currently an Ethernet header would always be present > > on output to a mirror. > > We can just require a mirror port to be always L2 and output the L3 > packets with zero Ethernet addresses. For mirroring, this should be > okay(?) Yes, I expect that is a good approach. I'll look over the code to see about making that so if its not already the case.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index c413c588a24f..6853ab008861 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2994,6 +2994,7 @@ int skb_vlan_pop(struct sk_buff *skb); int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci); struct sk_buff *pskb_extract(struct sk_buff *skb, int off, int to_copy, gfp_t gfp); +int skb_vlan_accel(struct sk_buff *skb); static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len) { diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 7a1d48983f81..a36c7491f714 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4482,12 +4482,28 @@ pull: return err; } -int skb_vlan_pop(struct sk_buff *skb) +/* If a vlan tag is present move it to hw accel tag */ +int skb_vlan_accel(struct sk_buff *skb) { u16 vlan_tci; __be16 vlan_proto; int err; + vlan_proto = skb->protocol; + err = __skb_vlan_pop(skb, &vlan_tci); + if (unlikely(err)) + return err; + + __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci); + return 0; +} +EXPORT_SYMBOL(skb_vlan_accel); + +int skb_vlan_pop(struct sk_buff *skb) +{ + u16 vlan_tci; + int err; + if (likely(skb_vlan_tag_present(skb))) { skb->vlan_tci = 0; } else { @@ -4500,19 +4516,13 @@ int skb_vlan_pop(struct sk_buff *skb) if (err) return err; } - /* move next vlan tag to hw accel tag */ + if (likely((skb->protocol != htons(ETH_P_8021Q) && skb->protocol != htons(ETH_P_8021AD)) || skb->len < VLAN_ETH_HLEN)) return 0; - vlan_proto = skb->protocol; - err = __skb_vlan_pop(skb, &vlan_tci); - if (unlikely(err)) - return err; - - __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci); - return 0; + return skb_vlan_accel(skb); } EXPORT_SYMBOL(skb_vlan_pop); diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index 7d9b2307d6eb..ad2331cde732 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -302,6 +302,17 @@ static int set_eth_addr(struct sk_buff *skb, struct sw_flow_key *flow_key, static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key) { + /* Push outermost VLAN tag to skb metadata unless a VLAN tag + * is already present there. + */ + if ((skb->protocol == htons(ETH_P_8021Q) || + skb->protocol == htons(ETH_P_8021AD)) && + !skb_vlan_tag_present(skb)) { + int err = skb_vlan_accel(skb); + if (unlikely(err)) + return err; + } + skb_pull_rcsum(skb, ETH_HLEN); skb_reset_mac_header(skb); skb->mac_len -= ETH_HLEN; @@ -314,13 +325,6 @@ static int push_eth(struct sk_buff *skb, struct sw_flow_key *key, const struct ovs_action_push_eth *ethh) { struct ethhdr *hdr; - int err; - - /* De-accelerate any hardware accelerated VLAN tag added to a previous - * Ethernet header */ - err = skb_vlan_deaccel(skb); - if (unlikely(err)) - return err; /* Add the new Ethernet header */ if (skb_cow_head(skb, ETH_HLEN) < 0) diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index 4d2698596033..fdefee776d4f 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -469,8 +469,10 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key) skb_reset_mac_header(skb); /* Link layer. */ + key->eth.tci = 0; if (key->phy.is_layer3) { - key->eth.tci = 0; + if (skb_vlan_tag_present(skb)) + key->eth.tci = htons(skb->vlan_tci); } else { eth = eth_hdr(skb); ether_addr_copy(key->eth.src, eth->h_source); @@ -481,7 +483,6 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key) * update skb->csum here. */ - key->eth.tci = 0; if (skb_vlan_tag_present(skb)) key->eth.tci = htons(skb->vlan_tci); else if (eth->h_proto == htons(ETH_P_8021Q))