diff mbox

[net-next,2/3] net/act_pedit: Support using offset relative to the conventional network headers

Message ID 20161130090928.14816-3-amir@vadai.me
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Amir Vadai Nov. 30, 2016, 9:09 a.m. UTC
Extend pedit to enable the user using offset relative to network
headers.  This change would enable to work with more complex header
schemes (vs the simple IPv4 case) where setting a fixed offset relative
to the network header is not enough. It is also forward looking to
enable hardware offloading of pedit more easier.

The header type is embedded in the 8 MSB of the u32 key->shift which
were never used till now. Therefore backward compatibility is being
kept.

Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
  flower \
    ip_proto tcp \
    src_port 80 \
  action pedit munge tcp dport set 8080 pipe \
  action mirred egress redirect dev veth0

Will forward traffic to tcp port 80, and modify the destination port to
8080.

hange-Id: Ibd7bbbe0b8c2f6adae0591868bb6892c55e75732
Signed-off-by: Amir Vadai <amir@vadai.me>
---
 include/uapi/linux/tc_act/tc_pedit.h | 17 ++++++++++
 net/sched/act_pedit.c                | 65 +++++++++++++++++++++++++++++-------
 2 files changed, 70 insertions(+), 12 deletions(-)

Comments

David Miller Dec. 1, 2016, 7:41 p.m. UTC | #1
From: Amir Vadai <amir@vadai.me>
Date: Wed, 30 Nov 2016 11:09:27 +0200

> @@ -119,18 +119,45 @@ static bool offset_valid(struct sk_buff *skb, int offset)
>  	return true;
>  }
>  
> +static int pedit_skb_hdr_offset(struct sk_buff *skb,
> +				enum pedit_header_type htype, int *hoffset)
> +{
> +	int ret = -1;
> +
> +	switch (htype) {
> +	case PEDIT_HDR_TYPE_ETH:
> +		if (skb_mac_header_was_set(skb)) {
> +			*hoffset = skb_mac_offset(skb);
> +			ret = 0;
> +		}
> +		break;
> +	case PEDIT_HDR_TYPE_RAW:
> +	case PEDIT_HDR_TYPE_IP4:
> +	case PEDIT_HDR_TYPE_IP6:
> +		*hoffset = skb_network_offset(skb);
> +		ret = 0;
> +		break;
> +	case PEDIT_HDR_TYPE_TCP:
> +	case PEDIT_HDR_TYPE_UDP:
> +		if (skb_transport_header_was_set(skb)) {
> +			*hoffset = skb_transport_offset(skb);
> +			ret = 0;
> +		}
> +		break;
> +	};
> +
> +	return ret;
> +}
> +

The only distinction between the cases is "L2", "L3", and "L4".

Therefore I don't see any reason to break it down into IP4 vs. IP6 vs.
RAW, for example.  They all map to the same thing.

So why not just have PEDIT_HDR_TYPE_L2, PEDIT_HDR_TYPE_L3, and
PEDIT_HDR_TYPE_L4?  It definitely seems more straightforward
and cleaner that way.

Thanks.
Amir Vadai Dec. 2, 2016, 10:40 a.m. UTC | #2
On Thu, Dec 01, 2016 at 02:41:14PM -0500, David Miller wrote:
> From: Amir Vadai <amir@vadai.me>
> Date: Wed, 30 Nov 2016 11:09:27 +0200
> 
> > @@ -119,18 +119,45 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> >  	return true;
> >  }
> >  
> > +static int pedit_skb_hdr_offset(struct sk_buff *skb,
> > +				enum pedit_header_type htype, int *hoffset)
> > +{
> > +	int ret = -1;
> > +
> > +	switch (htype) {
> > +	case PEDIT_HDR_TYPE_ETH:
> > +		if (skb_mac_header_was_set(skb)) {
> > +			*hoffset = skb_mac_offset(skb);
> > +			ret = 0;
> > +		}
> > +		break;
> > +	case PEDIT_HDR_TYPE_RAW:
> > +	case PEDIT_HDR_TYPE_IP4:
> > +	case PEDIT_HDR_TYPE_IP6:
> > +		*hoffset = skb_network_offset(skb);
> > +		ret = 0;
> > +		break;
> > +	case PEDIT_HDR_TYPE_TCP:
> > +	case PEDIT_HDR_TYPE_UDP:
> > +		if (skb_transport_header_was_set(skb)) {
> > +			*hoffset = skb_transport_offset(skb);
> > +			ret = 0;
> > +		}
> > +		break;
> > +	};
> > +
> > +	return ret;
> > +}
> > +
> 
> The only distinction between the cases is "L2", "L3", and "L4".
> 
> Therefore I don't see any reason to break it down into IP4 vs. IP6 vs.
> RAW, for example.  They all map to the same thing.
> 
> So why not just have PEDIT_HDR_TYPE_L2, PEDIT_HDR_TYPE_L3, and
> PEDIT_HDR_TYPE_L4?  It definitely seems more straightforward
> and cleaner that way.
Yeh, is isn't by mistake. The next step will be to implement hardware
offloading of the action, and for that we would like to keep the
information about the specific header type.

> 
> Thanks.
Or Gerlitz Dec. 4, 2016, 9:55 p.m. UTC | #3
On Fri, Dec 2, 2016 at 12:40 PM, Amir Vadai <amir@vadai.me> wrote:
> On Thu, Dec 01, 2016 at 02:41:14PM -0500, David Miller wrote:
>> From: Amir Vadai <amir@vadai.me>
>> Date: Wed, 30 Nov 2016 11:09:27 +0200

>> > +static int pedit_skb_hdr_offset(struct sk_buff *skb,
>> > +                           enum pedit_header_type htype, int *hoffset)
>> > +{
>> > +   int ret = -1;
>> > +
>> > +   switch (htype) {
>> > +   case PEDIT_HDR_TYPE_ETH:
>> > +           if (skb_mac_header_was_set(skb)) {
>> > +                   *hoffset = skb_mac_offset(skb);
>> > +                   ret = 0;
>> > +           }
>> > +           break;
>> > +   case PEDIT_HDR_TYPE_RAW:
>> > +   case PEDIT_HDR_TYPE_IP4:
>> > +   case PEDIT_HDR_TYPE_IP6:
>> > +           *hoffset = skb_network_offset(skb);
>> > +           ret = 0;
>> > +           break;
>> > +   case PEDIT_HDR_TYPE_TCP:
>> > +   case PEDIT_HDR_TYPE_UDP:
>> > +           if (skb_transport_header_was_set(skb)) {
>> > +                   *hoffset = skb_transport_offset(skb);
>> > +                   ret = 0;
>> > +           }
>> > +           break;
>> > +   };
>> > +
>> > +   return ret;
>> > +}
>> > +

>> The only distinction between the cases is "L2", "L3", and "L4".

>> Therefore I don't see any reason to break it down into IP4 vs. IP6 vs.
>> RAW, for example.  They all map to the same thing.

>> So why not just have PEDIT_HDR_TYPE_L2, PEDIT_HDR_TYPE_L3, and
>> PEDIT_HDR_TYPE_L4?  It definitely seems more straightforward
>> and cleaner that way.

> Yeh, is isn't by mistake. The next step will be to implement hardware
> offloading of the action, and for that we would like to keep the
> information about the specific header type.

Hi Dave,

I see that this patch is marked as "Changes Requested" @ your patchworks.

Just wanted to make a note as Amir explained here and as mentioned on
the change log, this was done in purpose, as heads up for HW offloads.
Typically HW APIs would let you do things also based on header type
they have parsed, etc, so that's why we added this small redundancy
e.g of IPv4/IPv6 header ID instead of network header ID - while SW
wise both IPv4/IPv6 are using the same code path, for HW offloads, the
HW driver could choose to use the IPv4/IPv6 header ID info.

Or.
diff mbox

Patch

diff --git a/include/uapi/linux/tc_act/tc_pedit.h b/include/uapi/linux/tc_act/tc_pedit.h
index 6389959a5157..604e6729ad38 100644
--- a/include/uapi/linux/tc_act/tc_pedit.h
+++ b/include/uapi/linux/tc_act/tc_pedit.h
@@ -32,4 +32,21 @@  struct tc_pedit_sel {
 };
 #define tc_pedit tc_pedit_sel
 
+#define PEDIT_TYPE_SHIFT 24
+#define PEDIT_TYPE_MASK 0xff
+
+#define PEDIT_TYPE_GET(_val) \
+	(((_val) >> PEDIT_TYPE_SHIFT) & PEDIT_TYPE_MASK)
+#define PEDIT_SHIFT_GET(_val) ((_val) & 0xff)
+
+enum pedit_header_type {
+	PEDIT_HDR_TYPE_RAW = 0,
+
+	PEDIT_HDR_TYPE_ETH = 1,
+	PEDIT_HDR_TYPE_IP4 = 2,
+	PEDIT_HDR_TYPE_IP6 = 3,
+	PEDIT_HDR_TYPE_TCP = 4,
+	PEDIT_HDR_TYPE_UDP = 5,
+};
+
 #endif
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index b27c4daec88f..4b9c7184c752 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -119,18 +119,45 @@  static bool offset_valid(struct sk_buff *skb, int offset)
 	return true;
 }
 
+static int pedit_skb_hdr_offset(struct sk_buff *skb,
+				enum pedit_header_type htype, int *hoffset)
+{
+	int ret = -1;
+
+	switch (htype) {
+	case PEDIT_HDR_TYPE_ETH:
+		if (skb_mac_header_was_set(skb)) {
+			*hoffset = skb_mac_offset(skb);
+			ret = 0;
+		}
+		break;
+	case PEDIT_HDR_TYPE_RAW:
+	case PEDIT_HDR_TYPE_IP4:
+	case PEDIT_HDR_TYPE_IP6:
+		*hoffset = skb_network_offset(skb);
+		ret = 0;
+		break;
+	case PEDIT_HDR_TYPE_TCP:
+	case PEDIT_HDR_TYPE_UDP:
+		if (skb_transport_header_was_set(skb)) {
+			*hoffset = skb_transport_offset(skb);
+			ret = 0;
+		}
+		break;
+	};
+
+	return ret;
+}
+
 static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
 		     struct tcf_result *res)
 {
 	struct tcf_pedit *p = to_pedit(a);
 	int i;
-	unsigned int off;
 
 	if (skb_unclone(skb, GFP_ATOMIC))
 		return p->tcf_action;
 
-	off = skb_network_offset(skb);
-
 	spin_lock(&p->tcf_lock);
 
 	tcf_lastuse_update(&p->tcf_tm);
@@ -141,20 +168,32 @@  static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
 		for (i = p->tcfp_nkeys; i > 0; i--, tkey++) {
 			u32 *ptr, _data;
 			int offset = tkey->off;
+			int hoffset;
+			int rc;
+			enum pedit_header_type htype =
+				PEDIT_TYPE_GET(tkey->shift);
+
+			rc = pedit_skb_hdr_offset(skb, htype, &hoffset);
+			if (rc) {
+				pr_info("tc filter pedit bad header type specified (0x%x)\n",
+					htype);
+				goto bad;
+			}
 
 			if (tkey->offmask) {
 				char *d, _d;
 
-				if (!offset_valid(skb, off + tkey->at)) {
+				if (!offset_valid(skb, hoffset + tkey->at)) {
 					pr_info("tc filter pedit 'at' offset %d out of bounds\n",
-						off + tkey->at);
+						hoffset + tkey->at);
 					goto bad;
 				}
-				d = skb_header_pointer(skb, off + tkey->at, 1,
-						       &_d);
+				d = skb_header_pointer(skb,
+						       hoffset + tkey->at,
+						       1, &_d);
 				if (!d)
 					goto bad;
-				offset += (*d & tkey->offmask) >> tkey->shift;
+				offset += (*d & tkey->offmask) >> PEDIT_SHIFT_GET(tkey->shift);
 			}
 
 			if (offset % 4) {
@@ -163,19 +202,21 @@  static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
 				goto bad;
 			}
 
-			if (!offset_valid(skb, off + offset)) {
+			if (!offset_valid(skb, hoffset + offset)) {
 				pr_info("tc filter pedit offset %d out of bounds\n",
-					offset);
+					hoffset + offset);
 				goto bad;
 			}
 
-			ptr = skb_header_pointer(skb, off + offset, 4, &_data);
+			ptr = skb_header_pointer(skb,
+						 hoffset + offset,
+						 4, &_data);
 			if (!ptr)
 				goto bad;
 			/* just do it, baby */
 			*ptr = ((*ptr & tkey->mask) ^ tkey->val);
 			if (ptr == &_data)
-				skb_store_bits(skb, off + offset, ptr, 4);
+				skb_store_bits(skb, hoffset + offset, ptr, 4);
 		}
 
 		goto done;