diff mbox

[RFC,1/3] tc: fix return values of ingress qdisc

Message ID 1429644476-8914-2-git-send-email-ast@plumgrid.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Alexei Starovoitov April 21, 2015, 7:27 p.m. UTC
ingress qdisc should return NET_XMIT_* values just like all other qdiscs.

Since it's invoked via qdisc_enqueue_root() (which suppose to return
only NET_XMIT_* values as well), it was working by accident,
since TC_ACT_* values fit within NET_XMIT_MASK.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 net/core/dev.c          |    8 ++------
 net/sched/sch_ingress.c |    9 ++++-----
 2 files changed, 6 insertions(+), 11 deletions(-)

Comments

Cong Wang April 22, 2015, 4:59 a.m. UTC | #1
On Tue, Apr 21, 2015 at 12:27 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> ingress qdisc should return NET_XMIT_* values just like all other qdiscs.
>

XMIT already means egress...

> Since it's invoked via qdisc_enqueue_root() (which suppose to return
> only NET_XMIT_* values as well), it was working by accident,
> since TC_ACT_* values fit within NET_XMIT_MASK.
>

Why not just add a BUILD_BUG_ON() to capture this?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexei Starovoitov April 22, 2015, 10:04 p.m. UTC | #2
On 4/21/15 9:59 PM, Cong Wang wrote:
> On Tue, Apr 21, 2015 at 12:27 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
>> ingress qdisc should return NET_XMIT_* values just like all other qdiscs.
>>
>
> XMIT already means egress...

may be then it should be renamed as well.
from include/linux/netdevice.h:
/* qdisc ->enqueue() return codes. */
#define NET_XMIT_SUCCESS        0x00
...

the point is that qdisc->enqeue() must return NET_XMIT_* values.
ingress qdisc is violating this and therefore should be fixed.

>> Since it's invoked via qdisc_enqueue_root() (which suppose to return
>> only NET_XMIT_* values as well), it was working by accident,
>> since TC_ACT_* values fit within NET_XMIT_MASK.
>>
>
> Why not just add a BUILD_BUG_ON() to capture this?

ingress qdisc returning TC_ACT_* values is an obvious layering
violation. I'm puzzled why it's been this way for so long.
Adding BUILD_BUG_ON is not an option.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cong Wang April 22, 2015, 11:29 p.m. UTC | #3
On Wed, Apr 22, 2015 at 3:04 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> On 4/21/15 9:59 PM, Cong Wang wrote:
>>
>> On Tue, Apr 21, 2015 at 12:27 PM, Alexei Starovoitov <ast@plumgrid.com>
>> wrote:
>>>
>>> ingress qdisc should return NET_XMIT_* values just like all other qdiscs.
>>>
>>
>> XMIT already means egress...
>
>
> may be then it should be renamed as well.
> from include/linux/netdevice.h:
> /* qdisc ->enqueue() return codes. */
> #define NET_XMIT_SUCCESS        0x00
> ...
>
> the point is that qdisc->enqeue() must return NET_XMIT_* values.
> ingress qdisc is violating this and therefore should be fixed.

XMIT is non-sense for ingress, you really need to pick another
name for it if TC_ACT_OK isn't okay for you (it is okay for me).


>
>>> Since it's invoked via qdisc_enqueue_root() (which suppose to return
>>> only NET_XMIT_* values as well), it was working by accident,
>>> since TC_ACT_* values fit within NET_XMIT_MASK.
>>>
>>
>> Why not just add a BUILD_BUG_ON() to capture this?
>
>
> ingress qdisc returning TC_ACT_* values is an obvious layering
> violation. I'm puzzled why it's been this way for so long.

Why? Everyone knows ingress is the only qdisc on ingress
and it has no queue and can only accept filters (actions are on filters).

> Adding BUILD_BUG_ON is not an option.
>

It at least tells people we know their value are same, IOW, not a bug.

I don't see the need for the change except a BUILD_BUG_ON().
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Graf April 23, 2015, 8:46 a.m. UTC | #4
On 04/22/15 at 04:29pm, Cong Wang wrote:
> On Wed, Apr 22, 2015 at 3:04 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> > On 4/21/15 9:59 PM, Cong Wang wrote:
> >>
> >> On Tue, Apr 21, 2015 at 12:27 PM, Alexei Starovoitov <ast@plumgrid.com>
> >> wrote:
> >>>
> >>> ingress qdisc should return NET_XMIT_* values just like all other qdiscs.
> >>>
> >>
> >> XMIT already means egress...
> >
> >
> > may be then it should be renamed as well.
> > from include/linux/netdevice.h:
> > /* qdisc ->enqueue() return codes. */
> > #define NET_XMIT_SUCCESS        0x00
> > ...
> >
> > the point is that qdisc->enqeue() must return NET_XMIT_* values.
> > ingress qdisc is violating this and therefore should be fixed.
> 
> XMIT is non-sense for ingress, you really need to pick another
> name for it if TC_ACT_OK isn't okay for you (it is okay for me).

You transmit into a qdisc. If that terminology doesn't suit
you then rename it to NET_QUEUE_* but moving away from returning
TC_ACT_* is definitely the right thing to do here.
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 1796cef55ab5..ac6233f6f353 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3533,7 +3533,7 @@  static int ing_filter(struct sk_buff *skb, struct netdev_queue *rxq)
 {
 	struct net_device *dev = skb->dev;
 	u32 ttl = G_TC_RTTL(skb->tc_verd);
-	int result = TC_ACT_OK;
+	int result = NET_XMIT_SUCCESS;
 	struct Qdisc *q;
 
 	if (unlikely(MAX_RED_LOOP < ttl++)) {
@@ -3570,12 +3570,8 @@  static inline struct sk_buff *handle_ing(struct sk_buff *skb,
 		*pt_prev = NULL;
 	}
 
-	switch (ing_filter(skb, rxq)) {
-	case TC_ACT_SHOT:
-	case TC_ACT_STOLEN:
-		kfree_skb(skb);
+	if (ing_filter(skb, rxq) == NET_XMIT_DROP)
 		return NULL;
-	}
 
 	return skb;
 }
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 4cdbfb85686a..e68f4a5dbeba 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -65,21 +65,20 @@  static int ingress_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 	result = tc_classify(skb, fl, &res);
 
-	qdisc_bstats_update(sch, skb);
 	switch (result) {
 	case TC_ACT_SHOT:
-		result = TC_ACT_SHOT;
 		qdisc_qstats_drop(sch);
-		break;
 	case TC_ACT_STOLEN:
 	case TC_ACT_QUEUED:
-		result = TC_ACT_STOLEN;
+		result = NET_XMIT_DROP;
+		kfree_skb(skb);
 		break;
 	case TC_ACT_RECLASSIFY:
 	case TC_ACT_OK:
 		skb->tc_index = TC_H_MIN(res.classid);
 	default:
-		result = TC_ACT_OK;
+		qdisc_bstats_update(sch, skb);
+		result = NET_XMIT_SUCCESS;
 		break;
 	}