diff mbox

[net-next] net: sched: run ingress qdisc without locks

Message ID 1430450047-5671-1-git-send-email-ast@plumgrid.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Alexei Starovoitov May 1, 2015, 3:14 a.m. UTC
TC classifiers/actions were converted to RCU by John in the series:
http://thread.gmane.org/gmane.linux.network/329739/focus=329739
and many follow on patches.
This is the last patch from that series that finally drops
ingress spin_lock.

Single cpu ingress+u32 performance goes from 22.9 Mpps to 24.5 Mpps.

In two cpu case when both cores are receiving traffic on the same
device and go into the same ingress+u32 the performance jumps
from 4.5 + 4.5 Mpps to 23.5 + 23.5 Mpps

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
All the credit goes to John. I merely tested and benchmarked.

 net/core/dev.c          |    2 --
 net/sched/sch_ingress.c |    5 +++--
 2 files changed, 3 insertions(+), 4 deletions(-)

Comments

Jamal Hadi Salim May 1, 2015, 12:08 p.m. UTC | #1
On 04/30/15 23:14, Alexei Starovoitov wrote:
> TC classifiers/actions were converted to RCU by John in the series:
> http://thread.gmane.org/gmane.linux.network/329739/focus=329739
> and many follow on patches.
> This is the last patch from that series that finally drops
> ingress spin_lock.
>

As the culprit who added this lock I would like to offer a kudos
to John for the hard work. Alexei thanks for the hard work in
pursuing this further.
And to the rest of the community who has been beating up on me
all these years, you can stop now ;->

I think this at least deserves a:
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann May 1, 2015, 12:41 p.m. UTC | #2
On 05/01/2015 05:14 AM, Alexei Starovoitov wrote:
> TC classifiers/actions were converted to RCU by John in the series:
> http://thread.gmane.org/gmane.linux.network/329739/focus=329739
> and many follow on patches.
> This is the last patch from that series that finally drops
> ingress spin_lock.
>
> Single cpu ingress+u32 performance goes from 22.9 Mpps to 24.5 Mpps.
>
> In two cpu case when both cores are receiving traffic on the same
> device and go into the same ingress+u32 the performance jumps
> from 4.5 + 4.5 Mpps to 23.5 + 23.5 Mpps
>
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Looks good to me, thanks!

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend May 1, 2015, 2:04 p.m. UTC | #3
On 05/01/2015 05:08 AM, Jamal Hadi Salim wrote:
> On 04/30/15 23:14, Alexei Starovoitov wrote:
>> TC classifiers/actions were converted to RCU by John in the series:
>> http://thread.gmane.org/gmane.linux.network/329739/focus=329739
>> and many follow on patches.
>> This is the last patch from that series that finally drops
>> ingress spin_lock.
>>
> 
> As the culprit who added this lock I would like to offer a kudos
> to John for the hard work. Alexei thanks for the hard work in
> pursuing this further.
> And to the rest of the community who has been beating up on me
> all these years, you can stop now ;->

Well there is still the TX side, although not as obvious because
of mq and mqprio. Hopefully by say next LPC we can remove the
qdisc lock there in many scenarios. Also the netif_tx_lock is
there as well but many (most?) of the 10G/40G/etc devices can
create per core descriptor rings. I'll start prodding away on that
side shortly I suppose.

> 
> I think this at least deserves a:
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> cheers,
> jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexei Starovoitov May 1, 2015, 5:32 p.m. UTC | #4
On 5/1/15 7:04 AM, John Fastabend wrote:
>
> Well there is still the TX side, although not as obvious because
> of mq and mqprio. Hopefully by say next LPC we can remove the
> qdisc lock there in many scenarios. Also the netif_tx_lock is
> there as well but many (most?) of the 10G/40G/etc devices can
> create per core descriptor rings. I'll start prodding away on that
> side shortly I suppose.

that will be massive! 'per core descriptor ring' sounds like new
driver API. Looking forward to your work :)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 4, 2015, 3:42 a.m. UTC | #5
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Thu, 30 Apr 2015 20:14:07 -0700

> TC classifiers/actions were converted to RCU by John in the series:
> http://thread.gmane.org/gmane.linux.network/329739/focus=329739
> and many follow on patches.
> This is the last patch from that series that finally drops
> ingress spin_lock.
> 
> Single cpu ingress+u32 performance goes from 22.9 Mpps to 24.5 Mpps.
> 
> In two cpu case when both cores are receiving traffic on the same
> device and go into the same ingress+u32 the performance jumps
> from 4.5 + 4.5 Mpps to 23.5 + 23.5 Mpps
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 97a15ae8d07a..862875ec8f2f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3538,10 +3538,8 @@  static int ing_filter(struct sk_buff *skb, struct netdev_queue *rxq)
 
 	q = rcu_dereference(rxq->qdisc);
 	if (q != &noop_qdisc) {
-		spin_lock(qdisc_lock(q));
 		if (likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
 			result = qdisc_enqueue_root(skb, q);
-		spin_unlock(qdisc_lock(q));
 	}
 
 	return result;
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 4cdbfb85686a..a89cc3278bfb 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -65,11 +65,11 @@  static int ingress_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 	result = tc_classify(skb, fl, &res);
 
-	qdisc_bstats_update(sch, skb);
+	qdisc_bstats_update_cpu(sch, skb);
 	switch (result) {
 	case TC_ACT_SHOT:
 		result = TC_ACT_SHOT;
-		qdisc_qstats_drop(sch);
+		qdisc_qstats_drop_cpu(sch);
 		break;
 	case TC_ACT_STOLEN:
 	case TC_ACT_QUEUED:
@@ -91,6 +91,7 @@  static int ingress_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	net_inc_ingress_queue();
+	sch->flags |= TCQ_F_CPUSTATS;
 
 	return 0;
 }