multi bpf filter will impact performance?

Message ID	AANLkTimynbhgo2dqFwKOc8NT9TXpkCtCJMvwkySEBORz@mail.gmail.com
State	Superseded, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=NSNp15JHMzSRW3l0VxybziaNBp6fuzIC7iSjgEvZE/VB+ki1+/e7KmxnunBV7LePUG pOcBXWET/4oDz91A0ccsgl2hPJhjqcV3HcMCFWMKYWl+4ijm4PjyhCOkd8S+Jk8/wjTr 6pMLYBXjywK5XaGpKAJyYjZTeMd3FYUz4102Q= MIME-Version: 1.0 In-Reply-To: <AANLkTi=-iuGJDzELJpu26KtKr77uDx0xNCc8ibEidJ4h@mail.gmail.com> References: <AANLkTikx68M43+vv+Rav_HCJMJnuc15TtuBgmbv2xP=U@mail.gmail.com> <1291109699.2904.11.camel@edumazet-laptop> <AANLkTi=VpmnrXTBNV7McQm6mq9ULT7KTKbM8_hLPoL=2@mail.gmail.com> <1291127670.2904.96.camel@edumazet-laptop> <AANLkTi=-iuGJDzELJpu26KtKr77uDx0xNCc8ibEidJ4h@mail.gmail.com> From: Changli Gao <xiaosuo@gmail.com> Date: Wed, 1 Dec 2010 15:36:13 +0800 Message-ID: <AANLkTimynbhgo2dqFwKOc8NT9TXpkCtCJMvwkySEBORz@mail.gmail.com> Subject: Re: multi bpf filter will impact performance? To: Rui <wirelesser@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com>, netdev@vger.kernel.org Content-Type: multipart/mixed; boundary=001636c5b5a251faba0496545d60 Sender: netdev-owner@vger.kernel.org Precedence: bulk

Changli Gao Dec. 1, 2010, 7:36 a.m. UTC

On Wed, Dec 1, 2010 at 11:48 AM, Rui <wirelesser@gmail.com> wrote:
> one more question is
>
> if  RPS can spread the load into 4 separate cpus, how about the
> "packet_rcv(or tpacket_rcv)" ? will they run in parallel?
>

You mentioned RPS. But the current bpf doesn't have an instruction to
get the current CPU number. You can try this patch attached.

Maybe we can leverage the bpf and SO_REUSEPORT to direct the traffic
to the socket instance on the local CPU.

Eric Dumazet Dec. 1, 2010, 7:47 a.m. UTC | #1

Le mercredi 01 décembre 2010 à 15:36 +0800, Changli Gao a écrit :
> On Wed, Dec 1, 2010 at 11:48 AM, Rui <wirelesser@gmail.com> wrote:
> > one more question is
> >
> > if  RPS can spread the load into 4 separate cpus, how about the
> > "packet_rcv(or tpacket_rcv)" ? will they run in parallel?
> >
> 
> You mentioned RPS. But the current bpf doesn't have an instruction to
> get the current CPU number. You can try this patch attached.
> 
> Maybe we can leverage the bpf and SO_REUSEPORT to direct the traffic
> to the socket instance on the local CPU.
> 

Oh well, it seems you read over my neck, I was preparing a patch with
SKF_AD_RXHASH and SKF_AD_CPU




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Changli Gao Dec. 1, 2010, 7:59 a.m. UTC | #2

On Wed, Dec 1, 2010 at 3:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mercredi 01 décembre 2010 à 15:36 +0800, Changli Gao a écrit :
>
> Oh well, it seems you read over my neck, I was preparing a patch with
> SKF_AD_RXHASH and SKF_AD_CPU
>
>

Nice to hear it. :)

There are too many filters: bpf, iptables and tc. Maybe an unified one
such as nft is needed. Then the duplicate code would be reduced. Maybe
it is just a good dream.

Eric Dumazet Dec. 1, 2010, 8:09 a.m. UTC | #3

Le mercredi 01 décembre 2010 à 15:59 +0800, Changli Gao a écrit :
> On Wed, Dec 1, 2010 at 3:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Le mercredi 01 décembre 2010 à 15:36 +0800, Changli Gao a écrit :
> >
> > Oh well, it seems you read over my neck, I was preparing a patch with
> > SKF_AD_RXHASH and SKF_AD_CPU
> >
> >
> 
> Nice to hear it. :)
> 
> There are too many filters: bpf, iptables and tc. Maybe an unified one
> such as nft is needed. Then the duplicate code would be reduced. Maybe
> it is just a good dream.
> 

You forgot the rxhash function as well, it would be nice to augment it
with custom code if necessary.

A dream would be to have a native compiler, and not interpret pseudo
code...



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Changli Gao Dec. 1, 2010, 8:15 a.m. UTC | #4

On Wed, Dec 1, 2010 at 4:09 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> A dream would be to have a native compiler, and not interpret pseudo
> code...
>

FYI: FreeBSD's BPF implementation have JIT compilers in kernel on both
i386 and amd64.

Eric Dumazet Dec. 1, 2010, 8:42 a.m. UTC | #5

Le mercredi 01 décembre 2010 à 16:15 +0800, Changli Gao a écrit :
> On Wed, Dec 1, 2010 at 4:09 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > A dream would be to have a native compiler, and not interpret pseudo
> > code...
> >
> 
> FYI: FreeBSD's BPF implementation have JIT compilers in kernel on both
> i386 and amd64.
> 


IMHO, a better pcap optimizer would be the first step.

If you take a look at their generated code, its not a real win over the
code we currently have.


Really.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hagen Paul Pfeifer Dec. 1, 2010, 5:22 p.m. UTC | #6

On Wed, 01 Dec 2010 09:42:47 +0100, Eric Dumazet wrote:

> IMHO, a better pcap optimizer would be the first step.

+1

Optimizing complex filter rules is step one in the process of optimizing
the packet processing. A JIT compiler like FreeBSD provides cannot polish a
(pcap)turd. I thought Patrick was working on a generic filter mechanism for
netfilter!? ... ;)

Hagen

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Dec. 1, 2010, 6:18 p.m. UTC | #7

From: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Wed, 01 Dec 2010 18:22:48 +0100

> On Wed, 01 Dec 2010 09:42:47 +0100, Eric Dumazet wrote:
> 
>> IMHO, a better pcap optimizer would be the first step.
> 
> +1
> 
> Optimizing complex filter rules is step one in the process of optimizing
> the packet processing. A JIT compiler like FreeBSD provides cannot polish a
> (pcap)turd. I thought Patrick was working on a generic filter mechanism for
> netfilter!? ... ;)

Yes, and we spoke at the netfilter workshop about making that interpreter
available to socket filters and the packet classifier layer.

However, I think it's still valuable to write a few JIT compilers for
the existing BPF stuff.  I considered working on a sparc64 JIT just to
see what it would look like.

If people work on the BPF optimizer and BPF JITs in parallel, we'll have
both ready at the same time.  win++
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Dec. 1, 2010, 6:24 p.m. UTC | #8

From: David Miller <davem@davemloft.net>
Date: Wed, 01 Dec 2010 10:18:09 -0800 (PST)

> If people work on the BPF optimizer and BPF JITs in parallel, we'll have
> both ready at the same time.  win++

BTW, the JIT is non-trivial work for us because of non-linear SKBs.
We'd need some kind of helper stub or similar, with a straight line
fast path for the linear case.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet Dec. 1, 2010, 6:24 p.m. UTC | #9

Le mercredi 01 décembre 2010 à 10:18 -0800, David Miller a écrit :
> From: Hagen Paul Pfeifer <hagen@jauu.net>
> Date: Wed, 01 Dec 2010 18:22:48 +0100
> 
> > On Wed, 01 Dec 2010 09:42:47 +0100, Eric Dumazet wrote:
> > 
> >> IMHO, a better pcap optimizer would be the first step.
> > 
> > +1
> > 
> > Optimizing complex filter rules is step one in the process of optimizing
> > the packet processing. A JIT compiler like FreeBSD provides cannot polish a
> > (pcap)turd. I thought Patrick was working on a generic filter mechanism for
> > netfilter!? ... ;)
> 
> Yes, and we spoke at the netfilter workshop about making that interpreter
> available to socket filters and the packet classifier layer.
> 
> However, I think it's still valuable to write a few JIT compilers for
> the existing BPF stuff.  I considered working on a sparc64 JIT just to
> see what it would look like.
> 
> If people work on the BPF optimizer and BPF JITs in parallel, we'll have
> both ready at the same time.  win++

A third work in progress (from my side) is to add a check in
sk_chk_filter() to remove the memvalid we added lately to protect the
LOAD M(K).

It is needed anyway for arches without a BPF JIT :)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Dec. 1, 2010, 6:44 p.m. UTC | #10

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 01 Dec 2010 19:24:53 +0100

> A third work in progress (from my side) is to add a check in
> sk_chk_filter() to remove the memvalid we added lately to protect the
> LOAD M(K).

I understand your idea, but the static checkers are still going to
complain.  So better add a huge comment in sk_run_filter() explaining
why the checker's complaint should be ignored :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet Dec. 3, 2010, 6:32 a.m. UTC | #11

Le mercredi 01 décembre 2010 à 10:18 -0800, David Miller a écrit :

> However, I think it's still valuable to write a few JIT compilers for
> the existing BPF stuff.  I considered working on a sparc64 JIT just to
> see what it would look like.
> 
> If people work on the BPF optimizer and BPF JITs in parallel, we'll have
> both ready at the same time.  win++

I began work on implementing a BPF JIT for x86_64

My plan is to use external helpers to load skb data/metadata, to keep
BPF program very short and have no dependencies against struct layouts.

These helpers would be the three load_word, load_half, load_byte.

In case the bits are in skb head, these helpers should be fast.

For practical reasons, they would be in ASM for their fast path, and C
for the slow path. They are ASM because they are able to perform the
shortcut (in case of error, doing the stack unwind to perform the
"return 0;") so that we dont have to test their return from the JIT
program.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

multi bpf filter will impact performance?

Commit Message

Comments

Patch