Message ID | AANLkTimynbhgo2dqFwKOc8NT9TXpkCtCJMvwkySEBORz@mail.gmail.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
Le mercredi 01 décembre 2010 à 15:36 +0800, Changli Gao a écrit : > On Wed, Dec 1, 2010 at 11:48 AM, Rui <wirelesser@gmail.com> wrote: > > one more question is > > > > if RPS can spread the load into 4 separate cpus, how about the > > "packet_rcv(or tpacket_rcv)" ? will they run in parallel? > > > > You mentioned RPS. But the current bpf doesn't have an instruction to > get the current CPU number. You can try this patch attached. > > Maybe we can leverage the bpf and SO_REUSEPORT to direct the traffic > to the socket instance on the local CPU. > Oh well, it seems you read over my neck, I was preparing a patch with SKF_AD_RXHASH and SKF_AD_CPU -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 1, 2010 at 3:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mercredi 01 décembre 2010 à 15:36 +0800, Changli Gao a écrit : > > Oh well, it seems you read over my neck, I was preparing a patch with > SKF_AD_RXHASH and SKF_AD_CPU > > Nice to hear it. :) There are too many filters: bpf, iptables and tc. Maybe an unified one such as nft is needed. Then the duplicate code would be reduced. Maybe it is just a good dream.
Le mercredi 01 décembre 2010 à 15:59 +0800, Changli Gao a écrit : > On Wed, Dec 1, 2010 at 3:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > Le mercredi 01 décembre 2010 à 15:36 +0800, Changli Gao a écrit : > > > > Oh well, it seems you read over my neck, I was preparing a patch with > > SKF_AD_RXHASH and SKF_AD_CPU > > > > > > Nice to hear it. :) > > There are too many filters: bpf, iptables and tc. Maybe an unified one > such as nft is needed. Then the duplicate code would be reduced. Maybe > it is just a good dream. > You forgot the rxhash function as well, it would be nice to augment it with custom code if necessary. A dream would be to have a native compiler, and not interpret pseudo code... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 1, 2010 at 4:09 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > A dream would be to have a native compiler, and not interpret pseudo > code... > FYI: FreeBSD's BPF implementation have JIT compilers in kernel on both i386 and amd64.
Le mercredi 01 décembre 2010 à 16:15 +0800, Changli Gao a écrit : > On Wed, Dec 1, 2010 at 4:09 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > A dream would be to have a native compiler, and not interpret pseudo > > code... > > > > FYI: FreeBSD's BPF implementation have JIT compilers in kernel on both > i386 and amd64. > IMHO, a better pcap optimizer would be the first step. If you take a look at their generated code, its not a real win over the code we currently have. Really. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 01 Dec 2010 09:42:47 +0100, Eric Dumazet wrote:
> IMHO, a better pcap optimizer would be the first step.
+1
Optimizing complex filter rules is step one in the process of optimizing
the packet processing. A JIT compiler like FreeBSD provides cannot polish a
(pcap)turd. I thought Patrick was working on a generic filter mechanism for
netfilter!? ... ;)
Hagen
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Hagen Paul Pfeifer <hagen@jauu.net> Date: Wed, 01 Dec 2010 18:22:48 +0100 > On Wed, 01 Dec 2010 09:42:47 +0100, Eric Dumazet wrote: > >> IMHO, a better pcap optimizer would be the first step. > > +1 > > Optimizing complex filter rules is step one in the process of optimizing > the packet processing. A JIT compiler like FreeBSD provides cannot polish a > (pcap)turd. I thought Patrick was working on a generic filter mechanism for > netfilter!? ... ;) Yes, and we spoke at the netfilter workshop about making that interpreter available to socket filters and the packet classifier layer. However, I think it's still valuable to write a few JIT compilers for the existing BPF stuff. I considered working on a sparc64 JIT just to see what it would look like. If people work on the BPF optimizer and BPF JITs in parallel, we'll have both ready at the same time. win++ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Miller <davem@davemloft.net> Date: Wed, 01 Dec 2010 10:18:09 -0800 (PST) > If people work on the BPF optimizer and BPF JITs in parallel, we'll have > both ready at the same time. win++ BTW, the JIT is non-trivial work for us because of non-linear SKBs. We'd need some kind of helper stub or similar, with a straight line fast path for the linear case. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mercredi 01 décembre 2010 à 10:18 -0800, David Miller a écrit : > From: Hagen Paul Pfeifer <hagen@jauu.net> > Date: Wed, 01 Dec 2010 18:22:48 +0100 > > > On Wed, 01 Dec 2010 09:42:47 +0100, Eric Dumazet wrote: > > > >> IMHO, a better pcap optimizer would be the first step. > > > > +1 > > > > Optimizing complex filter rules is step one in the process of optimizing > > the packet processing. A JIT compiler like FreeBSD provides cannot polish a > > (pcap)turd. I thought Patrick was working on a generic filter mechanism for > > netfilter!? ... ;) > > Yes, and we spoke at the netfilter workshop about making that interpreter > available to socket filters and the packet classifier layer. > > However, I think it's still valuable to write a few JIT compilers for > the existing BPF stuff. I considered working on a sparc64 JIT just to > see what it would look like. > > If people work on the BPF optimizer and BPF JITs in parallel, we'll have > both ready at the same time. win++ A third work in progress (from my side) is to add a check in sk_chk_filter() to remove the memvalid we added lately to protect the LOAD M(K). It is needed anyway for arches without a BPF JIT :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 01 Dec 2010 19:24:53 +0100 > A third work in progress (from my side) is to add a check in > sk_chk_filter() to remove the memvalid we added lately to protect the > LOAD M(K). I understand your idea, but the static checkers are still going to complain. So better add a huge comment in sk_run_filter() explaining why the checker's complaint should be ignored :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mercredi 01 décembre 2010 à 10:18 -0800, David Miller a écrit : > However, I think it's still valuable to write a few JIT compilers for > the existing BPF stuff. I considered working on a sparc64 JIT just to > see what it would look like. > > If people work on the BPF optimizer and BPF JITs in parallel, we'll have > both ready at the same time. win++ I began work on implementing a BPF JIT for x86_64 My plan is to use external helpers to load skb data/metadata, to keep BPF program very short and have no dependencies against struct layouts. These helpers would be the three load_word, load_half, load_byte. In case the bits are in skb head, these helpers should be fast. For practical reasons, they would be in ASM for their fast path, and C for the slow path. They are ASM because they are able to perform the shortcut (in case of error, doing the stack unwind to perform the "return 0;") so that we dont have to test their return from the JIT program. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/filter.h b/include/linux/filter.h index 447a775..35db44a 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -124,7 +124,8 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ #define SKF_AD_MARK 20 #define SKF_AD_QUEUE 24 #define SKF_AD_HATYPE 28 -#define SKF_AD_MAX 32 +#define SKF_AD_CPU 32 +#define SKF_AD_MAX 36 #define SKF_NET_OFF (-0x100000) #define SKF_LL_OFF (-0x200000) diff --git a/net/core/filter.c b/net/core/filter.c index a44d27f..3baa3f7 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -410,6 +410,9 @@ load_b: A = 0; continue; } + case SKF_AD_CPU: + A = raw_smp_processor_id(); + continue; default: return 0; }