[net] x86: bpf_jit: fix compilation of large bpf programs

Message ID	1432334575-16959-1-git-send-email-ast@plumgrid.com
State	Accepted, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Alexei Starovoitov <ast@plumgrid.com> To: "David S. Miller" <davem@davemloft.net> Cc: Daniel Borkmann <daniel@iogearbox.net>, Eric Dumazet <edumazet@google.com>, netdev@vger.kernel.org Subject: [PATCH net] x86: bpf_jit: fix compilation of large bpf programs Date: Fri, 22 May 2015 15:42:55 -0700 Message-Id: <1432334575-16959-1-git-send-email-ast@plumgrid.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk

Alexei Starovoitov May 22, 2015, 10:42 p.m. UTC

x86 has variable length encoding. x86 JIT compiler is trying
to pick the shortest encoding for given bpf instruction.
While doing so the jump targets are changing, so JIT is doing
multiple passes over the program. Typical program needs 3 passes.
Some very short programs converge with 2 passes. Large programs
may need 4 or 5. But specially crafted bpf programs may hit the
pass limit and if the program converges on the last iteration
the JIT compiler will be producing an image full of 'int 3' insns.
Fix this corner case by doing final iteration over bpf program.

Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
Daniel wrote the 'Edge hopping nuthouse' test case with 4k jump
instructions that managed to trigger this bug.
The test case is nuts and the bug is real.
It's an old bug, but I think worth backporting all the way.
Though this fix will apply cleanly only till commit:
f3c2af7ba17a ("net: filter: x86: split bpf_jit_compile()")
The older kernels should be similar. They have
'for (pass = 0; pass < 10; pass++) {' at the line 153 or so.
and all have similar problem as far as I can see.

 arch/x86/net/bpf_jit_comp.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Daniel Borkmann May 22, 2015, 10:46 p.m. UTC | #1

On 05/23/2015 12:42 AM, Alexei Starovoitov wrote:
> x86 has variable length encoding. x86 JIT compiler is trying
> to pick the shortest encoding for given bpf instruction.
> While doing so the jump targets are changing, so JIT is doing
> multiple passes over the program. Typical program needs 3 passes.
> Some very short programs converge with 2 passes. Large programs
> may need 4 or 5. But specially crafted bpf programs may hit the
> pass limit and if the program converges on the last iteration
> the JIT compiler will be producing an image full of 'int 3' insns.
> Fix this corner case by doing final iteration over bpf program.
>
> Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

LGTM, thanks!

Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller May 25, 2015, 4:19 a.m. UTC | #2

From: Alexei Starovoitov <ast@plumgrid.com>
Date: Fri, 22 May 2015 15:42:55 -0700

> x86 has variable length encoding. x86 JIT compiler is trying
> to pick the shortest encoding for given bpf instruction.
> While doing so the jump targets are changing, so JIT is doing
> multiple passes over the program. Typical program needs 3 passes.
> Some very short programs converge with 2 passes. Large programs
> may need 4 or 5. But specially crafted bpf programs may hit the
> pass limit and if the program converges on the last iteration
> the JIT compiler will be producing an image full of 'int 3' insns.
> Fix this corner case by doing final iteration over bpf program.
> 
> Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Laight May 26, 2015, 1:40 p.m. UTC | #3

From: Alexei Starovoitov
> Sent: 22 May 2015 23:43
> x86 has variable length encoding. x86 JIT compiler is trying
> to pick the shortest encoding for given bpf instruction.
> While doing so the jump targets are changing, so JIT is doing
> multiple passes over the program. Typical program needs 3 passes.
> Some very short programs converge with 2 passes. Large programs
> may need 4 or 5. But specially crafted bpf programs may hit the
> pass limit and if the program converges on the last iteration
> the JIT compiler will be producing an image full of 'int 3' insns.
> Fix this corner case by doing final iteration over bpf program.

If the JIT compiler is only changing the encoding of the constants
in the x86 instructions (rather than changing the instructions themselves)
then there is likely to me an unmeasurable change in the execution time.
For instance I don't remember there being a difference in execution time
between long and short branches - the only difference is the amount of
cache they use.

	David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet May 26, 2015, 2:35 p.m. UTC | #4

On Tue, 2015-05-26 at 13:40 +0000, David Laight wrote:

> If the JIT compiler is only changing the encoding of the constants
> in the x86 instructions (rather than changing the instructions themselves)
> then there is likely to me an unmeasurable change in the execution time.
> For instance I don't remember there being a difference in execution time
> between long and short branches - the only difference is the amount of
> cache they use.

icache is precisely the matter here. In the end, it makes a difference.

You could check this interesting study Ingo did recently :

https://lkml.org/lkml/2015/5/19/1009


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet May 26, 2015, 3:29 p.m. UTC | #5

On Tue, 2015-05-26 at 15:13 +0000, David Laight wrote:

> Yes, interesting, a benchmark that manages to run a lot of code 'cold cache'.

We have binaries here at Google with 400 or 500 MBytes of text.

Not benchmark, super real workloads you know.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Laight May 26, 2015, 3:47 p.m. UTC | #6

From: Eric Dumazet 

> Sent: 26 May 2015 16:30

> 

> > Yes, interesting, a benchmark that manages to run a lot of code 'cold cache'.

> 

> We have binaries here at Google with 400 or 500 MBytes of text.

> 

> Not benchmark, super real workloads you know.


Indeed, and a lot of the code is likely to be running 'cold cache'.

I was alluding to the problem where people will benchmark a small function
by running in 1000s of times in a tight loop with exactly the same data.
Not only is it 'hot cache' but any dynamic branch prediction is 'trained'
to the specific data.

	David

[net] x86: bpf_jit: fix compilation of large bpf programs

Commit Message

Comments

Patch