Message ID | 20190513185402.220122-1-sdf@google.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf,1/2] flow_dissector: support FLOW_DISSECTOR_KEY_ETH_ADDRSwith BPF | expand |
On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > If we have a flow dissector BPF program attached to the namespace, > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. I suppose that this is true for a variety of keys? For instance, also FLOW_DISSECTOR_KEY_IPV4_ADDRS. We originally intended BPF flow dissection for all paths except tc_flower. As that catches all the vulnerable cases on the ingress path on the one hand and it is infeasible to support all the flower features, now and future. I think that is the real fix. > > Handle FLOW_DISSECTOR_KEY_ETH_ADDRS before BPF and only if we have > an skb (used by tc-flower only). > > Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") > Signed-off-by: Stanislav Fomichev <sdf@google.com> > --- > net/core/flow_dissector.c | 23 ++++++++++++----------- > 1 file changed, 12 insertions(+), 11 deletions(-) > > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c > index 9ca784c592ac..ba76d9168c8b 100644 > --- a/net/core/flow_dissector.c > +++ b/net/core/flow_dissector.c > @@ -825,6 +825,18 @@ bool __skb_flow_dissect(const struct net *net, > else if (skb->sk) > net = sock_net(skb->sk); > } > + > + if (dissector_uses_key(flow_dissector, > + FLOW_DISSECTOR_KEY_ETH_ADDRS)) { > + struct ethhdr *eth = eth_hdr(skb); Here as well as in the original patch: is it safe to just cast to eth_hdr? In the same file, __skb_flow_dissect_gre does test for (encapsulated) protocol first.
On 05/13, Willem de Bruijn wrote: > On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > If we have a flow dissector BPF program attached to the namespace, > > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. > > I suppose that this is true for a variety of keys? For instance, also > FLOW_DISSECTOR_KEY_IPV4_ADDRS. I though the intent was to support most of the basic stuff (eth/ip/tcp/udp) without any esoteric protocols. Not sure about FLOW_DISSECTOR_KEY_IPV4_ADDRS, looks like we support that (except FLOW_DISSECTOR_KEY_TIPC part). > We originally intended BPF flow dissection for all paths except > tc_flower. As that catches all the vulnerable cases on the ingress > path on the one hand and it is infeasible to support all the > flower features, now and future. I think that is the real fix. Sorry, didn't get what you meant by the real fix. Don't care about tc_flower? Just support a minimal set of features needed by selftests? > > > > Handle FLOW_DISSECTOR_KEY_ETH_ADDRS before BPF and only if we have > > an skb (used by tc-flower only). > > > > Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") > > Signed-off-by: Stanislav Fomichev <sdf@google.com> > > --- > > net/core/flow_dissector.c | 23 ++++++++++++----------- > > 1 file changed, 12 insertions(+), 11 deletions(-) > > > > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c > > index 9ca784c592ac..ba76d9168c8b 100644 > > --- a/net/core/flow_dissector.c > > +++ b/net/core/flow_dissector.c > > @@ -825,6 +825,18 @@ bool __skb_flow_dissect(const struct net *net, > > else if (skb->sk) > > net = sock_net(skb->sk); > > } > > + > > + if (dissector_uses_key(flow_dissector, > > + FLOW_DISSECTOR_KEY_ETH_ADDRS)) { > > + struct ethhdr *eth = eth_hdr(skb); > > Here as well as in the original patch: is it safe to just cast to > eth_hdr? In the same file, __skb_flow_dissect_gre does test for > (encapsulated) protocol first. Good question, I guess the assumption here is that FLOW_DISSECTOR_KEY_ETH_ADDRS is only used by tc_flower and the appropriate checks should be there as well. It's probably better to check skb->proto here though.
On Mon, May 13, 2019 at 5:02 PM Stanislav Fomichev <sdf@fomichev.me> wrote: > > On 05/13, Willem de Bruijn wrote: > > On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > > > If we have a flow dissector BPF program attached to the namespace, > > > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. > > > > I suppose that this is true for a variety of keys? For instance, also > > FLOW_DISSECTOR_KEY_IPV4_ADDRS. > I though the intent was to support most of the basic stuff (eth/ip/tcp/udp) > without any esoteric protocols. Indeed. But this applies both to protocols and the feature set. Both are more limited. > Not sure about FLOW_DISSECTOR_KEY_IPV4_ADDRS, > looks like we support that (except FLOW_DISSECTOR_KEY_TIPC part). Ah, I chose a bad example then. > > We originally intended BPF flow dissection for all paths except > > tc_flower. As that catches all the vulnerable cases on the ingress > > path on the one hand and it is infeasible to support all the > > flower features, now and future. I think that is the real fix. > Sorry, didn't get what you meant by the real fix. > Don't care about tc_flower? Just support a minimal set of features > needed by selftests? I do mean exclude BPF flow dissector (only) for tc_flower, as we cannot guarantee that the BPF program can fully implement the requested feature. > > > > > > > Handle FLOW_DISSECTOR_KEY_ETH_ADDRS before BPF and only if we have > > > an skb (used by tc-flower only). > > > > > > Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") > > > Signed-off-by: Stanislav Fomichev <sdf@google.com> > > > --- > > > net/core/flow_dissector.c | 23 ++++++++++++----------- > > > 1 file changed, 12 insertions(+), 11 deletions(-) > > > > > > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c > > > index 9ca784c592ac..ba76d9168c8b 100644 > > > --- a/net/core/flow_dissector.c > > > +++ b/net/core/flow_dissector.c > > > @@ -825,6 +825,18 @@ bool __skb_flow_dissect(const struct net *net, > > > else if (skb->sk) > > > net = sock_net(skb->sk); > > > } > > > + > > > + if (dissector_uses_key(flow_dissector, > > > + FLOW_DISSECTOR_KEY_ETH_ADDRS)) { > > > + struct ethhdr *eth = eth_hdr(skb); > > > > Here as well as in the original patch: is it safe to just cast to > > eth_hdr? In the same file, __skb_flow_dissect_gre does test for > > (encapsulated) protocol first. > Good question, I guess the assumption here is that > FLOW_DISSECTOR_KEY_ETH_ADDRS is only used by tc_flower and the appropriate > checks should be there as well. > It's probably better to check skb->proto here though. Right, as a mistaken or malicious admin can request it on a non Ethernet device and read garbage.
On Mon, May 13, 2019 at 5:21 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Mon, May 13, 2019 at 5:02 PM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > On 05/13, Willem de Bruijn wrote: > > > On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > > > > > If we have a flow dissector BPF program attached to the namespace, > > > > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. > > > > > > I suppose that this is true for a variety of keys? For instance, also > > > FLOW_DISSECTOR_KEY_IPV4_ADDRS. > > > I though the intent was to support most of the basic stuff (eth/ip/tcp/udp) > > without any esoteric protocols. > > Indeed. But this applies both to protocols and the feature set. Both > are more limited. > > > Not sure about FLOW_DISSECTOR_KEY_IPV4_ADDRS, > > looks like we support that (except FLOW_DISSECTOR_KEY_TIPC part). > > Ah, I chose a bad example then. > > > > We originally intended BPF flow dissection for all paths except > > > tc_flower. As that catches all the vulnerable cases on the ingress > > > path on the one hand and it is infeasible to support all the > > > flower features, now and future. I think that is the real fix. > > > Sorry, didn't get what you meant by the real fix. > > Don't care about tc_flower? Just support a minimal set of features > > needed by selftests? > > I do mean exclude BPF flow dissector (only) for tc_flower, as we > cannot guarantee that the BPF program can fully implement the > requested feature. Though, the user inserting the BPF flow dissector is the same as the user inserting the flower program, the (per netns) admin. So arguably is aware of the constraints incurred by BPF flow dissection. And else can still detect when a feature is not supported from the (lack of) output, as in this case of Ethernet address. I don't think we want to mix BPF and non-BPF flow dissection though. That subverts the safety argument of switching to BPF for flow dissection.
On 05/13, Willem de Bruijn wrote: > On Mon, May 13, 2019 at 5:21 PM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > On Mon, May 13, 2019 at 5:02 PM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > > > On 05/13, Willem de Bruijn wrote: > > > > On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > > > > > > > If we have a flow dissector BPF program attached to the namespace, > > > > > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. > > > > > > > > I suppose that this is true for a variety of keys? For instance, also > > > > FLOW_DISSECTOR_KEY_IPV4_ADDRS. > > > > > I though the intent was to support most of the basic stuff (eth/ip/tcp/udp) > > > without any esoteric protocols. > > > > Indeed. But this applies both to protocols and the feature set. Both > > are more limited. > > > > > Not sure about FLOW_DISSECTOR_KEY_IPV4_ADDRS, > > > looks like we support that (except FLOW_DISSECTOR_KEY_TIPC part). > > > > Ah, I chose a bad example then. > > > > > > We originally intended BPF flow dissection for all paths except > > > > tc_flower. As that catches all the vulnerable cases on the ingress > > > > path on the one hand and it is infeasible to support all the > > > > flower features, now and future. I think that is the real fix. > > > > > Sorry, didn't get what you meant by the real fix. > > > Don't care about tc_flower? Just support a minimal set of features > > > needed by selftests? > > > > I do mean exclude BPF flow dissector (only) for tc_flower, as we > > cannot guarantee that the BPF program can fully implement the > > requested feature. > > Though, the user inserting the BPF flow dissector is the same as the > user inserting the flower program, the (per netns) admin. So arguably > is aware of the constraints incurred by BPF flow dissection. And else > can still detect when a feature is not supported from the (lack of) > output, as in this case of Ethernet address. I don't think we want to > mix BPF and non-BPF flow dissection though. That subverts the safety > argument of switching to BPF for flow dissection. Yes, we cannot completely avoid tc_flower because we use it to do the end-to-end testing. That's why I was trying to make sure "basic" stuff works (it might feel confusing that tc_flower {src,dst}_mac stop working with a bpf program installed). TBH, I'd not call this particular piece of code that exports src/dst addresses a dissection. At this point, it's a well-formed skb with a proper l2 header and we just copy the addresses out. It's probably part of the reason the original patch didn't include any skb->protocol checks.
> On 05/13, Willem de Bruijn wrote: > > On Mon, May 13, 2019 at 5:21 PM Willem de Bruijn > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > On Mon, May 13, 2019 at 5:02 PM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > > > > > On 05/13, Willem de Bruijn wrote: > > > > > On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > > > > > > > > > If we have a flow dissector BPF program attached to the namespace, > > > > > > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. > > > > > > > > > > I suppose that this is true for a variety of keys? For instance, also > > > > > FLOW_DISSECTOR_KEY_IPV4_ADDRS. > > > > > > > I though the intent was to support most of the basic stuff (eth/ip/tcp/udp) > > > > without any esoteric protocols. > > > > > > Indeed. But this applies both to protocols and the feature set. Both > > > are more limited. > > > > > > > Not sure about FLOW_DISSECTOR_KEY_IPV4_ADDRS, > > > > looks like we support that (except FLOW_DISSECTOR_KEY_TIPC part). > > > > > > Ah, I chose a bad example then. > > > > > > > > We originally intended BPF flow dissection for all paths except > > > > > tc_flower. As that catches all the vulnerable cases on the ingress > > > > > path on the one hand and it is infeasible to support all the > > > > > flower features, now and future. I think that is the real fix. > > > > > > > Sorry, didn't get what you meant by the real fix. > > > > Don't care about tc_flower? Just support a minimal set of features > > > > needed by selftests? > > > > > > I do mean exclude BPF flow dissector (only) for tc_flower, as we > > > cannot guarantee that the BPF program can fully implement the > > > requested feature. > > > > Though, the user inserting the BPF flow dissector is the same as the > > user inserting the flower program, the (per netns) admin. So arguably > > is aware of the constraints incurred by BPF flow dissection. And else > > can still detect when a feature is not supported from the (lack of) > > output, as in this case of Ethernet address. I don't think we want to > > mix BPF and non-BPF flow dissection though. That subverts the safety > > argument of switching to BPF for flow dissection. > Yes, we cannot completely avoid tc_flower because we use it to do > the end-to-end testing. That's why I was trying to make sure "basic" > stuff works (it might feel confusing that tc_flower {src,dst}_mac > stop working with a bpf program installed). > > TBH, I'd not call this particular piece of code that exports src/dst > addresses a dissection. At this point, it's a well-formed skb with > a proper l2 header and we just copy the addresses out. It's probably > part of the reason the original patch didn't include any skb->protocol > checks. On the other hand, we can probably follow a simple rule: if it's not exported via bpf_flow_keys (and src/dsc mac is not), tc_flower is not supported as well.
From: Stanislav Fomichev <sdf@google.com> Date: Mon, May 13, 2019 at 7:21 PM To: Stanislav Fomichev Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>, Network Development, bpf, David Miller, Alexei Starovoitov, Daniel Borkmann, Willem de Bruijn <willemb@google.com>, Petar Penkov > > On 05/13, Willem de Bruijn wrote: > > > On Mon, May 13, 2019 at 5:21 PM Willem de Bruijn > > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > > > On Mon, May 13, 2019 at 5:02 PM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > > > > > > > On 05/13, Willem de Bruijn wrote: > > > > > > On Mon, May 13, 2019 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > > > > > > > > > > > If we have a flow dissector BPF program attached to the namespace, > > > > > > > FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. > > > > > > > > > > > > I suppose that this is true for a variety of keys? For instance, also > > > > > > FLOW_DISSECTOR_KEY_IPV4_ADDRS. > > > > > > > > > I though the intent was to support most of the basic stuff (eth/ip/tcp/udp) > > > > > without any esoteric protocols. > > > > > > > > Indeed. But this applies both to protocols and the feature set. Both > > > > are more limited. > > > > > > > > > Not sure about FLOW_DISSECTOR_KEY_IPV4_ADDRS, > > > > > looks like we support that (except FLOW_DISSECTOR_KEY_TIPC part). > > > > > > > > Ah, I chose a bad example then. > > > > > > > > > > We originally intended BPF flow dissection for all paths except > > > > > > tc_flower. As that catches all the vulnerable cases on the ingress > > > > > > path on the one hand and it is infeasible to support all the > > > > > > flower features, now and future. I think that is the real fix. > > > > > > > > > Sorry, didn't get what you meant by the real fix. > > > > > Don't care about tc_flower? Just support a minimal set of features > > > > > needed by selftests? > > > > > > > > I do mean exclude BPF flow dissector (only) for tc_flower, as we > > > > cannot guarantee that the BPF program can fully implement the > > > > requested feature. > > > > > > Though, the user inserting the BPF flow dissector is the same as the > > > user inserting the flower program, the (per netns) admin. So arguably > > > is aware of the constraints incurred by BPF flow dissection. And else > > > can still detect when a feature is not supported from the (lack of) > > > output, as in this case of Ethernet address. I don't think we want to > > > mix BPF and non-BPF flow dissection though. That subverts the safety > > > argument of switching to BPF for flow dissection. > > Yes, we cannot completely avoid tc_flower because we use it to do > > the end-to-end testing. That's why I was trying to make sure "basic" > > stuff works (it might feel confusing that tc_flower {src,dst}_mac > > stop working with a bpf program installed). > > > > TBH, I'd not call this particular piece of code that exports src/dst > > addresses a dissection. At this point, it's a well-formed skb with > > a proper l2 header and we just copy the addresses out. It's probably > > part of the reason the original patch didn't include any skb->protocol > > checks. But it is not guaranteed to be an Ethernet link layer device. Making this a good example of why when moving to BPF for safety we should not keep any C dissection code in the path at all. > On the other hand, we can probably follow a simple rule: > if it's not exported via bpf_flow_keys (and src/dsc mac is not), > tc_flower is not supported as well. Agreed. I was using that as point of reference just now, too.
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 9ca784c592ac..ba76d9168c8b 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -825,6 +825,18 @@ bool __skb_flow_dissect(const struct net *net, else if (skb->sk) net = sock_net(skb->sk); } + + if (dissector_uses_key(flow_dissector, + FLOW_DISSECTOR_KEY_ETH_ADDRS)) { + struct ethhdr *eth = eth_hdr(skb); + struct flow_dissector_key_eth_addrs *key_eth_addrs; + + key_eth_addrs = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_ETH_ADDRS, + target_container); + memcpy(key_eth_addrs, ð->h_dest, + sizeof(*key_eth_addrs)); + } } WARN_ON_ONCE(!net); @@ -860,17 +872,6 @@ bool __skb_flow_dissect(const struct net *net, rcu_read_unlock(); } - if (dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_ETH_ADDRS)) { - struct ethhdr *eth = eth_hdr(skb); - struct flow_dissector_key_eth_addrs *key_eth_addrs; - - key_eth_addrs = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_ETH_ADDRS, - target_container); - memcpy(key_eth_addrs, ð->h_dest, sizeof(*key_eth_addrs)); - } - proto_again: fdret = FLOW_DISSECT_RET_CONTINUE;
If we have a flow dissector BPF program attached to the namespace, FLOW_DISSECTOR_KEY_ETH_ADDRS won't trigger because we exit early. Handle FLOW_DISSECTOR_KEY_ETH_ADDRS before BPF and only if we have an skb (used by tc-flower only). Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") Signed-off-by: Stanislav Fomichev <sdf@google.com> --- net/core/flow_dissector.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)