Message ID | 1481014882-99783-1-git-send-email-nic@opencloud.tech |
---|---|
State | Changes Requested |
Headers | show |
On Tue, Dec 06, 2016 at 01:01:19AM -0800, nickcooper-zhangtonghao wrote: > When the datapath, whose type is "netdev", processes packets > in userspce action, it may cause a segmentation fault. In the > dp_execute_userspace_action(), we pass the "wc" argument to > dp_netdev_upcall() using NULL. In the dp_netdev_upcall() call tree, > the "wc" will be used. For example, dp_netdev_upcall() uses the > &wc->masks for debugging, and flow_wildcards_init_for_packet() > uses the "wc" if we disable megaflow, which is described in > more detail below. > > Segmentation fault in flow_wildcards_init_for_packet: > > #0 0x0000000000468fe8 flow_wildcards_init_for_packet lib/flow.c:1275 > #1 0x0000000000436c0b upcall_cb ofproto/ofproto-dpif-upcall.c:1231 > #2 0x000000000045bd96 dp_netdev_upcall lib/dpif-netdev.c:3857 > #3 0x0000000000461bf3 dp_execute_userspace_action lib/dpif-netdev.c:4388 > #4 dp_execute_cb lib/dpif-netdev.c:4521 > #5 0x0000000000486ae2 odp_execute_actions lib/odp-execute.c:538 > #6 0x00000000004607f9 dp_netdev_execute_actions lib/dpif-netdev.c:4627 > #7 packet_batch_per_flow_execute lib/dpif-netdev.c:3927 > #8 dp_netdev_input__ lib/dpif-netdev.c:4229 > #9 0x0000000000460ba8 dp_netdev_input lib/dpif-netdev.c:4238 > #10 dp_netdev_process_rxq_port lib/dpif-netdev.c:2873 > #11 0x000000000046126e dpif_netdev_run lib/dpif-netdev.c:3000 > #12 0x000000000042baf5 type_run ofproto/ofproto-dpif.c:504 > #13 0x00000000004192bf ofproto_type_run ofproto/ofproto.c:1687 > #14 0x0000000000409965 bridge_run__ vswitchd/bridge.c:2875 > #15 0x000000000040f145 bridge_run vswitchd/bridge.c:2938 > #16 0x00000000004062e5 main vswitchd/ovs-vswitchd.c:111 > > Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> > --- > lib/dpif-netdev.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 1400511..8175b7e 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -4380,11 +4380,12 @@ dp_execute_userspace_action(struct dp_netdev_pmd_thread *pmd, > const struct nlattr *userdata, long long now) > { > struct dp_packet_batch b; > + struct flow_wildcards wc; > int error; > > ofpbuf_clear(actions); > > - error = dp_netdev_upcall(pmd, packet, flow, NULL, ufid, > + error = dp_netdev_upcall(pmd, packet, flow, &wc, ufid, > DPIF_UC_ACTION, userdata, actions, > NULL); > if (!error || error == ENOSPC) { I'm not too familiar with dpif-netdev. However, there's probably some reason that this wasn't done to start with; for example, it might be a performance optimization of some kind. Therefore, it's probably best to at least consider fixing whatever function is doing the null dereference.
2016-12-06 11:27 GMT-08:00 Ben Pfaff <blp@ovn.org>: > On Tue, Dec 06, 2016 at 01:01:19AM -0800, nickcooper-zhangtonghao wrote: >> When the datapath, whose type is "netdev", processes packets >> in userspce action, it may cause a segmentation fault. In the >> dp_execute_userspace_action(), we pass the "wc" argument to >> dp_netdev_upcall() using NULL. In the dp_netdev_upcall() call tree, >> the "wc" will be used. For example, dp_netdev_upcall() uses the >> &wc->masks for debugging, and flow_wildcards_init_for_packet() >> uses the "wc" if we disable megaflow, which is described in >> more detail below. >> >> Segmentation fault in flow_wildcards_init_for_packet: >> >> #0 0x0000000000468fe8 flow_wildcards_init_for_packet lib/flow.c:1275 >> #1 0x0000000000436c0b upcall_cb ofproto/ofproto-dpif-upcall.c:1231 >> #2 0x000000000045bd96 dp_netdev_upcall lib/dpif-netdev.c:3857 >> #3 0x0000000000461bf3 dp_execute_userspace_action lib/dpif-netdev.c:4388 >> #4 dp_execute_cb lib/dpif-netdev.c:4521 >> #5 0x0000000000486ae2 odp_execute_actions lib/odp-execute.c:538 >> #6 0x00000000004607f9 dp_netdev_execute_actions lib/dpif-netdev.c:4627 >> #7 packet_batch_per_flow_execute lib/dpif-netdev.c:3927 >> #8 dp_netdev_input__ lib/dpif-netdev.c:4229 >> #9 0x0000000000460ba8 dp_netdev_input lib/dpif-netdev.c:4238 >> #10 dp_netdev_process_rxq_port lib/dpif-netdev.c:2873 >> #11 0x000000000046126e dpif_netdev_run lib/dpif-netdev.c:3000 >> #12 0x000000000042baf5 type_run ofproto/ofproto-dpif.c:504 >> #13 0x00000000004192bf ofproto_type_run ofproto/ofproto.c:1687 >> #14 0x0000000000409965 bridge_run__ vswitchd/bridge.c:2875 >> #15 0x000000000040f145 bridge_run vswitchd/bridge.c:2938 >> #16 0x00000000004062e5 main vswitchd/ovs-vswitchd.c:111 >> >> Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Thanks for the patch >> --- >> lib/dpif-netdev.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index 1400511..8175b7e 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -4380,11 +4380,12 @@ dp_execute_userspace_action(struct dp_netdev_pmd_thread *pmd, >> const struct nlattr *userdata, long long now) >> { >> struct dp_packet_batch b; >> + struct flow_wildcards wc; >> int error; >> >> ofpbuf_clear(actions); >> >> - error = dp_netdev_upcall(pmd, packet, flow, NULL, ufid, >> + error = dp_netdev_upcall(pmd, packet, flow, &wc, ufid, >> DPIF_UC_ACTION, userdata, actions, >> NULL); >> if (!error || error == ENOSPC) { > > I'm not too familiar with dpif-netdev. However, there's probably some > reason that this wasn't done to start with; for example, it might be a > performance optimization of some kind. Therefore, it's probably best to > at least consider fixing whatever function is doing the null > dereference. I agree, it seems better to handle the case where 'wc' is NULL in upcall_cb(). process_upcall() already handles it for non-miss upcalls, i.e. when coming from an OVS_ACTION_ATTR_USERSPACE datapath action. Also, I think we could add a test, like the following: ---8<--- AT_SETUP([ofproto-dpif - controller action without megaflows]) OVS_VSWITCHD_START add_of_ports br0 1 AT_CHECK([ovs-ofctl add-flow br0 in_port=1,action=controller]) AT_CHECK([ovs-appctl upcall/disable-megaflows], [0], [dnl megaflows disabled ]) AT_CAPTURE_FILE([ofctl_monitor.log]) AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl -P nxt_packet_in --detach --no-chdir --pidfile 2> ofctl_monitor.log]) for i in 1 2; do AT_CHECK([ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x1234)']) done OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 4]) OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*\(packets:\)/\1/' | sed 's/used:[[0-9]].[[0-9]]*s/used:0.001s/'], [0], [dnl flow-dump from non-dpdk interfaces: packets:1, bytes:14, used:0.001s, actions:userspace(pid=0,slow_path(controller)) ]) AT_CHECK([cat ofctl_monitor.log], [0], [dnl NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=14 in_port=1 (via action) data_len=14 (unbuffered) vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,dl_type=0x1234 dnl NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=14 in_port=1 (via action) data_len=14 (unbuffered) vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,dl_type=0x1234 ]) OVS_VSWITCHD_STOP AT_CLEANUP ---8<--- Would you mind sending a v2? Thanks, Daniele
Hi Ben and Daniele, Thanks for your tips. I submitted the patch again. The test has been added. May I also add “Signed-off-by: Daniele Di Proietto <diproiettod@ovn.org>” ? Thanks. Nick > On Dec 7, 2016, at 7:51 AM, Daniele Di Proietto <diproiettod@ovn.org> wrote: > > Thanks for the patch > >>> --- >>> lib/dpif-netdev.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>> index 1400511..8175b7e 100644 >>> --- a/lib/dpif-netdev.c >>> +++ b/lib/dpif-netdev.c >>> @@ -4380,11 +4380,12 @@ dp_execute_userspace_action(struct dp_netdev_pmd_thread *pmd, >>> const struct nlattr *userdata, long long now) >>> { >>> struct dp_packet_batch b; >>> + struct flow_wildcards wc; >>> int error; >>> >>> ofpbuf_clear(actions); >>> >>> - error = dp_netdev_upcall(pmd, packet, flow, NULL, ufid, >>> + error = dp_netdev_upcall(pmd, packet, flow, &wc, ufid, >>> DPIF_UC_ACTION, userdata, actions, >>> NULL); >>> if (!error || error == ENOSPC) { >> >> I'm not too familiar with dpif-netdev. However, there's probably some >> reason that this wasn't done to start with; for example, it might be a >> performance optimization of some kind. Therefore, it's probably best to >> at least consider fixing whatever function is doing the null >> dereference. > > I agree, it seems better to handle the case where 'wc' is NULL in upcall_cb(). > process_upcall() already handles it for non-miss upcalls, i.e. when coming from > an OVS_ACTION_ATTR_USERSPACE datapath action. > > Also, I think we could add a test, like the following:
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 1400511..8175b7e 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4380,11 +4380,12 @@ dp_execute_userspace_action(struct dp_netdev_pmd_thread *pmd, const struct nlattr *userdata, long long now) { struct dp_packet_batch b; + struct flow_wildcards wc; int error; ofpbuf_clear(actions); - error = dp_netdev_upcall(pmd, packet, flow, NULL, ufid, + error = dp_netdev_upcall(pmd, packet, flow, &wc, ufid, DPIF_UC_ACTION, userdata, actions, NULL); if (!error || error == ENOSPC) {
When the datapath, whose type is "netdev", processes packets in userspce action, it may cause a segmentation fault. In the dp_execute_userspace_action(), we pass the "wc" argument to dp_netdev_upcall() using NULL. In the dp_netdev_upcall() call tree, the "wc" will be used. For example, dp_netdev_upcall() uses the &wc->masks for debugging, and flow_wildcards_init_for_packet() uses the "wc" if we disable megaflow, which is described in more detail below. Segmentation fault in flow_wildcards_init_for_packet: #0 0x0000000000468fe8 flow_wildcards_init_for_packet lib/flow.c:1275 #1 0x0000000000436c0b upcall_cb ofproto/ofproto-dpif-upcall.c:1231 #2 0x000000000045bd96 dp_netdev_upcall lib/dpif-netdev.c:3857 #3 0x0000000000461bf3 dp_execute_userspace_action lib/dpif-netdev.c:4388 #4 dp_execute_cb lib/dpif-netdev.c:4521 #5 0x0000000000486ae2 odp_execute_actions lib/odp-execute.c:538 #6 0x00000000004607f9 dp_netdev_execute_actions lib/dpif-netdev.c:4627 #7 packet_batch_per_flow_execute lib/dpif-netdev.c:3927 #8 dp_netdev_input__ lib/dpif-netdev.c:4229 #9 0x0000000000460ba8 dp_netdev_input lib/dpif-netdev.c:4238 #10 dp_netdev_process_rxq_port lib/dpif-netdev.c:2873 #11 0x000000000046126e dpif_netdev_run lib/dpif-netdev.c:3000 #12 0x000000000042baf5 type_run ofproto/ofproto-dpif.c:504 #13 0x00000000004192bf ofproto_type_run ofproto/ofproto.c:1687 #14 0x0000000000409965 bridge_run__ vswitchd/bridge.c:2875 #15 0x000000000040f145 bridge_run vswitchd/bridge.c:2938 #16 0x00000000004062e5 main vswitchd/ovs-vswitchd.c:111 Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> --- lib/dpif-netdev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)