Message ID | 20230731045523.315568-1-hepeng.0320@bytedance.com |
---|---|
State | Changes Requested |
Delegated to: | Ilya Maximets |
Headers | show |
Series | [ovs-dev,ovs-dev,v3] dpif-netdev: fix dpif_netdev_flow_put | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
Bleep bloop. Greetings Peng He, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Author Peng He <xnhp0320@gmail.com> needs to sign off. WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: Peng He <hepeng.0320@bytedance.com> Lines checked: 186, Warnings: 1, Errors: 1 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
just found that I miss the Fixes tag, will add it in the next post. Peng He <xnhp0320@gmail.com> 于2023年7月31日周一 14:44写道: > OVS allows overlapping megaflows, as long as the actions of these > megaflows are equal. However, the current implementation of action > modification relies on flow_lookup instead of ufid, this could result > in looking up a wrong megaflow and make the ukeys and megaflows > inconsistent > > Just like the test case in the patch, at first we have a rule with the > prefix: > > 10.1.2.0/24 > > and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with IP > 10.1.2.2 is received. > > Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the > 10.1.2.2/24 megaflow and just changes its action instead of extending > the prefix into 10.1.2.2/16. > > then suppose we have a 10.1.0.2 packet, since it misses the megaflow, > this time, we will have an overlapping megaflow with the right prefix: > 10.1.0.2/16 > > now we have two megaflows: > 10.1.2.2/24 > 10.1.0.2/16 > > last, suppose we have changed the ruleset again. The revalidator this > time still decides to change the actions of both megaflows instead of > deleting them. > > The dpif_netdev_flow_put will search the megaflow to modify with unmasked > keys, however it might lookup the wrong megaflow as the key 10.1.2.2 > matches > both 10.1.2.2/24 and 10.1.0.2/16! > > This patch changes the megaflow lookup code in modification path into > relying the ufid to find the correct megaflow instead of key lookup. > > Signed-off-by: Peng He <hepeng.0320@bytedance.com> > --- > lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- > tests/pmd.at | 48 ++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 82 insertions(+), 15 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 70b953ae6..8c88a5dc0 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > const struct dpif_flow_put *put, > struct dpif_flow_stats *stats) > { > - struct dp_netdev_flow *netdev_flow; > + struct dp_netdev_flow *netdev_flow = NULL; > int error = 0; > > if (stats) { > @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > } > > ovs_mutex_lock(&pmd->flow_mutex); > - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > - if (!netdev_flow) { > - if (put->flags & DPIF_FP_CREATE) { > - dp_netdev_flow_add(pmd, match, ufid, put->actions, > - put->actions_len, ODPP_NONE); > + if (put->ufid) { > + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, > + put->key, put->key_len); > + } else { > + /* Use key instead of the locally generated ufid > + * to search netdev_flow. > + */ > + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > + } > + > + if (put->flags & DPIF_FP_CREATE) { > + if (!netdev_flow) { > + /* If ufid is provided, use provided ufid, otherwise > + * use locally generated ufid. > + */ > + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : ufid, > + put->actions, put->actions_len, ODPP_NONE); > } else { > - error = ENOENT; > + error = EEXIST; > } > - } else { > - if (put->flags & DPIF_FP_MODIFY) { > + goto exit; > + } > + > + if (put->flags & DPIF_FP_MODIFY) { > + if (!netdev_flow) { > + error = ENOENT; > + } else { > + if (!flow_equal(&match->flow, &netdev_flow->flow)) { > + /* Overlapping flow. */ > + error = EINVAL; > + goto exit; > + } > + > struct dp_netdev_actions *new_actions; > struct dp_netdev_actions *old_actions; > > @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > * counter, and subtracting it before outputting the > stats */ > error = EOPNOTSUPP; > } > - > ovsrcu_postpone(dp_netdev_actions_free, old_actions); > - } else if (put->flags & DPIF_FP_CREATE) { > - error = EEXIST; > - } else { > - /* Overlapping flow. */ > - error = EINVAL; > } > } > + > +exit: > ovs_mutex_unlock(&pmd->flow_mutex); > return error; > } > diff --git a/tests/pmd.at b/tests/pmd.at > index 48f3d432d..f399294d2 100644 > --- a/tests/pmd.at > +++ b/tests/pmd.at > @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log > | grep "PMD load based sleeps > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) > + > +OVS_VSWITCHD_START( > +[add-port br0 p1 \ > + -- set interface p1 type=dummy-pmd \ > + -- set bridge br0 datapath-type=dummy \ > + -- add-port br0 p2 \ > + -- set interface p2 type=dummy-pmd -- > +]) > + > +dnl Add one openflow rule and generate a megaflow. > +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst= > 10.1.2.0/24,actions=p2']) > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > + > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [0], > [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2 > +]) > + > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > +dnl Replace openflow rules, trigger the revalidation. > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 > actions=ct(commit)' | dnl > +ovs-ofctl --bundle replace-flows br0 -]) > +AT_CHECK([ovs-appctl revalidator/wait]) > + > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | > strip_xout_keep_actions], [0], [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, > actions:ct(commit) > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, > actions:ct(commit) > +]) > + > +dnl Hold the prefix 10.1.2.2/24 by another 10s. > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 > tuple in the pvector of subtables. > +for i in `seq 0 256`;do > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > +done > + > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | > dnl > +ovs-ofctl --bundle replace-flows br0 -]) > + > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | > strip_xout_keep_actions], [0], [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 > +]) > + > +OVS_VSWITCHD_STOP > +AT_CLEANUP > -- > 2.25.1 > >
Hi, Ilya, are there any more comments beside the Fixes tag? Peng He <xnhp0320@gmail.com> 于2023年7月31日周一 20:53写道: > just found that I miss the Fixes tag, will add it in the next post. > > Peng He <xnhp0320@gmail.com> 于2023年7月31日周一 14:44写道: > >> OVS allows overlapping megaflows, as long as the actions of these >> megaflows are equal. However, the current implementation of action >> modification relies on flow_lookup instead of ufid, this could result >> in looking up a wrong megaflow and make the ukeys and megaflows >> inconsistent >> >> Just like the test case in the patch, at first we have a rule with the >> prefix: >> >> 10.1.2.0/24 >> >> and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with >> IP >> 10.1.2.2 is received. >> >> Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the >> 10.1.2.2/24 megaflow and just changes its action instead of extending >> the prefix into 10.1.2.2/16. >> >> then suppose we have a 10.1.0.2 packet, since it misses the megaflow, >> this time, we will have an overlapping megaflow with the right prefix: >> 10.1.0.2/16 >> >> now we have two megaflows: >> 10.1.2.2/24 >> 10.1.0.2/16 >> >> last, suppose we have changed the ruleset again. The revalidator this >> time still decides to change the actions of both megaflows instead of >> deleting them. >> >> The dpif_netdev_flow_put will search the megaflow to modify with unmasked >> keys, however it might lookup the wrong megaflow as the key 10.1.2.2 >> matches >> both 10.1.2.2/24 and 10.1.0.2/16! >> >> This patch changes the megaflow lookup code in modification path into >> relying the ufid to find the correct megaflow instead of key lookup. >> >> Signed-off-by: Peng He <hepeng.0320@bytedance.com> >> --- >> lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- >> tests/pmd.at | 48 ++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 82 insertions(+), 15 deletions(-) >> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index 70b953ae6..8c88a5dc0 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, >> const struct dpif_flow_put *put, >> struct dpif_flow_stats *stats) >> { >> - struct dp_netdev_flow *netdev_flow; >> + struct dp_netdev_flow *netdev_flow = NULL; >> int error = 0; >> >> if (stats) { >> @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, >> } >> >> ovs_mutex_lock(&pmd->flow_mutex); >> - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); >> - if (!netdev_flow) { >> - if (put->flags & DPIF_FP_CREATE) { >> - dp_netdev_flow_add(pmd, match, ufid, put->actions, >> - put->actions_len, ODPP_NONE); >> + if (put->ufid) { >> + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, >> + put->key, put->key_len); >> + } else { >> + /* Use key instead of the locally generated ufid >> + * to search netdev_flow. >> + */ >> + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); >> + } >> + >> + if (put->flags & DPIF_FP_CREATE) { >> + if (!netdev_flow) { >> + /* If ufid is provided, use provided ufid, otherwise >> + * use locally generated ufid. >> + */ >> + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : ufid, >> + put->actions, put->actions_len, >> ODPP_NONE); >> } else { >> - error = ENOENT; >> + error = EEXIST; >> } >> - } else { >> - if (put->flags & DPIF_FP_MODIFY) { >> + goto exit; >> + } >> + >> + if (put->flags & DPIF_FP_MODIFY) { >> + if (!netdev_flow) { >> + error = ENOENT; >> + } else { >> + if (!flow_equal(&match->flow, &netdev_flow->flow)) { >> + /* Overlapping flow. */ >> + error = EINVAL; >> + goto exit; >> + } >> + >> struct dp_netdev_actions *new_actions; >> struct dp_netdev_actions *old_actions; >> >> @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, >> * counter, and subtracting it before outputting the >> stats */ >> error = EOPNOTSUPP; >> } >> - >> ovsrcu_postpone(dp_netdev_actions_free, old_actions); >> - } else if (put->flags & DPIF_FP_CREATE) { >> - error = EEXIST; >> - } else { >> - /* Overlapping flow. */ >> - error = EINVAL; >> } >> } >> + >> +exit: >> ovs_mutex_unlock(&pmd->flow_mutex); >> return error; >> } >> diff --git a/tests/pmd.at b/tests/pmd.at >> index 48f3d432d..f399294d2 100644 >> --- a/tests/pmd.at >> +++ b/tests/pmd.at >> @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log >> | grep "PMD load based sleeps >> >> OVS_VSWITCHD_STOP >> AT_CLEANUP >> + >> +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) >> + >> +OVS_VSWITCHD_START( >> +[add-port br0 p1 \ >> + -- set interface p1 type=dummy-pmd \ >> + -- set bridge br0 datapath-type=dummy \ >> + -- add-port br0 p2 \ >> + -- set interface p2 type=dummy-pmd -- >> +]) >> + >> +dnl Add one openflow rule and generate a megaflow. >> +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst= >> 10.1.2.0/24,actions=p2']) >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 >> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) >> + >> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], >> [0], [ >> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= >> 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, >> actions:2 >> +]) >> + >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 >> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) >> +dnl Replace openflow rules, trigger the revalidation. >> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 >> actions=ct(commit)' | dnl >> +ovs-ofctl --bundle replace-flows br0 -]) >> +AT_CHECK([ovs-appctl revalidator/wait]) >> + >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 >> 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) >> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | >> strip_xout_keep_actions], [0], [ >> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= >> 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, >> actions:ct(commit) >> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= >> 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, >> actions:ct(commit) >> +]) >> + >> +dnl Hold the prefix 10.1.2.2/24 by another 10s. >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 >> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) >> +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 >> tuple in the pvector of subtables. >> +for i in `seq 0 256`;do >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 >> 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) >> +done >> + >> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | >> dnl >> +ovs-ofctl --bundle replace-flows br0 -]) >> + >> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | >> strip_xout_keep_actions], [0], [ >> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= >> 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 >> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= >> 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 >> +]) >> + >> +OVS_VSWITCHD_STOP >> +AT_CLEANUP >> -- >> 2.25.1 >> >> > > -- > hepeng >
On Mon, Aug 07, 2023 at 11:13:09AM +0800, Peng He wrote: > Hi, Ilya, > > are there any more comments beside the Fixes tag? FWIIW, the tag aside, this looks good to me. Acked-by: Simon Horman <horms@ovn.org>
On 7/31/23 06:55, Peng He wrote: > OVS allows overlapping megaflows, as long as the actions of these > megaflows are equal. However, the current implementation of action > modification relies on flow_lookup instead of ufid, this could result > in looking up a wrong megaflow and make the ukeys and megaflows inconsistent > > Just like the test case in the patch, at first we have a rule with the > prefix: > > 10.1.2.0/24 > > and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with IP > 10.1.2.2 is received. > > Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the > 10.1.2.2/24 megaflow and just changes its action instead of extending > the prefix into 10.1.2.2/16. > > then suppose we have a 10.1.0.2 packet, since it misses the megaflow, > this time, we will have an overlapping megaflow with the right prefix: > 10.1.0.2/16 > > now we have two megaflows: > 10.1.2.2/24 > 10.1.0.2/16 > > last, suppose we have changed the ruleset again. The revalidator this > time still decides to change the actions of both megaflows instead of > deleting them. > > The dpif_netdev_flow_put will search the megaflow to modify with unmasked > keys, however it might lookup the wrong megaflow as the key 10.1.2.2 matches > both 10.1.2.2/24 and 10.1.0.2/16! > > This patch changes the megaflow lookup code in modification path into > relying the ufid to find the correct megaflow instead of key lookup. > > Signed-off-by: Peng He <hepeng.0320@bytedance.com> > --- Thanks for the update! This version looks logically correct to me. See some small comments below. Best regards, Ilya Maximets. > lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- > tests/pmd.at | 48 ++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 82 insertions(+), 15 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 70b953ae6..8c88a5dc0 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > const struct dpif_flow_put *put, > struct dpif_flow_stats *stats) > { > - struct dp_netdev_flow *netdev_flow; > + struct dp_netdev_flow *netdev_flow = NULL; > int error = 0; > > if (stats) { > @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > } > > ovs_mutex_lock(&pmd->flow_mutex); > - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > - if (!netdev_flow) { > - if (put->flags & DPIF_FP_CREATE) { > - dp_netdev_flow_add(pmd, match, ufid, put->actions, > - put->actions_len, ODPP_NONE); > + if (put->ufid) { > + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, > + put->key, put->key_len); > + } else { > + /* Use key instead of the locally generated ufid > + * to search netdev_flow. > + */ Nit: Usually, /* and */ in the comment are on their own lines or both on the lines with the text, i.e. /* * Line 1 * Line 2 */ Or /* Line 1 * Line 2 */ Rarely the style is mixed. In the case above, I'd suggest to move the closing '*/' to the previous line. > + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > + } > + > + if (put->flags & DPIF_FP_CREATE) { > + if (!netdev_flow) { > + /* If ufid is provided, use provided ufid, otherwise > + * use locally generated ufid. > + */ > + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : ufid, This looks abit strange, because ufid equals put->ufid, if provided. I'd remove the comment and just use 'ufid' here without a ternary operator. Or maybe a cleaner solution is to remove 'ufid = *put->ufid;' assignment from dpif_netdev_flow_put() and re-name the argument of flow_put_on_pmd() function from 'ufid' to 'local_ufid' or something like that. In this case we should still remove the comment, but keep the ternary operator. Might be easier to read this way. What do you think? > + put->actions, put->actions_len, ODPP_NONE); > } else { > - error = ENOENT; > + error = EEXIST; > } > - } else { > - if (put->flags & DPIF_FP_MODIFY) { > + goto exit; > + } > + > + if (put->flags & DPIF_FP_MODIFY) { > + if (!netdev_flow) { > + error = ENOENT; > + } else { > + if (!flow_equal(&match->flow, &netdev_flow->flow)) { Should this be (!put->ufid && !flow_equal(&match->flow, &netdev_flow->flow)) ? i.e. is it necessary to compare flows if found by external uuid? > + /* Overlapping flow. */ > + error = EINVAL; > + goto exit; > + } > + > struct dp_netdev_actions *new_actions; > struct dp_netdev_actions *old_actions; > > @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > * counter, and subtracting it before outputting the stats */ > error = EOPNOTSUPP; > } > - > ovsrcu_postpone(dp_netdev_actions_free, old_actions); > - } else if (put->flags & DPIF_FP_CREATE) { > - error = EEXIST; > - } else { > - /* Overlapping flow. */ > - error = EINVAL; > } > } > + > +exit: > ovs_mutex_unlock(&pmd->flow_mutex); > return error; > } > diff --git a/tests/pmd.at b/tests/pmd.at > index 48f3d432d..f399294d2 100644 > --- a/tests/pmd.at > +++ b/tests/pmd.at > @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD load based sleeps > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) > + > +OVS_VSWITCHD_START( > +[add-port br0 p1 \ > + -- set interface p1 type=dummy-pmd \ > + -- set bridge br0 datapath-type=dummy \ It's probably better to change order of above two lines. > + -- add-port br0 p2 \ > + -- set interface p2 type=dummy-pmd -- '--' at the end is not needed. > +]) > + > +dnl Add one openflow rule and generate a megaflow. > +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2']) > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > + > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [0], [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2 > +]) For some reason, this check always fails on FreeBSD: ./pmd.at:1349: ovs-appctl dpctl/dump-flows | sed 's/.*core: [0-9]*//' --- - 2023-08-07 22:43:09.208141000 +0000 +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/1088/stdout 2023-08-07 22:43:09.207372000 +0000 @@ -1,3 +1 @@ -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2 - --- The flow dump is empty, presumably the PMD thread didn't fully process the packet yet at the time of a dump. > + > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > +dnl Replace openflow rules, trigger the revalidation. > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=ct(commit)' | dnl > +ovs-ofctl --bundle replace-flows br0 -]) Indent this line to the level of 'echo' above. > +AT_CHECK([ovs-appctl revalidator/wait]) > + > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, actions:ct(commit) > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:ct(commit) > +]) > + > +dnl Hold the prefix 10.1.2.2/24 by another 10s. > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 tuple in the pvector of subtables. > +for i in `seq 0 256`;do A space after ';'. Also, it's better to use $(...) instead of `...`. > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) Indent this 2 spaces to the right. > +done > + > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | dnl > +ovs-ofctl --bundle replace-flows br0 -]) Indentation here again. Should, probably, wait here for revalidation as well? > + > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 > +]) > + > +OVS_VSWITCHD_STOP > +AT_CLEANUP
Ilya Maximets <i.maximets@ovn.org> 于2023年8月8日周二 07:30写道: > On 7/31/23 06:55, Peng He wrote: > > OVS allows overlapping megaflows, as long as the actions of these > > megaflows are equal. However, the current implementation of action > > modification relies on flow_lookup instead of ufid, this could result > > in looking up a wrong megaflow and make the ukeys and megaflows > inconsistent > > > > Just like the test case in the patch, at first we have a rule with the > > prefix: > > > > 10.1.2.0/24 > > > > and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with > IP > > 10.1.2.2 is received. > > > > Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep > the > > 10.1.2.2/24 megaflow and just changes its action instead of extending > > the prefix into 10.1.2.2/16. > > > > then suppose we have a 10.1.0.2 packet, since it misses the megaflow, > > this time, we will have an overlapping megaflow with the right prefix: > > 10.1.0.2/16 > > > > now we have two megaflows: > > 10.1.2.2/24 > > 10.1.0.2/16 > > > > last, suppose we have changed the ruleset again. The revalidator this > > time still decides to change the actions of both megaflows instead of > > deleting them. > > > > The dpif_netdev_flow_put will search the megaflow to modify with unmasked > > keys, however it might lookup the wrong megaflow as the key 10.1.2.2 > matches > > both 10.1.2.2/24 and 10.1.0.2/16! > > > > This patch changes the megaflow lookup code in modification path into > > relying the ufid to find the correct megaflow instead of key lookup. > > > > Signed-off-by: Peng He <hepeng.0320@bytedance.com> > > --- > > Thanks for the update! This version looks logically correct to me. > See some small comments below. > > Best regards, Ilya Maximets. > > > lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- > > tests/pmd.at | 48 ++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 82 insertions(+), 15 deletions(-) > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > index 70b953ae6..8c88a5dc0 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > const struct dpif_flow_put *put, > > struct dpif_flow_stats *stats) > > { > > - struct dp_netdev_flow *netdev_flow; > > + struct dp_netdev_flow *netdev_flow = NULL; > > int error = 0; > > > > if (stats) { > > @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > } > > > > ovs_mutex_lock(&pmd->flow_mutex); > > - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > - if (!netdev_flow) { > > - if (put->flags & DPIF_FP_CREATE) { > > - dp_netdev_flow_add(pmd, match, ufid, put->actions, > > - put->actions_len, ODPP_NONE); > > + if (put->ufid) { > > + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, > > + put->key, put->key_len); > > + } else { > > + /* Use key instead of the locally generated ufid > > + * to search netdev_flow. > > + */ > > Nit: Usually, /* and */ in the comment are on their own lines or > both on the lines with the text, i.e. > > /* > * Line 1 > * Line 2 > */ > > Or > > /* Line 1 > * Line 2 */ > > Rarely the style is mixed. In the case above, I'd suggest to move > the closing '*/' to the previous line. > > > + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > + } > > + > > + if (put->flags & DPIF_FP_CREATE) { > > + if (!netdev_flow) { > > + /* If ufid is provided, use provided ufid, otherwise > > + * use locally generated ufid. > > + */ > > + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : ufid, > > This looks abit strange, because ufid equals put->ufid, if provided. > I'd remove the comment and just use 'ufid' here without a ternary > operator. > > Or maybe a cleaner solution is to remove 'ufid = *put->ufid;' assignment > from dpif_netdev_flow_put() and re-name the argument of flow_put_on_pmd() > function from 'ufid' to 'local_ufid' or something like that. In this > case we should still remove the comment, but keep the ternary operator. > Might be easier to read this way. What do you think? > I miss ufid = put->ufid here. If so, I guess just use ufid here is fine. > > > + put->actions, put->actions_len, > ODPP_NONE); > > } else { > > - error = ENOENT; > > + error = EEXIST; > > } > > - } else { > > - if (put->flags & DPIF_FP_MODIFY) { > > + goto exit; > > + } > > + > > + if (put->flags & DPIF_FP_MODIFY) { > > + if (!netdev_flow) { > > + error = ENOENT; > > + } else { > > + if (!flow_equal(&match->flow, &netdev_flow->flow)) { > > Should this be (!put->ufid && !flow_equal(&match->flow, > &netdev_flow->flow)) ? > i.e. is it necessary to compare flows if found by external uuid? > Yes, no need to check against the flow if ufid is provided. will fix it in the next version. Will fix all the issues you mentioned below. > > > + /* Overlapping flow. */ > > + error = EINVAL; > > + goto exit; > > + } > > + > > struct dp_netdev_actions *new_actions; > > struct dp_netdev_actions *old_actions; > > > > @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > * counter, and subtracting it before outputting the > stats */ > > error = EOPNOTSUPP; > > } > > - > > ovsrcu_postpone(dp_netdev_actions_free, old_actions); > > - } else if (put->flags & DPIF_FP_CREATE) { > > - error = EEXIST; > > - } else { > > - /* Overlapping flow. */ > > - error = EINVAL; > > } > > } > > + > > +exit: > > ovs_mutex_unlock(&pmd->flow_mutex); > > return error; > > } > > diff --git a/tests/pmd.at b/tests/pmd.at > > index 48f3d432d..f399294d2 100644 > > --- a/tests/pmd.at > > +++ b/tests/pmd.at > > @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM > ovs-vswitchd.log | grep "PMD load based sleeps > > > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) > > + > > +OVS_VSWITCHD_START( > > +[add-port br0 p1 \ > > + -- set interface p1 type=dummy-pmd \ > > + -- set bridge br0 datapath-type=dummy \ > > It's probably better to change order of above two lines. > > > + -- add-port br0 p2 \ > > + -- set interface p2 type=dummy-pmd -- > > '--' at the end is not needed. > > > +]) > > + > > +dnl Add one openflow rule and generate a megaflow. > > +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst= > 10.1.2.0/24,actions=p2']) > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], > [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2 > > +]) > > For some reason, this check always fails on FreeBSD: > > ./pmd.at:1349: ovs-appctl dpctl/dump-flows | sed 's/.*core: [0-9]*//' > --- - 2023-08-07 22:43:09.208141000 +0000 > +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/1088/stdout > 2023-08-07 22:43:09.207372000 +0000 > @@ -1,3 +1 @@ > > -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2 > - > --- > > The flow dump is empty, presumably the PMD thread didn't fully > process the packet yet at the time of a dump. > So, what's your suggestion? set a large max_idle and sleep like a 1s? > > + > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > +dnl Replace openflow rules, trigger the revalidation. > > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 > actions=ct(commit)' | dnl > > +ovs-ofctl --bundle replace-flows br0 -]) > > Indent this line to the level of 'echo' above. > > > +AT_CHECK([ovs-appctl revalidator/wait]) > > + > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | > strip_xout_keep_actions], [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, > actions:ct(commit) > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, > actions:ct(commit) > > +]) > > + > > +dnl Hold the prefix 10.1.2.2/24 by another 10s. > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 > tuple in the pvector of subtables. > > +for i in `seq 0 256`;do > > A space after ';'. Also, it's better to use $(...) instead of `...`. > > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > > Indent this 2 spaces to the right. > > > +done > > + > > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | > dnl > > +ovs-ofctl --bundle replace-flows br0 -]) > > Indentation here again. > > Should, probably, wait here for revalidation as well? > > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | > strip_xout_keep_actions], [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst= > 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 > > +]) > > + > > +OVS_VSWITCHD_STOP > > +AT_CLEANUP > >
On 8/8/23 09:20, Peng He wrote: > > > Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> 于2023年8月8日周二 07:30写道: > > On 7/31/23 06:55, Peng He wrote: > > OVS allows overlapping megaflows, as long as the actions of these > > megaflows are equal. However, the current implementation of action > > modification relies on flow_lookup instead of ufid, this could result > > in looking up a wrong megaflow and make the ukeys and megaflows inconsistent > > > > Just like the test case in the patch, at first we have a rule with the > > prefix: > > > > 10.1.2.0/24 <http://10.1.2.0/24> > > > > and we will get a megaflow with prefixes 10.1.2.2/24 <http://10.1.2.2/24> when a packet with IP > > 10.1.2.2 is received. > > > > Then suppose we change the rule into 10.1.0.0/16 <http://10.1.0.0/16>. OVS prefers to keep the > > 10.1.2.2/24 <http://10.1.2.2/24> megaflow and just changes its action instead of extending > > the prefix into 10.1.2.2/16 <http://10.1.2.2/16>. > > > > then suppose we have a 10.1.0.2 packet, since it misses the megaflow, > > this time, we will have an overlapping megaflow with the right prefix: > > 10.1.0.2/16 <http://10.1.0.2/16> > > > > now we have two megaflows: > > 10.1.2.2/24 <http://10.1.2.2/24> > > 10.1.0.2/16 <http://10.1.0.2/16> > > > > last, suppose we have changed the ruleset again. The revalidator this > > time still decides to change the actions of both megaflows instead of > > deleting them. > > > > The dpif_netdev_flow_put will search the megaflow to modify with unmasked > > keys, however it might lookup the wrong megaflow as the key 10.1.2.2 matches > > both 10.1.2.2/24 <http://10.1.2.2/24> and 10.1.0.2/16 <http://10.1.0.2/16>! > > > > This patch changes the megaflow lookup code in modification path into > > relying the ufid to find the correct megaflow instead of key lookup. > > > > Signed-off-by: Peng He <hepeng.0320@bytedance.com <mailto:hepeng.0320@bytedance.com>> > > --- > > Thanks for the update! This version looks logically correct to me. > See some small comments below. > > Best regards, Ilya Maximets. > > > lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- > > tests/pmd.at <http://pmd.at> | 48 ++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 82 insertions(+), 15 deletions(-) > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > index 70b953ae6..8c88a5dc0 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > const struct dpif_flow_put *put, > > struct dpif_flow_stats *stats) > > { > > - struct dp_netdev_flow *netdev_flow; > > + struct dp_netdev_flow *netdev_flow = NULL; > > int error = 0; > > > > if (stats) { > > @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > } > > > > ovs_mutex_lock(&pmd->flow_mutex); > > - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > - if (!netdev_flow) { > > - if (put->flags & DPIF_FP_CREATE) { > > - dp_netdev_flow_add(pmd, match, ufid, put->actions, > > - put->actions_len, ODPP_NONE); > > + if (put->ufid) { > > + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, > > + put->key, put->key_len); > > + } else { > > + /* Use key instead of the locally generated ufid > > + * to search netdev_flow. > > + */ > > Nit: Usually, /* and */ in the comment are on their own lines or > both on the lines with the text, i.e. > > /* > * Line 1 > * Line 2 > */ > > Or > > /* Line 1 > * Line 2 */ > > Rarely the style is mixed. In the case above, I'd suggest to move > the closing '*/' to the previous line. > > > + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > + } > > + > > + if (put->flags & DPIF_FP_CREATE) { > > + if (!netdev_flow) { > > + /* If ufid is provided, use provided ufid, otherwise > > + * use locally generated ufid. > > + */ > > + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : ufid, > > This looks abit strange, because ufid equals put->ufid, if provided. > I'd remove the comment and just use 'ufid' here without a ternary > operator. > > Or maybe a cleaner solution is to remove 'ufid = *put->ufid;' assignment > from dpif_netdev_flow_put() and re-name the argument of flow_put_on_pmd() > function from 'ufid' to 'local_ufid' or something like that. In this > case we should still remove the comment, but keep the ternary operator. > Might be easier to read this way. What do you think? > > > I miss ufid = put->ufid here. If so, I guess just use ufid here is fine. > > > > > > + put->actions, put->actions_len, ODPP_NONE); > > } else { > > - error = ENOENT; > > + error = EEXIST; > > } > > - } else { > > - if (put->flags & DPIF_FP_MODIFY) { > > + goto exit; > > + } > > + > > + if (put->flags & DPIF_FP_MODIFY) { > > + if (!netdev_flow) { > > + error = ENOENT; > > + } else { > > + if (!flow_equal(&match->flow, &netdev_flow->flow)) { > > Should this be (!put->ufid && !flow_equal(&match->flow, &netdev_flow->flow)) ? > i.e. is it necessary to compare flows if found by external uuid? > > Yes, no need to check against the flow if ufid is provided. > will fix it in the next version. > > Will fix all the issues you mentioned below. > > > > > + /* Overlapping flow. */ > > + error = EINVAL; > > + goto exit; > > + } > > + > > struct dp_netdev_actions *new_actions; > > struct dp_netdev_actions *old_actions; > > > > @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > * counter, and subtracting it before outputting the stats */ > > error = EOPNOTSUPP; > > } > > - > > ovsrcu_postpone(dp_netdev_actions_free, old_actions); > > - } else if (put->flags & DPIF_FP_CREATE) { > > - error = EEXIST; > > - } else { > > - /* Overlapping flow. */ > > - error = EINVAL; > > } > > } > > + > > +exit: > > ovs_mutex_unlock(&pmd->flow_mutex); > > return error; > > } > > diff --git a/tests/pmd.at <http://pmd.at> b/tests/pmd.at <http://pmd.at> > > index 48f3d432d..f399294d2 100644 > > --- a/tests/pmd.at <http://pmd.at> > > +++ b/tests/pmd.at <http://pmd.at> > > @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD load based sleeps > > > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) > > + > > +OVS_VSWITCHD_START( > > +[add-port br0 p1 \ > > + -- set interface p1 type=dummy-pmd \ > > + -- set bridge br0 datapath-type=dummy \ > > It's probably better to change order of above two lines. > > > + -- add-port br0 p2 \ > > + -- set interface p2 type=dummy-pmd -- > > '--' at the end is not needed. > > > +]) > > + > > +dnl Add one openflow rule and generate a megaflow. > > +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2' <http://10.1.2.0/24,actions=p2'>]) > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [0], [ > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:never, actions:2 > > +]) > > For some reason, this check always fails on FreeBSD: > > ./pmd.at:1349 <http://pmd.at:1349>: ovs-appctl dpctl/dump-flows | sed 's/.*core: [0-9]*//' > --- - 2023-08-07 22:43:09.208141000 +0000 > +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/1088/stdout 2023-08-07 22:43:09.207372000 +0000 > @@ -1,3 +1 @@ > > -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:never, actions:2 > - > --- > > The flow dump is empty, presumably the PMD thread didn't fully > process the packet yet at the time of a dump. > > > So, what's your suggestion? set a large max_idle and sleep like a 1s? Sleeps are not great. We should use OVS_WAIT_UNTIL_EQUAL, I guess. > > > > + > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > +dnl Replace openflow rules, trigger the revalidation. > > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 <http://10.1.0.0/16> actions=ct(commit)' | dnl > > +ovs-ofctl --bundle replace-flows br0 -]) > > Indent this line to the level of 'echo' above. > > > +AT_CHECK([ovs-appctl revalidator/wait]) > > + > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [ > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no <http://10.1.0.2/255.255.0.0,frag=no>), packets:0, bytes:0, used:never, actions:ct(commit) > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:0.0s, actions:ct(commit) > > +]) > > + > > +dnl Hold the prefix 10.1.2.2/24 <http://10.1.2.2/24> by another 10s. > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > +dnl Send more 10.1.0.2 to make 10.1.0.0/16 <http://10.1.0.0/16> tuple preprend 10.1.2.0/24 <http://10.1.2.0/24> tuple in the pvector of subtables. > > +for i in `seq 0 256`;do > > A space after ';'. Also, it's better to use $(...) instead of `...`. > > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > > Indent this 2 spaces to the right. > > > +done > > + > > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 <http://10.1.0.0/16> actions=p2' | dnl > > +ovs-ofctl --bundle replace-flows br0 -]) > > Indentation here again. > > Should, probably, wait here for revalidation as well? > > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [ > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no <http://10.1.0.2/255.255.0.0,frag=no>), packets:0, bytes:0, used:0.0s, actions:2 > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:0.0s, actions:2 > > +]) > > + > > +OVS_VSWITCHD_STOP > > +AT_CLEANUP > > > > -- > hepeng
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 70b953ae6..8c88a5dc0 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, const struct dpif_flow_put *put, struct dpif_flow_stats *stats) { - struct dp_netdev_flow *netdev_flow; + struct dp_netdev_flow *netdev_flow = NULL; int error = 0; if (stats) { @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, } ovs_mutex_lock(&pmd->flow_mutex); - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); - if (!netdev_flow) { - if (put->flags & DPIF_FP_CREATE) { - dp_netdev_flow_add(pmd, match, ufid, put->actions, - put->actions_len, ODPP_NONE); + if (put->ufid) { + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, + put->key, put->key_len); + } else { + /* Use key instead of the locally generated ufid + * to search netdev_flow. + */ + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); + } + + if (put->flags & DPIF_FP_CREATE) { + if (!netdev_flow) { + /* If ufid is provided, use provided ufid, otherwise + * use locally generated ufid. + */ + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : ufid, + put->actions, put->actions_len, ODPP_NONE); } else { - error = ENOENT; + error = EEXIST; } - } else { - if (put->flags & DPIF_FP_MODIFY) { + goto exit; + } + + if (put->flags & DPIF_FP_MODIFY) { + if (!netdev_flow) { + error = ENOENT; + } else { + if (!flow_equal(&match->flow, &netdev_flow->flow)) { + /* Overlapping flow. */ + error = EINVAL; + goto exit; + } + struct dp_netdev_actions *new_actions; struct dp_netdev_actions *old_actions; @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, * counter, and subtracting it before outputting the stats */ error = EOPNOTSUPP; } - ovsrcu_postpone(dp_netdev_actions_free, old_actions); - } else if (put->flags & DPIF_FP_CREATE) { - error = EEXIST; - } else { - /* Overlapping flow. */ - error = EINVAL; } } + +exit: ovs_mutex_unlock(&pmd->flow_mutex); return error; } diff --git a/tests/pmd.at b/tests/pmd.at index 48f3d432d..f399294d2 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD load based sleeps OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) + +OVS_VSWITCHD_START( +[add-port br0 p1 \ + -- set interface p1 type=dummy-pmd \ + -- set bridge br0 datapath-type=dummy \ + -- add-port br0 p2 \ + -- set interface p2 type=dummy-pmd -- +]) + +dnl Add one openflow rule and generate a megaflow. +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2']) +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) + +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [0], [ +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2 +]) + +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) +dnl Replace openflow rules, trigger the revalidation. +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=ct(commit)' | dnl +ovs-ofctl --bundle replace-flows br0 -]) +AT_CHECK([ovs-appctl revalidator/wait]) + +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [ +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, actions:ct(commit) +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:ct(commit) +]) + +dnl Hold the prefix 10.1.2.2/24 by another 10s. +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 tuple in the pvector of subtables. +for i in `seq 0 256`;do +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) +done + +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | dnl +ovs-ofctl --bundle replace-flows br0 -]) + +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [ +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2 +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP
OVS allows overlapping megaflows, as long as the actions of these megaflows are equal. However, the current implementation of action modification relies on flow_lookup instead of ufid, this could result in looking up a wrong megaflow and make the ukeys and megaflows inconsistent Just like the test case in the patch, at first we have a rule with the prefix: 10.1.2.0/24 and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with IP 10.1.2.2 is received. Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the 10.1.2.2/24 megaflow and just changes its action instead of extending the prefix into 10.1.2.2/16. then suppose we have a 10.1.0.2 packet, since it misses the megaflow, this time, we will have an overlapping megaflow with the right prefix: 10.1.0.2/16 now we have two megaflows: 10.1.2.2/24 10.1.0.2/16 last, suppose we have changed the ruleset again. The revalidator this time still decides to change the actions of both megaflows instead of deleting them. The dpif_netdev_flow_put will search the megaflow to modify with unmasked keys, however it might lookup the wrong megaflow as the key 10.1.2.2 matches both 10.1.2.2/24 and 10.1.0.2/16! This patch changes the megaflow lookup code in modification path into relying the ufid to find the correct megaflow instead of key lookup. Signed-off-by: Peng He <hepeng.0320@bytedance.com> --- lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- tests/pmd.at | 48 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+), 15 deletions(-)