Message ID | CAPWQB7G0kPWwU+-mx7T8F8h9xfPXiohwJD6uv-iH_PKqEeOk0g@mail.gmail.com |
---|---|
State | Not Applicable |
Headers | show |
On 10 March 2017 at 11:34, Joe Stringer <joe@ovn.org> wrote: > On 9 March 2017 at 10:22, Yi-Hung Wei <yihung.wei@gmail.com> wrote: >> On Wed, Mar 8, 2017 at 11:40 AM, Joe Stringer <joe@ovn.org> wrote: > diff --git a/lib/meta-flow.c b/lib/meta-flow.c > index e844008f6294..bef5aad768a3 100644 > --- a/lib/meta-flow.c > +++ b/lib/meta-flow.c > @@ -2836,9 +2836,13 @@ mf_vl_mff_ref_cnt_mod(const struct vl_mff_map > *map, uint64_t tlv_bitmap, > vmf = mf_get_vl_mff__(i + MFF_TUN_METADATA0, map); > if (vmf) { > if (ref) { > - ovs_refcount_ref(&vmf->ref_cnt); > + if (ovs_refcount_ref(&vmf->ref_cnt) == 0) { > + VLOG_WARN("Taking reference on freed VMF %d", i); > + } Turns out that ovs_refcount_ref() already asserts this ;-)
On Fri, Mar 10, 2017 at 11:34 AM, Joe Stringer <joe@ovn.org> wrote: > On 9 March 2017 at 10:22, Yi-Hung Wei <yihung.wei@gmail.com> wrote: > > On Wed, Mar 8, 2017 at 11:40 AM, Joe Stringer <joe@ovn.org> wrote: > >> > @@ -145,6 +150,7 @@ learn_execute(const struct ofpact_learn *learn, > >> > const struct flow *flow, > >> > spec->n_bits); > >> > bitwise_one(ofpact_set_field_mask(sf), > >> > spec->dst.field->n_bytes, > >> > spec->dst.ofs, spec->n_bits); > >> > + mf_vl_mff_set_tlv_bitmap(spec->dst.field, > >> > &fm->ofpacts_tlv_bitmap); > >> > break; > >> > > >> > case NX_LEARN_DST_OUTPUT: > >> > >> I think we should also set the tlv bitmap in this LEARN_DST_OUTPUT > >> case? IIUC this uses an OXM field to determine the output port. I > >> believe it's possible for someone to use a TLV to specify an output > >> action using LEARN - so it would set the bitmap in the same way that > >> the NX_LEARN_DST_LOAD case does. > > > > Yes, for output action in learn, we can specify a TLV to be the SRC > field, > > but the DST field will eventually be an immediate value that we derive > from > > the SRC field. So the ref count for a TLV in LEARN_OUTPUT is incresed > when > > we decode the learn flow indecode_NXAST_RAW_LEARN(), but not when we > create > > the "learned" flow in learn_execute(). > > If I follow, you're saying that the learn flow (original) takes a > reference on the field, so when we create the "learned" flow in > learn_execute, we don't take a reference on the field. But could the > original learn flow be deleted and leave the learned flow behind > without a reference? > Yes, this patch does account the ref counting for learned flows. There are four types of mf_fields usage in the learn action, as in learn_execute(), they are NX_LEARN_SRC_FIELD, NX_LEARN_DST_MATCH, NX_LEARN_DST_LOAD, and NX_LEARN_DST_OUTPUT. In the "learned" flow, we need to account the ref counting of vl_mff_field onDST_MATCH and DST_LOAD, since we need to read values from vl_mf_field in the learned flow. For SRC_FIELD, and DST_OUTPUT, it is translated into immediate value when we create the learned flow. > > >> > >> > >> > diff --git a/lib/meta-flow.c b/lib/meta-flow.c > >> > index 40704e628aaa..7807b3055f5b 100644 > >> > --- a/lib/meta-flow.c > >> > +++ b/lib/meta-flow.c > >> > @@ -27,6 +27,8 @@ > >> > #include "openvswitch/dynamic-string.h" > >> > #include "nx-match.h" > >> > #include "openvswitch/ofp-util.h" > >> > +#include "ofproto/ofproto-provider.h" > >> > +#include "ovs-atomic.h" > >> > #include "ovs-rcu.h" > >> > #include "ovs-thread.h" > >> > #include "packets.h" > >> > @@ -2646,6 +2648,7 @@ field_array_set(enum mf_field_id id, const union > >> > mf_value *value, > >> > * struct vl_mff_map.*/ > >> > struct vl_mf_field { > >> > struct mf_field mf; > >> > + struct ovs_refcount ref_cnt; > >> > struct cmap_node cmap_node; /* In ofproto->vl_mff_map->cmap. */ > >> > }; > >> > > >> > @@ -2655,17 +2658,25 @@ mf_field_hash(uint32_t key) > >> > return hash_int(key, 0); > >> > } > >> > > >> > -void > >> > -mf_vl_mff_map_clear(struct vl_mff_map *vl_mff_map) > >> > +enum ofperr > >> > +mf_vl_mff_map_clear(struct vl_mff_map *vl_mff_map, bool clear) > >> > OVS_REQUIRES(vl_mff_map->mutex) > >> > { > >> > struct vl_mf_field *vmf; > >> > > >> > CMAP_FOR_EACH (vmf, cmap_node, &vl_mff_map->cmap) { > >> > - cmap_remove(&vl_mff_map->cmap, &vmf->cmap_node, > >> > - mf_field_hash(vmf->mf.id)); > >> > - ovsrcu_postpone(free, vmf); > >> > + if (clear) { > >> > + cmap_remove(&vl_mff_map->cmap, &vmf->cmap_node, > >> > + mf_field_hash(vmf->mf.id)); > >> > + ovsrcu_postpone(free, vmf); > >> > + } else { > >> > + if (ovs_refcount_read(&vmf->ref_cnt) != 1) { > >> > + return OFPERR_NXTTMFC_INVALID_TLV_DEL; > >> > + } > >> > + } > >> > } > >> > + > >> > + return 0; > >> > } > >> > >> I found this a bit confusing. mf_vl_mff_map_clear() has a boolean, > >> also named 'clear' that determines whether it will actually clear or > >> not. If you call a function like "clear(clear=false)", what would it > >> intuitively do? > >> > >> I see two calling conventions - first, for TLV clear command from > >> controller: > >> error = mf_vl_mff_map_clear(vl_mff_map, false); > >> if (error) > >> return error; > >> error = mf_vl_mff_map_clear(vl_mff_map, true); > >> > >> Second, from ofproto cleanup code: > >> mf_vl_mff_map_clear(vl_mff_map, true); > >> > >> I suggest that this would be easier to understand if the function were > >> structured with the the bool to be 'force': > >> > >> enum ofperr > >> mf_vl_mff_map_clear(struct vl_mff_map *vl_mff_map, bool force) > >> OVS_REQUIRES(vl_mff_map->mutex) > >> { > >> struct vl_mf_field *vmf; > >> > >> /* First, check whether any flows still refer to the current TLV > >> map. */ > >> if (!force) { > >> CMAP_FOR_EACH (vmf, cmap_node, &vl_mff_map->cmap) { > >> if (ovs_refcount_read(&vmf->ref_cnt) != 1) { > >> return OFPERR_NXTTMFC_INVALID_TLV_DEL; > >> } > >> } > >> } > >> > >> CMAP_FOR_EACH (vmf, cmap_node, &vl_mff_map->cmap) { > >> cmap_remove(&vl_mff_map->cmap, &vmf->cmap_node, > >> mf_field_hash(vmf->mf.id)); > >> ovsrcu_postpone(free, vmf); > >> } > >> > >> return 0; > >> } > > > > Thanks for suggestion, it is much clear after modification. > > > >> > >> > >> You may also consider refactoring the ovsrcu_postpone(free, vmf) into > >> a separate, vmf_delete() function which looks something like this: > >> > >> static void vmf_delete(struct vl_mf_field *vmf) > >> { > >> if (ovs_refcount_unref(&vmf->ref_cnt) == 1) { > >> /* Postpone as this function is typically called immediately > >> after removing from cmap. */ > >> ovsrcu_postpone(free, vmf); > >> } else { > >> VLOG_WARN_RL(&rl, "Attempted to delete VMF %s but refcount is > >> nonzero!", > >> vmf->mf.name); > >> } > >> } > >> > >> This could be reused from the other delete locations as well. > > > > I do not factor out the ovsrcu_postpone(), becuase the only case that we > > force to clear the vl_mff_map is when we destory ofproto. Otherwise, we > are > > not going to remove vmf from vl_mff_map if ref_count > 1. > > I guess my point was partly that both mf_vl_mff_map_del() and > mf_vl_mff_map_clear() can potentially RCU-defer freeing of the VMF, > and if we push that out to a separate function here that ensures that > the refcount is what we expect it to be, then it's more defensively > coded against potential misuse - attempting to free when it's still > being used, for instance. > Thanks, I applied this to v3. > > >> > +static void > >> > +mf_vl_mff_ref_cnt_mod(const struct vl_mff_map *map, uint64_t > >> > tlv_bitmap, > >> > + bool ref) > >> > +{ > >> > + struct vl_mf_field *vmf; > >> > + int i; > >> > + > >> > + if (map) { > >> > + ULLONG_FOR_EACH_1 (i, tlv_bitmap) { > >> > + vmf = mf_get_vl_mff__(i + MFF_TUN_METADATA0, map); > >> > + if (vmf) { > >> > + if (ref) { > >> > + ovs_refcount_ref(&vmf->ref_cnt); > >> > + } else { > >> > + ovs_refcount_unref(&vmf->ref_cnt); > >> > >> We should warn if we're trying to unref and this is the final ref, > >> because then we're not freeing the object. > > > > It could be the case that no active flow is currently use this tlv, but > the > > controller is going to use this tlv entry for some new flows. In this > case, > > it is not necessary to free vmf since it will be useful. > > Typically when refcounts are used, the last one to decrement it -> 0 > is responsible for freeing it. Also, if it goes to 0 and something is > able to find the element, then it probably shouldn't increment the > refcount because most likely the piece of code that decremented it to > 0 also rcu-deferred the free. It's easier to reason about how we > ensure that the memory a) valid to access and b) is eventually freed > if we stick to this idiom. > > For most cases, this is hit from the main thread which handles > OpenFlow messages; if it's decoding a message and acquiring references > to fields, then by definition it can't also be removing the fields > that we're trying to reference, so I don't think this should ever > decrement the refcount -> 0; therefore, if we warn, we could easily > see that we have some bad assumption in how we understand this code. > > I will note though, I also see some learn_execute() callers for this, > which means that another thread is getting involved - perhaps an > upcall handler or perhaps a revalidator. This makes me a little more > nervous about ensuring that references to the fields are being > correctly counted and whether we may be trying to retain access to a > field which was recently freed. Most likely it's fine, since that > other thread would have to find a flow which references these fields > to be able to take another reference (eg for the "learned" flow); but > warning something like this would be more explicit in this function: > > diff --git a/lib/meta-flow.c b/lib/meta-flow.c > index e844008f6294..bef5aad768a3 100644 > --- a/lib/meta-flow.c > +++ b/lib/meta-flow.c > @@ -2836,9 +2836,13 @@ mf_vl_mff_ref_cnt_mod(const struct vl_mff_map > *map, uint64_t tlv_bitmap, > vmf = mf_get_vl_mff__(i + MFF_TUN_METADATA0, map); > if (vmf) { > if (ref) { > - ovs_refcount_ref(&vmf->ref_cnt); > + if (ovs_refcount_ref(&vmf->ref_cnt) == 0) { > + VLOG_WARN("Taking reference on freed VMF %d", i); > + } > } else { > - ovs_refcount_unref(&vmf->ref_cnt); > + if (ovs_refcount_unref(&vmf->ref_cnt) == 1) { > + VLOG_WARN("Dereferencing VMF %d without free", i); > + } > } > } else { > VLOG_WARN("Invalid TLV index %d.", i); > > Maybe it's over the top. But it would be nicer to know through a log > message than just have some crazy bug show up later on and try to > figure out that this is what is happening. > Thanks, as you pointed out on in the following e-mail, since ovs_refcount_ref() already asserts this, we are not going to vlog here. > >> > @@ -1164,6 +1169,8 @@ decode_NXAST_RAW_OUTPUT_REG2(const struct > >> > nx_action_output_reg2 *naor, > >> > if (error) { > >> > return error; > >> > } > >> > + mf_vl_mff_set_tlv_bitmap(output_reg->src.field, tlv_bitmap); > >> > + > >> > >> I wonder if it's cleaner to fold this into nx_pull_header(). > >> > >> It's not so obvious why the ofpacts decode functions in this file > >> inconsistently need to call the mf_check_vl_mff() and/or > >> mf_vl_mff_set_tlv_bitmap(), which makes it a little more difficult to > >> verify (and easier to mess up if new actions are added). It'd be nice > >> to make this more consistent. > > > > > > Thanks, it is hard to follow as in v1. Actually, we use nx_pull_header(), > > nx_pull_entry(), and mf_from_nxm_header() to get a mf_field in > ofp-action.c > > and those three functions are heavily use in other part of the code base > > that may not need to update the bitmap. So instead to fold > > mf_vl_mff_set_tlv_bitmap() directly into the three functions, three > wrapper > > functions mf_vl_mff_nx_pull_header(), mf_vl_mff_nx_pull_entry(), and > > mf_vl_mff_mf_from_nxm_header() are created that handle mf_field operation > > and set the tlv_bitmap, and it should be more easier to follow now. > > OK, fair enough, thanks. > > >> > @@ -1603,6 +1599,12 @@ ofproto_destroy__(struct ofproto *ofproto) > >> > } > >> > free(ofproto->tables); > >> > > >> > + ovs_mutex_lock(&ofproto->vl_mff_map.mutex); > >> > + mf_vl_mff_map_clear(&ofproto->vl_mff_map, true); > >> > + ovs_mutex_unlock(&ofproto->vl_mff_map.mutex); > >> > + cmap_destroy(&ofproto->vl_mff_map.cmap); > >> > + ovs_mutex_destroy(&ofproto->vl_mff_map.mutex); > >> > + > >> > ovs_assert(hindex_is_empty(&ofproto->cookies)); > >> > hindex_destroy(&ofproto->cookies); > >> > > >> > >> Is there a reason for the above two hunks to shift this logic later? > >> Should this be in a separate patch to explain why, or could it be > >> dropped? > >> > > Cause I feel more comfortable to remove the vl_mf_field after all the > rules > > are deleted. In practice, it is invoked when ofproto is destroyed, so it > > probabaly does not matter? > > I think it should be another patch to move it, and describe this > concern in the commit message. > Thanks, I moved it to anther patch in v3. > > >> > @@ -7784,14 +7802,15 @@ handle_tlv_table_mod(struct ofconn *ofconn, > >> > const struct ofp_header *oh) > >> > old_tab = ovsrcu_get_protected(struct tun_table *, > >> > &ofproto->metadata_tab); > >> > error = tun_metadata_table_mod(&ttm, old_tab, &new_tab); > >> > if (!error) { > >> > - ovsrcu_set(&ofproto->metadata_tab, new_tab); > >> > - tun_metadata_postpone_free(old_tab); > >> > - > >> > ovs_mutex_lock(&ofproto->vl_mff_map.mutex); > >> > error = > >> > mf_vl_mff_map_mod_from_tun_metadata(&ofproto->vl_mff_map, > >> > &ttm); > >> > ovs_mutex_unlock(&ofproto->vl_mff_map.mutex); > >> > } > >> > + if (!error) { > >> > + ovsrcu_set(&ofproto->metadata_tab, new_tab); > >> > + tun_metadata_postpone_free(old_tab); > >> > + } > >> > >> Maybe I didn't quite dig deep enough, but what's the implications of > >> moving this? Could it fix another bug, should it be a separate patch? > > > > > > I'm not sure if it be a separate patch? What this change is that we can > only > > modify update the TLV mapping table if all the table modification > operations > > are valid. May it makes more sense like this > > > > if (!error) { > > ovs_mutex_lock(&ofproto->vl_mff_map.mutex); > > error = mf_vl_mff_map_mod_from_tun_ > metadata(&ofproto->vl_mff_map, > > &ttm); > > ovs_mutex_unlock(&ofproto->vl_mff_map.mutex); > > if (!error) { > > ovsrcu_set(&ofproto->metadata_tab, new_tab); > > tun_metadata_postpone_free(old_tab); > > } > > } > > LGTM. >
diff --git a/lib/meta-flow.c b/lib/meta-flow.c index e844008f6294..bef5aad768a3 100644 --- a/lib/meta-flow.c +++ b/lib/meta-flow.c @@ -2836,9 +2836,13 @@ mf_vl_mff_ref_cnt_mod(const struct vl_mff_map *map, uint64_t tlv_bitmap, vmf = mf_get_vl_mff__(i + MFF_TUN_METADATA0, map); if (vmf) { if (ref) { - ovs_refcount_ref(&vmf->ref_cnt); + if (ovs_refcount_ref(&vmf->ref_cnt) == 0) { + VLOG_WARN("Taking reference on freed VMF %d", i); + } } else { - ovs_refcount_unref(&vmf->ref_cnt); + if (ovs_refcount_unref(&vmf->ref_cnt) == 1) { + VLOG_WARN("Dereferencing VMF %d without free", i); + } } } else {