diff mbox

[ovs-dev,RFC,PATCHv2] ofp-actions: Add clone action.

Message ID 1480541749-14284-1-git-send-email-u9012063@gmail.com
State RFC
Headers show

Commit Message

William Tu Nov. 30, 2016, 9:35 p.m. UTC
This patch adds OpenFlow clone action with syntax as below:
"clone([action][,action...])".  The clone() action makes a copy of the
current packet and executes the list of actions against the packet,
without affecting the packet after the "clone(...)" action.  In other
word, the packet before the clone() and after the clone() is the same,
no matter what actions executed inside the clone().

The patch reuses the dapatah SAMPLE action, with probability of 100%.
The kernel datapath 'sample()' clones the skb and its flow key under this
circumstance, and actions specified in the clone() are executed against
the cloned skb.  Note that there is no performance overhead of clone, since
if eventually the packet is output, it will also clone the packet.  Here we
simply move the copy at the beginning.

The complexity of adding clone might outweight its use cases.  I'm looking
for comments as well as listing some cases below:
Use case 1:
Set different fields and output to different ports without unset
actions=
  clone(mod_dl_src:<mac1>, output:1), clone(mod_dl_dst:<mac2>, output:2), output:3
Since each clone() has independent packet, output:1 has only dl_src modified,
output:2 has only dl_dst modified, output:3 has original packet.

Similar to case1
actions=
  push_vlan(...), output:2, pop_vlan, push_vlan(...), output:3
can be changed to
actions=
  clone(push_vlan(...), output:2),clone(push_vlan(...), output:3)
without having to add pop_vlan.

case 2: resubmit to another table without worrying packet being modified
  actions=clone(resubmit(1,2)), ...

case 3: truncate in the clone action
Currently OVS can truncate packet by 'output(port=1,max_len=100)', which
ties truncate and output together.  However, sometimes the layer decides
to truncate is separate from the layer to output.  One proposal is to
introduce actions s.t like truncate_set() and truncate_unset(), where
only the action in between sees truncated packet.  Another approach is
to use clone() as below:
actions=
  clone(truncate(100), push_vlan, resubmit, ...)
where we don't need to worry about missing the truncate_unset() because
truncated packet is not visible outside the clone().

We definitely should put some limit on the action types available inside
clone(). For this patch, there is no restriction.

Signed-off-by: William Tu <u9012063@gmail.com>
---
v1->v2
- rebase and change NX1.0+(41) to 42
---
 include/openvswitch/ofp-actions.h |  23 +++++++++
 lib/ofp-actions.c                 | 101 +++++++++++++++++++++++++++++++++++++-
 ofproto/ofproto-dpif-xlate.c      |  35 +++++++++++++
 tests/ofproto-dpif.at             |  19 +++++++
 tests/system-traffic.at           |  29 +++++++++++
 5 files changed, 206 insertions(+), 1 deletion(-)

Comments

Gao Zhenyu Dec. 2, 2016, 2:25 a.m. UTC | #1
Thanks for working on it!

Can we have nested clone?

actions=clone(push_vlan(1), resubmit(1,2), clone(mod_dl_src:<mac1>,
output:5) )

BTW, I believe the port 3 will get a 104byte size packet with this action,
right?
actions= clone(truncate(100), push_vlan, output:3)


Thanks
Zhenyu Gao

2016-12-01 5:35 GMT+08:00 William Tu <u9012063@gmail.com>:

> This patch adds OpenFlow clone action with syntax as below:
> "clone([action][,action...])".  The clone() action makes a copy of the
> current packet and executes the list of actions against the packet,
> without affecting the packet after the "clone(...)" action.  In other
> word, the packet before the clone() and after the clone() is the same,
> no matter what actions executed inside the clone().
>
> The patch reuses the dapatah SAMPLE action, with probability of 100%.
> The kernel datapath 'sample()' clones the skb and its flow key under this
> circumstance, and actions specified in the clone() are executed against
> the cloned skb.  Note that there is no performance overhead of clone, since
> if eventually the packet is output, it will also clone the packet.  Here we
> simply move the copy at the beginning.
>
> The complexity of adding clone might outweight its use cases.  I'm looking
> for comments as well as listing some cases below:
> Use case 1:
> Set different fields and output to different ports without unset
> actions=
>   clone(mod_dl_src:<mac1>, output:1), clone(mod_dl_dst:<mac2>, output:2),
> output:3
> Since each clone() has independent packet, output:1 has only dl_src
> modified,
> output:2 has only dl_dst modified, output:3 has original packet.
>
> Similar to case1
> actions=
>   push_vlan(...), output:2, pop_vlan, push_vlan(...), output:3
> can be changed to
> actions=
>   clone(push_vlan(...), output:2),clone(push_vlan(...), output:3)
> without having to add pop_vlan.
>
> case 2: resubmit to another table without worrying packet being modified
>   actions=clone(resubmit(1,2)), ...
>
> case 3: truncate in the clone action
> Currently OVS can truncate packet by 'output(port=1,max_len=100)', which
> ties truncate and output together.  However, sometimes the layer decides
> to truncate is separate from the layer to output.  One proposal is to
> introduce actions s.t like truncate_set() and truncate_unset(), where
> only the action in between sees truncated packet.  Another approach is
> to use clone() as below:
> actions=
>   clone(truncate(100), push_vlan, resubmit, ...)
> where we don't need to worry about missing the truncate_unset() because
> truncated packet is not visible outside the clone().
>
> We definitely should put some limit on the action types available inside
> clone(). For this patch, there is no restriction.
>
> Signed-off-by: William Tu <u9012063@gmail.com>
> ---
> v1->v2
> - rebase and change NX1.0+(41) to 42
> ---
>  include/openvswitch/ofp-actions.h |  23 +++++++++
>  lib/ofp-actions.c                 | 101 ++++++++++++++++++++++++++++++
> +++++++-
>  ofproto/ofproto-dpif-xlate.c      |  35 +++++++++++++
>  tests/ofproto-dpif.at             |  19 +++++++
>  tests/system-traffic.at           |  29 +++++++++++
>  5 files changed, 206 insertions(+), 1 deletion(-)
>
> diff --git a/include/openvswitch/ofp-actions.h b/include/openvswitch/ofp-
> actions.h
> index 2999261..be4a991 100644
> --- a/include/openvswitch/ofp-actions.h
> +++ b/include/openvswitch/ofp-actions.h
> @@ -109,6 +109,7 @@
>      OFPACT(CT,              ofpact_conntrack,   ofpact, "ct")           \
>      OFPACT(NAT,             ofpact_nat,         ofpact, "nat")          \
>      OFPACT(OUTPUT_TRUNC,    ofpact_output_trunc,ofpact, "output_trunc") \
> +    OFPACT(CLONE,           ofpact_clone,       ofpact, "clone")        \
>                                                                          \
>      /* Debugging actions.                                               \
>       *                                                                  \
> @@ -868,6 +869,28 @@ struct ofpact_sample {
>      enum nx_action_sample_direction direction;
>  };
>
> +/* OFPACT_CLONE.
> + *
> + * Used for cloning the packet and execute actions
> + * in the clone envelope. */
> +struct ofpact_clone {
> +    OFPACT_PADDED_MEMBERS(
> +        struct ofpact ofpact;
> +    );
> +    struct ofpact actions[0];
> +};
> +
> +BUILD_ASSERT_DECL(offsetof(struct ofpact_clone, actions)
> +                  % OFPACT_ALIGNTO == 0);
> +BUILD_ASSERT_DECL(offsetof(struct ofpact_clone, actions)
> +                  == sizeof(struct ofpact_clone));
> +
> +static inline size_t
> +ofpact_clone_get_action_len(const struct ofpact_clone *oc)
> +{
> +    return oc->ofpact.len - offsetof(struct ofpact_clone, actions);
> +}
> +
>  /* OFPACT_DEC_TTL.
>   *
>   * Used for OFPAT11_DEC_NW_TTL, NXAST_DEC_TTL and NXAST_DEC_TTL_CNT_IDS.
> */
> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> index 7507558..227ecb2 100644
> --- a/lib/ofp-actions.c
> +++ b/lib/ofp-actions.c
> @@ -306,6 +306,9 @@ enum ofp_raw_action_type {
>      /* NX1.0+(39): struct nx_action_output_trunc. */
>      NXAST_RAW_OUTPUT_TRUNC,
>
> +    /* NX1.0+(42): struct nx_action_clone, ... */
> +    NXAST_RAW_CLONE,
> +
>  /* ## ------------------ ## */
>  /* ## Debugging actions. ## */
>  /* ## ------------------ ## */
> @@ -386,6 +389,7 @@ ofpact_next_flattened(const struct ofpact *ofpact)
>      switch (ofpact->type) {
>      case OFPACT_OUTPUT:
>      case OFPACT_GROUP:
> +    case OFPACT_CLONE:
>      case OFPACT_CONTROLLER:
>      case OFPACT_ENQUEUE:
>      case OFPACT_OUTPUT_REG:
> @@ -4801,6 +4805,90 @@ format_UNROLL_XLATE(const struct
> ofpact_unroll_xlate *a, struct ds *s)
>                    colors.paren,   colors.end);
>  }
>
> +
> +struct nx_action_clone {
> +    ovs_be16 type;                  /* OFPAT_VENDOR. */
> +    ovs_be16 len;                   /* Length is at least 16. */
> +    ovs_be32 vendor;                /* NX_VENDOR_ID. */
> +    ovs_be16 subtype;               /* NXAST_CLONE. */
> +    ovs_be16 pad0;
> +    ovs_be32 pad1;
> +    /* Followed by a sequence of zero or more OpenFlow actions.  The
> length
> +     * of these is included in 'len'. */
> +};
> +OFP_ASSERT(sizeof(struct nx_action_clone) == 16);
> +
> +static enum ofperr
> +decode_NXAST_RAW_CLONE(const struct nx_action_clone *nac,
> +                        enum ofp_version ofp_version,
> +                        struct ofpbuf *out)
> +{
> +    int error;
> +    struct ofpbuf openflow;
> +    const size_t clone_offset = ofpacts_pull(out);
> +    struct ofpact_clone *clone = ofpact_put_CLONE(out);
> +
> +    /* decode action list */
> +    ofpbuf_pull(out, sizeof(*clone));
> +    openflow = ofpbuf_const_initializer(
> +                    nac + 1, ntohs(nac->len) - sizeof(*nac));
> +    error = ofpacts_pull_openflow_actions__(&openflow, openflow.size,
> +                                            ofp_version,
> +                                            1u <<
> OVSINST_OFPIT11_APPLY_ACTIONS,
> +                                            out, OFPACT_CLONE);
> +    clone = ofpbuf_push_uninit(out, sizeof(*clone));
> +    out->header = &clone->ofpact;
> +    ofpact_finish_CLONE(out, &clone);
> +    ofpbuf_push_uninit(out, clone_offset);
> +    return error;
> +}
> +
> +static void
> +encode_CLONE(const struct ofpact_clone *clone,
> +              enum ofp_version ofp_version, struct ofpbuf *out)
> +{
> +    size_t len;
> +    const size_t ofs = out->size;
> +    struct nx_action_clone *nac;
> +
> +    nac = put_NXAST_CLONE(out);
> +    len = ofpacts_put_openflow_actions(clone->actions,
> +                                 ofpact_clone_get_action_len(clone),
> +                                 out, ofp_version);
> +    len += sizeof(*nac);
> +    nac = ofpbuf_at(out, ofs, sizeof(*nac));
> +    nac->len = htons(len);
> +}
> +
> +static char * OVS_WARN_UNUSED_RESULT
> +parse_CLONE(char *arg, struct ofpbuf *ofpacts,
> +             enum ofputil_protocol *usable_protocols)
> +{
> +
> +    const size_t ct_offset = ofpacts_pull(ofpacts);
> +    struct ofpact_clone *clone = ofpact_put_CLONE(ofpacts);
> +    char *error;
> +
> +    ofpbuf_pull(ofpacts, sizeof(*clone));
> +    error = ofpacts_parse_copy(arg, ofpacts, usable_protocols,
> +                               false, OFPACT_CLONE);
> +    /* header points to the action list */
> +    ofpacts->header = ofpbuf_push_uninit(ofpacts, sizeof(*clone));
> +    clone = ofpacts->header;
> +
> +    ofpact_finish_CLONE(ofpacts, &clone);
> +    ofpbuf_push_uninit(ofpacts, ct_offset);
> +    return error;
> +}
> +
> +static void
> +format_CLONE(const struct ofpact_clone *a, struct ds *s)
> +{
> +    ds_put_format(s, "%sclone(%s", colors.paren, colors.end);
> +    ofpacts_format(a->actions, ofpact_clone_get_action_len(a), s);
> +    ds_put_format(s, "%s)%s", colors.paren, colors.end);
> +}
> +
>  /* Action structure for NXAST_SAMPLE.
>   *
>   * Samples matching packets with the given probability and sends them
> @@ -6212,6 +6300,7 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
>      case OFPACT_BUNDLE:
>      case OFPACT_CLEAR_ACTIONS:
>      case OFPACT_CT:
> +    case OFPACT_CLONE:
>      case OFPACT_NAT:
>      case OFPACT_CONTROLLER:
>      case OFPACT_DEC_MPLS_TTL:
> @@ -6288,6 +6377,7 @@ ofpact_is_allowed_in_actions_set(const struct
> ofpact *a)
>       * the specification.  Thus the order in which they should be applied
>       * in the action set is undefined. */
>      case OFPACT_BUNDLE:
> +    case OFPACT_CLONE:
>      case OFPACT_CONTROLLER:
>      case OFPACT_CT:
>      case OFPACT_NAT:
> @@ -6480,6 +6570,7 @@ ovs_instruction_type_from_ofpact_type(enum
> ofpact_type type)
>          return OVSINST_OFPIT11_GOTO_TABLE;
>      case OFPACT_OUTPUT:
>      case OFPACT_GROUP:
> +    case OFPACT_CLONE:
>      case OFPACT_CONTROLLER:
>      case OFPACT_ENQUEUE:
>      case OFPACT_OUTPUT_REG:
> @@ -7080,6 +7171,9 @@ ofpact_check__(enum ofputil_protocol
> *usable_protocols, struct ofpact *a,
>      case OFPACT_SAMPLE:
>          return 0;
>
> +    case OFPACT_CLONE:
> +        return 0;
> +
>      case OFPACT_CT: {
>          struct ofpact_conntrack *oc = ofpact_get_CT(a);
>
> @@ -7271,7 +7365,7 @@ ofpacts_verify_nested(const struct ofpact *a, enum
> ofpact_type outer_action)
>
>      if (outer_action) {
>          ovs_assert(outer_action == OFPACT_WRITE_ACTIONS
> -                   || outer_action == OFPACT_CT);
> +                   || outer_action == OFPACT_CT || outer_action ==
> OFPACT_CLONE);
>
>          if (outer_action == OFPACT_CT) {
>              if (!field) {
> @@ -7282,6 +7376,10 @@ ofpacts_verify_nested(const struct ofpact *a, enum
> ofpact_type outer_action)
>                  return OFPERR_OFPBAC_BAD_ARGUMENT;
>              }
>          }
> +        if (outer_action == OFPACT_CLONE) {
> +                /* add some constraints. */
> +                return 0;
> +        }
>      }
>
>      return 0;
> @@ -7635,6 +7733,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact,
> ofp_port_t port)
>      case OFPACT_POP_MPLS:
>      case OFPACT_SAMPLE:
>      case OFPACT_CLEAR_ACTIONS:
> +    case OFPACT_CLONE:
>      case OFPACT_WRITE_ACTIONS:
>      case OFPACT_GOTO_TABLE:
>      case OFPACT_METER:
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 55ac371..b534b31 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -4289,6 +4289,35 @@ xlate_sample_action(struct xlate_ctx *ctx,
>                            tunnel_out_port, false);
>  }
>
> +static void
> +compose_clone_action(struct xlate_ctx *ctx,
> +                    const struct ofpact_clone *oc)
> +{
> +    struct flow old_flow, old_base_flow;
> +    size_t actions_offset;
> +    size_t sample_offset = nl_msg_start_nested(ctx->odp_actions,
> +                                               OVS_ACTION_ATTR_SAMPLE);
> +
> +    /* Ensure that any prior actions are applied. */
> +    xlate_commit_actions(ctx);
> +
> +    /* clone might change the flow key, store the original. */
> +    old_flow = ctx->xin->flow;
> +    old_base_flow = ctx->base_flow;
> +
> +    /* Sample with 100% Probability */
> +    nl_msg_put_u32(ctx->odp_actions, OVS_SAMPLE_ATTR_PROBABILITY,
> UINT32_MAX);
> +    actions_offset = nl_msg_start_nested(ctx->odp_actions,
> +                                         OVS_SAMPLE_ATTR_ACTIONS);
> +    do_xlate_actions(oc->actions, ofpact_clone_get_action_len(oc), ctx);
> +    nl_msg_end_nested(ctx->odp_actions, actions_offset);
> +    nl_msg_end_nested(ctx->odp_actions, sample_offset);
> +
> +    /* restore flow key */
> +    ctx->xin->flow = old_flow;
> +    ctx->base_flow = old_base_flow;
> +}
> +
>  static bool
>  may_receive(const struct xport *xport, struct xlate_ctx *ctx)
>  {
> @@ -4447,6 +4476,7 @@ freeze_unroll_actions(const struct ofpact *a, const
> struct ofpact *end,
>          case OFPACT_WRITE_ACTIONS:
>          case OFPACT_METER:
>          case OFPACT_SAMPLE:
> +        case OFPACT_CLONE:
>          case OFPACT_DEBUG_RECIRC:
>          case OFPACT_CT:
>          case OFPACT_NAT:
> @@ -4695,6 +4725,7 @@ recirc_for_mpls(const struct ofpact *a, struct
> xlate_ctx *ctx)
>      case OFPACT_NOTE:
>      case OFPACT_EXIT:
>      case OFPACT_SAMPLE:
> +    case OFPACT_CLONE:
>      case OFPACT_UNROLL_XLATE:
>      case OFPACT_CT:
>      case OFPACT_NAT:
> @@ -5054,6 +5085,10 @@ do_xlate_actions(const struct ofpact *ofpacts,
> size_t ofpacts_len,
>              xlate_sample_action(ctx, ofpact_get_SAMPLE(a));
>              break;
>
> +        case OFPACT_CLONE:
> +            compose_clone_action(ctx, ofpact_get_CLONE(a));
> +            break;
> +
>          case OFPACT_CT:
>              compose_conntrack_action(ctx, ofpact_get_CT(a));
>              break;
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index ec7bd60..999f9fd 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -6388,6 +6388,25 @@ AT_CHECK([ovs-vsctl destroy
> Flow_Sample_Collector_Set 1], [0], [ignore])
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>
> +dnl OpenFlow Clone action using Sample datapath action
> +AT_SETUP([ofproto-dpif - clone action])
> +OVS_VSWITCHD_START
> +add_of_ports br0 1 2 3 4
> +
> +AT_DATA([flows.txt], [dnl
> +in_port=1, actions=clone(set_field:192.168.3.3->ip_src),clone(set_
> field:192.168.4.4->ip_dst,output:2),clone(mod_dl_src:80:
> 81:81:81:81:81,set_field:192.168.5.5->ip_dst,output:3),output:4
> +])
> +AT_CHECK([ovs-ofctl add-flows br0 flows.txt], [0], [ignore])
> +
> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1),eth(src=50:54:00:
> 00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=
> 10.10.10.2,dst=10.10.10.1,proto=1,tos=1,ttl=128,frag=no),icmp(type=8,code=0)'],
> [0], [stdout])
> +
> +AT_CHECK([tail -1 stdout], [0], [dnl
> +Datapath actions: sample(sample=100.0%,actions(
> drop)),sample(sample=100.0%,actions(set(ipv4(src=10.10.10.
> 2,dst=192.168.4.4)),2)),sample(sample=100.0%,actions(
> set(eth(src=80:81:81:81:81:81)),set(ipv4(src=10.10.10.2,dst=
> 192.168.5.5)),3)),4
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
>  dnl Flow based IPFIX statistics check
>  AT_SETUP([ofproto-dpif - Flow IPFIX statistics check])
>  OVS_VSWITCHD_START
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 801dfe3..31e5a16 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -337,6 +337,35 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3
> -w 2 10.1.1.100 | FORMAT_PI
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
>
> +AT_SETUP([datapath - clone action])
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2)
> +
> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
> +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24")
> +
> +AT_CHECK([ovs-vsctl -- set interface ovs-p0 ofport_request=1])
> +AT_CHECK([ovs-vsctl -- set interface ovs-p1 ofport_request=2])
> +AT_CHECK([ovs-vsctl -- set interface ovs-p2 ofport_request=3])
> +
> +dnl verify that the clone(...) won't affect the original packet, so ping
> still works OK
> +dnl without 'output' in 'clone()'
> +AT_CHECK([ovs-ofctl add-flow br0 "in_port=1,actions=clone(mod_
> dl_dst(50:54:00:00:00:0a),set_field:192.168.3.3->ip_dst), output:2"])
> +dnl with 'output' in 'clone()'
> +AT_CHECK([ovs-ofctl add-flow br0 "in_port=2,actions=clone(mod_
> dl_dst(50:54:00:00:00:0b),set_field:192.168.4.4->ip_dst, output:3),
> output:1"])
> +
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 |
> FORMAT_PING], [0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid])
> +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v
> -o wget0.log])
> +
> +OVS_TRAFFIC_VSWITCHD_STOP
> +AT_CLEANUP
> +
>  AT_SETUP([datapath - basic truncate action])
>  OVS_TRAFFIC_VSWITCHD_START()
>  AT_CHECK([ovs-ofctl del-flows br0])
> --
> 2.5.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
William Tu Dec. 2, 2016, 5:11 a.m. UTC | #2
On Thu, Dec 1, 2016 at 6:25 PM, Gao Zhenyu <sysugaozhenyu@gmail.com> wrote:
> Thanks for working on it!
>
> Can we have nested clone?
>
> actions=clone(push_vlan(1), resubmit(1,2), clone(mod_dl_src:<mac1>,
> output:5) )
>

I tested it for a couple of simple cases, it seems ok. But it might
complicate things a lot.

> BTW, I believe the port 3 will get a 104byte size packet with this action,
> right?
> actions= clone(truncate(100), push_vlan, output:3)
>

Yes.

Regards,
William

>
> Thanks
> Zhenyu Gao
>
> 2016-12-01 5:35 GMT+08:00 William Tu <u9012063@gmail.com>:
>>
>> This patch adds OpenFlow clone action with syntax as below:
>> "clone([action][,action...])".  The clone() action makes a copy of the
>> current packet and executes the list of actions against the packet,
>> without affecting the packet after the "clone(...)" action.  In other
>> word, the packet before the clone() and after the clone() is the same,
>> no matter what actions executed inside the clone().
>>
>> The patch reuses the dapatah SAMPLE action, with probability of 100%.
>> The kernel datapath 'sample()' clones the skb and its flow key under this
>> circumstance, and actions specified in the clone() are executed against
>> the cloned skb.  Note that there is no performance overhead of clone,
>> since
>> if eventually the packet is output, it will also clone the packet.  Here
>> we
>> simply move the copy at the beginning.
>>
>> The complexity of adding clone might outweight its use cases.  I'm looking
>> for comments as well as listing some cases below:
>> Use case 1:
>> Set different fields and output to different ports without unset
>> actions=
>>   clone(mod_dl_src:<mac1>, output:1), clone(mod_dl_dst:<mac2>, output:2),
>> output:3
>> Since each clone() has independent packet, output:1 has only dl_src
>> modified,
>> output:2 has only dl_dst modified, output:3 has original packet.
>>
>> Similar to case1
>> actions=
>>   push_vlan(...), output:2, pop_vlan, push_vlan(...), output:3
>> can be changed to
>> actions=
>>   clone(push_vlan(...), output:2),clone(push_vlan(...), output:3)
>> without having to add pop_vlan.
>>
>> case 2: resubmit to another table without worrying packet being modified
>>   actions=clone(resubmit(1,2)), ...
>>
>> case 3: truncate in the clone action
>> Currently OVS can truncate packet by 'output(port=1,max_len=100)', which
>> ties truncate and output together.  However, sometimes the layer decides
>> to truncate is separate from the layer to output.  One proposal is to
>> introduce actions s.t like truncate_set() and truncate_unset(), where
>> only the action in between sees truncated packet.  Another approach is
>> to use clone() as below:
>> actions=
>>   clone(truncate(100), push_vlan, resubmit, ...)
>> where we don't need to worry about missing the truncate_unset() because
>> truncated packet is not visible outside the clone().
>>
>> We definitely should put some limit on the action types available inside
>> clone(). For this patch, there is no restriction.
>>
>> Signed-off-by: William Tu <u9012063@gmail.com>
>> ---
>> v1->v2
>> - rebase and change NX1.0+(41) to 42
>> ---
>>  include/openvswitch/ofp-actions.h |  23 +++++++++
>>  lib/ofp-actions.c                 | 101
>> +++++++++++++++++++++++++++++++++++++-
>>  ofproto/ofproto-dpif-xlate.c      |  35 +++++++++++++
>>  tests/ofproto-dpif.at             |  19 +++++++
>>  tests/system-traffic.at           |  29 +++++++++++
>>  5 files changed, 206 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/openvswitch/ofp-actions.h
>> b/include/openvswitch/ofp-actions.h
>> index 2999261..be4a991 100644
>> --- a/include/openvswitch/ofp-actions.h
>> +++ b/include/openvswitch/ofp-actions.h
>> @@ -109,6 +109,7 @@
>>      OFPACT(CT,              ofpact_conntrack,   ofpact, "ct")           \
>>      OFPACT(NAT,             ofpact_nat,         ofpact, "nat")          \
>>      OFPACT(OUTPUT_TRUNC,    ofpact_output_trunc,ofpact, "output_trunc") \
>> +    OFPACT(CLONE,           ofpact_clone,       ofpact, "clone")        \
>>                                                                          \
>>      /* Debugging actions.                                               \
>>       *                                                                  \
>> @@ -868,6 +869,28 @@ struct ofpact_sample {
>>      enum nx_action_sample_direction direction;
>>  };
>>
>> +/* OFPACT_CLONE.
>> + *
>> + * Used for cloning the packet and execute actions
>> + * in the clone envelope. */
>> +struct ofpact_clone {
>> +    OFPACT_PADDED_MEMBERS(
>> +        struct ofpact ofpact;
>> +    );
>> +    struct ofpact actions[0];
>> +};
>> +
>> +BUILD_ASSERT_DECL(offsetof(struct ofpact_clone, actions)
>> +                  % OFPACT_ALIGNTO == 0);
>> +BUILD_ASSERT_DECL(offsetof(struct ofpact_clone, actions)
>> +                  == sizeof(struct ofpact_clone));
>> +
>> +static inline size_t
>> +ofpact_clone_get_action_len(const struct ofpact_clone *oc)
>> +{
>> +    return oc->ofpact.len - offsetof(struct ofpact_clone, actions);
>> +}
>> +
>>  /* OFPACT_DEC_TTL.
>>   *
>>   * Used for OFPAT11_DEC_NW_TTL, NXAST_DEC_TTL and NXAST_DEC_TTL_CNT_IDS.
>> */
>> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
>> index 7507558..227ecb2 100644
>> --- a/lib/ofp-actions.c
>> +++ b/lib/ofp-actions.c
>> @@ -306,6 +306,9 @@ enum ofp_raw_action_type {
>>      /* NX1.0+(39): struct nx_action_output_trunc. */
>>      NXAST_RAW_OUTPUT_TRUNC,
>>
>> +    /* NX1.0+(42): struct nx_action_clone, ... */
>> +    NXAST_RAW_CLONE,
>> +
>>  /* ## ------------------ ## */
>>  /* ## Debugging actions. ## */
>>  /* ## ------------------ ## */
>> @@ -386,6 +389,7 @@ ofpact_next_flattened(const struct ofpact *ofpact)
>>      switch (ofpact->type) {
>>      case OFPACT_OUTPUT:
>>      case OFPACT_GROUP:
>> +    case OFPACT_CLONE:
>>      case OFPACT_CONTROLLER:
>>      case OFPACT_ENQUEUE:
>>      case OFPACT_OUTPUT_REG:
>> @@ -4801,6 +4805,90 @@ format_UNROLL_XLATE(const struct
>> ofpact_unroll_xlate *a, struct ds *s)
>>                    colors.paren,   colors.end);
>>  }
>>
>> +
>> +struct nx_action_clone {
>> +    ovs_be16 type;                  /* OFPAT_VENDOR. */
>> +    ovs_be16 len;                   /* Length is at least 16. */
>> +    ovs_be32 vendor;                /* NX_VENDOR_ID. */
>> +    ovs_be16 subtype;               /* NXAST_CLONE. */
>> +    ovs_be16 pad0;
>> +    ovs_be32 pad1;
>> +    /* Followed by a sequence of zero or more OpenFlow actions.  The
>> length
>> +     * of these is included in 'len'. */
>> +};
>> +OFP_ASSERT(sizeof(struct nx_action_clone) == 16);
>> +
>> +static enum ofperr
>> +decode_NXAST_RAW_CLONE(const struct nx_action_clone *nac,
>> +                        enum ofp_version ofp_version,
>> +                        struct ofpbuf *out)
>> +{
>> +    int error;
>> +    struct ofpbuf openflow;
>> +    const size_t clone_offset = ofpacts_pull(out);
>> +    struct ofpact_clone *clone = ofpact_put_CLONE(out);
>> +
>> +    /* decode action list */
>> +    ofpbuf_pull(out, sizeof(*clone));
>> +    openflow = ofpbuf_const_initializer(
>> +                    nac + 1, ntohs(nac->len) - sizeof(*nac));
>> +    error = ofpacts_pull_openflow_actions__(&openflow, openflow.size,
>> +                                            ofp_version,
>> +                                            1u <<
>> OVSINST_OFPIT11_APPLY_ACTIONS,
>> +                                            out, OFPACT_CLONE);
>> +    clone = ofpbuf_push_uninit(out, sizeof(*clone));
>> +    out->header = &clone->ofpact;
>> +    ofpact_finish_CLONE(out, &clone);
>> +    ofpbuf_push_uninit(out, clone_offset);
>> +    return error;
>> +}
>> +
>> +static void
>> +encode_CLONE(const struct ofpact_clone *clone,
>> +              enum ofp_version ofp_version, struct ofpbuf *out)
>> +{
>> +    size_t len;
>> +    const size_t ofs = out->size;
>> +    struct nx_action_clone *nac;
>> +
>> +    nac = put_NXAST_CLONE(out);
>> +    len = ofpacts_put_openflow_actions(clone->actions,
>> +                                 ofpact_clone_get_action_len(clone),
>> +                                 out, ofp_version);
>> +    len += sizeof(*nac);
>> +    nac = ofpbuf_at(out, ofs, sizeof(*nac));
>> +    nac->len = htons(len);
>> +}
>> +
>> +static char * OVS_WARN_UNUSED_RESULT
>> +parse_CLONE(char *arg, struct ofpbuf *ofpacts,
>> +             enum ofputil_protocol *usable_protocols)
>> +{
>> +
>> +    const size_t ct_offset = ofpacts_pull(ofpacts);
>> +    struct ofpact_clone *clone = ofpact_put_CLONE(ofpacts);
>> +    char *error;
>> +
>> +    ofpbuf_pull(ofpacts, sizeof(*clone));
>> +    error = ofpacts_parse_copy(arg, ofpacts, usable_protocols,
>> +                               false, OFPACT_CLONE);
>> +    /* header points to the action list */
>> +    ofpacts->header = ofpbuf_push_uninit(ofpacts, sizeof(*clone));
>> +    clone = ofpacts->header;
>> +
>> +    ofpact_finish_CLONE(ofpacts, &clone);
>> +    ofpbuf_push_uninit(ofpacts, ct_offset);
>> +    return error;
>> +}
>> +
>> +static void
>> +format_CLONE(const struct ofpact_clone *a, struct ds *s)
>> +{
>> +    ds_put_format(s, "%sclone(%s", colors.paren, colors.end);
>> +    ofpacts_format(a->actions, ofpact_clone_get_action_len(a), s);
>> +    ds_put_format(s, "%s)%s", colors.paren, colors.end);
>> +}
>> +
>>  /* Action structure for NXAST_SAMPLE.
>>   *
>>   * Samples matching packets with the given probability and sends them
>> @@ -6212,6 +6300,7 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
>>      case OFPACT_BUNDLE:
>>      case OFPACT_CLEAR_ACTIONS:
>>      case OFPACT_CT:
>> +    case OFPACT_CLONE:
>>      case OFPACT_NAT:
>>      case OFPACT_CONTROLLER:
>>      case OFPACT_DEC_MPLS_TTL:
>> @@ -6288,6 +6377,7 @@ ofpact_is_allowed_in_actions_set(const struct ofpact
>> *a)
>>       * the specification.  Thus the order in which they should be applied
>>       * in the action set is undefined. */
>>      case OFPACT_BUNDLE:
>> +    case OFPACT_CLONE:
>>      case OFPACT_CONTROLLER:
>>      case OFPACT_CT:
>>      case OFPACT_NAT:
>> @@ -6480,6 +6570,7 @@ ovs_instruction_type_from_ofpact_type(enum
>> ofpact_type type)
>>          return OVSINST_OFPIT11_GOTO_TABLE;
>>      case OFPACT_OUTPUT:
>>      case OFPACT_GROUP:
>> +    case OFPACT_CLONE:
>>      case OFPACT_CONTROLLER:
>>      case OFPACT_ENQUEUE:
>>      case OFPACT_OUTPUT_REG:
>> @@ -7080,6 +7171,9 @@ ofpact_check__(enum ofputil_protocol
>> *usable_protocols, struct ofpact *a,
>>      case OFPACT_SAMPLE:
>>          return 0;
>>
>> +    case OFPACT_CLONE:
>> +        return 0;
>> +
>>      case OFPACT_CT: {
>>          struct ofpact_conntrack *oc = ofpact_get_CT(a);
>>
>> @@ -7271,7 +7365,7 @@ ofpacts_verify_nested(const struct ofpact *a, enum
>> ofpact_type outer_action)
>>
>>      if (outer_action) {
>>          ovs_assert(outer_action == OFPACT_WRITE_ACTIONS
>> -                   || outer_action == OFPACT_CT);
>> +                   || outer_action == OFPACT_CT || outer_action ==
>> OFPACT_CLONE);
>>
>>          if (outer_action == OFPACT_CT) {
>>              if (!field) {
>> @@ -7282,6 +7376,10 @@ ofpacts_verify_nested(const struct ofpact *a, enum
>> ofpact_type outer_action)
>>                  return OFPERR_OFPBAC_BAD_ARGUMENT;
>>              }
>>          }
>> +        if (outer_action == OFPACT_CLONE) {
>> +                /* add some constraints. */
>> +                return 0;
>> +        }
>>      }
>>
>>      return 0;
>> @@ -7635,6 +7733,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact,
>> ofp_port_t port)
>>      case OFPACT_POP_MPLS:
>>      case OFPACT_SAMPLE:
>>      case OFPACT_CLEAR_ACTIONS:
>> +    case OFPACT_CLONE:
>>      case OFPACT_WRITE_ACTIONS:
>>      case OFPACT_GOTO_TABLE:
>>      case OFPACT_METER:
>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>> index 55ac371..b534b31 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -4289,6 +4289,35 @@ xlate_sample_action(struct xlate_ctx *ctx,
>>                            tunnel_out_port, false);
>>  }
>>
>> +static void
>> +compose_clone_action(struct xlate_ctx *ctx,
>> +                    const struct ofpact_clone *oc)
>> +{
>> +    struct flow old_flow, old_base_flow;
>> +    size_t actions_offset;
>> +    size_t sample_offset = nl_msg_start_nested(ctx->odp_actions,
>> +                                               OVS_ACTION_ATTR_SAMPLE);
>> +
>> +    /* Ensure that any prior actions are applied. */
>> +    xlate_commit_actions(ctx);
>> +
>> +    /* clone might change the flow key, store the original. */
>> +    old_flow = ctx->xin->flow;
>> +    old_base_flow = ctx->base_flow;
>> +
>> +    /* Sample with 100% Probability */
>> +    nl_msg_put_u32(ctx->odp_actions, OVS_SAMPLE_ATTR_PROBABILITY,
>> UINT32_MAX);
>> +    actions_offset = nl_msg_start_nested(ctx->odp_actions,
>> +                                         OVS_SAMPLE_ATTR_ACTIONS);
>> +    do_xlate_actions(oc->actions, ofpact_clone_get_action_len(oc), ctx);
>> +    nl_msg_end_nested(ctx->odp_actions, actions_offset);
>> +    nl_msg_end_nested(ctx->odp_actions, sample_offset);
>> +
>> +    /* restore flow key */
>> +    ctx->xin->flow = old_flow;
>> +    ctx->base_flow = old_base_flow;
>> +}
>> +
>>  static bool
>>  may_receive(const struct xport *xport, struct xlate_ctx *ctx)
>>  {
>> @@ -4447,6 +4476,7 @@ freeze_unroll_actions(const struct ofpact *a, const
>> struct ofpact *end,
>>          case OFPACT_WRITE_ACTIONS:
>>          case OFPACT_METER:
>>          case OFPACT_SAMPLE:
>> +        case OFPACT_CLONE:
>>          case OFPACT_DEBUG_RECIRC:
>>          case OFPACT_CT:
>>          case OFPACT_NAT:
>> @@ -4695,6 +4725,7 @@ recirc_for_mpls(const struct ofpact *a, struct
>> xlate_ctx *ctx)
>>      case OFPACT_NOTE:
>>      case OFPACT_EXIT:
>>      case OFPACT_SAMPLE:
>> +    case OFPACT_CLONE:
>>      case OFPACT_UNROLL_XLATE:
>>      case OFPACT_CT:
>>      case OFPACT_NAT:
>> @@ -5054,6 +5085,10 @@ do_xlate_actions(const struct ofpact *ofpacts,
>> size_t ofpacts_len,
>>              xlate_sample_action(ctx, ofpact_get_SAMPLE(a));
>>              break;
>>
>> +        case OFPACT_CLONE:
>> +            compose_clone_action(ctx, ofpact_get_CLONE(a));
>> +            break;
>> +
>>          case OFPACT_CT:
>>              compose_conntrack_action(ctx, ofpact_get_CT(a));
>>              break;
>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
>> index ec7bd60..999f9fd 100644
>> --- a/tests/ofproto-dpif.at
>> +++ b/tests/ofproto-dpif.at
>> @@ -6388,6 +6388,25 @@ AT_CHECK([ovs-vsctl destroy
>> Flow_Sample_Collector_Set 1], [0], [ignore])
>>  OVS_VSWITCHD_STOP
>>  AT_CLEANUP
>>
>> +dnl OpenFlow Clone action using Sample datapath action
>> +AT_SETUP([ofproto-dpif - clone action])
>> +OVS_VSWITCHD_START
>> +add_of_ports br0 1 2 3 4
>> +
>> +AT_DATA([flows.txt], [dnl
>> +in_port=1,
>> actions=clone(set_field:192.168.3.3->ip_src),clone(set_field:192.168.4.4->ip_dst,output:2),clone(mod_dl_src:80:81:81:81:81:81,set_field:192.168.5.5->ip_dst,output:3),output:4
>> +])
>> +AT_CHECK([ovs-ofctl add-flows br0 flows.txt], [0], [ignore])
>> +
>> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
>> 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.10.10.2,dst=10.10.10.1,proto=1,tos=1,ttl=128,frag=no),icmp(type=8,code=0)'],
>> [0], [stdout])
>> +
>> +AT_CHECK([tail -1 stdout], [0], [dnl
>> +Datapath actions:
>> sample(sample=100.0%,actions(drop)),sample(sample=100.0%,actions(set(ipv4(src=10.10.10.2,dst=192.168.4.4)),2)),sample(sample=100.0%,actions(set(eth(src=80:81:81:81:81:81)),set(ipv4(src=10.10.10.2,dst=192.168.5.5)),3)),4
>> +])
>> +
>> +OVS_VSWITCHD_STOP
>> +AT_CLEANUP
>> +
>>  dnl Flow based IPFIX statistics check
>>  AT_SETUP([ofproto-dpif - Flow IPFIX statistics check])
>>  OVS_VSWITCHD_START
>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> index 801dfe3..31e5a16 100644
>> --- a/tests/system-traffic.at
>> +++ b/tests/system-traffic.at
>> @@ -337,6 +337,35 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3
>> -w 2 10.1.1.100 | FORMAT_PI
>>  OVS_TRAFFIC_VSWITCHD_STOP
>>  AT_CLEANUP
>>
>> +AT_SETUP([datapath - clone action])
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +
>> +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2)
>> +
>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
>> +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24")
>> +
>> +AT_CHECK([ovs-vsctl -- set interface ovs-p0 ofport_request=1])
>> +AT_CHECK([ovs-vsctl -- set interface ovs-p1 ofport_request=2])
>> +AT_CHECK([ovs-vsctl -- set interface ovs-p2 ofport_request=3])
>> +
>> +dnl verify that the clone(...) won't affect the original packet, so ping
>> still works OK
>> +dnl without 'output' in 'clone()'
>> +AT_CHECK([ovs-ofctl add-flow br0
>> "in_port=1,actions=clone(mod_dl_dst(50:54:00:00:00:0a),set_field:192.168.3.3->ip_dst),
>> output:2"])
>> +dnl with 'output' in 'clone()'
>> +AT_CHECK([ovs-ofctl add-flow br0
>> "in_port=2,actions=clone(mod_dl_dst(50:54:00:00:00:0b),set_field:192.168.4.4->ip_dst,
>> output:3), output:1"])
>> +
>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 |
>> FORMAT_PING], [0], [dnl
>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>> +])
>> +
>> +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid])
>> +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v
>> -o wget0.log])
>> +
>> +OVS_TRAFFIC_VSWITCHD_STOP
>> +AT_CLEANUP
>> +
>>  AT_SETUP([datapath - basic truncate action])
>>  OVS_TRAFFIC_VSWITCHD_START()
>>  AT_CHECK([ovs-ofctl del-flows br0])
>> --
>> 2.5.0
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Ben Pfaff Dec. 2, 2016, 5:09 p.m. UTC | #3
On Wed, Nov 30, 2016 at 01:35:49PM -0800, William Tu wrote:
> This patch adds OpenFlow clone action with syntax as below:
> "clone([action][,action...])".  The clone() action makes a copy of the
> current packet and executes the list of actions against the packet,
> without affecting the packet after the "clone(...)" action.  In other
> word, the packet before the clone() and after the clone() is the same,
> no matter what actions executed inside the clone().

I'm experimenting with using this to eliminate most of the patch ports
that ovn-controller creates.  I'm planning to get that work at least
mostly done before I do the full review here.
William Tu Dec. 2, 2016, 9 p.m. UTC | #4
Note that there are some differences between kernel datapath and
userspace datapath, when using the SAMPLE action. For userspace, we
execute the nested actions inside SAMPLE inline/immediately, while in
kernel datapath, we defer the nested actions to a fifo queue and
execute in the end.

For example:
actions=output:1, clone(output:2), clone(output:3), output:4

Execution sequence seen from kernel DP:
 output:1, output:4, output:2, output:3
Sequence of packet seen from userspace DP:
 output:1, output:2, output:3, output:4

Regards,
William

On Fri, Dec 2, 2016 at 9:09 AM, Ben Pfaff <blp@ovn.org> wrote:
> On Wed, Nov 30, 2016 at 01:35:49PM -0800, William Tu wrote:
>> This patch adds OpenFlow clone action with syntax as below:
>> "clone([action][,action...])".  The clone() action makes a copy of the
>> current packet and executes the list of actions against the packet,
>> without affecting the packet after the "clone(...)" action.  In other
>> word, the packet before the clone() and after the clone() is the same,
>> no matter what actions executed inside the clone().
>
> I'm experimenting with using this to eliminate most of the patch ports
> that ovn-controller creates.  I'm planning to get that work at least
> mostly done before I do the full review here.
Ben Pfaff Dec. 5, 2016, 7:20 a.m. UTC | #5
I sent out a series based on this, but it appears to reveal a bug in the
clone action that causes segfaults.  Try the series without the patch
"XXX Fix segfault in clone action" applied and you should get test
failures in several tests, e.g. 2249 2251 2252 2253 2254 2256 2259 2260
2266 2268 2277 2278

The series starts at:
        https://patchwork.ozlabs.org/patch/702602/

Thanks,

Ben.

On Fri, Dec 02, 2016 at 01:00:26PM -0800, William Tu wrote:
> Note that there are some differences between kernel datapath and
> userspace datapath, when using the SAMPLE action. For userspace, we
> execute the nested actions inside SAMPLE inline/immediately, while in
> kernel datapath, we defer the nested actions to a fifo queue and
> execute in the end.
> 
> For example:
> actions=output:1, clone(output:2), clone(output:3), output:4
> 
> Execution sequence seen from kernel DP:
>  output:1, output:4, output:2, output:3
> Sequence of packet seen from userspace DP:
>  output:1, output:2, output:3, output:4
> 
> Regards,
> William
> 
> On Fri, Dec 2, 2016 at 9:09 AM, Ben Pfaff <blp@ovn.org> wrote:
> > On Wed, Nov 30, 2016 at 01:35:49PM -0800, William Tu wrote:
> >> This patch adds OpenFlow clone action with syntax as below:
> >> "clone([action][,action...])".  The clone() action makes a copy of the
> >> current packet and executes the list of actions against the packet,
> >> without affecting the packet after the "clone(...)" action.  In other
> >> word, the packet before the clone() and after the clone() is the same,
> >> no matter what actions executed inside the clone().
> >
> > I'm experimenting with using this to eliminate most of the patch ports
> > that ovn-controller creates.  I'm planning to get that work at least
> > mostly done before I do the full review here.
William Tu Dec. 15, 2016, 2:41 a.m. UTC | #6
Thanks for the feedback.

>> actions=
>>   clone(truncate(100), push_vlan, resubmit, ...)
>> where we don't need to worry about missing the truncate_unset() because
>> truncated packet is not visible outside the clone().
>
> I see how "clone" helps with this conceptually, but I'm not sure why the
> "sample" is necessary.  I think that the proposed value here is that
> "sample" allows the truncate to be canceled if no output occurs after
> "truncate" and before the end of the "sample" action.  But it's also
> possible for the translation code to see whether there's an output
> action within the clone and, if there is none, then to refrain from
> emitting the datapath "truncate" action entirely.  That seems like a
> more efficient way to implement it.  Will that work?
>

Yes that will work, but we have to add a new datapath clone action.
The reason I use "sample" is try not to add another new datapath
action to kernel.  Current "sample" action happens to provide a list
of actions to execute, so I repurpose it to clone.

>> We definitely should put some limit on the action types available inside
>> clone(). For this patch, there is no restriction.
>
> Why should we limit the actions available inside clone?
>
Currently there is a limit of max 10 actions to execute inside sample.
"#define DEFERRED_ACTION_FIFO_SIZE 10".  So I assume we need to limit
the number or bump up this value.  Other than this, I'm not sure
whether any action can put inside clone, for example some nested case:
clone(clone(sample(...))).

>> Signed-off-by: William Tu <u9012063@gmail.com>
>
> This incremental is needed to avoid putting anything emitted by
> xlate_commit_actions() into the middle of the sample action, otherwise
> the OVN test cases segfault (after my patches are applied):
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 9bcefcd..c30f93b 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -4296,8 +4296,6 @@ compose_clone_action(struct xlate_ctx *ctx,
>  {
>      struct flow old_flow, old_base_flow;
>      size_t actions_offset;
> -    size_t sample_offset = nl_msg_start_nested(ctx->odp_actions,
> -                                               OVS_ACTION_ATTR_SAMPLE);
>
>      /* Ensure that any prior actions are applied. */
>      xlate_commit_actions(ctx);
> @@ -4307,6 +4305,8 @@ compose_clone_action(struct xlate_ctx *ctx,
>      old_base_flow = ctx->base_flow;
>
>      /* Sample with 100% Probability */
> +    size_t sample_offset = nl_msg_start_nested(ctx->odp_actions,
> +                                               OVS_ACTION_ATTR_SAMPLE);
>      nl_msg_put_u32(ctx->odp_actions, OVS_SAMPLE_ATTR_PROBABILITY, UINT32_MAX);
>      actions_offset = nl_msg_start_nested(ctx->odp_actions,
>                                           OVS_SAMPLE_ATTR_ACTIONS);
>
Thanks for the fix.
William
Ben Pfaff Dec. 15, 2016, 5:20 a.m. UTC | #7
On Wed, Dec 14, 2016 at 06:41:46PM -0800, William Tu wrote:
> >> actions=
> >>   clone(truncate(100), push_vlan, resubmit, ...)
> >> where we don't need to worry about missing the truncate_unset() because
> >> truncated packet is not visible outside the clone().
> >
> > I see how "clone" helps with this conceptually, but I'm not sure why the
> > "sample" is necessary.  I think that the proposed value here is that
> > "sample" allows the truncate to be canceled if no output occurs after
> > "truncate" and before the end of the "sample" action.  But it's also
> > possible for the translation code to see whether there's an output
> > action within the clone and, if there is none, then to refrain from
> > emitting the datapath "truncate" action entirely.  That seems like a
> > more efficient way to implement it.  Will that work?
> 
> Yes that will work, but we have to add a new datapath clone action.
> The reason I use "sample" is try not to add another new datapath
> action to kernel.  Current "sample" action happens to provide a list
> of actions to execute, so I repurpose it to clone.

I still don't understand.

Here are some examples of translations from OpenFlow to datapath
actions, the way I would expect them to happen:

    OF: 1, clone(truncate(100), push_vlan, 2), 3
    dp: 1, truncate(100), push_vlan, 2, pop_vlan, 3

    OF: 1, clone(truncate(100), push_vlan), 2
    dp: 1, 2

Where does the sample action help?

Thanks,

Ben.
William Tu Dec. 15, 2016, 3:38 p.m. UTC | #8
> Here are some examples of translations from OpenFlow to datapath
> actions, the way I would expect them to happen:
>
>     OF: 1, clone(truncate(100), push_vlan, 2), 3
>     dp: 1, truncate(100), push_vlan, 2, pop_vlan, 3
>
I see your point; the "clone" is handled at translation code, and
pop_vlan is appended before the end of clone.

I was thinking about actually making a copy of the packet, and kernel
sample implementation happens to call 'skb_clone()'.
If using sample,
    OF: 1, clone(truncate(100), push_vlan, 2), 3
     dp: 1, sample(truncate(100), push_vlan, 2), 3
So the "truncate(100), push_vlan, 2" applies to a copy of skb, no need
to pop_vlan.

William
Ben Pfaff Dec. 15, 2016, 5:05 p.m. UTC | #9
On Thu, Dec 15, 2016 at 07:38:15AM -0800, William Tu wrote:
> > Here are some examples of translations from OpenFlow to datapath
> > actions, the way I would expect them to happen:
> >
> >     OF: 1, clone(truncate(100), push_vlan, 2), 3
> >     dp: 1, truncate(100), push_vlan, 2, pop_vlan, 3
> >
> I see your point; the "clone" is handled at translation code, and
> pop_vlan is appended before the end of clone.
> 
> I was thinking about actually making a copy of the packet, and kernel
> sample implementation happens to call 'skb_clone()'.
> If using sample,
>     OF: 1, clone(truncate(100), push_vlan, 2), 3
>      dp: 1, sample(truncate(100), push_vlan, 2), 3
> So the "truncate(100), push_vlan, 2" applies to a copy of skb, no need
> to pop_vlan.

It's slower and has additional limitations (for example, you pointed out
that only a few actions can be put inside the sample action), so the
benefit seems more like a tradeoff.
diff mbox

Patch

diff --git a/include/openvswitch/ofp-actions.h b/include/openvswitch/ofp-actions.h
index 2999261..be4a991 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -109,6 +109,7 @@ 
     OFPACT(CT,              ofpact_conntrack,   ofpact, "ct")           \
     OFPACT(NAT,             ofpact_nat,         ofpact, "nat")          \
     OFPACT(OUTPUT_TRUNC,    ofpact_output_trunc,ofpact, "output_trunc") \
+    OFPACT(CLONE,           ofpact_clone,       ofpact, "clone")        \
                                                                         \
     /* Debugging actions.                                               \
      *                                                                  \
@@ -868,6 +869,28 @@  struct ofpact_sample {
     enum nx_action_sample_direction direction;
 };
 
+/* OFPACT_CLONE.
+ *
+ * Used for cloning the packet and execute actions
+ * in the clone envelope. */
+struct ofpact_clone {
+    OFPACT_PADDED_MEMBERS(
+        struct ofpact ofpact;
+    );
+    struct ofpact actions[0];
+};
+
+BUILD_ASSERT_DECL(offsetof(struct ofpact_clone, actions)
+                  % OFPACT_ALIGNTO == 0);
+BUILD_ASSERT_DECL(offsetof(struct ofpact_clone, actions)
+                  == sizeof(struct ofpact_clone));
+
+static inline size_t
+ofpact_clone_get_action_len(const struct ofpact_clone *oc)
+{
+    return oc->ofpact.len - offsetof(struct ofpact_clone, actions);
+}
+
 /* OFPACT_DEC_TTL.
  *
  * Used for OFPAT11_DEC_NW_TTL, NXAST_DEC_TTL and NXAST_DEC_TTL_CNT_IDS. */
diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 7507558..227ecb2 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -306,6 +306,9 @@  enum ofp_raw_action_type {
     /* NX1.0+(39): struct nx_action_output_trunc. */
     NXAST_RAW_OUTPUT_TRUNC,
 
+    /* NX1.0+(42): struct nx_action_clone, ... */
+    NXAST_RAW_CLONE,
+
 /* ## ------------------ ## */
 /* ## Debugging actions. ## */
 /* ## ------------------ ## */
@@ -386,6 +389,7 @@  ofpact_next_flattened(const struct ofpact *ofpact)
     switch (ofpact->type) {
     case OFPACT_OUTPUT:
     case OFPACT_GROUP:
+    case OFPACT_CLONE:
     case OFPACT_CONTROLLER:
     case OFPACT_ENQUEUE:
     case OFPACT_OUTPUT_REG:
@@ -4801,6 +4805,90 @@  format_UNROLL_XLATE(const struct ofpact_unroll_xlate *a, struct ds *s)
                   colors.paren,   colors.end);
 }
 
+
+struct nx_action_clone {
+    ovs_be16 type;                  /* OFPAT_VENDOR. */
+    ovs_be16 len;                   /* Length is at least 16. */
+    ovs_be32 vendor;                /* NX_VENDOR_ID. */
+    ovs_be16 subtype;               /* NXAST_CLONE. */
+    ovs_be16 pad0;
+    ovs_be32 pad1;
+    /* Followed by a sequence of zero or more OpenFlow actions.  The length
+     * of these is included in 'len'. */
+};
+OFP_ASSERT(sizeof(struct nx_action_clone) == 16);
+
+static enum ofperr
+decode_NXAST_RAW_CLONE(const struct nx_action_clone *nac,
+                        enum ofp_version ofp_version,
+                        struct ofpbuf *out)
+{
+    int error;
+    struct ofpbuf openflow;
+    const size_t clone_offset = ofpacts_pull(out);
+    struct ofpact_clone *clone = ofpact_put_CLONE(out);
+
+    /* decode action list */
+    ofpbuf_pull(out, sizeof(*clone));
+    openflow = ofpbuf_const_initializer(
+                    nac + 1, ntohs(nac->len) - sizeof(*nac));
+    error = ofpacts_pull_openflow_actions__(&openflow, openflow.size,
+                                            ofp_version,
+                                            1u << OVSINST_OFPIT11_APPLY_ACTIONS,
+                                            out, OFPACT_CLONE);
+    clone = ofpbuf_push_uninit(out, sizeof(*clone));
+    out->header = &clone->ofpact;
+    ofpact_finish_CLONE(out, &clone);
+    ofpbuf_push_uninit(out, clone_offset);
+    return error;
+}
+
+static void
+encode_CLONE(const struct ofpact_clone *clone,
+              enum ofp_version ofp_version, struct ofpbuf *out)
+{
+    size_t len;
+    const size_t ofs = out->size;
+    struct nx_action_clone *nac;
+
+    nac = put_NXAST_CLONE(out);
+    len = ofpacts_put_openflow_actions(clone->actions,
+                                 ofpact_clone_get_action_len(clone),
+                                 out, ofp_version);
+    len += sizeof(*nac);
+    nac = ofpbuf_at(out, ofs, sizeof(*nac));
+    nac->len = htons(len);
+}
+
+static char * OVS_WARN_UNUSED_RESULT
+parse_CLONE(char *arg, struct ofpbuf *ofpacts,
+             enum ofputil_protocol *usable_protocols)
+{
+
+    const size_t ct_offset = ofpacts_pull(ofpacts);
+    struct ofpact_clone *clone = ofpact_put_CLONE(ofpacts);
+    char *error;
+
+    ofpbuf_pull(ofpacts, sizeof(*clone));
+    error = ofpacts_parse_copy(arg, ofpacts, usable_protocols,
+                               false, OFPACT_CLONE);
+    /* header points to the action list */
+    ofpacts->header = ofpbuf_push_uninit(ofpacts, sizeof(*clone));
+    clone = ofpacts->header;
+
+    ofpact_finish_CLONE(ofpacts, &clone);
+    ofpbuf_push_uninit(ofpacts, ct_offset);
+    return error;
+}
+
+static void
+format_CLONE(const struct ofpact_clone *a, struct ds *s)
+{
+    ds_put_format(s, "%sclone(%s", colors.paren, colors.end);
+    ofpacts_format(a->actions, ofpact_clone_get_action_len(a), s);
+    ds_put_format(s, "%s)%s", colors.paren, colors.end);
+}
+
 /* Action structure for NXAST_SAMPLE.
  *
  * Samples matching packets with the given probability and sends them
@@ -6212,6 +6300,7 @@  ofpact_is_set_or_move_action(const struct ofpact *a)
     case OFPACT_BUNDLE:
     case OFPACT_CLEAR_ACTIONS:
     case OFPACT_CT:
+    case OFPACT_CLONE:
     case OFPACT_NAT:
     case OFPACT_CONTROLLER:
     case OFPACT_DEC_MPLS_TTL:
@@ -6288,6 +6377,7 @@  ofpact_is_allowed_in_actions_set(const struct ofpact *a)
      * the specification.  Thus the order in which they should be applied
      * in the action set is undefined. */
     case OFPACT_BUNDLE:
+    case OFPACT_CLONE:
     case OFPACT_CONTROLLER:
     case OFPACT_CT:
     case OFPACT_NAT:
@@ -6480,6 +6570,7 @@  ovs_instruction_type_from_ofpact_type(enum ofpact_type type)
         return OVSINST_OFPIT11_GOTO_TABLE;
     case OFPACT_OUTPUT:
     case OFPACT_GROUP:
+    case OFPACT_CLONE:
     case OFPACT_CONTROLLER:
     case OFPACT_ENQUEUE:
     case OFPACT_OUTPUT_REG:
@@ -7080,6 +7171,9 @@  ofpact_check__(enum ofputil_protocol *usable_protocols, struct ofpact *a,
     case OFPACT_SAMPLE:
         return 0;
 
+    case OFPACT_CLONE:
+        return 0;
+
     case OFPACT_CT: {
         struct ofpact_conntrack *oc = ofpact_get_CT(a);
 
@@ -7271,7 +7365,7 @@  ofpacts_verify_nested(const struct ofpact *a, enum ofpact_type outer_action)
 
     if (outer_action) {
         ovs_assert(outer_action == OFPACT_WRITE_ACTIONS
-                   || outer_action == OFPACT_CT);
+                   || outer_action == OFPACT_CT || outer_action == OFPACT_CLONE);
 
         if (outer_action == OFPACT_CT) {
             if (!field) {
@@ -7282,6 +7376,10 @@  ofpacts_verify_nested(const struct ofpact *a, enum ofpact_type outer_action)
                 return OFPERR_OFPBAC_BAD_ARGUMENT;
             }
         }
+        if (outer_action == OFPACT_CLONE) {
+                /* add some constraints. */
+                return 0;
+        }
     }
 
     return 0;
@@ -7635,6 +7733,7 @@  ofpact_outputs_to_port(const struct ofpact *ofpact, ofp_port_t port)
     case OFPACT_POP_MPLS:
     case OFPACT_SAMPLE:
     case OFPACT_CLEAR_ACTIONS:
+    case OFPACT_CLONE:
     case OFPACT_WRITE_ACTIONS:
     case OFPACT_GOTO_TABLE:
     case OFPACT_METER:
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 55ac371..b534b31 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4289,6 +4289,35 @@  xlate_sample_action(struct xlate_ctx *ctx,
                           tunnel_out_port, false);
 }
 
+static void
+compose_clone_action(struct xlate_ctx *ctx,
+                    const struct ofpact_clone *oc)
+{
+    struct flow old_flow, old_base_flow;
+    size_t actions_offset;
+    size_t sample_offset = nl_msg_start_nested(ctx->odp_actions,
+                                               OVS_ACTION_ATTR_SAMPLE);
+
+    /* Ensure that any prior actions are applied. */
+    xlate_commit_actions(ctx);
+
+    /* clone might change the flow key, store the original. */
+    old_flow = ctx->xin->flow;
+    old_base_flow = ctx->base_flow;
+
+    /* Sample with 100% Probability */
+    nl_msg_put_u32(ctx->odp_actions, OVS_SAMPLE_ATTR_PROBABILITY, UINT32_MAX);
+    actions_offset = nl_msg_start_nested(ctx->odp_actions,
+                                         OVS_SAMPLE_ATTR_ACTIONS);
+    do_xlate_actions(oc->actions, ofpact_clone_get_action_len(oc), ctx);
+    nl_msg_end_nested(ctx->odp_actions, actions_offset);
+    nl_msg_end_nested(ctx->odp_actions, sample_offset);
+
+    /* restore flow key */
+    ctx->xin->flow = old_flow;
+    ctx->base_flow = old_base_flow;
+}
+
 static bool
 may_receive(const struct xport *xport, struct xlate_ctx *ctx)
 {
@@ -4447,6 +4476,7 @@  freeze_unroll_actions(const struct ofpact *a, const struct ofpact *end,
         case OFPACT_WRITE_ACTIONS:
         case OFPACT_METER:
         case OFPACT_SAMPLE:
+        case OFPACT_CLONE:
         case OFPACT_DEBUG_RECIRC:
         case OFPACT_CT:
         case OFPACT_NAT:
@@ -4695,6 +4725,7 @@  recirc_for_mpls(const struct ofpact *a, struct xlate_ctx *ctx)
     case OFPACT_NOTE:
     case OFPACT_EXIT:
     case OFPACT_SAMPLE:
+    case OFPACT_CLONE:
     case OFPACT_UNROLL_XLATE:
     case OFPACT_CT:
     case OFPACT_NAT:
@@ -5054,6 +5085,10 @@  do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
             xlate_sample_action(ctx, ofpact_get_SAMPLE(a));
             break;
 
+        case OFPACT_CLONE:
+            compose_clone_action(ctx, ofpact_get_CLONE(a));
+            break;
+
         case OFPACT_CT:
             compose_conntrack_action(ctx, ofpact_get_CT(a));
             break;
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index ec7bd60..999f9fd 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -6388,6 +6388,25 @@  AT_CHECK([ovs-vsctl destroy Flow_Sample_Collector_Set 1], [0], [ignore])
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+dnl OpenFlow Clone action using Sample datapath action
+AT_SETUP([ofproto-dpif - clone action])
+OVS_VSWITCHD_START
+add_of_ports br0 1 2 3 4
+
+AT_DATA([flows.txt], [dnl
+in_port=1, actions=clone(set_field:192.168.3.3->ip_src),clone(set_field:192.168.4.4->ip_dst,output:2),clone(mod_dl_src:80:81:81:81:81:81,set_field:192.168.5.5->ip_dst,output:3),output:4
+])
+AT_CHECK([ovs-ofctl add-flows br0 flows.txt], [0], [ignore])
+
+AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.10.10.2,dst=10.10.10.1,proto=1,tos=1,ttl=128,frag=no),icmp(type=8,code=0)'], [0], [stdout])
+
+AT_CHECK([tail -1 stdout], [0], [dnl
+Datapath actions: sample(sample=100.0%,actions(drop)),sample(sample=100.0%,actions(set(ipv4(src=10.10.10.2,dst=192.168.4.4)),2)),sample(sample=100.0%,actions(set(eth(src=80:81:81:81:81:81)),set(ipv4(src=10.10.10.2,dst=192.168.5.5)),3)),4
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
 dnl Flow based IPFIX statistics check
 AT_SETUP([ofproto-dpif - Flow IPFIX statistics check])
 OVS_VSWITCHD_START
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 801dfe3..31e5a16 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -337,6 +337,35 @@  NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PI
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([datapath - clone action])
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1, at_ns2)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24")
+
+AT_CHECK([ovs-vsctl -- set interface ovs-p0 ofport_request=1])
+AT_CHECK([ovs-vsctl -- set interface ovs-p1 ofport_request=2])
+AT_CHECK([ovs-vsctl -- set interface ovs-p2 ofport_request=3])
+
+dnl verify that the clone(...) won't affect the original packet, so ping still works OK
+dnl without 'output' in 'clone()'
+AT_CHECK([ovs-ofctl add-flow br0 "in_port=1,actions=clone(mod_dl_dst(50:54:00:00:00:0a),set_field:192.168.3.3->ip_dst), output:2"])
+dnl with 'output' in 'clone()'
+AT_CHECK([ovs-ofctl add-flow br0 "in_port=2,actions=clone(mod_dl_dst(50:54:00:00:00:0b),set_field:192.168.4.4->ip_dst, output:3), output:1"])
+
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid])
+NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([datapath - basic truncate action])
 OVS_TRAFFIC_VSWITCHD_START()
 AT_CHECK([ovs-ofctl del-flows br0])