Message ID | 1565657498-62682-5-git-send-email-yihung.wei@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | Support zone-based conntrack timeout policy | expand |
> -----Original Message----- > From: ovs-dev-bounces@openvswitch.org <ovs-dev- > bounces@openvswitch.org> On Behalf Of Yi-Hung Wei > Sent: Tuesday, August 13, 2019 3:52 AM > To: dev@openvswitch.org > Subject: [ovs-dev] [PATCH v3 4/9] ct-dpif, dpif-netlink: Add conntrack > timeout policy support > > This patch first defines the dpif interface for a datapath to support > adding, deleting, getting and dumping conntrack timeout policy. > The timeout policy is identified by a 4 bytes unsigned integer in > datapath, and it currently support timeout for TCP, UDP, and ICMP > protocols. > > Moreover, this patch provides the implementation for Linux kernel > datapath in dpif-netlink. > > In Linux kernel, the timeout policy is maintained per L3/L4 protocol, > and it is identified by 32 bytes null terminated string. On the other > hand, in vswitchd, the timeout policy is a generic one that consists of > all the supported L4 protocols. Therefore, one of the main task in > dpif-netlink is to break down the generic timeout policy into 6 > sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp), > and push down the configuration using the netlink API in > netlink-conntrack.c. > > This patch also adds missing symbols in the windows datapath so > that the build on windows can pass. > > Appveyor CI: > * https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754 > > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> > --- > Documentation/faq/releases.rst | 3 +- > datapath-windows/include/OvsDpInterfaceCtExt.h | 114 +++++ > datapath-windows/ovsext/Netlink/NetlinkProto.h | 8 +- > include/windows/automake.mk | 1 + > .../windows/linux/netfilter/nfnetlink_cttimeout.h | 0 > lib/ct-dpif.c | 104 +++++ > lib/ct-dpif.h | 56 +++ > lib/dpif-netdev.c | 6 + > lib/dpif-netlink.c | 469 +++++++++++++++++++++ > lib/dpif-netlink.h | 1 - > lib/dpif-provider.h | 44 ++ > lib/netlink-conntrack.c | 308 ++++++++++++++ > lib/netlink-conntrack.h | 27 +- > lib/netlink-protocol.h | 8 +- > 14 files changed, 1142 insertions(+), 7 deletions(-) > create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h > [Alin] This is not an actual review. I'm okay with the Windows changes. I also tested the series and things look good. Do you mind folding in the following: diff --git a/datapath-windows/include/OvsDpInterfaceCtExt.h b/datapath-windows/include/OvsDpInterfaceCtExt.h index 4379855bb..3379f0a25 100644 --- a/datapath-windows/include/OvsDpInterfaceCtExt.h +++ b/datapath-windows/include/OvsDpInterfaceCtExt.h @@ -421,7 +421,7 @@ struct nf_ct_tcp_flags { UINT8 mask; }; -/* File: nfnetlink_cttimeout.h */ +/* File: nfnetlink_cttimeout.h. XXX: the following are not implemented */ enum ctnl_timeout_msg_types { IPCTNL_MSG_TIMEOUT_NEW, IPCTNL_MSG_TIMEOUT_GET, Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
On Tue, Aug 13, 2019 at 4:25 AM <aserdean@ovn.org> wrote: > > --- > > Documentation/faq/releases.rst | 3 +- > > datapath-windows/include/OvsDpInterfaceCtExt.h | 114 +++++ > > datapath-windows/ovsext/Netlink/NetlinkProto.h | 8 +- > > include/windows/automake.mk | 1 + > > .../windows/linux/netfilter/nfnetlink_cttimeout.h | 0 > > lib/ct-dpif.c | 104 +++++ > > lib/ct-dpif.h | 56 +++ > > lib/dpif-netdev.c | 6 + > > lib/dpif-netlink.c | 469 > +++++++++++++++++++++ > > lib/dpif-netlink.h | 1 - > > lib/dpif-provider.h | 44 ++ > > lib/netlink-conntrack.c | 308 ++++++++++++++ > > lib/netlink-conntrack.h | 27 +- > > lib/netlink-protocol.h | 8 +- > > 14 files changed, 1142 insertions(+), 7 deletions(-) > > create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h > > > [Alin] This is not an actual review. > > I'm okay with the Windows changes. > > I also tested the series and things look good. > > Do you mind folding in the following: > diff --git a/datapath-windows/include/OvsDpInterfaceCtExt.h > b/datapath-windows/include/OvsDpInterfaceCtExt.h > index 4379855bb..3379f0a25 100644 > --- a/datapath-windows/include/OvsDpInterfaceCtExt.h > +++ b/datapath-windows/include/OvsDpInterfaceCtExt.h > @@ -421,7 +421,7 @@ struct nf_ct_tcp_flags { > UINT8 mask; > }; > > -/* File: nfnetlink_cttimeout.h */ > +/* File: nfnetlink_cttimeout.h. XXX: the following are not implemented */ > enum ctnl_timeout_msg_types { > IPCTNL_MSG_TIMEOUT_NEW, > IPCTNL_MSG_TIMEOUT_GET, > > > Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> > Thanks Alin! Will fold in your diff into OvsDpInterfaceCtExt.h Thanks, -Yi-Hung
Thanks for the patch mostly minor comments On Mon, Aug 12, 2019 at 5:54 PM Yi-Hung Wei <yihung.wei@gmail.com> wrote: > This patch first defines the dpif interface for a datapath to support > adding, deleting, getting and dumping conntrack timeout policy. > The timeout policy is identified by a 4 bytes unsigned integer in > datapath, and it currently support timeout for TCP, UDP, and ICMP > protocols. > > Moreover, this patch provides the implementation for Linux kernel > datapath in dpif-netlink. > > In Linux kernel, the timeout policy is maintained per L3/L4 protocol, > and it is identified by 32 bytes null terminated string. On the other > hand, in vswitchd, the timeout policy is a generic one that consists of > all the supported L4 protocols. Therefore, one of the main task in > dpif-netlink is to break down the generic timeout policy into 6 > sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp), > and push down the configuration using the netlink API in > netlink-conntrack.c. > > This patch also adds missing symbols in the windows datapath so > that the build on windows can pass. > > Appveyor CI: > * https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754 > > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> > --- > Documentation/faq/releases.rst | 3 +- > datapath-windows/include/OvsDpInterfaceCtExt.h | 114 +++++ > datapath-windows/ovsext/Netlink/NetlinkProto.h | 8 +- > include/windows/automake.mk | 1 + > .../windows/linux/netfilter/nfnetlink_cttimeout.h | 0 > lib/ct-dpif.c | 104 +++++ > lib/ct-dpif.h | 56 +++ > lib/dpif-netdev.c | 6 + > lib/dpif-netlink.c | 469 > +++++++++++++++++++++ > lib/dpif-netlink.h | 1 - > lib/dpif-provider.h | 44 ++ > lib/netlink-conntrack.c | 308 ++++++++++++++ > lib/netlink-conntrack.h | 27 +- > lib/netlink-protocol.h | 8 +- > 14 files changed, 1142 insertions(+), 7 deletions(-) > create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h > > diff --git a/Documentation/faq/releases.rst > b/Documentation/faq/releases.rst > index 8daa23bb2d0c..0b7eaab1b143 100644 > --- a/Documentation/faq/releases.rst > +++ b/Documentation/faq/releases.rst > @@ -110,8 +110,9 @@ Q: Are all features available with all datapaths? > ========================== ============== ============== ========= > ======= > Connection tracking 4.3 YES YES > YES > Conntrack Fragment Reass. 4.3 YES YES > YES > + Conntrack Timeout Policies 5.2 YES NO > NO > + Conntrack Zone Limit 4.18 YES NO > YES > NAT 4.6 YES YES > YES > - Conntrack zone limit 4.18 YES NO > YES > Tunnel - LISP NO YES NO > NO > Tunnel - STT NO YES NO > YES > Tunnel - GRE 3.11 YES YES > YES > diff --git a/datapath-windows/include/OvsDpInterfaceCtExt.h > b/datapath-windows/include/OvsDpInterfaceCtExt.h > index 3b947782e90c..4379855bb8dd 100644 > --- a/datapath-windows/include/OvsDpInterfaceCtExt.h > +++ b/datapath-windows/include/OvsDpInterfaceCtExt.h > @@ -421,4 +421,118 @@ struct nf_ct_tcp_flags { > UINT8 mask; > }; > > +/* File: nfnetlink_cttimeout.h */ > +enum ctnl_timeout_msg_types { > + IPCTNL_MSG_TIMEOUT_NEW, > + IPCTNL_MSG_TIMEOUT_GET, > + IPCTNL_MSG_TIMEOUT_DELETE, > + IPCTNL_MSG_TIMEOUT_DEFAULT_SET, > + IPCTNL_MSG_TIMEOUT_DEFAULT_GET, > + > + IPCTNL_MSG_TIMEOUT_MAX > +}; > + > +enum ctattr_timeout { > + CTA_TIMEOUT_UNSPEC, > + CTA_TIMEOUT_NAME, > + CTA_TIMEOUT_L3PROTO, > + CTA_TIMEOUT_L4PROTO, > + CTA_TIMEOUT_DATA, > + CTA_TIMEOUT_USE, > + __CTA_TIMEOUT_MAX > +}; > +#define CTA_TIMEOUT_MAX (__CTA_TIMEOUT_MAX - 1) > + > +enum ctattr_timeout_generic { > + CTA_TIMEOUT_GENERIC_UNSPEC, > + CTA_TIMEOUT_GENERIC_TIMEOUT, > + __CTA_TIMEOUT_GENERIC_MAX > +}; > +#define CTA_TIMEOUT_GENERIC_MAX (__CTA_TIMEOUT_GENERIC_MAX - 1) > + > +enum ctattr_timeout_tcp { > + CTA_TIMEOUT_TCP_UNSPEC, > + CTA_TIMEOUT_TCP_SYN_SENT, > + CTA_TIMEOUT_TCP_SYN_RECV, > + CTA_TIMEOUT_TCP_ESTABLISHED, > + CTA_TIMEOUT_TCP_FIN_WAIT, > + CTA_TIMEOUT_TCP_CLOSE_WAIT, > + CTA_TIMEOUT_TCP_LAST_ACK, > + CTA_TIMEOUT_TCP_TIME_WAIT, > + CTA_TIMEOUT_TCP_CLOSE, > + CTA_TIMEOUT_TCP_SYN_SENT2, > + CTA_TIMEOUT_TCP_RETRANS, > + CTA_TIMEOUT_TCP_UNACK, > + __CTA_TIMEOUT_TCP_MAX > +}; > +#define CTA_TIMEOUT_TCP_MAX (__CTA_TIMEOUT_TCP_MAX - 1) > + > +enum ctattr_timeout_udp { > + CTA_TIMEOUT_UDP_UNSPEC, > + CTA_TIMEOUT_UDP_UNREPLIED, > + CTA_TIMEOUT_UDP_REPLIED, > + __CTA_TIMEOUT_UDP_MAX > +}; > +#define CTA_TIMEOUT_UDP_MAX (__CTA_TIMEOUT_UDP_MAX - 1) > + > +enum ctattr_timeout_udplite { > + CTA_TIMEOUT_UDPLITE_UNSPEC, > + CTA_TIMEOUT_UDPLITE_UNREPLIED, > + CTA_TIMEOUT_UDPLITE_REPLIED, > + __CTA_TIMEOUT_UDPLITE_MAX > +}; > +#define CTA_TIMEOUT_UDPLITE_MAX (__CTA_TIMEOUT_UDPLITE_MAX - 1) > + > +enum ctattr_timeout_icmp { > + CTA_TIMEOUT_ICMP_UNSPEC, > + CTA_TIMEOUT_ICMP_TIMEOUT, > + __CTA_TIMEOUT_ICMP_MAX > +}; > +#define CTA_TIMEOUT_ICMP_MAX (__CTA_TIMEOUT_ICMP_MAX - 1) > + > +enum ctattr_timeout_dccp { > + CTA_TIMEOUT_DCCP_UNSPEC, > + CTA_TIMEOUT_DCCP_REQUEST, > + CTA_TIMEOUT_DCCP_RESPOND, > + CTA_TIMEOUT_DCCP_PARTOPEN, > + CTA_TIMEOUT_DCCP_OPEN, > + CTA_TIMEOUT_DCCP_CLOSEREQ, > + CTA_TIMEOUT_DCCP_CLOSING, > + CTA_TIMEOUT_DCCP_TIMEWAIT, > + __CTA_TIMEOUT_DCCP_MAX > +}; > +#define CTA_TIMEOUT_DCCP_MAX (__CTA_TIMEOUT_DCCP_MAX - 1) > + > +enum ctattr_timeout_sctp { > + CTA_TIMEOUT_SCTP_UNSPEC, > + CTA_TIMEOUT_SCTP_CLOSED, > + CTA_TIMEOUT_SCTP_COOKIE_WAIT, > + CTA_TIMEOUT_SCTP_COOKIE_ECHOED, > + CTA_TIMEOUT_SCTP_ESTABLISHED, > + CTA_TIMEOUT_SCTP_SHUTDOWN_SENT, > + CTA_TIMEOUT_SCTP_SHUTDOWN_RECD, > + CTA_TIMEOUT_SCTP_SHUTDOWN_ACK_SENT, > + CTA_TIMEOUT_SCTP_HEARTBEAT_SENT, > + CTA_TIMEOUT_SCTP_HEARTBEAT_ACKED, > + __CTA_TIMEOUT_SCTP_MAX > +}; > +#define CTA_TIMEOUT_SCTP_MAX (__CTA_TIMEOUT_SCTP_MAX - 1) > + > +enum ctattr_timeout_icmpv6 { > + CTA_TIMEOUT_ICMPV6_UNSPEC, > + CTA_TIMEOUT_ICMPV6_TIMEOUT, > + __CTA_TIMEOUT_ICMPV6_MAX > +}; > +#define CTA_TIMEOUT_ICMPV6_MAX (__CTA_TIMEOUT_ICMPV6_MAX - 1) > + > +enum ctattr_timeout_gre { > + CTA_TIMEOUT_GRE_UNSPEC, > + CTA_TIMEOUT_GRE_UNREPLIED, > + CTA_TIMEOUT_GRE_REPLIED, > + __CTA_TIMEOUT_GRE_MAX > +}; > +#define CTA_TIMEOUT_GRE_MAX (__CTA_TIMEOUT_GRE_MAX - 1) > + > +#define CTNL_TIMEOUT_NAME_MAX 32 > + > #endif /* __OVS_DP_INTERFACE_CT_EXT_H_ */ > diff --git a/datapath-windows/ovsext/Netlink/NetlinkProto.h > b/datapath-windows/ovsext/Netlink/NetlinkProto.h > index 59b56565c1dc..b32f6f7fb114 100644 > --- a/datapath-windows/ovsext/Netlink/NetlinkProto.h > +++ b/datapath-windows/ovsext/Netlink/NetlinkProto.h > @@ -50,13 +50,17 @@ > #define NLM_F_ACK 0x004 > #define NLM_F_ECHO 0x008 > > +/* GET request flag.*/ > #define NLM_F_ROOT 0x100 > #define NLM_F_MATCH 0x200 > -#define NLM_F_EXCL 0x200 > #define NLM_F_ATOMIC 0x400 > -#define NLM_F_CREATE 0x400 > #define NLM_F_DUMP (NLM_F_ROOT | NLM_F_MATCH) > > +/* NEW request flags. */ > +#define NLM_F_REPLACE 0x100 > +#define NLM_F_EXCL 0x200 > +#define NLM_F_CREATE 0x400 > + > /* nlmsg_type values. */ > #define NLMSG_NOOP 1 > #define NLMSG_ERROR 2 > diff --git a/include/windows/automake.mk b/include/windows/automake.mk > index 382627b51787..883bbbf5d97c 100644 > --- a/include/windows/automake.mk > +++ b/include/windows/automake.mk > @@ -15,6 +15,7 @@ noinst_HEADERS += \ > include/windows/linux/netfilter/nf_conntrack_tcp.h \ > include/windows/linux/netfilter/nfnetlink.h \ > include/windows/linux/netfilter/nfnetlink_conntrack.h \ > + include/windows/linux/netfilter/nfnetlink_cttimeout.h \ > include/windows/linux/pkt_sched.h \ > include/windows/linux/types.h \ > include/windows/net/if.h \ > diff --git a/include/windows/linux/netfilter/nfnetlink_cttimeout.h > b/include/windows/linux/netfilter/nfnetlink_cttimeout.h > new file mode 100644 > index 000000000000..e69de29bb2d1 > diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c > index 6ea7feb0ee35..7f9ce0a561f7 100644 > --- a/lib/ct-dpif.c > +++ b/lib/ct-dpif.c > @@ -760,3 +760,107 @@ ct_dpif_format_zone_limits(uint32_t default_limit, > ds_put_format(ds, ",count=%"PRIu32, zone_limit->count); > } > } > + > +static const char *const ct_dpif_tp_attr_string[] = { > +#define CT_DPIF_TP_TCP_ATTR(ATTR) \ > + [CT_DPIF_TP_ATTR_TCP_##ATTR] = "TCP_"#ATTR, > + CT_DPIF_TP_TCP_ATTRS > +#undef CT_DPIF_TP_TCP_ATTR > +#define CT_DPIF_TP_UDP_ATTR(ATTR) \ > + [CT_DPIF_TP_ATTR_UDP_##ATTR] = "UDP_"#ATTR, > + CT_DPIF_TP_UDP_ATTRS > +#undef CT_DPIF_TP_UDP_ATTR > +#define CT_DPIF_TP_ICMP_ATTR(ATTR) \ > + [CT_DPIF_TP_ATTR_ICMP_##ATTR] = "ICMP_"#ATTR, > + CT_DPIF_TP_ICMP_ATTRS > +#undef CT_DPIF_TP_ICMP_ATTR > +}; > + > +static bool > +ct_dpif_set_timeout_policy_attr(struct ct_dpif_timeout_policy *tp, > + uint32_t attr, uint32_t value) > +{ > + if (tp->present & (1 << attr) && tp->attrs[attr] == value) { > + return false; > + } > + tp->attrs[attr] = value; > + tp->present |= 1 << attr; > + return true; > +} > + > +/* Sets a timeout value identified by '*name' to 'value'. > + * Returns true if the attribute is changed */ > +bool > +ct_dpif_set_timeout_policy_attr_by_name(struct ct_dpif_timeout_policy *tp, > + const char *name, uint32_t value) > +{ > + uint32_t i; > move > + > + for (i = 0; i < CT_DPIF_TP_ATTR_MAX; ++i) { > for (uint32_t i = 0; i < CT_DPIF_TP_ATTR_MAX; ++i) { > + if (!strcasecmp(name, ct_dpif_tp_attr_string[i])) { > + return ct_dpif_set_timeout_policy_attr(tp, i, value); > + } > + } > + return false; > +} > + > +bool > +ct_dpif_timeout_policy_support_ipproto(uint8_t ipproto) > +{ > + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP || > + ipproto == IPPROTO_ICMP || ipproto == IPPROTO_ICMPV6) { > + return true; > + } > + return false; > +} > + > +int > +ct_dpif_set_timeout_policy(struct dpif *dpif, > + const struct ct_dpif_timeout_policy *tp) > +{ > + return (dpif->dpif_class->ct_set_timeout_policy > + ? dpif->dpif_class->ct_set_timeout_policy(dpif, tp) > + : EOPNOTSUPP); > +} > + > +int > +ct_dpif_del_timeout_policy(struct dpif *dpif, uint32_t tp_id) > +{ > + return (dpif->dpif_class->ct_del_timeout_policy > + ? dpif->dpif_class->ct_del_timeout_policy(dpif, tp_id) > + : EOPNOTSUPP); > +} > + > +int > +ct_dpif_get_timeout_policy(struct dpif *dpif, uint32_t tp_id, > + struct ct_dpif_timeout_policy *tp) > +{ > + return (dpif->dpif_class->ct_get_timeout_policy > + ? dpif->dpif_class->ct_get_timeout_policy( > + dpif, tp_id, tp) : EOPNOTSUPP); > +} > + > +int > +ct_dpif_timeout_policy_dump_start(struct dpif *dpif, void **statep) > +{ > + return (dpif->dpif_class->ct_timeout_policy_dump_start > + ? dpif->dpif_class->ct_timeout_policy_dump_start(dpif, statep) > + : EOPNOTSUPP); > +} > + > +int > +ct_dpif_timeout_policy_dump_next(struct dpif *dpif, void *state, > + struct ct_dpif_timeout_policy *tp) > +{ > + return (dpif->dpif_class->ct_timeout_policy_dump_next > + ? dpif->dpif_class->ct_timeout_policy_dump_next(dpif, state, > tp) > + : EOPNOTSUPP); > +} > + > +int > +ct_dpif_timeout_policy_dump_done(struct dpif *dpif, void *state) > +{ > + return (dpif->dpif_class->ct_timeout_policy_dump_done > + ? dpif->dpif_class->ct_timeout_policy_dump_done(dpif, state) > + : EOPNOTSUPP); > +} > diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h > index 2f4906817946..aabd6962f2c0 100644 > --- a/lib/ct-dpif.h > +++ b/lib/ct-dpif.h > @@ -225,6 +225,50 @@ struct ct_dpif_zone_limit { > struct ovs_list node; > }; > > +#define CT_DPIF_TP_TCP_ATTRS \ > + CT_DPIF_TP_TCP_ATTR(SYN_SENT) \ > + CT_DPIF_TP_TCP_ATTR(SYN_RECV) \ > + CT_DPIF_TP_TCP_ATTR(ESTABLISHED) \ > + CT_DPIF_TP_TCP_ATTR(FIN_WAIT) \ > + CT_DPIF_TP_TCP_ATTR(CLOSE_WAIT) \ > + CT_DPIF_TP_TCP_ATTR(LAST_ACK) \ > + CT_DPIF_TP_TCP_ATTR(TIME_WAIT) \ > + CT_DPIF_TP_TCP_ATTR(CLOSE) \ > + CT_DPIF_TP_TCP_ATTR(SYN_SENT2) \ > + CT_DPIF_TP_TCP_ATTR(RETRANSMIT) \ > + CT_DPIF_TP_TCP_ATTR(UNACK) > + > +#define CT_DPIF_TP_UDP_ATTRS \ > + CT_DPIF_TP_UDP_ATTR(FIRST) \ > + CT_DPIF_TP_UDP_ATTR(SINGLE) \ > + CT_DPIF_TP_UDP_ATTR(MULTIPLE) > + > +#define CT_DPIF_TP_ICMP_ATTRS \ > + CT_DPIF_TP_ICMP_ATTR(FIRST) \ > + CT_DPIF_TP_ICMP_ATTR(REPLY) > + > +enum OVS_PACKED_ENUM ct_dpif_tp_attr { > +#define CT_DPIF_TP_TCP_ATTR(ATTR) CT_DPIF_TP_ATTR_TCP_##ATTR, > + CT_DPIF_TP_TCP_ATTRS > +#undef CT_DPIF_TP_TCP_ATTR > +#define CT_DPIF_TP_UDP_ATTR(ATTR) CT_DPIF_TP_ATTR_UDP_##ATTR, > + CT_DPIF_TP_UDP_ATTRS > +#undef CT_DPIF_TP_UDP_ATTR > +#define CT_DPIF_TP_ICMP_ATTR(ATTR) CT_DPIF_TP_ATTR_ICMP_##ATTR, > + CT_DPIF_TP_ICMP_ATTRS > +#undef CT_DPIF_TP_ICMP_ATTR > + CT_DPIF_TP_ATTR_MAX > +}; > + > +struct ct_dpif_timeout_policy { > + uint32_t id; /* Unique identifier for the timeout policy in > + * the datapath. */ > + uint32_t present; /* If a timeout attribute is present set the > + * corresponding CT_DPIF_TP_ATTR_* mapping > bit. */ > + uint32_t attrs[CT_DPIF_TP_ATTR_MAX]; /* An array that specifies > + * timeout attribute > values */ > +}; > + > int ct_dpif_dump_start(struct dpif *, struct ct_dpif_dump_state **, > const uint16_t *zone, int *); > int ct_dpif_dump_next(struct ct_dpif_dump_state *, struct ct_dpif_entry > *); > @@ -262,5 +306,17 @@ bool ct_dpif_parse_zone_limit_tuple(const char *s, > uint16_t *pzone, > uint32_t *plimit, struct ds *); > void ct_dpif_format_zone_limits(uint32_t default_limit, > const struct ovs_list *, struct ds *); > +bool ct_dpif_set_timeout_policy_attr_by_name(struct > ct_dpif_timeout_policy *tp, > + const char *key, uint32_t > value); > +bool ct_dpif_timeout_policy_support_ipproto(uint8_t ipproto); > +int ct_dpif_set_timeout_policy(struct dpif *dpif, > + const struct ct_dpif_timeout_policy *tp); > +int ct_dpif_get_timeout_policy(struct dpif *dpif, uint32_t tp_id, > + struct ct_dpif_timeout_policy *tp); > +int ct_dpif_del_timeout_policy(struct dpif *dpif, uint32_t tp_id); > +int ct_dpif_timeout_policy_dump_start(struct dpif *dpif, void **statep); > +int ct_dpif_timeout_policy_dump_next(struct dpif *dpif, void *state, > + struct ct_dpif_timeout_policy *tp); > +int ct_dpif_timeout_policy_dump_done(struct dpif *dpif, void *state); > > #endif /* CT_DPIF_H */ > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index d0a1c58adace..2079e368fb52 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -7529,6 +7529,12 @@ const struct dpif_class dpif_netdev_class = { > NULL, /* ct_set_limits */ > NULL, /* ct_get_limits */ > NULL, /* ct_del_limits */ > + NULL, /* ct_set_timeout_policy */ > + NULL, /* ct_get_timeout_policy */ > + NULL, /* ct_del_timeout_policy */ > + NULL, /* ct_timeout_policy_dump_start */ > + NULL, /* ct_timeout_policy_dump_next */ > + NULL, /* ct_timeout_policy_dump_done */ > dpif_netdev_ipf_set_enabled, > dpif_netdev_ipf_set_min_frag, > dpif_netdev_ipf_set_max_nfrags, > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index 7bc71d6d19d7..c2ac19dff887 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -50,6 +50,7 @@ > #include "odp-util.h" > #include "openvswitch/dynamic-string.h" > #include "openvswitch/flow.h" > +#include "openvswitch/hmap.h" > #include "openvswitch/match.h" > #include "openvswitch/ofpbuf.h" > #include "openvswitch/poll-loop.h" > @@ -3023,6 +3024,468 @@ dpif_netlink_ct_del_limits(struct dpif *dpif > OVS_UNUSED, > ofpbuf_delete(request); > return err; > } > + > +#define NL_TP_NAME_PREFIX "ovs_tp_" > + > +struct dpif_netlink_timeout_policy_protocol { > + uint16_t l3num; > + uint8_t l4num; > +}; > + > +enum OVS_PACKED_ENUM dpif_netlink_support_timeout_policy_protocol { > + DPIF_NL_TP_AF_INET_TCP, > + DPIF_NL_TP_AF_INET_UDP, > + DPIF_NL_TP_AF_INET_ICMP, > + DPIF_NL_TP_AF_INET6_TCP, > + DPIF_NL_TP_AF_INET6_UDP, > + DPIF_NL_TP_AF_INET6_ICMPV6, > + DPIF_NL_TP_MAX > +}; > + > +#define DPIF_NL_ALL_TP ((1UL << DPIF_NL_TP_MAX) - 1) > + > + > +static struct dpif_netlink_timeout_policy_protocol tp_protos[] = { > + [DPIF_NL_TP_AF_INET_TCP] = { .l3num = AF_INET, .l4num = IPPROTO_TCP }, > + [DPIF_NL_TP_AF_INET_UDP] = { .l3num = AF_INET, .l4num = IPPROTO_UDP }, > + [DPIF_NL_TP_AF_INET_ICMP] = { .l3num = AF_INET, .l4num = IPPROTO_ICMP > }, > + [DPIF_NL_TP_AF_INET6_TCP] = { .l3num = AF_INET6, .l4num = IPPROTO_TCP > }, > + [DPIF_NL_TP_AF_INET6_UDP] = { .l3num = AF_INET6, .l4num = IPPROTO_UDP > }, > + [DPIF_NL_TP_AF_INET6_ICMPV6] = { .l3num = AF_INET6, > + .l4num = IPPROTO_ICMPV6 }, > +}; > + > +static void > +dpif_netlink_format_tp_name(uint32_t id, uint16_t l3num, uint8_t l4num, > + struct ds *tp_name) > +{ > + ds_clear(tp_name); > + ds_put_format(tp_name, "%s%"PRIu32"_", NL_TP_NAME_PREFIX, id); > + ct_dpif_format_ipproto(tp_name, l4num); > + > + if (l3num == AF_INET) { > + ds_put_cstr(tp_name, "4"); > + } else if (l3num == AF_INET6 && l4num != IPPROTO_ICMPV6) { > + ds_put_cstr(tp_name, "6"); > + } > + > + ovs_assert(tp_name->length < CTNL_TIMEOUT_NAME_MAX); > +} > + > +#define CT_DPIF_NL_TP_TCP_MAPPINGS \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_SENT, SYN_SENT) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_RECV, SYN_RECV) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, ESTABLISHED, ESTABLISHED) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, FIN_WAIT, FIN_WAIT) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, CLOSE_WAIT, CLOSE_WAIT) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, LAST_ACK, LAST_ACK) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, TIME_WAIT, TIME_WAIT) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, CLOSE, CLOSE) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_SENT2, SYN_SENT2) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, RETRANSMIT, RETRANS) \ > + CT_DPIF_NL_TP_MAPPING(TCP, TCP, UNACK, UNACK) > + > +#define CT_DPIF_NL_TP_UDP_MAPPINGS \ > + CT_DPIF_NL_TP_MAPPING(UDP, UDP, SINGLE, UNREPLIED) \ > + CT_DPIF_NL_TP_MAPPING(UDP, UDP, MULTIPLE, REPLIED) > + > +#define CT_DPIF_NL_TP_ICMP_MAPPINGS \ > + CT_DPIF_NL_TP_MAPPING(ICMP, ICMP, FIRST, TIMEOUT) > + > +#define CT_DPIF_NL_TP_ICMPV6_MAPPINGS \ > + CT_DPIF_NL_TP_MAPPING(ICMP, ICMPV6, FIRST, TIMEOUT) > + > + > +#define CT_DPIF_NL_TP_MAPPING(PROTO1, PROTO2, ATTR1, ATTR2) \ > +if (tp->present & (1 << CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1)) { \ > + nl_tp->present |= 1 << CTA_TIMEOUT_##PROTO2##_##ATTR2; \ > + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2] = \ > + tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1]; \ > +} > + > +static void > +dpif_netlink_get_nl_tp_tcp_attrs(const struct ct_dpif_timeout_policy *tp, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + CT_DPIF_NL_TP_TCP_MAPPINGS > +} > + > +static void > +dpif_netlink_get_nl_tp_udp_attrs(const struct ct_dpif_timeout_policy *tp, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + CT_DPIF_NL_TP_UDP_MAPPINGS > +} > + > +static void > +dpif_netlink_get_nl_tp_icmp_attrs(const struct ct_dpif_timeout_policy *tp, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + CT_DPIF_NL_TP_ICMP_MAPPINGS > +} > + > +static void > +dpif_netlink_get_nl_tp_icmpv6_attrs(const struct ct_dpif_timeout_policy > *tp, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + CT_DPIF_NL_TP_ICMPV6_MAPPINGS > +} > + > +#undef CT_DPIF_NL_TP_MAPPING > + > +static void > +dpif_netlink_get_nl_tp_attrs(const struct ct_dpif_timeout_policy *tp, > + uint8_t l4num, struct nl_ct_timeout_policy > *nl_tp) > +{ > + nl_tp->present = 0; > + > + if (l4num == IPPROTO_TCP) { > + dpif_netlink_get_nl_tp_tcp_attrs(tp, nl_tp); > + } else if (l4num == IPPROTO_UDP) { > + dpif_netlink_get_nl_tp_udp_attrs(tp, nl_tp); > + } else if (l4num == IPPROTO_ICMP) { > + dpif_netlink_get_nl_tp_icmp_attrs(tp, nl_tp); > + } else if (l4num == IPPROTO_ICMPV6) { > + dpif_netlink_get_nl_tp_icmpv6_attrs(tp, nl_tp); > + } > +} > + > +#define CT_DPIF_NL_TP_MAPPING(PROTO1, PROTO2, ATTR1, ATTR2) > \ > +if (nl_tp->present & (1 << CTA_TIMEOUT_##PROTO2##_##ATTR2)) { > \ > + if (tp->present & (1 << CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1)) { > \ > + if (tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1] != > \ > + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2]) { > \ > + VLOG_WARN_RL(&error_rl, "Inconsistent timeout policy %s " > \ > + "attribute %s=%"PRIu32" while %s=%"PRIu32, > \ > + nl_tp->name, "CTA_TIMEOUT_"#PROTO2"_"#ATTR2, > \ > + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2], > \ > + "CT_DPIF_TP_ATTR_"#PROTO1"_"#ATTR1, > \ > + tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1]); > \ > + } > \ > + } else { > \ > + tp->present |= 1 << CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1; > \ > + tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1] = > \ > + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2]; > \ > + } > \ > +} > + > +static void > +dpif_netlink_set_ct_dpif_tp_tcp_attrs(const struct nl_ct_timeout_policy > *nl_tp, > + struct ct_dpif_timeout_policy *tp) > +{ > + CT_DPIF_NL_TP_TCP_MAPPINGS > +} > + > +static void > +dpif_netlink_set_ct_dpif_tp_udp_attrs(const struct nl_ct_timeout_policy > *nl_tp, > + struct ct_dpif_timeout_policy *tp) > +{ > + CT_DPIF_NL_TP_UDP_MAPPINGS > +} > + > +static void > +dpif_netlink_set_ct_dpif_tp_icmp_attrs( > + const struct nl_ct_timeout_policy *nl_tp, > + struct ct_dpif_timeout_policy *tp) > +{ > + CT_DPIF_NL_TP_ICMP_MAPPINGS > +} > + > +static void > +dpif_netlink_set_ct_dpif_tp_icmpv6_attrs( > + const struct nl_ct_timeout_policy *nl_tp, > + struct ct_dpif_timeout_policy *tp) > +{ > + CT_DPIF_NL_TP_ICMPV6_MAPPINGS > +} > + > +#undef CT_DPIF_NL_TP_MAPPING > + > +static void > +dpif_netlink_set_ct_dpif_tp_attrs(const struct nl_ct_timeout_policy > *nl_tp, > + struct ct_dpif_timeout_policy *tp) > +{ > + if (nl_tp->l4num == IPPROTO_TCP) { > + dpif_netlink_set_ct_dpif_tp_tcp_attrs(nl_tp, tp); > + } else if (nl_tp->l4num == IPPROTO_UDP) { > + dpif_netlink_set_ct_dpif_tp_udp_attrs(nl_tp, tp); > + } else if (nl_tp->l4num == IPPROTO_ICMP) { > + dpif_netlink_set_ct_dpif_tp_icmp_attrs(nl_tp, tp); > + } else if (nl_tp->l4num == IPPROTO_ICMPV6) { > + dpif_netlink_set_ct_dpif_tp_icmpv6_attrs(nl_tp, tp); > + } > +} > + > +#ifdef _WIN32 > +static int > +dpif_netlink_ct_set_timeout_policy(struct dpif *dpif OVS_UNUSED, > + const struct ct_dpif_timeout_policy > *tp) > +{ > + return EOPNOTSUPP; > +} > + > +static int > +dpif_netlink_ct_get_timeout_policy(struct dpif *dpif OVS_UNUSED, > + uint32_t tp_id, > + struct ct_dpif_timeout_policy *tp) > +{ > + return EOPNOTSUPP; > +} > + > +static int > +dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED, > + uint32_t tp_id) > +{ > + return EOPNOTSUPP; > +} > + > +static int > +dpif_netlink_ct_timeout_policy_dump_start(struct dpif *dpif OVS_UNUSED, > + void **statep) > +{ > + return EOPNOTSUPP; > +} > + > +static int > +dpif_netlink_ct_timeout_policy_dump_next(struct dpif *dpif OVS_UNUSED, > + void *state, > + struct ct_dpif_timeout_policy > **tp) > +{ > + return EOPNOTSUPP; > +} > + > +static int > +dpif_netlink_ct_timeout_policy_dump_done(struct dpif *dpif OVS_UNUSED, > + void *state) > +{ > + return EOPNOTSUPP; > +} > +#else > +static int > +dpif_netlink_ct_set_timeout_policy(struct dpif *dpif OVS_UNUSED, > + const struct ct_dpif_timeout_policy > *tp) > +{ > + struct nl_ct_timeout_policy nl_tp; > + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; > + int i, err = 0; > + > + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { > for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) { there are several other cases as well I not going to mention all > + dpif_netlink_format_tp_name(tp->id, tp_protos[i].l3num, > + tp_protos[i].l4num, &nl_tp_name); > + ovs_strlcpy(nl_tp.name, ds_cstr(&nl_tp_name), sizeof nl_tp.name); > + nl_tp.l3num = tp_protos[i].l3num; > + nl_tp.l4num = tp_protos[i].l4num; > + dpif_netlink_get_nl_tp_attrs(tp, tp_protos[i].l4num, &nl_tp); > + err = nl_ct_set_timeout_policy(&nl_tp); > + if (err) { > + VLOG_WARN_RL(&error_rl, "failed to add timeout policy %s > (%s)", > + nl_tp.name, ovs_strerror(err)); > + goto out; > + } > + } > + > +out: > + ds_destroy(&nl_tp_name); > + return err; > +} > + > +static int > +dpif_netlink_ct_get_timeout_policy(struct dpif *dpif OVS_UNUSED, > + uint32_t tp_id, > + struct ct_dpif_timeout_policy *tp) > +{ > + struct nl_ct_timeout_policy nl_tp; > + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; > + int i, err = 0; > + > + tp->id = tp_id; > + tp->present = 0; > + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { > + dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num, > + tp_protos[i].l4num, &nl_tp_name); > + err = nl_ct_get_timeout_policy(ds_cstr(&nl_tp_name), &nl_tp); > + > + if (err) { > + VLOG_WARN_RL(&error_rl, "failed to get timeout policy %s > (%s)", > + nl_tp.name, ovs_strerror(err)); > + goto out; > + } > + dpif_netlink_set_ct_dpif_tp_attrs(&nl_tp, tp); > + } > + > +out: > + ds_destroy(&nl_tp_name); > + return err; > +} > + > +/* Returns 0 if all the sub timeout policies are deleted or > + * not exist in the kernel. */ > +static int > +dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED, > + uint32_t tp_id) > +{ > + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; > + int i, err = 0; > + > + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { > for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) { > + dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num, > + tp_protos[i].l4num, &nl_tp_name); > + err = nl_ct_del_timeout_policy(ds_cstr(&nl_tp_name)); > + if (err == ENOENT) { > + err = 0; > + } > + if (err) { > + VLOG_WARN_RL(&error_rl, "failed to delete timeout policy %s > (%s)", > + ds_cstr(&nl_tp_name), ovs_strerror(err)); > + goto out; > + } > + } > + > +out: > + ds_destroy(&nl_tp_name); > + return err; > +} > + > +struct dpif_netlink_ct_timeout_policy_dump_state { > + struct nl_ct_timeout_policy_dump_state *nl_dump_state; > + struct hmap tp_dump_map; > +}; > + > +struct dpif_netlink_tp_dump_node { > + struct hmap_node hmap_node; /* node in tp_dump_map. */ > + struct ct_dpif_timeout_policy *tp; > + uint32_t l3_l4_present; > +}; > + > +static struct dpif_netlink_tp_dump_node * > +get_dpif_netlink_tp_dump_node_by_tp_id(uint32_t tp_id, > + struct hmap *tp_dump_map) > +{ > + struct dpif_netlink_tp_dump_node *tp_dump_node; > + > + HMAP_FOR_EACH_WITH_HASH (tp_dump_node, hmap_node, hash_int(tp_id, 0), > + tp_dump_map) { > + if (tp_dump_node->tp->id == tp_id) { > + return tp_dump_node; > + } > + } > + return NULL; > +} > + > +static void > +update_dpif_netlink_tp_dump_node( > + const struct nl_ct_timeout_policy *nl_tp, > + struct dpif_netlink_tp_dump_node *tp_dump_node) > +{ > + int i; > + > + dpif_netlink_set_ct_dpif_tp_attrs(nl_tp, tp_dump_node->tp); > + for (i = 0; i < DPIF_NL_TP_MAX; ++i) { > another one for (int i = 0; i < DPIF_NL_TP_MAX; ++i) { > + if (nl_tp->l3num == tp_protos[i].l3num && > + nl_tp->l4num == tp_protos[i].l4num) { > + tp_dump_node->l3_l4_present |= 1 << i; > + break; > + } > + } > +} > + > +static int > +dpif_netlink_ct_timeout_policy_dump_start(struct dpif *dpif OVS_UNUSED, > + void **statep) > +{ > + struct dpif_netlink_ct_timeout_policy_dump_state *dump_state; > + int err; > + > + *statep = dump_state = xzalloc(sizeof *dump_state); > + err = nl_ct_timeout_policy_dump_start(&dump_state->nl_dump_state); > + if (err) { > + free(dump_state); > + return err; > + } > + hmap_init(&dump_state->tp_dump_map); > + return 0; > +} > + > +static int > +dpif_netlink_ct_timeout_policy_dump_next(struct dpif *dpif OVS_UNUSED, > + void *state, > + struct ct_dpif_timeout_policy > *tp) > I think it would be super helpful to add some comments to this function. > +{ > + struct dpif_netlink_ct_timeout_policy_dump_state *dump_state = state; > + struct dpif_netlink_tp_dump_node *tp_dump_node; > + int err; > + > + do { > + struct nl_ct_timeout_policy nl_tp; > + uint32_t tp_id; > + > + err = nl_ct_timeout_policy_dump_next(dump_state->nl_dump_state, > + &nl_tp); > + if (err) { > + break; > + } > + > + if (!ovs_scan(nl_tp.name, NL_TP_NAME_PREFIX"%"PRIu32, &tp_id)) { > + continue; > + } > + > + tp_dump_node = get_dpif_netlink_tp_dump_node_by_tp_id( > + tp_id, &dump_state->tp_dump_map); > + if (!tp_dump_node) { > + tp_dump_node = xzalloc(sizeof *tp_dump_node); > + tp_dump_node->tp = xzalloc(sizeof *tp_dump_node->tp); > + tp_dump_node->tp->id = tp_id; > + hmap_insert(&dump_state->tp_dump_map, > &tp_dump_node->hmap_node, > + hash_int(tp_id, 0)); > + } > + > + update_dpif_netlink_tp_dump_node(&nl_tp, tp_dump_node); > + if (tp_dump_node->l3_l4_present == DPIF_NL_ALL_TP) { > > + hmap_remove(&dump_state->tp_dump_map, > &tp_dump_node->hmap_node); > + *tp = *tp_dump_node->tp; > + free(tp_dump_node->tp); > + free(tp_dump_node); > common block; write once > + break; > + } > + } while (true); > + > + /* Dump the incomplete timeout policy. */ > + if (err == EOF) { > + if (!hmap_is_empty(&dump_state->tp_dump_map)) { > + struct hmap_node *hmap_node = > hmap_first(&dump_state->tp_dump_map); > + > > + tp_dump_node = CONTAINER_OF(hmap_node, > + struct dpif_netlink_tp_dump_node, hmap_node); > > + hmap_remove(&dump_state->tp_dump_map, hmap_node); + *tp = *tp_dump_node->tp; + free(tp_dump_node->tp); + free(tp_dump_node); common block; write once > + return 0; > + } > + } > + > + return err; > +} > + > +static int > +dpif_netlink_ct_timeout_policy_dump_done(struct dpif *dpif OVS_UNUSED, > + void *state) > +{ > + struct dpif_netlink_ct_timeout_policy_dump_state *dump_state = state; > + struct dpif_netlink_tp_dump_node *tp_dump_node; > + int err; > + > + err = nl_ct_timeout_policy_dump_done(dump_state->nl_dump_state); > int err = nl_ct_timeout_policy_dump_done(dump_state->nl_dump_state); > + HMAP_FOR_EACH_POP (tp_dump_node, hmap_node, &dump_state->tp_dump_map) > { > + free(tp_dump_node->tp); > + free(tp_dump_node); > + } > + hmap_destroy(&dump_state->tp_dump_map); > + free(dump_state); > + return err; > +} > +#endif > + > > /* Meters */ > > @@ -3429,6 +3892,12 @@ const struct dpif_class dpif_netlink_class = { > dpif_netlink_ct_set_limits, > dpif_netlink_ct_get_limits, > dpif_netlink_ct_del_limits, > + dpif_netlink_ct_set_timeout_policy, > + dpif_netlink_ct_get_timeout_policy, > + dpif_netlink_ct_del_timeout_policy, > + dpif_netlink_ct_timeout_policy_dump_start, > + dpif_netlink_ct_timeout_policy_dump_next, > + dpif_netlink_ct_timeout_policy_dump_done, > NULL, /* ipf_set_enabled */ > NULL, /* ipf_set_min_frag */ > NULL, /* ipf_set_max_nfrags */ > diff --git a/lib/dpif-netlink.h b/lib/dpif-netlink.h > index 0a9628088275..24294bc42dc3 100644 > --- a/lib/dpif-netlink.h > +++ b/lib/dpif-netlink.h > @@ -20,7 +20,6 @@ > #include <stdbool.h> > #include <stddef.h> > #include <stdint.h> > -#include "odp-netlink.h" > > #include "flow.h" > > diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h > index 12898b9e3c6d..e988626ea05b 100644 > --- a/lib/dpif-provider.h > +++ b/lib/dpif-provider.h > @@ -80,6 +80,7 @@ dpif_flow_dump_thread_init(struct dpif_flow_dump_thread > *thread, > struct ct_dpif_dump_state; > struct ct_dpif_entry; > struct ct_dpif_tuple; > +struct ct_dpif_timeout_policy; > > /* 'dpif_ipf_proto_status' and 'dpif_ipf_status' are presently in > * sync with 'ipf_proto_status' and 'ipf_status', but more > @@ -498,6 +499,49 @@ struct dpif_class { > * list of 'struct ct_dpif_zone_limit' entries. */ > int (*ct_del_limits)(struct dpif *, const struct ovs_list > *zone_limits); > > + /* Connection tracking timeout policy */ > + > + /* A connection tracking timeout policy contains a list of timeout > + * attributes that specify timeout values on various connection > states. > + * In a datapath, the timeout policy is identified by a 4-byte > unsigned > + * integer. Unsupported timeout attributes are ignored. When a > + * connection is committed it can be associated with a timeout > + * policy, or it defaults to the datapath's default timeout policy. */ > + > + /* Sets timeout policy '*tp' into the datapath. */ > + int (*ct_set_timeout_policy)(struct dpif *, > + const struct ct_dpif_timeout_policy *tp); > + /* Gets a timeout policy specified by tp_id and stores it into '*tp'. > */ > + int (*ct_get_timeout_policy)(struct dpif *, uint32_t tp_id, > + struct ct_dpif_timeout_policy *tp); > + /* Deletes a timeout policy identified by 'tp_id'. */ > + int (*ct_del_timeout_policy)(struct dpif *, uint32_t tp_id); > + > + /* Conntrack timeout policy dumping interface. > + * > + * These functions provide a datapath-agnostic dumping interface > + * to the conntrack timeout policy provided by the datapaths. > + * > + * ct_timeout_policy_dump_start() should put in '*statep' a pointer to > + * a newly allocated structure that will be passed by the caller to > + * ct_timeout_policy_dump_next() and ct_timeout_policy_dump_done(). > + * > + * ct_timeout_policy_dump_next() attempts to retrieve another timeout > + * policy from 'dpif' for 'state', which was initialized by a > successful > + * call to ct_timeout_policy_dump_start(). On success, stores a new > + * timeout policy into 'tp' and returns 0. Returns EOF if the last > + * timeout policy has been dumped, or a positive errno value on error. > + * This function will not be called again once it returns nonzero once > + * for a given iteration (but the ct_timeout_policy_dump_done() will > + * be called afterward). > + * > + * ct_timeout_policy_dump_done() should perform any cleanup necessary > + * (including deallocating the 'state' structure, if applicable). */ > + int (*ct_timeout_policy_dump_start)(struct dpif *, void **statep); > + int (*ct_timeout_policy_dump_next)(struct dpif *, void *state, > + struct ct_dpif_timeout_policy *tp); > + int (*ct_timeout_policy_dump_done)(struct dpif *, void *state); > + > /* IP Fragmentation. */ > > /* Disables or enables conntrack fragment reassembly. The default > diff --git a/lib/netlink-conntrack.c b/lib/netlink-conntrack.c > index 7631ba5d5d31..828e4a5a84c1 100644 > --- a/lib/netlink-conntrack.c > +++ b/lib/netlink-conntrack.c > @@ -840,6 +840,314 @@ nl_ct_parse_helper(struct nlattr *nla, struct > ct_dpif_helper *helper) > return parsed; > } > > +static int nl_ct_timeout_policy_max_attr[] = { > + [IPPROTO_TCP] = CTA_TIMEOUT_TCP_MAX, > + [IPPROTO_UDP] = CTA_TIMEOUT_UDP_MAX, > + [IPPROTO_ICMP] = CTA_TIMEOUT_ICMP_MAX, > + [IPPROTO_ICMPV6] = CTA_TIMEOUT_ICMPV6_MAX > +}; > + > +static void > +nl_ct_set_timeout_policy_attr(struct nl_ct_timeout_policy *nl_tp, > + uint32_t attr, uint32_t val) > +{ > + nl_tp->present |= 1 << attr; > + nl_tp->attrs[attr] = val; > +} > + > +static int > +nl_ct_parse_tcp_timeout_policy_data(struct nlattr *nla, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + static const struct nl_policy policy[] = { > + [CTA_TIMEOUT_TCP_SYN_SENT] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_SYN_RECV] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_ESTABLISHED] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_FIN_WAIT] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_CLOSE_WAIT] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_LAST_ACK] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_TIME_WAIT] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_CLOSE] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_SYN_SENT2] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_RETRANS] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_TCP_UNACK] = { .type = NL_A_BE32, > + .optional = false }, > + }; > + struct nlattr *attrs[ARRAY_SIZE(policy)]; > + int i; > + > + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { > + VLOG_ERR_RL(&rl, "Could not parse nested tcp timeout options. " > + "Possibly incompatible Linux kernel version."); > + return EINVAL; > + } > + > + for (i = CTA_TIMEOUT_TCP_SYN_SENT; i <= CTA_TIMEOUT_TCP_UNACK; i++) { > + nl_ct_set_timeout_policy_attr(nl_tp, i, > + ntohl(nl_attr_get_be32(attrs[i]))); > + } > + return 0; > +} > + > +static int > +nl_ct_parse_udp_timeout_policy_data(struct nlattr *nla, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + static const struct nl_policy policy[] = { > + [CTA_TIMEOUT_UDP_UNREPLIED] = { .type = NL_A_BE32, > + .optional = false }, > + [CTA_TIMEOUT_UDP_REPLIED] = { .type = NL_A_BE32, > + .optional = false }, > + }; > + struct nlattr *attrs[ARRAY_SIZE(policy)]; > + int i; > + > + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { > + VLOG_ERR_RL(&rl, "Could not parse nested tcp timeout options. " > + "Possibly incompatible Linux kernel version."); > + return EINVAL; > + } > + > + for (i = CTA_TIMEOUT_UDP_UNREPLIED; i <= CTA_TIMEOUT_UDP_REPLIED; > i++) { > + nl_ct_set_timeout_policy_attr(nl_tp, i, > + ntohl(nl_attr_get_be32(attrs[i]))); > + } > + return 0; > +} > + > +static int > +nl_ct_parse_icmp_timeout_policy_data(struct nlattr *nla, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + static const struct nl_policy policy[] = { > + [CTA_TIMEOUT_ICMP_TIMEOUT] = { .type = NL_A_BE32, > + .optional = false }, > + }; > + struct nlattr *attrs[ARRAY_SIZE(policy)]; > + > + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { > + VLOG_ERR_RL(&rl, "Could not parse nested icmp timeout options. " > + "Possibly incompatible Linux kernel version."); > + return EINVAL; > + } > + > + nl_ct_set_timeout_policy_attr( > + nl_tp, CTA_TIMEOUT_ICMP_TIMEOUT, > + ntohl(nl_attr_get_be32(attrs[CTA_TIMEOUT_ICMP_TIMEOUT]))); > + return 0; > +} > + > +static int > +nl_ct_parse_icmpv6_timeout_policy_data(struct nlattr *nla, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + static const struct nl_policy policy[] = { > + [CTA_TIMEOUT_ICMPV6_TIMEOUT] = { .type = NL_A_BE32, > + .optional = false }, > + }; > + struct nlattr *attrs[ARRAY_SIZE(policy)]; > + > + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { > + VLOG_ERR_RL(&rl, "Could not parse nested icmpv6 timeout options. " > + "Possibly incompatible Linux kernel version."); > + return EINVAL; > + } > + > + nl_ct_set_timeout_policy_attr( > + nl_tp, CTA_TIMEOUT_ICMPV6_TIMEOUT, > + ntohl(nl_attr_get_be32(attrs[CTA_TIMEOUT_ICMPV6_TIMEOUT]))); > + return 0; > +} > + > +static int > +nl_ct_parse_timeout_policy_data(struct nlattr *nla, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + switch (nl_tp->l4num) { > + case IPPROTO_TCP: > + return nl_ct_parse_tcp_timeout_policy_data(nla, nl_tp); > + case IPPROTO_UDP: > + return nl_ct_parse_udp_timeout_policy_data(nla, nl_tp); > + case IPPROTO_ICMP: > + return nl_ct_parse_icmp_timeout_policy_data(nla, nl_tp); > + case IPPROTO_ICMPV6: > + return nl_ct_parse_icmpv6_timeout_policy_data(nla, nl_tp); > + default: > + return EINVAL; > + } > +} > + > +static int > +nl_ct_timeout_policy_from_ofpbuf(struct ofpbuf *buf, > + struct nl_ct_timeout_policy *nl_tp, > + bool default_tp) > +{ > + static const struct nl_policy policy[] = { > + [CTA_TIMEOUT_NAME] = { .type = NL_A_STRING, .optional = false > }, > + [CTA_TIMEOUT_L3PROTO] = { .type = NL_A_BE16, .optional = false }, > + [CTA_TIMEOUT_L4PROTO] = { .type = NL_A_U8, .optional = false }, > + [CTA_TIMEOUT_DATA] = { .type = NL_A_NESTED, .optional = false } > + }; > + static const struct nl_policy policy_default_tp[] = { > + [CTA_TIMEOUT_L3PROTO] = { .type = NL_A_BE16, .optional = false }, > + [CTA_TIMEOUT_L4PROTO] = { .type = NL_A_U8, .optional = false }, > + [CTA_TIMEOUT_DATA] = { .type = NL_A_NESTED, .optional = false } > + }; > + > + struct nlattr *attrs[ARRAY_SIZE(policy)]; > + struct ofpbuf b = ofpbuf_const_initializer(buf->data, buf->size); > + struct nlmsghdr *nlmsg = ofpbuf_try_pull(&b, sizeof *nlmsg); > + struct nfgenmsg *nfmsg = ofpbuf_try_pull(&b, sizeof *nfmsg); > + int err; > err not needed > + > + if (!nlmsg || !nfmsg > + || NFNL_SUBSYS_ID(nlmsg->nlmsg_type) != > NFNL_SUBSYS_CTNETLINK_TIMEOUT > + || nfmsg->version != NFNETLINK_V0 > + || !nl_policy_parse(&b, 0, default_tp ? policy_default_tp : > policy, > + attrs, default_tp ? > ARRAY_SIZE(policy_default_tp) : > + ARRAY_SIZE(policy))) { > + return EINVAL; > + } > + > + if (!default_tp) { > + ovs_strlcpy(nl_tp->name, > nl_attr_get_string(attrs[CTA_TIMEOUT_NAME]), > + sizeof nl_tp->name); > + } > + nl_tp->l3num = ntohs(nl_attr_get_be16(attrs[CTA_TIMEOUT_L3PROTO])); > + nl_tp->l4num = nl_attr_get_u8(attrs[CTA_TIMEOUT_L4PROTO]); > + nl_tp->present = 0; > + > + err = nl_ct_parse_timeout_policy_data(attrs[CTA_TIMEOUT_DATA], nl_tp); + return err; > return nl_ct_parse_timeout_policy_data(attrs[CTA_TIMEOUT_DATA], nl_tp); > +} > + > +int > +nl_ct_set_timeout_policy(const struct nl_ct_timeout_policy *nl_tp) > +{ > + struct ofpbuf buf; > + size_t offset; > + int i, err; > + > + ofpbuf_init(&buf, 512); > + nl_msg_put_nfgenmsg(&buf, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK_TIMEOUT, > + IPCTNL_MSG_TIMEOUT_NEW, NLM_F_REQUEST | > NLM_F_CREATE > + | NLM_F_ACK | NLM_F_REPLACE); > + > + nl_msg_put_string(&buf, CTA_TIMEOUT_NAME, nl_tp->name); > + nl_msg_put_be16(&buf, CTA_TIMEOUT_L3PROTO, htons(nl_tp->l3num)); > + nl_msg_put_u8(&buf, CTA_TIMEOUT_L4PROTO, nl_tp->l4num); > + > + offset = nl_msg_start_nested(&buf, CTA_TIMEOUT_DATA); > + for (i = 1; i <= nl_ct_timeout_policy_max_attr[nl_tp->l4num]; ++i) { > for (int i = 1; i <= nl_ct_timeout_policy_max_attr[nl_tp->l4num]; ++i) { > + if (nl_tp->present & 1 << i) { > + nl_msg_put_be32(&buf, i, htonl(nl_tp->attrs[i])); > + } > + } > + nl_msg_end_nested(&buf, offset); > + > + err = nl_transact(NETLINK_NETFILTER, &buf, NULL); > int err = nl_transact(NETLINK_NETFILTER, &buf, NULL); > + ofpbuf_uninit(&buf); > + return err; > +} > + > +int > +nl_ct_get_timeout_policy(const char *tp_name, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + struct ofpbuf request, *reply; > + int err; > + > + ofpbuf_init(&request, 512); > + nl_msg_put_nfgenmsg(&request, 0, AF_UNSPEC, > NFNL_SUBSYS_CTNETLINK_TIMEOUT, > + IPCTNL_MSG_TIMEOUT_GET, NLM_F_REQUEST | > NLM_F_ACK); > + nl_msg_put_string(&request, CTA_TIMEOUT_NAME, tp_name); > + err = nl_transact(NETLINK_NETFILTER, &request, &reply); > int err = nl_transact(NETLINK_NETFILTER, &request, &reply); > + if (err) { > + goto out; > + } > + > + err = nl_ct_timeout_policy_from_ofpbuf(reply, nl_tp, false); > + > +out: > + ofpbuf_uninit(&request); > + ofpbuf_delete(reply); > + return err; > +} > + > +int > +nl_ct_del_timeout_policy(const char *tp_name) > +{ > + struct ofpbuf buf; > + int err; > + > + ofpbuf_init(&buf, 64); > + nl_msg_put_nfgenmsg(&buf, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK_TIMEOUT, > + IPCTNL_MSG_TIMEOUT_DELETE, NLM_F_REQUEST | > NLM_F_ACK); > + > + nl_msg_put_string(&buf, CTA_TIMEOUT_NAME, tp_name); > + err = nl_transact(NETLINK_NETFILTER, &buf, NULL); > int err = nl_transact(NETLINK_NETFILTER, &buf, NULL); + ofpbuf_uninit(&buf); > + return err; > +} > + > +struct nl_ct_timeout_policy_dump_state { > + struct nl_dump dump; > + struct ofpbuf buf; > +}; > + > +int > +nl_ct_timeout_policy_dump_start( > + struct nl_ct_timeout_policy_dump_state **statep) > +{ > + struct ofpbuf request; > + struct nl_ct_timeout_policy_dump_state *state; > + > + *statep = state = xzalloc(sizeof *state); > + ofpbuf_init(&request, 512); > + nl_msg_put_nfgenmsg(&request, 0, AF_UNSPEC, > NFNL_SUBSYS_CTNETLINK_TIMEOUT, > + IPCTNL_MSG_TIMEOUT_GET, > + NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP); > + > + nl_dump_start(&state->dump, NETLINK_NETFILTER, &request); > + ofpbuf_uninit(&request); > + ofpbuf_init(&state->buf, NL_DUMP_BUFSIZE); > + return 0; > +} > + > +int > +nl_ct_timeout_policy_dump_next(struct nl_ct_timeout_policy_dump_state > *state, > + struct nl_ct_timeout_policy *nl_tp) > +{ > + struct ofpbuf reply; > + int err; > + > + if (!nl_dump_next(&state->dump, &reply, &state->buf)) { > + return EOF; > + } > + err = nl_ct_timeout_policy_from_ofpbuf(&reply, nl_tp, false); > int err = nl_ct_timeout_policy_from_ofpbuf(&reply, nl_tp, false); > + ofpbuf_uninit(&reply); > + return err; > +} > + > +int > +nl_ct_timeout_policy_dump_done(struct nl_ct_timeout_policy_dump_state > *state) > +{ > + int err = nl_dump_done(&state->dump); > + ofpbuf_uninit(&state->buf); > + free(state); > + return err; > +} > + > /* Translate netlink entry status flags to CT_DPIF_TCP status flags. */ > static uint32_t > ips_status_to_dpif_flags(uint32_t status) > diff --git a/lib/netlink-conntrack.h b/lib/netlink-conntrack.h > index 8b536fd65ba8..81c74549bd16 100644 > --- a/lib/netlink-conntrack.h > +++ b/lib/netlink-conntrack.h > @@ -17,6 +17,8 @@ > #ifndef NETLINK_CONNTRACK_H > #define NETLINK_CONNTRACK_H > > +#include <linux/netfilter/nfnetlink_cttimeout.h> > + > #include "byte-order.h" > #include "compiler.h" > #include "ct-dpif.h" > @@ -33,10 +35,21 @@ enum nl_ct_event_type { > NL_CT_EVENT_DELETE = 1 << 2, > }; > > +#define NL_CT_TIMEOUT_POLICY_MAX_ATTR (CTA_TIMEOUT_TCP_MAX + 1) > + > +struct nl_ct_timeout_policy { > + char name[CTNL_TIMEOUT_NAME_MAX]; > + uint16_t l3num; > + uint8_t l4num; > + uint32_t attrs[NL_CT_TIMEOUT_POLICY_MAX_ATTR]; > + uint32_t present; > +}; > + > struct nl_ct_dump_state; > +struct nl_ct_timeout_policy_dump_state; > > int nl_ct_dump_start(struct nl_ct_dump_state **, const uint16_t *zone, > - int *ptot_bkts); > + int *ptot_bkts); > int nl_ct_dump_next(struct nl_ct_dump_state *, struct ct_dpif_entry *); > int nl_ct_dump_done(struct nl_ct_dump_state *); > > @@ -44,6 +57,18 @@ int nl_ct_flush(void); > int nl_ct_flush_zone(uint16_t zone); > int nl_ct_flush_tuple(const struct ct_dpif_tuple *, uint16_t zone); > > +int nl_ct_set_timeout_policy(const struct nl_ct_timeout_policy *nl_tp); > +int nl_ct_get_timeout_policy(const char *tp_name, > + struct nl_ct_timeout_policy *nl_tp); > +int nl_ct_del_timeout_policy(const char *tp_name); > +int nl_ct_timeout_policy_dump_start( > + struct nl_ct_timeout_policy_dump_state **statep); > +int nl_ct_timeout_policy_dump_next( > + struct nl_ct_timeout_policy_dump_state *state, > + struct nl_ct_timeout_policy *nl_tp); > +int nl_ct_timeout_policy_dump_done( > + struct nl_ct_timeout_policy_dump_state *state); > + > bool nl_ct_parse_entry(struct ofpbuf *, struct ct_dpif_entry *, > enum nl_ct_event_type *); > void nl_ct_format_event_entry(const struct ct_dpif_entry *, > diff --git a/lib/netlink-protocol.h b/lib/netlink-protocol.h > index c0617dfad21f..ceded7915ef8 100644 > --- a/lib/netlink-protocol.h > +++ b/lib/netlink-protocol.h > @@ -47,13 +47,17 @@ > #define NLM_F_ACK 0x004 > #define NLM_F_ECHO 0x008 > > +/* GET request flag.*/ > #define NLM_F_ROOT 0x100 > #define NLM_F_MATCH 0x200 > -#define NLM_F_EXCL 0x200 > #define NLM_F_ATOMIC 0x400 > -#define NLM_F_CREATE 0x400 > #define NLM_F_DUMP (NLM_F_ROOT | NLM_F_MATCH) > > +/* NEW request flags. */ > +#define NLM_F_REPLACE 0x100 > +#define NLM_F_EXCL 0x200 > +#define NLM_F_CREATE 0x400 > + > /* nlmsg_type values. */ > #define NLMSG_NOOP 1 > #define NLMSG_ERROR 2 > -- > 2.7.4 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Wed, Aug 14, 2019 at 2:35 PM Darrell Ball <dlu998@gmail.com> wrote: > On Mon, Aug 12, 2019 at 5:54 PM Yi-Hung Wei <yihung.wei@gmail.com> wrote: >> --- a/lib/dpif-netlink.c >> +++ b/lib/dpif-netlink.c >> +static int >> +dpif_netlink_ct_set_timeout_policy(struct dpif *dpif OVS_UNUSED, >> + const struct ct_dpif_timeout_policy *tp) >> +{ >> + struct nl_ct_timeout_policy nl_tp; >> + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; >> + int i, err = 0; >> + >> + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { > > > for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) { > > there are several other cases as well > I not going to mention all Thanks for the review. I make all the similar code changes in ct-dpif.c, dpif-netlink.c and netlink-conntrack.c. >> +static int >> +dpif_netlink_ct_timeout_policy_dump_next(struct dpif *dpif OVS_UNUSED, >> + void *state, >> + struct ct_dpif_timeout_policy *tp) > > I think it would be super helpful to add some comments to this function. > Sure, I added some comments to dpif_netlink_ct_timeout_policy_dump_next() in v4. >> + tp_dump_node = get_dpif_netlink_tp_dump_node_by_tp_id( >> + tp_id, &dump_state->tp_dump_map); >> + if (!tp_dump_node) { >> + tp_dump_node = xzalloc(sizeof *tp_dump_node); >> + tp_dump_node->tp = xzalloc(sizeof *tp_dump_node->tp); >> + tp_dump_node->tp->id = tp_id; >> + hmap_insert(&dump_state->tp_dump_map, &tp_dump_node->hmap_node, >> + hash_int(tp_id, 0)); >> + } >> + >> + update_dpif_netlink_tp_dump_node(&nl_tp, tp_dump_node); >> + if (tp_dump_node->l3_l4_present == DPIF_NL_ALL_TP) { > >> >> + hmap_remove(&dump_state->tp_dump_map, &tp_dump_node->hmap_node); >> + *tp = *tp_dump_node->tp; >> + free(tp_dump_node->tp); >> + free(tp_dump_node); > > common block; write once Sure, I move the common block to a new function in v4. static void get_and_cleanup_tp_dump_node(struct hmap *hamp, struct dpif_netlink_tp_dump_node *tp_dump_node, struct ct_dpif_timeout_policy *tp) { hmap_remove(hmap, &tp_dump_node->hmap_node); *tp = *tp_dump_node->tp; free(tp_dump_node->tp); free(tp_dump_node); } Thanks, -Yi-Hung
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index 8daa23bb2d0c..0b7eaab1b143 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -110,8 +110,9 @@ Q: Are all features available with all datapaths? ========================== ============== ============== ========= ======= Connection tracking 4.3 YES YES YES Conntrack Fragment Reass. 4.3 YES YES YES + Conntrack Timeout Policies 5.2 YES NO NO + Conntrack Zone Limit 4.18 YES NO YES NAT 4.6 YES YES YES - Conntrack zone limit 4.18 YES NO YES Tunnel - LISP NO YES NO NO Tunnel - STT NO YES NO YES Tunnel - GRE 3.11 YES YES YES diff --git a/datapath-windows/include/OvsDpInterfaceCtExt.h b/datapath-windows/include/OvsDpInterfaceCtExt.h index 3b947782e90c..4379855bb8dd 100644 --- a/datapath-windows/include/OvsDpInterfaceCtExt.h +++ b/datapath-windows/include/OvsDpInterfaceCtExt.h @@ -421,4 +421,118 @@ struct nf_ct_tcp_flags { UINT8 mask; }; +/* File: nfnetlink_cttimeout.h */ +enum ctnl_timeout_msg_types { + IPCTNL_MSG_TIMEOUT_NEW, + IPCTNL_MSG_TIMEOUT_GET, + IPCTNL_MSG_TIMEOUT_DELETE, + IPCTNL_MSG_TIMEOUT_DEFAULT_SET, + IPCTNL_MSG_TIMEOUT_DEFAULT_GET, + + IPCTNL_MSG_TIMEOUT_MAX +}; + +enum ctattr_timeout { + CTA_TIMEOUT_UNSPEC, + CTA_TIMEOUT_NAME, + CTA_TIMEOUT_L3PROTO, + CTA_TIMEOUT_L4PROTO, + CTA_TIMEOUT_DATA, + CTA_TIMEOUT_USE, + __CTA_TIMEOUT_MAX +}; +#define CTA_TIMEOUT_MAX (__CTA_TIMEOUT_MAX - 1) + +enum ctattr_timeout_generic { + CTA_TIMEOUT_GENERIC_UNSPEC, + CTA_TIMEOUT_GENERIC_TIMEOUT, + __CTA_TIMEOUT_GENERIC_MAX +}; +#define CTA_TIMEOUT_GENERIC_MAX (__CTA_TIMEOUT_GENERIC_MAX - 1) + +enum ctattr_timeout_tcp { + CTA_TIMEOUT_TCP_UNSPEC, + CTA_TIMEOUT_TCP_SYN_SENT, + CTA_TIMEOUT_TCP_SYN_RECV, + CTA_TIMEOUT_TCP_ESTABLISHED, + CTA_TIMEOUT_TCP_FIN_WAIT, + CTA_TIMEOUT_TCP_CLOSE_WAIT, + CTA_TIMEOUT_TCP_LAST_ACK, + CTA_TIMEOUT_TCP_TIME_WAIT, + CTA_TIMEOUT_TCP_CLOSE, + CTA_TIMEOUT_TCP_SYN_SENT2, + CTA_TIMEOUT_TCP_RETRANS, + CTA_TIMEOUT_TCP_UNACK, + __CTA_TIMEOUT_TCP_MAX +}; +#define CTA_TIMEOUT_TCP_MAX (__CTA_TIMEOUT_TCP_MAX - 1) + +enum ctattr_timeout_udp { + CTA_TIMEOUT_UDP_UNSPEC, + CTA_TIMEOUT_UDP_UNREPLIED, + CTA_TIMEOUT_UDP_REPLIED, + __CTA_TIMEOUT_UDP_MAX +}; +#define CTA_TIMEOUT_UDP_MAX (__CTA_TIMEOUT_UDP_MAX - 1) + +enum ctattr_timeout_udplite { + CTA_TIMEOUT_UDPLITE_UNSPEC, + CTA_TIMEOUT_UDPLITE_UNREPLIED, + CTA_TIMEOUT_UDPLITE_REPLIED, + __CTA_TIMEOUT_UDPLITE_MAX +}; +#define CTA_TIMEOUT_UDPLITE_MAX (__CTA_TIMEOUT_UDPLITE_MAX - 1) + +enum ctattr_timeout_icmp { + CTA_TIMEOUT_ICMP_UNSPEC, + CTA_TIMEOUT_ICMP_TIMEOUT, + __CTA_TIMEOUT_ICMP_MAX +}; +#define CTA_TIMEOUT_ICMP_MAX (__CTA_TIMEOUT_ICMP_MAX - 1) + +enum ctattr_timeout_dccp { + CTA_TIMEOUT_DCCP_UNSPEC, + CTA_TIMEOUT_DCCP_REQUEST, + CTA_TIMEOUT_DCCP_RESPOND, + CTA_TIMEOUT_DCCP_PARTOPEN, + CTA_TIMEOUT_DCCP_OPEN, + CTA_TIMEOUT_DCCP_CLOSEREQ, + CTA_TIMEOUT_DCCP_CLOSING, + CTA_TIMEOUT_DCCP_TIMEWAIT, + __CTA_TIMEOUT_DCCP_MAX +}; +#define CTA_TIMEOUT_DCCP_MAX (__CTA_TIMEOUT_DCCP_MAX - 1) + +enum ctattr_timeout_sctp { + CTA_TIMEOUT_SCTP_UNSPEC, + CTA_TIMEOUT_SCTP_CLOSED, + CTA_TIMEOUT_SCTP_COOKIE_WAIT, + CTA_TIMEOUT_SCTP_COOKIE_ECHOED, + CTA_TIMEOUT_SCTP_ESTABLISHED, + CTA_TIMEOUT_SCTP_SHUTDOWN_SENT, + CTA_TIMEOUT_SCTP_SHUTDOWN_RECD, + CTA_TIMEOUT_SCTP_SHUTDOWN_ACK_SENT, + CTA_TIMEOUT_SCTP_HEARTBEAT_SENT, + CTA_TIMEOUT_SCTP_HEARTBEAT_ACKED, + __CTA_TIMEOUT_SCTP_MAX +}; +#define CTA_TIMEOUT_SCTP_MAX (__CTA_TIMEOUT_SCTP_MAX - 1) + +enum ctattr_timeout_icmpv6 { + CTA_TIMEOUT_ICMPV6_UNSPEC, + CTA_TIMEOUT_ICMPV6_TIMEOUT, + __CTA_TIMEOUT_ICMPV6_MAX +}; +#define CTA_TIMEOUT_ICMPV6_MAX (__CTA_TIMEOUT_ICMPV6_MAX - 1) + +enum ctattr_timeout_gre { + CTA_TIMEOUT_GRE_UNSPEC, + CTA_TIMEOUT_GRE_UNREPLIED, + CTA_TIMEOUT_GRE_REPLIED, + __CTA_TIMEOUT_GRE_MAX +}; +#define CTA_TIMEOUT_GRE_MAX (__CTA_TIMEOUT_GRE_MAX - 1) + +#define CTNL_TIMEOUT_NAME_MAX 32 + #endif /* __OVS_DP_INTERFACE_CT_EXT_H_ */ diff --git a/datapath-windows/ovsext/Netlink/NetlinkProto.h b/datapath-windows/ovsext/Netlink/NetlinkProto.h index 59b56565c1dc..b32f6f7fb114 100644 --- a/datapath-windows/ovsext/Netlink/NetlinkProto.h +++ b/datapath-windows/ovsext/Netlink/NetlinkProto.h @@ -50,13 +50,17 @@ #define NLM_F_ACK 0x004 #define NLM_F_ECHO 0x008 +/* GET request flag.*/ #define NLM_F_ROOT 0x100 #define NLM_F_MATCH 0x200 -#define NLM_F_EXCL 0x200 #define NLM_F_ATOMIC 0x400 -#define NLM_F_CREATE 0x400 #define NLM_F_DUMP (NLM_F_ROOT | NLM_F_MATCH) +/* NEW request flags. */ +#define NLM_F_REPLACE 0x100 +#define NLM_F_EXCL 0x200 +#define NLM_F_CREATE 0x400 + /* nlmsg_type values. */ #define NLMSG_NOOP 1 #define NLMSG_ERROR 2 diff --git a/include/windows/automake.mk b/include/windows/automake.mk index 382627b51787..883bbbf5d97c 100644 --- a/include/windows/automake.mk +++ b/include/windows/automake.mk @@ -15,6 +15,7 @@ noinst_HEADERS += \ include/windows/linux/netfilter/nf_conntrack_tcp.h \ include/windows/linux/netfilter/nfnetlink.h \ include/windows/linux/netfilter/nfnetlink_conntrack.h \ + include/windows/linux/netfilter/nfnetlink_cttimeout.h \ include/windows/linux/pkt_sched.h \ include/windows/linux/types.h \ include/windows/net/if.h \ diff --git a/include/windows/linux/netfilter/nfnetlink_cttimeout.h b/include/windows/linux/netfilter/nfnetlink_cttimeout.h new file mode 100644 index 000000000000..e69de29bb2d1 diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c index 6ea7feb0ee35..7f9ce0a561f7 100644 --- a/lib/ct-dpif.c +++ b/lib/ct-dpif.c @@ -760,3 +760,107 @@ ct_dpif_format_zone_limits(uint32_t default_limit, ds_put_format(ds, ",count=%"PRIu32, zone_limit->count); } } + +static const char *const ct_dpif_tp_attr_string[] = { +#define CT_DPIF_TP_TCP_ATTR(ATTR) \ + [CT_DPIF_TP_ATTR_TCP_##ATTR] = "TCP_"#ATTR, + CT_DPIF_TP_TCP_ATTRS +#undef CT_DPIF_TP_TCP_ATTR +#define CT_DPIF_TP_UDP_ATTR(ATTR) \ + [CT_DPIF_TP_ATTR_UDP_##ATTR] = "UDP_"#ATTR, + CT_DPIF_TP_UDP_ATTRS +#undef CT_DPIF_TP_UDP_ATTR +#define CT_DPIF_TP_ICMP_ATTR(ATTR) \ + [CT_DPIF_TP_ATTR_ICMP_##ATTR] = "ICMP_"#ATTR, + CT_DPIF_TP_ICMP_ATTRS +#undef CT_DPIF_TP_ICMP_ATTR +}; + +static bool +ct_dpif_set_timeout_policy_attr(struct ct_dpif_timeout_policy *tp, + uint32_t attr, uint32_t value) +{ + if (tp->present & (1 << attr) && tp->attrs[attr] == value) { + return false; + } + tp->attrs[attr] = value; + tp->present |= 1 << attr; + return true; +} + +/* Sets a timeout value identified by '*name' to 'value'. + * Returns true if the attribute is changed */ +bool +ct_dpif_set_timeout_policy_attr_by_name(struct ct_dpif_timeout_policy *tp, + const char *name, uint32_t value) +{ + uint32_t i; + + for (i = 0; i < CT_DPIF_TP_ATTR_MAX; ++i) { + if (!strcasecmp(name, ct_dpif_tp_attr_string[i])) { + return ct_dpif_set_timeout_policy_attr(tp, i, value); + } + } + return false; +} + +bool +ct_dpif_timeout_policy_support_ipproto(uint8_t ipproto) +{ + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP || + ipproto == IPPROTO_ICMP || ipproto == IPPROTO_ICMPV6) { + return true; + } + return false; +} + +int +ct_dpif_set_timeout_policy(struct dpif *dpif, + const struct ct_dpif_timeout_policy *tp) +{ + return (dpif->dpif_class->ct_set_timeout_policy + ? dpif->dpif_class->ct_set_timeout_policy(dpif, tp) + : EOPNOTSUPP); +} + +int +ct_dpif_del_timeout_policy(struct dpif *dpif, uint32_t tp_id) +{ + return (dpif->dpif_class->ct_del_timeout_policy + ? dpif->dpif_class->ct_del_timeout_policy(dpif, tp_id) + : EOPNOTSUPP); +} + +int +ct_dpif_get_timeout_policy(struct dpif *dpif, uint32_t tp_id, + struct ct_dpif_timeout_policy *tp) +{ + return (dpif->dpif_class->ct_get_timeout_policy + ? dpif->dpif_class->ct_get_timeout_policy( + dpif, tp_id, tp) : EOPNOTSUPP); +} + +int +ct_dpif_timeout_policy_dump_start(struct dpif *dpif, void **statep) +{ + return (dpif->dpif_class->ct_timeout_policy_dump_start + ? dpif->dpif_class->ct_timeout_policy_dump_start(dpif, statep) + : EOPNOTSUPP); +} + +int +ct_dpif_timeout_policy_dump_next(struct dpif *dpif, void *state, + struct ct_dpif_timeout_policy *tp) +{ + return (dpif->dpif_class->ct_timeout_policy_dump_next + ? dpif->dpif_class->ct_timeout_policy_dump_next(dpif, state, tp) + : EOPNOTSUPP); +} + +int +ct_dpif_timeout_policy_dump_done(struct dpif *dpif, void *state) +{ + return (dpif->dpif_class->ct_timeout_policy_dump_done + ? dpif->dpif_class->ct_timeout_policy_dump_done(dpif, state) + : EOPNOTSUPP); +} diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h index 2f4906817946..aabd6962f2c0 100644 --- a/lib/ct-dpif.h +++ b/lib/ct-dpif.h @@ -225,6 +225,50 @@ struct ct_dpif_zone_limit { struct ovs_list node; }; +#define CT_DPIF_TP_TCP_ATTRS \ + CT_DPIF_TP_TCP_ATTR(SYN_SENT) \ + CT_DPIF_TP_TCP_ATTR(SYN_RECV) \ + CT_DPIF_TP_TCP_ATTR(ESTABLISHED) \ + CT_DPIF_TP_TCP_ATTR(FIN_WAIT) \ + CT_DPIF_TP_TCP_ATTR(CLOSE_WAIT) \ + CT_DPIF_TP_TCP_ATTR(LAST_ACK) \ + CT_DPIF_TP_TCP_ATTR(TIME_WAIT) \ + CT_DPIF_TP_TCP_ATTR(CLOSE) \ + CT_DPIF_TP_TCP_ATTR(SYN_SENT2) \ + CT_DPIF_TP_TCP_ATTR(RETRANSMIT) \ + CT_DPIF_TP_TCP_ATTR(UNACK) + +#define CT_DPIF_TP_UDP_ATTRS \ + CT_DPIF_TP_UDP_ATTR(FIRST) \ + CT_DPIF_TP_UDP_ATTR(SINGLE) \ + CT_DPIF_TP_UDP_ATTR(MULTIPLE) + +#define CT_DPIF_TP_ICMP_ATTRS \ + CT_DPIF_TP_ICMP_ATTR(FIRST) \ + CT_DPIF_TP_ICMP_ATTR(REPLY) + +enum OVS_PACKED_ENUM ct_dpif_tp_attr { +#define CT_DPIF_TP_TCP_ATTR(ATTR) CT_DPIF_TP_ATTR_TCP_##ATTR, + CT_DPIF_TP_TCP_ATTRS +#undef CT_DPIF_TP_TCP_ATTR +#define CT_DPIF_TP_UDP_ATTR(ATTR) CT_DPIF_TP_ATTR_UDP_##ATTR, + CT_DPIF_TP_UDP_ATTRS +#undef CT_DPIF_TP_UDP_ATTR +#define CT_DPIF_TP_ICMP_ATTR(ATTR) CT_DPIF_TP_ATTR_ICMP_##ATTR, + CT_DPIF_TP_ICMP_ATTRS +#undef CT_DPIF_TP_ICMP_ATTR + CT_DPIF_TP_ATTR_MAX +}; + +struct ct_dpif_timeout_policy { + uint32_t id; /* Unique identifier for the timeout policy in + * the datapath. */ + uint32_t present; /* If a timeout attribute is present set the + * corresponding CT_DPIF_TP_ATTR_* mapping bit. */ + uint32_t attrs[CT_DPIF_TP_ATTR_MAX]; /* An array that specifies + * timeout attribute values */ +}; + int ct_dpif_dump_start(struct dpif *, struct ct_dpif_dump_state **, const uint16_t *zone, int *); int ct_dpif_dump_next(struct ct_dpif_dump_state *, struct ct_dpif_entry *); @@ -262,5 +306,17 @@ bool ct_dpif_parse_zone_limit_tuple(const char *s, uint16_t *pzone, uint32_t *plimit, struct ds *); void ct_dpif_format_zone_limits(uint32_t default_limit, const struct ovs_list *, struct ds *); +bool ct_dpif_set_timeout_policy_attr_by_name(struct ct_dpif_timeout_policy *tp, + const char *key, uint32_t value); +bool ct_dpif_timeout_policy_support_ipproto(uint8_t ipproto); +int ct_dpif_set_timeout_policy(struct dpif *dpif, + const struct ct_dpif_timeout_policy *tp); +int ct_dpif_get_timeout_policy(struct dpif *dpif, uint32_t tp_id, + struct ct_dpif_timeout_policy *tp); +int ct_dpif_del_timeout_policy(struct dpif *dpif, uint32_t tp_id); +int ct_dpif_timeout_policy_dump_start(struct dpif *dpif, void **statep); +int ct_dpif_timeout_policy_dump_next(struct dpif *dpif, void *state, + struct ct_dpif_timeout_policy *tp); +int ct_dpif_timeout_policy_dump_done(struct dpif *dpif, void *state); #endif /* CT_DPIF_H */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d0a1c58adace..2079e368fb52 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7529,6 +7529,12 @@ const struct dpif_class dpif_netdev_class = { NULL, /* ct_set_limits */ NULL, /* ct_get_limits */ NULL, /* ct_del_limits */ + NULL, /* ct_set_timeout_policy */ + NULL, /* ct_get_timeout_policy */ + NULL, /* ct_del_timeout_policy */ + NULL, /* ct_timeout_policy_dump_start */ + NULL, /* ct_timeout_policy_dump_next */ + NULL, /* ct_timeout_policy_dump_done */ dpif_netdev_ipf_set_enabled, dpif_netdev_ipf_set_min_frag, dpif_netdev_ipf_set_max_nfrags, diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 7bc71d6d19d7..c2ac19dff887 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -50,6 +50,7 @@ #include "odp-util.h" #include "openvswitch/dynamic-string.h" #include "openvswitch/flow.h" +#include "openvswitch/hmap.h" #include "openvswitch/match.h" #include "openvswitch/ofpbuf.h" #include "openvswitch/poll-loop.h" @@ -3023,6 +3024,468 @@ dpif_netlink_ct_del_limits(struct dpif *dpif OVS_UNUSED, ofpbuf_delete(request); return err; } + +#define NL_TP_NAME_PREFIX "ovs_tp_" + +struct dpif_netlink_timeout_policy_protocol { + uint16_t l3num; + uint8_t l4num; +}; + +enum OVS_PACKED_ENUM dpif_netlink_support_timeout_policy_protocol { + DPIF_NL_TP_AF_INET_TCP, + DPIF_NL_TP_AF_INET_UDP, + DPIF_NL_TP_AF_INET_ICMP, + DPIF_NL_TP_AF_INET6_TCP, + DPIF_NL_TP_AF_INET6_UDP, + DPIF_NL_TP_AF_INET6_ICMPV6, + DPIF_NL_TP_MAX +}; + +#define DPIF_NL_ALL_TP ((1UL << DPIF_NL_TP_MAX) - 1) + + +static struct dpif_netlink_timeout_policy_protocol tp_protos[] = { + [DPIF_NL_TP_AF_INET_TCP] = { .l3num = AF_INET, .l4num = IPPROTO_TCP }, + [DPIF_NL_TP_AF_INET_UDP] = { .l3num = AF_INET, .l4num = IPPROTO_UDP }, + [DPIF_NL_TP_AF_INET_ICMP] = { .l3num = AF_INET, .l4num = IPPROTO_ICMP }, + [DPIF_NL_TP_AF_INET6_TCP] = { .l3num = AF_INET6, .l4num = IPPROTO_TCP }, + [DPIF_NL_TP_AF_INET6_UDP] = { .l3num = AF_INET6, .l4num = IPPROTO_UDP }, + [DPIF_NL_TP_AF_INET6_ICMPV6] = { .l3num = AF_INET6, + .l4num = IPPROTO_ICMPV6 }, +}; + +static void +dpif_netlink_format_tp_name(uint32_t id, uint16_t l3num, uint8_t l4num, + struct ds *tp_name) +{ + ds_clear(tp_name); + ds_put_format(tp_name, "%s%"PRIu32"_", NL_TP_NAME_PREFIX, id); + ct_dpif_format_ipproto(tp_name, l4num); + + if (l3num == AF_INET) { + ds_put_cstr(tp_name, "4"); + } else if (l3num == AF_INET6 && l4num != IPPROTO_ICMPV6) { + ds_put_cstr(tp_name, "6"); + } + + ovs_assert(tp_name->length < CTNL_TIMEOUT_NAME_MAX); +} + +#define CT_DPIF_NL_TP_TCP_MAPPINGS \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_SENT, SYN_SENT) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_RECV, SYN_RECV) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, ESTABLISHED, ESTABLISHED) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, FIN_WAIT, FIN_WAIT) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, CLOSE_WAIT, CLOSE_WAIT) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, LAST_ACK, LAST_ACK) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, TIME_WAIT, TIME_WAIT) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, CLOSE, CLOSE) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_SENT2, SYN_SENT2) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, RETRANSMIT, RETRANS) \ + CT_DPIF_NL_TP_MAPPING(TCP, TCP, UNACK, UNACK) + +#define CT_DPIF_NL_TP_UDP_MAPPINGS \ + CT_DPIF_NL_TP_MAPPING(UDP, UDP, SINGLE, UNREPLIED) \ + CT_DPIF_NL_TP_MAPPING(UDP, UDP, MULTIPLE, REPLIED) + +#define CT_DPIF_NL_TP_ICMP_MAPPINGS \ + CT_DPIF_NL_TP_MAPPING(ICMP, ICMP, FIRST, TIMEOUT) + +#define CT_DPIF_NL_TP_ICMPV6_MAPPINGS \ + CT_DPIF_NL_TP_MAPPING(ICMP, ICMPV6, FIRST, TIMEOUT) + + +#define CT_DPIF_NL_TP_MAPPING(PROTO1, PROTO2, ATTR1, ATTR2) \ +if (tp->present & (1 << CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1)) { \ + nl_tp->present |= 1 << CTA_TIMEOUT_##PROTO2##_##ATTR2; \ + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2] = \ + tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1]; \ +} + +static void +dpif_netlink_get_nl_tp_tcp_attrs(const struct ct_dpif_timeout_policy *tp, + struct nl_ct_timeout_policy *nl_tp) +{ + CT_DPIF_NL_TP_TCP_MAPPINGS +} + +static void +dpif_netlink_get_nl_tp_udp_attrs(const struct ct_dpif_timeout_policy *tp, + struct nl_ct_timeout_policy *nl_tp) +{ + CT_DPIF_NL_TP_UDP_MAPPINGS +} + +static void +dpif_netlink_get_nl_tp_icmp_attrs(const struct ct_dpif_timeout_policy *tp, + struct nl_ct_timeout_policy *nl_tp) +{ + CT_DPIF_NL_TP_ICMP_MAPPINGS +} + +static void +dpif_netlink_get_nl_tp_icmpv6_attrs(const struct ct_dpif_timeout_policy *tp, + struct nl_ct_timeout_policy *nl_tp) +{ + CT_DPIF_NL_TP_ICMPV6_MAPPINGS +} + +#undef CT_DPIF_NL_TP_MAPPING + +static void +dpif_netlink_get_nl_tp_attrs(const struct ct_dpif_timeout_policy *tp, + uint8_t l4num, struct nl_ct_timeout_policy *nl_tp) +{ + nl_tp->present = 0; + + if (l4num == IPPROTO_TCP) { + dpif_netlink_get_nl_tp_tcp_attrs(tp, nl_tp); + } else if (l4num == IPPROTO_UDP) { + dpif_netlink_get_nl_tp_udp_attrs(tp, nl_tp); + } else if (l4num == IPPROTO_ICMP) { + dpif_netlink_get_nl_tp_icmp_attrs(tp, nl_tp); + } else if (l4num == IPPROTO_ICMPV6) { + dpif_netlink_get_nl_tp_icmpv6_attrs(tp, nl_tp); + } +} + +#define CT_DPIF_NL_TP_MAPPING(PROTO1, PROTO2, ATTR1, ATTR2) \ +if (nl_tp->present & (1 << CTA_TIMEOUT_##PROTO2##_##ATTR2)) { \ + if (tp->present & (1 << CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1)) { \ + if (tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1] != \ + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2]) { \ + VLOG_WARN_RL(&error_rl, "Inconsistent timeout policy %s " \ + "attribute %s=%"PRIu32" while %s=%"PRIu32, \ + nl_tp->name, "CTA_TIMEOUT_"#PROTO2"_"#ATTR2, \ + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2], \ + "CT_DPIF_TP_ATTR_"#PROTO1"_"#ATTR1, \ + tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1]); \ + } \ + } else { \ + tp->present |= 1 << CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1; \ + tp->attrs[CT_DPIF_TP_ATTR_##PROTO1##_##ATTR1] = \ + nl_tp->attrs[CTA_TIMEOUT_##PROTO2##_##ATTR2]; \ + } \ +} + +static void +dpif_netlink_set_ct_dpif_tp_tcp_attrs(const struct nl_ct_timeout_policy *nl_tp, + struct ct_dpif_timeout_policy *tp) +{ + CT_DPIF_NL_TP_TCP_MAPPINGS +} + +static void +dpif_netlink_set_ct_dpif_tp_udp_attrs(const struct nl_ct_timeout_policy *nl_tp, + struct ct_dpif_timeout_policy *tp) +{ + CT_DPIF_NL_TP_UDP_MAPPINGS +} + +static void +dpif_netlink_set_ct_dpif_tp_icmp_attrs( + const struct nl_ct_timeout_policy *nl_tp, + struct ct_dpif_timeout_policy *tp) +{ + CT_DPIF_NL_TP_ICMP_MAPPINGS +} + +static void +dpif_netlink_set_ct_dpif_tp_icmpv6_attrs( + const struct nl_ct_timeout_policy *nl_tp, + struct ct_dpif_timeout_policy *tp) +{ + CT_DPIF_NL_TP_ICMPV6_MAPPINGS +} + +#undef CT_DPIF_NL_TP_MAPPING + +static void +dpif_netlink_set_ct_dpif_tp_attrs(const struct nl_ct_timeout_policy *nl_tp, + struct ct_dpif_timeout_policy *tp) +{ + if (nl_tp->l4num == IPPROTO_TCP) { + dpif_netlink_set_ct_dpif_tp_tcp_attrs(nl_tp, tp); + } else if (nl_tp->l4num == IPPROTO_UDP) { + dpif_netlink_set_ct_dpif_tp_udp_attrs(nl_tp, tp); + } else if (nl_tp->l4num == IPPROTO_ICMP) { + dpif_netlink_set_ct_dpif_tp_icmp_attrs(nl_tp, tp); + } else if (nl_tp->l4num == IPPROTO_ICMPV6) { + dpif_netlink_set_ct_dpif_tp_icmpv6_attrs(nl_tp, tp); + } +} + +#ifdef _WIN32 +static int +dpif_netlink_ct_set_timeout_policy(struct dpif *dpif OVS_UNUSED, + const struct ct_dpif_timeout_policy *tp) +{ + return EOPNOTSUPP; +} + +static int +dpif_netlink_ct_get_timeout_policy(struct dpif *dpif OVS_UNUSED, + uint32_t tp_id, + struct ct_dpif_timeout_policy *tp) +{ + return EOPNOTSUPP; +} + +static int +dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED, + uint32_t tp_id) +{ + return EOPNOTSUPP; +} + +static int +dpif_netlink_ct_timeout_policy_dump_start(struct dpif *dpif OVS_UNUSED, + void **statep) +{ + return EOPNOTSUPP; +} + +static int +dpif_netlink_ct_timeout_policy_dump_next(struct dpif *dpif OVS_UNUSED, + void *state, + struct ct_dpif_timeout_policy **tp) +{ + return EOPNOTSUPP; +} + +static int +dpif_netlink_ct_timeout_policy_dump_done(struct dpif *dpif OVS_UNUSED, + void *state) +{ + return EOPNOTSUPP; +} +#else +static int +dpif_netlink_ct_set_timeout_policy(struct dpif *dpif OVS_UNUSED, + const struct ct_dpif_timeout_policy *tp) +{ + struct nl_ct_timeout_policy nl_tp; + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; + int i, err = 0; + + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { + dpif_netlink_format_tp_name(tp->id, tp_protos[i].l3num, + tp_protos[i].l4num, &nl_tp_name); + ovs_strlcpy(nl_tp.name, ds_cstr(&nl_tp_name), sizeof nl_tp.name); + nl_tp.l3num = tp_protos[i].l3num; + nl_tp.l4num = tp_protos[i].l4num; + dpif_netlink_get_nl_tp_attrs(tp, tp_protos[i].l4num, &nl_tp); + err = nl_ct_set_timeout_policy(&nl_tp); + if (err) { + VLOG_WARN_RL(&error_rl, "failed to add timeout policy %s (%s)", + nl_tp.name, ovs_strerror(err)); + goto out; + } + } + +out: + ds_destroy(&nl_tp_name); + return err; +} + +static int +dpif_netlink_ct_get_timeout_policy(struct dpif *dpif OVS_UNUSED, + uint32_t tp_id, + struct ct_dpif_timeout_policy *tp) +{ + struct nl_ct_timeout_policy nl_tp; + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; + int i, err = 0; + + tp->id = tp_id; + tp->present = 0; + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { + dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num, + tp_protos[i].l4num, &nl_tp_name); + err = nl_ct_get_timeout_policy(ds_cstr(&nl_tp_name), &nl_tp); + + if (err) { + VLOG_WARN_RL(&error_rl, "failed to get timeout policy %s (%s)", + nl_tp.name, ovs_strerror(err)); + goto out; + } + dpif_netlink_set_ct_dpif_tp_attrs(&nl_tp, tp); + } + +out: + ds_destroy(&nl_tp_name); + return err; +} + +/* Returns 0 if all the sub timeout policies are deleted or + * not exist in the kernel. */ +static int +dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED, + uint32_t tp_id) +{ + struct ds nl_tp_name = DS_EMPTY_INITIALIZER; + int i, err = 0; + + for (i = 0; i < ARRAY_SIZE(tp_protos); ++i) { + dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num, + tp_protos[i].l4num, &nl_tp_name); + err = nl_ct_del_timeout_policy(ds_cstr(&nl_tp_name)); + if (err == ENOENT) { + err = 0; + } + if (err) { + VLOG_WARN_RL(&error_rl, "failed to delete timeout policy %s (%s)", + ds_cstr(&nl_tp_name), ovs_strerror(err)); + goto out; + } + } + +out: + ds_destroy(&nl_tp_name); + return err; +} + +struct dpif_netlink_ct_timeout_policy_dump_state { + struct nl_ct_timeout_policy_dump_state *nl_dump_state; + struct hmap tp_dump_map; +}; + +struct dpif_netlink_tp_dump_node { + struct hmap_node hmap_node; /* node in tp_dump_map. */ + struct ct_dpif_timeout_policy *tp; + uint32_t l3_l4_present; +}; + +static struct dpif_netlink_tp_dump_node * +get_dpif_netlink_tp_dump_node_by_tp_id(uint32_t tp_id, + struct hmap *tp_dump_map) +{ + struct dpif_netlink_tp_dump_node *tp_dump_node; + + HMAP_FOR_EACH_WITH_HASH (tp_dump_node, hmap_node, hash_int(tp_id, 0), + tp_dump_map) { + if (tp_dump_node->tp->id == tp_id) { + return tp_dump_node; + } + } + return NULL; +} + +static void +update_dpif_netlink_tp_dump_node( + const struct nl_ct_timeout_policy *nl_tp, + struct dpif_netlink_tp_dump_node *tp_dump_node) +{ + int i; + + dpif_netlink_set_ct_dpif_tp_attrs(nl_tp, tp_dump_node->tp); + for (i = 0; i < DPIF_NL_TP_MAX; ++i) { + if (nl_tp->l3num == tp_protos[i].l3num && + nl_tp->l4num == tp_protos[i].l4num) { + tp_dump_node->l3_l4_present |= 1 << i; + break; + } + } +} + +static int +dpif_netlink_ct_timeout_policy_dump_start(struct dpif *dpif OVS_UNUSED, + void **statep) +{ + struct dpif_netlink_ct_timeout_policy_dump_state *dump_state; + int err; + + *statep = dump_state = xzalloc(sizeof *dump_state); + err = nl_ct_timeout_policy_dump_start(&dump_state->nl_dump_state); + if (err) { + free(dump_state); + return err; + } + hmap_init(&dump_state->tp_dump_map); + return 0; +} + +static int +dpif_netlink_ct_timeout_policy_dump_next(struct dpif *dpif OVS_UNUSED, + void *state, + struct ct_dpif_timeout_policy *tp) +{ + struct dpif_netlink_ct_timeout_policy_dump_state *dump_state = state; + struct dpif_netlink_tp_dump_node *tp_dump_node; + int err; + + do { + struct nl_ct_timeout_policy nl_tp; + uint32_t tp_id; + + err = nl_ct_timeout_policy_dump_next(dump_state->nl_dump_state, + &nl_tp); + if (err) { + break; + } + + if (!ovs_scan(nl_tp.name, NL_TP_NAME_PREFIX"%"PRIu32, &tp_id)) { + continue; + } + + tp_dump_node = get_dpif_netlink_tp_dump_node_by_tp_id( + tp_id, &dump_state->tp_dump_map); + if (!tp_dump_node) { + tp_dump_node = xzalloc(sizeof *tp_dump_node); + tp_dump_node->tp = xzalloc(sizeof *tp_dump_node->tp); + tp_dump_node->tp->id = tp_id; + hmap_insert(&dump_state->tp_dump_map, &tp_dump_node->hmap_node, + hash_int(tp_id, 0)); + } + + update_dpif_netlink_tp_dump_node(&nl_tp, tp_dump_node); + if (tp_dump_node->l3_l4_present == DPIF_NL_ALL_TP) { + hmap_remove(&dump_state->tp_dump_map, &tp_dump_node->hmap_node); + *tp = *tp_dump_node->tp; + free(tp_dump_node->tp); + free(tp_dump_node); + break; + } + } while (true); + + /* Dump the incomplete timeout policy. */ + if (err == EOF) { + if (!hmap_is_empty(&dump_state->tp_dump_map)) { + struct hmap_node *hmap_node = hmap_first(&dump_state->tp_dump_map); + + hmap_remove(&dump_state->tp_dump_map, hmap_node); + tp_dump_node = CONTAINER_OF(hmap_node, + struct dpif_netlink_tp_dump_node, hmap_node); + *tp = *tp_dump_node->tp; + free(tp_dump_node->tp); + free(tp_dump_node); + return 0; + } + } + + return err; +} + +static int +dpif_netlink_ct_timeout_policy_dump_done(struct dpif *dpif OVS_UNUSED, + void *state) +{ + struct dpif_netlink_ct_timeout_policy_dump_state *dump_state = state; + struct dpif_netlink_tp_dump_node *tp_dump_node; + int err; + + err = nl_ct_timeout_policy_dump_done(dump_state->nl_dump_state); + HMAP_FOR_EACH_POP (tp_dump_node, hmap_node, &dump_state->tp_dump_map) { + free(tp_dump_node->tp); + free(tp_dump_node); + } + hmap_destroy(&dump_state->tp_dump_map); + free(dump_state); + return err; +} +#endif + /* Meters */ @@ -3429,6 +3892,12 @@ const struct dpif_class dpif_netlink_class = { dpif_netlink_ct_set_limits, dpif_netlink_ct_get_limits, dpif_netlink_ct_del_limits, + dpif_netlink_ct_set_timeout_policy, + dpif_netlink_ct_get_timeout_policy, + dpif_netlink_ct_del_timeout_policy, + dpif_netlink_ct_timeout_policy_dump_start, + dpif_netlink_ct_timeout_policy_dump_next, + dpif_netlink_ct_timeout_policy_dump_done, NULL, /* ipf_set_enabled */ NULL, /* ipf_set_min_frag */ NULL, /* ipf_set_max_nfrags */ diff --git a/lib/dpif-netlink.h b/lib/dpif-netlink.h index 0a9628088275..24294bc42dc3 100644 --- a/lib/dpif-netlink.h +++ b/lib/dpif-netlink.h @@ -20,7 +20,6 @@ #include <stdbool.h> #include <stddef.h> #include <stdint.h> -#include "odp-netlink.h" #include "flow.h" diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h index 12898b9e3c6d..e988626ea05b 100644 --- a/lib/dpif-provider.h +++ b/lib/dpif-provider.h @@ -80,6 +80,7 @@ dpif_flow_dump_thread_init(struct dpif_flow_dump_thread *thread, struct ct_dpif_dump_state; struct ct_dpif_entry; struct ct_dpif_tuple; +struct ct_dpif_timeout_policy; /* 'dpif_ipf_proto_status' and 'dpif_ipf_status' are presently in * sync with 'ipf_proto_status' and 'ipf_status', but more @@ -498,6 +499,49 @@ struct dpif_class { * list of 'struct ct_dpif_zone_limit' entries. */ int (*ct_del_limits)(struct dpif *, const struct ovs_list *zone_limits); + /* Connection tracking timeout policy */ + + /* A connection tracking timeout policy contains a list of timeout + * attributes that specify timeout values on various connection states. + * In a datapath, the timeout policy is identified by a 4-byte unsigned + * integer. Unsupported timeout attributes are ignored. When a + * connection is committed it can be associated with a timeout + * policy, or it defaults to the datapath's default timeout policy. */ + + /* Sets timeout policy '*tp' into the datapath. */ + int (*ct_set_timeout_policy)(struct dpif *, + const struct ct_dpif_timeout_policy *tp); + /* Gets a timeout policy specified by tp_id and stores it into '*tp'. */ + int (*ct_get_timeout_policy)(struct dpif *, uint32_t tp_id, + struct ct_dpif_timeout_policy *tp); + /* Deletes a timeout policy identified by 'tp_id'. */ + int (*ct_del_timeout_policy)(struct dpif *, uint32_t tp_id); + + /* Conntrack timeout policy dumping interface. + * + * These functions provide a datapath-agnostic dumping interface + * to the conntrack timeout policy provided by the datapaths. + * + * ct_timeout_policy_dump_start() should put in '*statep' a pointer to + * a newly allocated structure that will be passed by the caller to + * ct_timeout_policy_dump_next() and ct_timeout_policy_dump_done(). + * + * ct_timeout_policy_dump_next() attempts to retrieve another timeout + * policy from 'dpif' for 'state', which was initialized by a successful + * call to ct_timeout_policy_dump_start(). On success, stores a new + * timeout policy into 'tp' and returns 0. Returns EOF if the last + * timeout policy has been dumped, or a positive errno value on error. + * This function will not be called again once it returns nonzero once + * for a given iteration (but the ct_timeout_policy_dump_done() will + * be called afterward). + * + * ct_timeout_policy_dump_done() should perform any cleanup necessary + * (including deallocating the 'state' structure, if applicable). */ + int (*ct_timeout_policy_dump_start)(struct dpif *, void **statep); + int (*ct_timeout_policy_dump_next)(struct dpif *, void *state, + struct ct_dpif_timeout_policy *tp); + int (*ct_timeout_policy_dump_done)(struct dpif *, void *state); + /* IP Fragmentation. */ /* Disables or enables conntrack fragment reassembly. The default diff --git a/lib/netlink-conntrack.c b/lib/netlink-conntrack.c index 7631ba5d5d31..828e4a5a84c1 100644 --- a/lib/netlink-conntrack.c +++ b/lib/netlink-conntrack.c @@ -840,6 +840,314 @@ nl_ct_parse_helper(struct nlattr *nla, struct ct_dpif_helper *helper) return parsed; } +static int nl_ct_timeout_policy_max_attr[] = { + [IPPROTO_TCP] = CTA_TIMEOUT_TCP_MAX, + [IPPROTO_UDP] = CTA_TIMEOUT_UDP_MAX, + [IPPROTO_ICMP] = CTA_TIMEOUT_ICMP_MAX, + [IPPROTO_ICMPV6] = CTA_TIMEOUT_ICMPV6_MAX +}; + +static void +nl_ct_set_timeout_policy_attr(struct nl_ct_timeout_policy *nl_tp, + uint32_t attr, uint32_t val) +{ + nl_tp->present |= 1 << attr; + nl_tp->attrs[attr] = val; +} + +static int +nl_ct_parse_tcp_timeout_policy_data(struct nlattr *nla, + struct nl_ct_timeout_policy *nl_tp) +{ + static const struct nl_policy policy[] = { + [CTA_TIMEOUT_TCP_SYN_SENT] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_SYN_RECV] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_ESTABLISHED] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_FIN_WAIT] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_CLOSE_WAIT] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_LAST_ACK] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_TIME_WAIT] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_CLOSE] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_SYN_SENT2] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_RETRANS] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_TCP_UNACK] = { .type = NL_A_BE32, + .optional = false }, + }; + struct nlattr *attrs[ARRAY_SIZE(policy)]; + int i; + + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { + VLOG_ERR_RL(&rl, "Could not parse nested tcp timeout options. " + "Possibly incompatible Linux kernel version."); + return EINVAL; + } + + for (i = CTA_TIMEOUT_TCP_SYN_SENT; i <= CTA_TIMEOUT_TCP_UNACK; i++) { + nl_ct_set_timeout_policy_attr(nl_tp, i, + ntohl(nl_attr_get_be32(attrs[i]))); + } + return 0; +} + +static int +nl_ct_parse_udp_timeout_policy_data(struct nlattr *nla, + struct nl_ct_timeout_policy *nl_tp) +{ + static const struct nl_policy policy[] = { + [CTA_TIMEOUT_UDP_UNREPLIED] = { .type = NL_A_BE32, + .optional = false }, + [CTA_TIMEOUT_UDP_REPLIED] = { .type = NL_A_BE32, + .optional = false }, + }; + struct nlattr *attrs[ARRAY_SIZE(policy)]; + int i; + + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { + VLOG_ERR_RL(&rl, "Could not parse nested tcp timeout options. " + "Possibly incompatible Linux kernel version."); + return EINVAL; + } + + for (i = CTA_TIMEOUT_UDP_UNREPLIED; i <= CTA_TIMEOUT_UDP_REPLIED; i++) { + nl_ct_set_timeout_policy_attr(nl_tp, i, + ntohl(nl_attr_get_be32(attrs[i]))); + } + return 0; +} + +static int +nl_ct_parse_icmp_timeout_policy_data(struct nlattr *nla, + struct nl_ct_timeout_policy *nl_tp) +{ + static const struct nl_policy policy[] = { + [CTA_TIMEOUT_ICMP_TIMEOUT] = { .type = NL_A_BE32, + .optional = false }, + }; + struct nlattr *attrs[ARRAY_SIZE(policy)]; + + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { + VLOG_ERR_RL(&rl, "Could not parse nested icmp timeout options. " + "Possibly incompatible Linux kernel version."); + return EINVAL; + } + + nl_ct_set_timeout_policy_attr( + nl_tp, CTA_TIMEOUT_ICMP_TIMEOUT, + ntohl(nl_attr_get_be32(attrs[CTA_TIMEOUT_ICMP_TIMEOUT]))); + return 0; +} + +static int +nl_ct_parse_icmpv6_timeout_policy_data(struct nlattr *nla, + struct nl_ct_timeout_policy *nl_tp) +{ + static const struct nl_policy policy[] = { + [CTA_TIMEOUT_ICMPV6_TIMEOUT] = { .type = NL_A_BE32, + .optional = false }, + }; + struct nlattr *attrs[ARRAY_SIZE(policy)]; + + if (!nl_parse_nested(nla, policy, attrs, ARRAY_SIZE(policy))) { + VLOG_ERR_RL(&rl, "Could not parse nested icmpv6 timeout options. " + "Possibly incompatible Linux kernel version."); + return EINVAL; + } + + nl_ct_set_timeout_policy_attr( + nl_tp, CTA_TIMEOUT_ICMPV6_TIMEOUT, + ntohl(nl_attr_get_be32(attrs[CTA_TIMEOUT_ICMPV6_TIMEOUT]))); + return 0; +} + +static int +nl_ct_parse_timeout_policy_data(struct nlattr *nla, + struct nl_ct_timeout_policy *nl_tp) +{ + switch (nl_tp->l4num) { + case IPPROTO_TCP: + return nl_ct_parse_tcp_timeout_policy_data(nla, nl_tp); + case IPPROTO_UDP: + return nl_ct_parse_udp_timeout_policy_data(nla, nl_tp); + case IPPROTO_ICMP: + return nl_ct_parse_icmp_timeout_policy_data(nla, nl_tp); + case IPPROTO_ICMPV6: + return nl_ct_parse_icmpv6_timeout_policy_data(nla, nl_tp); + default: + return EINVAL; + } +} + +static int +nl_ct_timeout_policy_from_ofpbuf(struct ofpbuf *buf, + struct nl_ct_timeout_policy *nl_tp, + bool default_tp) +{ + static const struct nl_policy policy[] = { + [CTA_TIMEOUT_NAME] = { .type = NL_A_STRING, .optional = false }, + [CTA_TIMEOUT_L3PROTO] = { .type = NL_A_BE16, .optional = false }, + [CTA_TIMEOUT_L4PROTO] = { .type = NL_A_U8, .optional = false }, + [CTA_TIMEOUT_DATA] = { .type = NL_A_NESTED, .optional = false } + }; + static const struct nl_policy policy_default_tp[] = { + [CTA_TIMEOUT_L3PROTO] = { .type = NL_A_BE16, .optional = false }, + [CTA_TIMEOUT_L4PROTO] = { .type = NL_A_U8, .optional = false }, + [CTA_TIMEOUT_DATA] = { .type = NL_A_NESTED, .optional = false } + }; + + struct nlattr *attrs[ARRAY_SIZE(policy)]; + struct ofpbuf b = ofpbuf_const_initializer(buf->data, buf->size); + struct nlmsghdr *nlmsg = ofpbuf_try_pull(&b, sizeof *nlmsg); + struct nfgenmsg *nfmsg = ofpbuf_try_pull(&b, sizeof *nfmsg); + int err; + + if (!nlmsg || !nfmsg + || NFNL_SUBSYS_ID(nlmsg->nlmsg_type) != NFNL_SUBSYS_CTNETLINK_TIMEOUT + || nfmsg->version != NFNETLINK_V0 + || !nl_policy_parse(&b, 0, default_tp ? policy_default_tp : policy, + attrs, default_tp ? ARRAY_SIZE(policy_default_tp) : + ARRAY_SIZE(policy))) { + return EINVAL; + } + + if (!default_tp) { + ovs_strlcpy(nl_tp->name, nl_attr_get_string(attrs[CTA_TIMEOUT_NAME]), + sizeof nl_tp->name); + } + nl_tp->l3num = ntohs(nl_attr_get_be16(attrs[CTA_TIMEOUT_L3PROTO])); + nl_tp->l4num = nl_attr_get_u8(attrs[CTA_TIMEOUT_L4PROTO]); + nl_tp->present = 0; + + err = nl_ct_parse_timeout_policy_data(attrs[CTA_TIMEOUT_DATA], nl_tp); + return err; +} + +int +nl_ct_set_timeout_policy(const struct nl_ct_timeout_policy *nl_tp) +{ + struct ofpbuf buf; + size_t offset; + int i, err; + + ofpbuf_init(&buf, 512); + nl_msg_put_nfgenmsg(&buf, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK_TIMEOUT, + IPCTNL_MSG_TIMEOUT_NEW, NLM_F_REQUEST | NLM_F_CREATE + | NLM_F_ACK | NLM_F_REPLACE); + + nl_msg_put_string(&buf, CTA_TIMEOUT_NAME, nl_tp->name); + nl_msg_put_be16(&buf, CTA_TIMEOUT_L3PROTO, htons(nl_tp->l3num)); + nl_msg_put_u8(&buf, CTA_TIMEOUT_L4PROTO, nl_tp->l4num); + + offset = nl_msg_start_nested(&buf, CTA_TIMEOUT_DATA); + for (i = 1; i <= nl_ct_timeout_policy_max_attr[nl_tp->l4num]; ++i) { + if (nl_tp->present & 1 << i) { + nl_msg_put_be32(&buf, i, htonl(nl_tp->attrs[i])); + } + } + nl_msg_end_nested(&buf, offset); + + err = nl_transact(NETLINK_NETFILTER, &buf, NULL); + ofpbuf_uninit(&buf); + return err; +} + +int +nl_ct_get_timeout_policy(const char *tp_name, + struct nl_ct_timeout_policy *nl_tp) +{ + struct ofpbuf request, *reply; + int err; + + ofpbuf_init(&request, 512); + nl_msg_put_nfgenmsg(&request, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK_TIMEOUT, + IPCTNL_MSG_TIMEOUT_GET, NLM_F_REQUEST | NLM_F_ACK); + nl_msg_put_string(&request, CTA_TIMEOUT_NAME, tp_name); + err = nl_transact(NETLINK_NETFILTER, &request, &reply); + if (err) { + goto out; + } + + err = nl_ct_timeout_policy_from_ofpbuf(reply, nl_tp, false); + +out: + ofpbuf_uninit(&request); + ofpbuf_delete(reply); + return err; +} + +int +nl_ct_del_timeout_policy(const char *tp_name) +{ + struct ofpbuf buf; + int err; + + ofpbuf_init(&buf, 64); + nl_msg_put_nfgenmsg(&buf, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK_TIMEOUT, + IPCTNL_MSG_TIMEOUT_DELETE, NLM_F_REQUEST | NLM_F_ACK); + + nl_msg_put_string(&buf, CTA_TIMEOUT_NAME, tp_name); + err = nl_transact(NETLINK_NETFILTER, &buf, NULL); + ofpbuf_uninit(&buf); + return err; +} + +struct nl_ct_timeout_policy_dump_state { + struct nl_dump dump; + struct ofpbuf buf; +}; + +int +nl_ct_timeout_policy_dump_start( + struct nl_ct_timeout_policy_dump_state **statep) +{ + struct ofpbuf request; + struct nl_ct_timeout_policy_dump_state *state; + + *statep = state = xzalloc(sizeof *state); + ofpbuf_init(&request, 512); + nl_msg_put_nfgenmsg(&request, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK_TIMEOUT, + IPCTNL_MSG_TIMEOUT_GET, + NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP); + + nl_dump_start(&state->dump, NETLINK_NETFILTER, &request); + ofpbuf_uninit(&request); + ofpbuf_init(&state->buf, NL_DUMP_BUFSIZE); + return 0; +} + +int +nl_ct_timeout_policy_dump_next(struct nl_ct_timeout_policy_dump_state *state, + struct nl_ct_timeout_policy *nl_tp) +{ + struct ofpbuf reply; + int err; + + if (!nl_dump_next(&state->dump, &reply, &state->buf)) { + return EOF; + } + err = nl_ct_timeout_policy_from_ofpbuf(&reply, nl_tp, false); + ofpbuf_uninit(&reply); + return err; +} + +int +nl_ct_timeout_policy_dump_done(struct nl_ct_timeout_policy_dump_state *state) +{ + int err = nl_dump_done(&state->dump); + ofpbuf_uninit(&state->buf); + free(state); + return err; +} + /* Translate netlink entry status flags to CT_DPIF_TCP status flags. */ static uint32_t ips_status_to_dpif_flags(uint32_t status) diff --git a/lib/netlink-conntrack.h b/lib/netlink-conntrack.h index 8b536fd65ba8..81c74549bd16 100644 --- a/lib/netlink-conntrack.h +++ b/lib/netlink-conntrack.h @@ -17,6 +17,8 @@ #ifndef NETLINK_CONNTRACK_H #define NETLINK_CONNTRACK_H +#include <linux/netfilter/nfnetlink_cttimeout.h> + #include "byte-order.h" #include "compiler.h" #include "ct-dpif.h" @@ -33,10 +35,21 @@ enum nl_ct_event_type { NL_CT_EVENT_DELETE = 1 << 2, }; +#define NL_CT_TIMEOUT_POLICY_MAX_ATTR (CTA_TIMEOUT_TCP_MAX + 1) + +struct nl_ct_timeout_policy { + char name[CTNL_TIMEOUT_NAME_MAX]; + uint16_t l3num; + uint8_t l4num; + uint32_t attrs[NL_CT_TIMEOUT_POLICY_MAX_ATTR]; + uint32_t present; +}; + struct nl_ct_dump_state; +struct nl_ct_timeout_policy_dump_state; int nl_ct_dump_start(struct nl_ct_dump_state **, const uint16_t *zone, - int *ptot_bkts); + int *ptot_bkts); int nl_ct_dump_next(struct nl_ct_dump_state *, struct ct_dpif_entry *); int nl_ct_dump_done(struct nl_ct_dump_state *); @@ -44,6 +57,18 @@ int nl_ct_flush(void); int nl_ct_flush_zone(uint16_t zone); int nl_ct_flush_tuple(const struct ct_dpif_tuple *, uint16_t zone); +int nl_ct_set_timeout_policy(const struct nl_ct_timeout_policy *nl_tp); +int nl_ct_get_timeout_policy(const char *tp_name, + struct nl_ct_timeout_policy *nl_tp); +int nl_ct_del_timeout_policy(const char *tp_name); +int nl_ct_timeout_policy_dump_start( + struct nl_ct_timeout_policy_dump_state **statep); +int nl_ct_timeout_policy_dump_next( + struct nl_ct_timeout_policy_dump_state *state, + struct nl_ct_timeout_policy *nl_tp); +int nl_ct_timeout_policy_dump_done( + struct nl_ct_timeout_policy_dump_state *state); + bool nl_ct_parse_entry(struct ofpbuf *, struct ct_dpif_entry *, enum nl_ct_event_type *); void nl_ct_format_event_entry(const struct ct_dpif_entry *, diff --git a/lib/netlink-protocol.h b/lib/netlink-protocol.h index c0617dfad21f..ceded7915ef8 100644 --- a/lib/netlink-protocol.h +++ b/lib/netlink-protocol.h @@ -47,13 +47,17 @@ #define NLM_F_ACK 0x004 #define NLM_F_ECHO 0x008 +/* GET request flag.*/ #define NLM_F_ROOT 0x100 #define NLM_F_MATCH 0x200 -#define NLM_F_EXCL 0x200 #define NLM_F_ATOMIC 0x400 -#define NLM_F_CREATE 0x400 #define NLM_F_DUMP (NLM_F_ROOT | NLM_F_MATCH) +/* NEW request flags. */ +#define NLM_F_REPLACE 0x100 +#define NLM_F_EXCL 0x200 +#define NLM_F_CREATE 0x400 + /* nlmsg_type values. */ #define NLMSG_NOOP 1 #define NLMSG_ERROR 2
This patch first defines the dpif interface for a datapath to support adding, deleting, getting and dumping conntrack timeout policy. The timeout policy is identified by a 4 bytes unsigned integer in datapath, and it currently support timeout for TCP, UDP, and ICMP protocols. Moreover, this patch provides the implementation for Linux kernel datapath in dpif-netlink. In Linux kernel, the timeout policy is maintained per L3/L4 protocol, and it is identified by 32 bytes null terminated string. On the other hand, in vswitchd, the timeout policy is a generic one that consists of all the supported L4 protocols. Therefore, one of the main task in dpif-netlink is to break down the generic timeout policy into 6 sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp), and push down the configuration using the netlink API in netlink-conntrack.c. This patch also adds missing symbols in the windows datapath so that the build on windows can pass. Appveyor CI: * https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> --- Documentation/faq/releases.rst | 3 +- datapath-windows/include/OvsDpInterfaceCtExt.h | 114 +++++ datapath-windows/ovsext/Netlink/NetlinkProto.h | 8 +- include/windows/automake.mk | 1 + .../windows/linux/netfilter/nfnetlink_cttimeout.h | 0 lib/ct-dpif.c | 104 +++++ lib/ct-dpif.h | 56 +++ lib/dpif-netdev.c | 6 + lib/dpif-netlink.c | 469 +++++++++++++++++++++ lib/dpif-netlink.h | 1 - lib/dpif-provider.h | 44 ++ lib/netlink-conntrack.c | 308 ++++++++++++++ lib/netlink-conntrack.h | 27 +- lib/netlink-protocol.h | 8 +- 14 files changed, 1142 insertions(+), 7 deletions(-) create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h