Message ID | 1512734518-103757-10-git-send-email-mark.b.kavanagh@intel.com |
---|---|
State | Changes Requested |
Delegated to: | Ian Stokes |
Headers | show |
Series | netdev-dpdk: support multi-segment mbufs | expand |
Hi Mark, For some reason, I could not apply this patch cleanly. I couldn't do much of testing on the feature as such. Can you please send a proper Patch after rebase. Regards _Sugesh > -----Original Message----- > From: Kavanagh, Mark B > Sent: Friday, December 8, 2017 12:02 PM > To: dev@openvswitch.org; qiudayu@chinac.com > Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; > santosh.shukla@caviumnetworks.com; Chandran, Sugesh > <sugesh.chandran@intel.com>; Kavanagh, Mark B > <mark.b.kavanagh@intel.com> > Subject: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment > jumbo frames > [Snip] > 1.9.3
>From: Chandran, Sugesh >Sent: Friday, December 8, 2017 6:05 PM >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org; >qiudayu@chinac.com >Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara ><ciara.loftus@intel.com>; santosh.shukla@caviumnetworks.com >Subject: RE: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment >jumbo frames > >Hi Mark, > >For some reason, I could not apply this patch cleanly. Apologies for this Sugesh, I'll send an updated version soon. -Mark >I couldn't do much of testing on the feature as such. >Can you please send a proper Patch after rebase. > >Regards >_Sugesh > > >> -----Original Message----- >> From: Kavanagh, Mark B >> Sent: Friday, December 8, 2017 12:02 PM >> To: dev@openvswitch.org; qiudayu@chinac.com >> Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara ><ciara.loftus@intel.com>; >> santosh.shukla@caviumnetworks.com; Chandran, Sugesh >> <sugesh.chandran@intel.com>; Kavanagh, Mark B >> <mark.b.kavanagh@intel.com> >> Subject: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment >> jumbo frames >> > >[Snip] >> 1.9.3
>From: Kavanagh, Mark B >Sent: Monday, December 11, 2017 11:49 AM >To: Chandran, Sugesh <sugesh.chandran@intel.com>; dev@openvswitch.org; >qiudayu@chinac.com >Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara ><ciara.loftus@intel.com>; santosh.shukla@caviumnetworks.com >Subject: RE: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment >jumbo frames > >>From: Chandran, Sugesh >>Sent: Friday, December 8, 2017 6:05 PM >>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org; >>qiudayu@chinac.com >>Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara >><ciara.loftus@intel.com>; santosh.shukla@caviumnetworks.com >>Subject: RE: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment >>jumbo frames >> >>Hi Mark, >> >>For some reason, I could not apply this patch cleanly. > >Apologies for this Sugesh, I'll send an updated version soon. >-Mark Addendum: I think I know what happened Sugesh. This patchset was built on the DPDK v17.11 upgrade patchset (this was mentioned in the cover letter!) - did you apply the former in advance of applying the multi-segment patchset? Thanks again, Mark > >>I couldn't do much of testing on the feature as such. >>Can you please send a proper Patch after rebase. >> >>Regards >>_Sugesh >> >> >>> -----Original Message----- >>> From: Kavanagh, Mark B >>> Sent: Friday, December 8, 2017 12:02 PM >>> To: dev@openvswitch.org; qiudayu@chinac.com >>> Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara >><ciara.loftus@intel.com>; >>> santosh.shukla@caviumnetworks.com; Chandran, Sugesh >>> <sugesh.chandran@intel.com>; Kavanagh, Mark B >>> <mark.b.kavanagh@intel.com> >>> Subject: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment >>> jumbo frames >>> >> >>[Snip] >>> 1.9.3
Regards _Sugesh > -----Original Message----- > From: Kavanagh, Mark B > Sent: Monday, December 11, 2017 11:59 AM > To: Chandran, Sugesh <sugesh.chandran@intel.com>; dev@openvswitch.org; > qiudayu@chinac.com > Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; > santosh.shukla@caviumnetworks.com > Subject: RE: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support multi-segment > jumbo frames > > >From: Kavanagh, Mark B > >Sent: Monday, December 11, 2017 11:49 AM > >To: Chandran, Sugesh <sugesh.chandran@intel.com>; dev@openvswitch.org; > >qiudayu@chinac.com > >Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara > ><ciara.loftus@intel.com>; santosh.shukla@caviumnetworks.com > >Subject: RE: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support > >multi-segment jumbo frames > > > >>From: Chandran, Sugesh > >>Sent: Friday, December 8, 2017 6:05 PM > >>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; > dev@openvswitch.org; > >>qiudayu@chinac.com > >>Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara > >><ciara.loftus@intel.com>; santosh.shukla@caviumnetworks.com > >>Subject: RE: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support > >>multi-segment jumbo frames > >> > >>Hi Mark, > >> > >>For some reason, I could not apply this patch cleanly. > > > >Apologies for this Sugesh, I'll send an updated version soon. > >-Mark > > Addendum: I think I know what happened Sugesh. > > This patchset was built on the DPDK v17.11 upgrade patchset (this was > mentioned in the cover letter!) - did you apply the former in advance of applying > the multi-segment patchset? [Sugesh] Apologies Mark, I do have the DPDK17.11 upgrade on my repo. But its from the previous version. That caused the conflict.. My bad :( > > Thanks again, > Mark > > > > >>I couldn't do much of testing on the feature as such. > >>Can you please send a proper Patch after rebase. > >> > >>Regards > >>_Sugesh > >> > >> > >>> -----Original Message----- > >>> From: Kavanagh, Mark B > >>> Sent: Friday, December 8, 2017 12:02 PM > >>> To: dev@openvswitch.org; qiudayu@chinac.com > >>> Cc: Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara > >><ciara.loftus@intel.com>; > >>> santosh.shukla@caviumnetworks.com; Chandran, Sugesh > >>> <sugesh.chandran@intel.com>; Kavanagh, Mark B > >>> <mark.b.kavanagh@intel.com> > >>> Subject: [ovs-dev][RFC PATCH V4 9/9] netdev-dpdk: support > >>> multi-segment jumbo frames > >>> > >> > >>[Snip] > >>> 1.9.3
diff --git a/NEWS b/NEWS index d45904e..74a8910 100644 --- a/NEWS +++ b/NEWS @@ -18,6 +18,7 @@ Post-v2.8.0 - DPDK: * Add support for DPDK v17.11 * Add support for vHost IOMMU + * Add support for multi-segment mbufs v2.8.0 - 31 Aug 2017 -------------------- diff --git a/lib/dpdk.c b/lib/dpdk.c index 6710d10..5023d1a 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -456,6 +456,13 @@ dpdk_init__(const struct smap *ovs_other_config) /* Finally, register the dpdk classes */ netdev_dpdk_register(); + + bool multi_seg_mbufs_enable = smap_get_bool(ovs_other_config, + "dpdk-multi-seg-mbufs", false); + if (multi_seg_mbufs_enable) { + VLOG_INFO("DPDK multi-segment mbufs enabled\n"); + netdev_dpdk_multi_segment_mbufs_enable(); + } } void diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f83bb9e..a819a8f 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -65,6 +65,7 @@ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; VLOG_DEFINE_THIS_MODULE(netdev_dpdk); static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); +static bool dpdk_multi_segment_mbufs = false; #define DPDK_PORT_WATCHDOG_INTERVAL 5 @@ -501,6 +502,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, uint16_t mbuf_pkt_data_len) + dev->requested_n_txq * dev->requested_txq_size + MIN(RTE_MAX_LCORE, dev->requested_n_rxq) * NETDEV_MAX_BURST + MIN_NB_MBUF; + /* XXX: should n_mbufs be increased if multi-seg mbufs are used? */ ovs_mutex_lock(&dpdk_mp_mutex); do { @@ -588,7 +590,13 @@ dpdk_mp_free(struct rte_mempool *mp) /* Tries to allocate a new mempool - or re-use an existing one where * appropriate - on requested_socket_id with a size determined by - * requested_mtu and requested Rx/Tx queues. + * requested_mtu and requested Rx/Tx queues. Some properties of the mempool's + * elements are dependent on the value of 'dpdk_multi_segment_mbufs': + * - if 'true', then the mempool contains standard-sized mbufs that are chained + * together to accommodate packets of size 'requested_mtu'. + * - if 'false', then the members of the allocated mempool are + * non-standard-sized mbufs. Each mbuf in the mempool is large enough to + * fully accomdate packets of size 'requested_mtu'. * On success - or when re-using an existing mempool - the new configuration * will be applied. * On error, device will be left unchanged. */ @@ -596,10 +604,18 @@ static int netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) OVS_REQUIRES(dev->mutex) { - uint16_t buf_size = dpdk_buf_size(dev->requested_mtu); + uint16_t buf_size = 0; struct rte_mempool *mp; int ret = 0; + /* Contiguous mbufs in use - permit oversized mbufs */ + if (!dpdk_multi_segment_mbufs) { + buf_size = dpdk_buf_size(dev->requested_mtu); + } else { + /* multi-segment mbufs - use standard mbuf size */ + buf_size = dpdk_buf_size(ETHER_MTU); + } + mp = dpdk_mp_create(dev, buf_size); if (!mp) { VLOG_ERR("Failed to create memory pool for netdev " @@ -677,11 +693,25 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) int diag = 0; int i; struct rte_eth_conf conf = port_conf; + struct rte_eth_txconf txconf; + + /* Multi-segment-mbuf-specific setup. */ + if (dpdk_multi_segment_mbufs) { + struct rte_eth_dev_info dev_info; + + /* DPDK PMDs typically attempt to use simple or vectorized + * transmit functions, neither of which are compatible with + * multi-segment mbufs. Ensure that these are disabled when + * multi-segment mbufs are enabled. + */ + rte_eth_dev_info_get(dev->port_id, &dev_info); + txconf = dev_info.default_txconf; + txconf.txq_flags &= ~ETH_TXQ_FLAGS_NOMULTSEGS; - /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly - * enabled. */ - if (dev->mtu > ETHER_MTU) { - conf.rxmode.enable_scatter = 1; + /* For some NICs (e.g. Niantic), scattered_rx mode (required for + * ingress jumbo frames when multi-segments are enabled) needs to + * be explicitly enabled. */ + conf.rxmode.enable_scatter = 1; } conf.rxmode.hw_ip_checksum = (dev->hw_ol_features & @@ -712,7 +742,9 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) for (i = 0; i < n_txq; i++) { diag = rte_eth_tx_queue_setup(dev->port_id, i, dev->txq_size, - dev->socket_id, NULL); + dev->socket_id, + dpdk_multi_segment_mbufs ? &txconf + : NULL); if (diag) { VLOG_INFO("Interface %s txq(%d) setup error: %s", dev->up.name, i, rte_strerror(-diag)); @@ -3384,6 +3416,12 @@ unlock: return err; } +void +netdev_dpdk_multi_segment_mbufs_enable(void) +{ + dpdk_multi_segment_mbufs = true; +} + #define NETDEV_DPDK_CLASS(NAME, INIT, CONSTRUCT, DESTRUCT, \ SET_CONFIG, SET_TX_MULTIQ, SEND, \ GET_CARRIER, GET_STATS, \ diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h index b7d02a7..a3339fe 100644 --- a/lib/netdev-dpdk.h +++ b/lib/netdev-dpdk.h @@ -25,6 +25,7 @@ struct dp_packet; #ifdef DPDK_NETDEV +void netdev_dpdk_multi_segment_mbufs_enable(void); void netdev_dpdk_register(void); void free_dpdk_buf(struct dp_packet *); diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 4c317d0..ccce944 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -331,6 +331,26 @@ </p> </column> + <column name="other_config" key="dpdk-multi-seg-mbufs" + type='{"type": "boolean"}'> + <p> + Specifies if DPDK uses multi-segment mbufs for handling jumbo frames. + </p> + <p> + If true, DPDK allocates a single mempool per port, irrespective + of the ports' requested MTU sizes. The elements of this mempool are + 'standard'-sized mbufs (typically 2k MB), which may be chained + together to accommodate jumbo frames. In this approach, each mbuf + typically stores a fragment of the overall jumbo frame. + </p> + <p> + If not specified, defaults to <code>false</code>, in which case, + the size of each mbuf within a DPDK port's mempool will be grown to + accommodate jumbo frames within a single mbuf. + </p> + </column> + + <column name="other_config" key="vhost-sock-dir" type='{"type": "string"}'> <p>
Currently, jumbo frame support for OvS-DPDK is implemented by increasing the size of mbufs within a mempool, such that each mbuf within the pool is large enough to contain an entire jumbo frame of a user-defined size. Typically, for each user-defined MTU, 'requested_mtu', a new mempool is created, containing mbufs of size ~requested_mtu. With the multi-segment approach, a port uses a single mempool, (containing standard/default-sized mbufs of ~2k bytes), irrespective of the user-requested MTU value. To accommodate jumbo frames, mbufs are chained together, where each mbuf in the chain stores a portion of the jumbo frame. Each mbuf in the chain is termed a segment, hence the name. == Enabling multi-segment mbufs == Multi-segment and single-segment mbufs are mutually exclusive, and the user must decide on which approach to adopt on init. The introduction of a new OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this. This is a global boolean value, which determines how jumbo frames are represented across all DPDK ports. In the absence of a user-supplied value, 'dpdk-multi-seg-mbufs' defaults to false, i.e. multi-segment mbufs must be explicitly enabled / single-segment mbufs remain the default. Setting the field is identical to setting existing DPDK-specific OVSDB fields: ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x10 ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=4096,0 ==> ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> --- NEWS | 1 + lib/dpdk.c | 7 +++++++ lib/netdev-dpdk.c | 52 +++++++++++++++++++++++++++++++++++++++++++++------- lib/netdev-dpdk.h | 1 + vswitchd/vswitch.xml | 20 ++++++++++++++++++++ 5 files changed, 74 insertions(+), 7 deletions(-)