From patchwork Tue Nov 21 18:29:10 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840148
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDdx37rnz9t31
for ;
Wed, 22 Nov 2017 05:29:57 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id 36B80AE1;
Tue, 21 Nov 2017 18:29:24 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id D4AEEAD2
for ; Tue, 21 Nov 2017 18:29:21 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 37A548A
for ; Tue, 21 Nov 2017 18:29:21 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:21 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930083"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:19 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:10 +0000
Message-Id: <1511288957-68599-2-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 1/8] netdev-dpdk: simplify mbuf sizing
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
When calculating the mbuf data_room_size (i.e. the size of actual
packet data that an mbuf can accomodate), it is possible to simply
use the value calculated by dpdk_buf_size() as a parameter to
rte_pktmbuf_pool_create(). This simplifies mbuf sizing considerably.
This patch removes the related size conversions and macros, which
are no longer needed.
The benefits of this approach are threefold:
- the mbuf sizing code is much simpler, and more readable.
- mbuf size will always be cache-aligned [1], satisfying that
requirement of specific PMDs (vNIC thunderx, for example).
- the maximum amount of data that each mbuf contains may now be
calculated as mbuf->buf_len - mbuf->data_off. This is important
in the case of multi-segment jumbo frames.
[1] (this is true since mbuf size is now always a multiple
of 1024, + 128B RTE_PKTMBUF_HEADROOM + 704B dp_packet).
Fixes: 4be4d22 ("netdev-dpdk: clean up mbuf initialization")
Fixes: 31b88c9 ("netdev-dpdk: round up mbuf_size to cache_line_size")
Signed-off-by: Mark Kavanagh
---
lib/netdev-dpdk.c | 22 ++++++++--------------
1 file changed, 8 insertions(+), 14 deletions(-)
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8906423..c5eb851 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -81,12 +81,6 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+ (2 * VLAN_HEADER_LEN))
#define MTU_TO_FRAME_LEN(mtu) ((mtu) + ETHER_HDR_LEN + ETHER_CRC_LEN)
#define MTU_TO_MAX_FRAME_LEN(mtu) ((mtu) + ETHER_HDR_MAX_LEN)
-#define FRAME_LEN_TO_MTU(frame_len) ((frame_len) \
- - ETHER_HDR_LEN - ETHER_CRC_LEN)
-#define MBUF_SIZE(mtu) ROUND_UP((MTU_TO_MAX_FRAME_LEN(mtu) \
- + sizeof(struct dp_packet) \
- + RTE_PKTMBUF_HEADROOM), \
- RTE_CACHE_LINE_SIZE)
#define NETDEV_DPDK_MBUF_ALIGN 1024
#define NETDEV_DPDK_MAX_PKT_LEN 9728
@@ -447,7 +441,7 @@ is_dpdk_class(const struct netdev_class *class)
* behaviour, which reduces performance. To prevent this, use a buffer size
* that is closest to 'mtu', but which satisfies the aforementioned criteria.
*/
-static uint32_t
+static uint16_t
dpdk_buf_size(int mtu)
{
return ROUND_UP((MTU_TO_MAX_FRAME_LEN(mtu) + RTE_PKTMBUF_HEADROOM),
@@ -486,7 +480,7 @@ ovs_rte_pktmbuf_init(struct rte_mempool *mp OVS_UNUSED,
* - a new mempool was just created;
* - a matching mempool already exists. */
static struct rte_mempool *
-dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
+dpdk_mp_create(struct netdev_dpdk *dev, uint16_t frame_len)
{
char mp_name[RTE_MEMPOOL_NAMESIZE];
const char *netdev_name = netdev_get_name(&dev->up);
@@ -513,12 +507,12 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
* longer than RTE_MEMPOOL_NAMESIZE. */
int ret = snprintf(mp_name, RTE_MEMPOOL_NAMESIZE,
"ovs%08x%02d%05d%07u",
- hash, socket_id, mtu, n_mbufs);
+ hash, socket_id, frame_len, n_mbufs);
if (ret < 0 || ret >= RTE_MEMPOOL_NAMESIZE) {
VLOG_DBG("snprintf returned %d. "
"Failed to generate a mempool name for \"%s\". "
- "Hash:0x%x, socket_id: %d, mtu:%d, mbufs:%u.",
- ret, netdev_name, hash, socket_id, mtu, n_mbufs);
+ "Hash:0x%x, socket_id: %d, frame length:%d, mbufs:%u.",
+ ret, netdev_name, hash, socket_id, frame_len, n_mbufs);
break;
}
@@ -529,7 +523,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
mp = rte_pktmbuf_pool_create(mp_name, n_mbufs, MP_CACHE_SZ,
sizeof (struct dp_packet) - sizeof (struct rte_mbuf),
- MBUF_SIZE(mtu) - sizeof(struct dp_packet), socket_id);
+ frame_len + RTE_PKTMBUF_HEADROOM, socket_id);
if (mp) {
VLOG_DBG("Allocated \"%s\" mempool with %u mbufs",
@@ -582,11 +576,11 @@ static int
netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
OVS_REQUIRES(dev->mutex)
{
- uint32_t buf_size = dpdk_buf_size(dev->requested_mtu);
+ uint16_t buf_size = dpdk_buf_size(dev->requested_mtu);
struct rte_mempool *mp;
int ret = 0;
- mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size));
+ mp = dpdk_mp_create(dev, buf_size);
if (!mp) {
VLOG_ERR("Failed to create memory pool for netdev "
"%s, with MTU %d on socket %d: %s\n",
From patchwork Tue Nov 21 18:29:11 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840149
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDgB50LZz9t8T
for ;
Wed, 22 Nov 2017 05:30:36 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id 1483CB0A;
Tue, 21 Nov 2017 18:29:25 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id 765EBAB6
for ; Tue, 21 Nov 2017 18:29:23 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id E4812478
for ; Tue, 21 Nov 2017 18:29:22 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:22 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930093"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:21 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:11 +0000
Message-Id: <1511288957-68599-3-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 2/8] lib/dp-packet: init specific mbuf
fields to 0
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
dp_packets are created using xmalloc(); in the case of OvS-DPDK, it's
possible the the resultant mbuf portion of the dp_packet contains
random data. For some mbuf fields, specifically those related to
multi-segment mbufs and/or offload features, random values may cause
unexpected behaviour, should the dp_packet's contents be later copied
to a DPDK mbuf. It is critical therefore, that these fields should be
initialized to 0.
This patch ensures that the following mbuf fields are initialized to 0,
on creation of a new dp_packet:
- ol_flags
- nb_segs
- tx_offload
- packet_type
Adapted from an idea by Michael Qiu :
https://patchwork.ozlabs.org/patch/777570/
Signed-off-by: Mark Kavanagh
Acked-by: Michael Qiu
---
lib/dp-packet.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index b4b721c..7aa440f 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -626,13 +626,13 @@ dp_packet_mbuf_rss_flag_reset(struct dp_packet *p OVS_UNUSED)
/* This initialization is needed for packets that do not come
* from DPDK interfaces, when vswitchd is built with --with-dpdk.
- * The DPDK rte library will still otherwise manage the mbuf.
- * We only need to initialize the mbuf ol_flags. */
+ * The DPDK rte library will still otherwise manage the mbuf. */
static inline void
dp_packet_mbuf_init(struct dp_packet *p OVS_UNUSED)
{
#ifdef DPDK_NETDEV
- p->mbuf.ol_flags = 0;
+ struct rte_mbuf *mbuf = &(p->mbuf);
+ mbuf->ol_flags = mbuf->nb_segs = mbuf->tx_offload = mbuf->packet_type = 0;
#endif
}
From patchwork Tue Nov 21 18:29:12 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840150
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDgR3k3tz9t31
for ;
Wed, 22 Nov 2017 05:31:15 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id E0EB1B9E;
Tue, 21 Nov 2017 18:29:26 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id 01E00B08
for ; Tue, 21 Nov 2017 18:29:25 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id ABF488A
for ; Tue, 21 Nov 2017 18:29:24 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:24 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930106"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:23 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:12 +0000
Message-Id: <1511288957-68599-4-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 3/8] lib/dp-packet: copy mbuf info for
packet copy
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
From: Michael Qiu
Currently, when doing packet copy, lots of DPDK mbuf's info
will be missed, like packet type, ol_flags, etc.
Those information is very important for DPDK to do
packets processing.
Signed-off-by: Michael Qiu
[mark.b.kavanagh@intel.com rebased]
Signed-off-by: Mark Kavanagh
---
lib/dp-packet.c | 3 +++
lib/netdev-dpdk.c | 4 ++++
2 files changed, 7 insertions(+)
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 443c225..5078211 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -178,6 +178,9 @@ dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t headroom)
#ifdef DPDK_NETDEV
new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;
+ new_buffer->mbuf.tx_offload = buffer->mbuf.tx_offload;
+ new_buffer->mbuf.packet_type = buffer->mbuf.packet_type;
+ new_buffer->mbuf.nb_segs = buffer->mbuf.nb_segs;
#else
new_buffer->rss_hash_valid = buffer->rss_hash_valid;
#endif
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index c5eb851..61a0dca 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1860,6 +1860,10 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
memcpy(rte_pktmbuf_mtod(pkts[txcnt], void *),
dp_packet_data(packet), size);
dp_packet_set_size((struct dp_packet *)pkts[txcnt], size);
+ pkts[txcnt]->nb_segs = packet->mbuf.nb_segs;
+ pkts[txcnt]->ol_flags = packet->mbuf.ol_flags;
+ pkts[txcnt]->packet_type = packet->mbuf.packet_type;
+ pkts[txcnt]->tx_offload = packet->mbuf.tx_offload;
txcnt++;
}
From patchwork Tue Nov 21 18:29:13 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840151
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDh122dNz9t2f
for ;
Wed, 22 Nov 2017 05:31:45 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id C75DFBC1;
Tue, 21 Nov 2017 18:29:28 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id 5BBC9BBD
for ; Tue, 21 Nov 2017 18:29:27 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id EAC214E9
for ; Tue, 21 Nov 2017 18:29:26 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:26 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930112"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:24 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:13 +0000
Message-Id: <1511288957-68599-5-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Cc: Marcin Ksiadz ,
Przemyslaw Lal
Subject: [ovs-dev] [RFC PATCH v3 4/8] lib/dp-packet: Fix data_len issue with
multi-segs
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
From: Michael Qiu
When a packet is from DPDK source, and it contains
multiple segments, data_len is not equal to the
packet size. This patch fixes this issue.
Co-authored-by: Mark Kavanagh
Co-authored-by: Przemyslaw Lal
Co-authored-by: Marcin Ksiadz
Co-authored-by: Yuanhan Liu
Signed-off-by: Michael Qiu
Signed-off-by: Mark Kavanagh
Signed-off-by: Przemyslaw Lal
Signed-off-by: Marcin Ksiadz
Signed-off-by: Yuanhan Liu
---
lib/dp-packet.h | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 7aa440f..c2736d3 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -23,6 +23,7 @@
#ifdef DPDK_NETDEV
#include
#include
+#include "rte_ether.h"
#endif
#include "netdev-dpdk.h"
@@ -429,17 +430,14 @@ dp_packet_size(const struct dp_packet *b)
static inline void
dp_packet_set_size(struct dp_packet *b, uint32_t v)
{
- /* netdev-dpdk does not currently support segmentation; consequently, for
- * all intents and purposes, 'data_len' (16 bit) and 'pkt_len' (32 bit) may
- * be used interchangably.
- *
- * On the datapath, it is expected that the size of packets
- * (and thus 'v') will always be <= UINT16_MAX; this means that there is no
- * loss of accuracy in assigning 'v' to 'data_len'.
+ /*
+ * Assign current segment length. If total length is greater than
+ * max data length in a segment, additional calculation is needed
*/
- b->mbuf.data_len = (uint16_t)v; /* Current seg length. */
- b->mbuf.pkt_len = v; /* Total length of all segments linked to
- * this segment. */
+ b->mbuf.data_len = MIN(v, b->mbuf.buf_len - b->mbuf.data_off);
+
+ /* Total length of all segments linked to this segment. */
+ b->mbuf.pkt_len = v;
}
static inline uint16_t
From patchwork Tue Nov 21 18:29:14 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840152
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDhd64gSz9sRW
for ;
Wed, 22 Nov 2017 05:32:17 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id C7CDFBD5;
Tue, 21 Nov 2017 18:29:30 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id EDF80BC5
for ; Tue, 21 Nov 2017 18:29:28 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A18414E9
for ; Tue, 21 Nov 2017 18:29:28 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:28 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930127"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:27 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:14 +0000
Message-Id: <1511288957-68599-6-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 5/8] lib/dp-packet: fix
dp_packet_put_uninit for multi-seg mbufs
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
dp_packet_put_uninit(dp_packet, size) appends 'size' bytes to the tail
of a dp_packet. In the case of multi-segment mbufs, it is the data
length of the last mbuf in the mbuf chain that should be adjusted by
'size' bytes.
In its current implementation, dp_packet_put_uninit() adjusts the
dp_packet's size via a call to dp_packet_set_size(); however, this
adjusts the data length of the first mbuf in the chain, which is
incorrect in the case of multi-segment mbufs. Instead, traverse the
mbuf chain to locate the final mbuf of said chain, and update its
data_len [1]. To finish, increase the packet length of the entire
mbuf [2] by 'size'.
[1] In the case of a single-segment mbuf, this is the mbuf itself.
[2] This is stored in the first mbuf of an mbuf chain.
Signed-off-by: Mark Kavanagh
---
lib/dp-packet.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 5078211..5c590e5 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -325,6 +325,22 @@ dp_packet_put_uninit(struct dp_packet *b, size_t size)
void *p;
dp_packet_prealloc_tailroom(b, size);
p = dp_packet_tail(b);
+#ifdef DPDK_NETDEV
+ if (b->source == DPBUF_DPDK) {
+ struct rte_mbuf *buf = &(b->mbuf);
+ /* In the case of multi-segment mbufs, the data length of the last mbuf
+ * should be adjusted by 'size' bytes. A call to dp_packet_size() would
+ * adjust the data length of the first mbuf in the segment, so we avoid
+ * invoking same; as a result, the packet length of the entire mbuf
+ * chain (stored in the first mbuf of said chain) must be adjusted here
+ * instead.
+ */
+ while (buf->next)
+ buf = buf->next;
+ buf->data_len += size;
+ b->mbuf.pkt_len += size;
+ } else
+#endif
dp_packet_set_size(b, dp_packet_size(b) + size);
return p;
}
From patchwork Tue Nov 21 18:29:15 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840153
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDjM6m95z9sRW
for ;
Wed, 22 Nov 2017 05:32:55 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id 223D0C00;
Tue, 21 Nov 2017 18:29:33 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id A029CAF7
for ; Tue, 21 Nov 2017 18:29:30 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 570F88A
for ; Tue, 21 Nov 2017 18:29:30 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:30 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930139"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:28 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:15 +0000
Message-Id: <1511288957-68599-7-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 6/8] lib/dp-packet: copy data from
multi-seg. DPDK mbuf
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
From: Michael Qiu
When doing packet clone, if packet source is from DPDK driver,
multi-segment must be considered, and copy the segment's
data one by one.
Signed-off-by: Michael Qiu
Signed-off-by: Mark Kavanagh
---
lib/dp-packet.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 5c590e5..26fff02 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -166,10 +166,30 @@ struct dp_packet *
dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t headroom)
{
struct dp_packet *new_buffer;
+ uint32_t pkt_len = dp_packet_size(buffer);
+#ifdef DPDK_NETDEV
+ /* copy multi-seg data */
+ if (buffer->source == DPBUF_DPDK && buffer->mbuf.nb_segs > 1) {
+ uint32_t offset = 0;
+ void *dst = NULL;
+ struct rte_mbuf *tmbuf = CONST_CAST(struct rte_mbuf *, &(buffer->mbuf));
+
+ new_buffer = dp_packet_new_with_headroom(pkt_len, headroom);
+ dp_packet_set_size(new_buffer, pkt_len + headroom);
+ dst = dp_packet_tail(new_buffer);
+
+ while (tmbuf) {
+ rte_memcpy((char *)dst + offset,
+ rte_pktmbuf_mtod(tmbuf, void *), tmbuf->data_len);
+ offset += tmbuf->data_len;
+ tmbuf = tmbuf->next;
+ }
+ }
+ else
+#endif
new_buffer = dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
- dp_packet_size(buffer),
- headroom);
+ pkt_len, headroom);
/* Copy the following fields into the returned buffer: l2_pad_size,
* l2_5_ofs, l3_ofs, l4_ofs, cutlen, packet_type and md. */
memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
From patchwork Tue Nov 21 18:29:16 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840154
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDjy30K6z9sRW
for ;
Wed, 22 Nov 2017 05:33:26 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id 06F5AC13;
Tue, 21 Nov 2017 18:29:35 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id 82C1DAF3
for ; Tue, 21 Nov 2017 18:29:32 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 0C4B6478
for ; Tue, 21 Nov 2017 18:29:32 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:31 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930159"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:30 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:16 +0000
Message-Id: <1511288957-68599-8-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 7/8] netdev-dpdk: copy large packet to
multi-seg. mbufs
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
From: Michael Qiu
Currently, packets are only copied to a single segment in
the function dpdk_do_tx_copy(). This could be an issue in
the case of jumbo frames, particularly when multi-segment
mbufs are involved.
This patch calculates the number of segments needed by a
packet and copies the data to each segment.
Signed-off-by: Michael Qiu
Signed-off-by: Mark Kavanagh
---
lib/netdev-dpdk.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 51 insertions(+), 4 deletions(-)
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 61a0dca..36275bd 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1824,8 +1824,10 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
#endif
struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
struct rte_mbuf *pkts[PKT_ARRAY_SIZE];
+ struct rte_mbuf *temp, *head = NULL;
uint32_t cnt = batch_cnt;
uint32_t dropped = 0;
+ uint32_t i, j, nb_segs;
if (dev->type != DPDK_DEV_VHOST) {
/* Check if QoS has been configured for this netdev. */
@@ -1838,9 +1840,10 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
uint32_t txcnt = 0;
- for (uint32_t i = 0; i < cnt; i++) {
+ for (i = 0; i < cnt; i++) {
struct dp_packet *packet = batch->packets[i];
uint32_t size = dp_packet_size(packet);
+ uint16_t max_data_len, data_len;
if (OVS_UNLIKELY(size > dev->max_packet_len)) {
VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d",
@@ -1850,15 +1853,59 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
continue;
}
- pkts[txcnt] = rte_pktmbuf_alloc(dev->mp);
+ temp = pkts[txcnt] = rte_pktmbuf_alloc(dev->mp);
if (OVS_UNLIKELY(!pkts[txcnt])) {
dropped += cnt - i;
break;
}
+ /* All new allocated mbuf's max data len is the same */
+ max_data_len = temp->buf_len - temp->data_off;
+
+ /* Calculate # of output mbufs. */
+ nb_segs = size / max_data_len;
+ if (size % max_data_len)
+ nb_segs = nb_segs + 1;
+
+ /* Allocate additional mbufs when multiple output mbufs required. */
+ for (j = 1; j < nb_segs; j++) {
+ temp->next = rte_pktmbuf_alloc(dev->mp);
+ if (!temp->next) {
+ rte_pktmbuf_free(pkts[txcnt]);
+ pkts[txcnt] = NULL;
+ break;
+ }
+ temp = temp->next;
+ }
/* We have to do a copy for now */
- memcpy(rte_pktmbuf_mtod(pkts[txcnt], void *),
- dp_packet_data(packet), size);
+ rte_pktmbuf_pkt_len(pkts[txcnt]) = size;
+ temp = pkts[txcnt];
+
+ data_len = size < max_data_len ? size: max_data_len;
+ if (packet->source == DPBUF_DPDK) {
+ head = &(packet->mbuf);
+ while (temp && head && size > 0) {
+ rte_memcpy(rte_pktmbuf_mtod(temp, void*),
+ dp_packet_data((struct dp_packet *)head), data_len);
+ rte_pktmbuf_data_len(temp) = data_len;
+ head = head->next;
+ size = size - data_len;
+ data_len = size < max_data_len ? size: max_data_len;
+ temp = temp->next;
+ }
+ } else {
+ int offset = 0;
+ while (temp && size > 0) {
+ memcpy(rte_pktmbuf_mtod(temp, void *),
+ dp_packet_at(packet, offset, data_len), data_len);
+ rte_pktmbuf_data_len(temp) = data_len;
+ temp = temp->next;
+ size = size - data_len;
+ offset += data_len;
+ data_len = size < max_data_len ? size: max_data_len;
+ }
+ }
+
dp_packet_set_size((struct dp_packet *)pkts[txcnt], size);
pkts[txcnt]->nb_segs = packet->mbuf.nb_segs;
pkts[txcnt]->ol_flags = packet->mbuf.ol_flags;
From patchwork Tue Nov 21 18:29:17 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Kavanagh
X-Patchwork-Id: 840155
X-Patchwork-Delegate: ian.stokes@intel.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
spf=pass (mailfrom) smtp.mailfrom=openvswitch.org
(client-ip=140.211.169.12; helo=mail.linuxfoundation.org;
envelope-from=ovs-dev-bounces@openvswitch.org;
receiver=)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
[140.211.169.12])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
bits)) (No client certificate requested)
by ozlabs.org (Postfix) with ESMTPS id 3yhDkd4y0bz9sRW
for ;
Wed, 22 Nov 2017 05:34:01 +1100 (AEDT)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
by mail.linuxfoundation.org (Postfix) with ESMTP id 18861C26;
Tue, 21 Nov 2017 18:29:37 +0000 (UTC)
X-Original-To: dev@openvswitch.org
Delivered-To: ovs-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id 364C0BF2
for ; Tue, 21 Nov 2017 18:29:34 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 8BF064E9
for ; Tue, 21 Nov 2017 18:29:33 +0000 (UTC)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
21 Nov 2017 10:29:33 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930169"
Received: from silpixa00380299.ir.intel.com ([10.237.222.17])
by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:32 -0800
From: Mark Kavanagh
To: dev@openvswitch.org,
qiudayu@chinac.com
Date: Tue, 21 Nov 2017 18:29:17 +0000
Message-Id: <1511288957-68599-9-git-send-email-mark.b.kavanagh@intel.com>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
References: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
T_RP_MATCHES_RCVD autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: [ovs-dev] [RFC PATCH v3 8/8] netdev-dpdk: support multi-segment
jumbo frames
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Sender: ovs-dev-bounces@openvswitch.org
Errors-To: ovs-dev-bounces@openvswitch.org
Currently, jumbo frame support for OvS-DPDK is implemented by
increasing the size of mbufs within a mempool, such that each mbuf
within the pool is large enough to contain an entire jumbo frame of
a user-defined size. Typically, for each user-defined MTU,
'requested_mtu', a new mempool is created, containing mbufs of size
~requested_mtu.
With the multi-segment approach, a port uses a single mempool,
(containing standard/default-sized mbufs of ~2k bytes), irrespective
of the user-requested MTU value. To accommodate jumbo frames, mbufs
are chained together, where each mbuf in the chain stores a portion of
the jumbo frame. Each mbuf in the chain is termed a segment, hence the
name.
== Enabling multi-segment mbufs ==
Multi-segment and single-segment mbufs are mutually exclusive, and the
user must decide on which approach to adopt on init. The introduction
of a new OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this. This
is a global boolean value, which determines how jumbo frames are
represented across all DPDK ports. In the absence of a user-supplied
value, 'dpdk-multi-seg-mbufs' defaults to false, i.e. multi-segment
mbufs must be explicitly enabled / single-segment mbufs remain the
default.
Setting the field is identical to setting existing DPDK-specific OVSDB
fields:
ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x10
ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=4096,0
==> ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true
Signed-off-by: Mark Kavanagh
---
NEWS | 1 +
lib/dpdk.c | 7 +++++++
lib/netdev-dpdk.c | 43 ++++++++++++++++++++++++++++++++++++++++---
lib/netdev-dpdk.h | 1 +
vswitchd/vswitch.xml | 20 ++++++++++++++++++++
5 files changed, 69 insertions(+), 3 deletions(-)
diff --git a/NEWS b/NEWS
index c15dc24..657b598 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,7 @@ Post-v2.8.0
- DPDK:
* Add support for DPDK v17.11
* Add support for vHost IOMMU feature
+ * Add support for multi-segment mbufs
v2.8.0 - 31 Aug 2017
--------------------
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 8da6c32..4c28bd0 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -450,6 +450,13 @@ dpdk_init__(const struct smap *ovs_other_config)
/* Finally, register the dpdk classes */
netdev_dpdk_register();
+
+ bool multi_seg_mbufs_enable = smap_get_bool(ovs_other_config,
+ "dpdk-multi-seg-mbufs", false);
+ if (multi_seg_mbufs_enable) {
+ VLOG_INFO("DPDK multi-segment mbufs enabled\n");
+ netdev_dpdk_multi_segment_mbufs_enable();
+ }
}
void
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 36275bd..293edad 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -65,6 +65,7 @@ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
VLOG_DEFINE_THIS_MODULE(netdev_dpdk);
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+static bool dpdk_multi_segment_mbufs = false;
#define DPDK_PORT_WATCHDOG_INTERVAL 5
@@ -500,6 +501,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, uint16_t frame_len)
+ dev->requested_n_txq * dev->requested_txq_size
+ MIN(RTE_MAX_LCORE, dev->requested_n_rxq) * NETDEV_MAX_BURST
+ MIN_NB_MBUF;
+ /* XXX (RFC) - should n_mbufs be increased if multi-seg mbufs are used? */
ovs_mutex_lock(&dpdk_mp_mutex);
do {
@@ -568,7 +570,13 @@ dpdk_mp_free(struct rte_mempool *mp)
/* Tries to allocate a new mempool - or re-use an existing one where
* appropriate - on requested_socket_id with a size determined by
- * requested_mtu and requested Rx/Tx queues.
+ * requested_mtu and requested Rx/Tx queues. Some properties of the mempool's
+ * elements are dependent on the value of 'dpdk_multi_segment_mbufs':
+ * - if 'true', then the mempool contains standard-sized mbufs that are chained
+ * together to accommodate packets of size 'requested_mtu'.
+ * - if 'false', then the members of the allocated mempool are
+ * non-standard-sized mbufs. Each mbuf in the mempool is large enough to fully
+ * accomdate packets of size 'requested_mtu'.
* On success - or when re-using an existing mempool - the new configuration
* will be applied.
* On error, device will be left unchanged. */
@@ -576,10 +584,18 @@ static int
netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
OVS_REQUIRES(dev->mutex)
{
- uint16_t buf_size = dpdk_buf_size(dev->requested_mtu);
+ uint16_t buf_size = 0;
struct rte_mempool *mp;
int ret = 0;
+ /* Contiguous mbufs in use - permit oversized mbufs */
+ if (!dpdk_multi_segment_mbufs) {
+ buf_size = dpdk_buf_size(dev->requested_mtu);
+ } else {
+ /* multi-segment mbufs - use standard mbuf size */
+ buf_size = dpdk_buf_size(ETHER_MTU);
+ }
+
mp = dpdk_mp_create(dev, buf_size);
if (!mp) {
VLOG_ERR("Failed to create memory pool for netdev "
@@ -657,6 +673,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
int diag = 0;
int i;
struct rte_eth_conf conf = port_conf;
+ struct rte_eth_txconf txconf;
/* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly
* enabled. */
@@ -690,9 +707,23 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
break;
}
+ /* DPDK PMDs typically attempt to use simple or vectorized
+ * transmit functions, neither of which are compatible with
+ * multi-segment mbufs. Ensure that these are disabled in the
+ * when multi-segment mbufs are enabled.
+ */
+ if (dpdk_multi_segment_mbufs) {
+ struct rte_eth_dev_info dev_info;
+ rte_eth_dev_info_get(dev->port_id, &dev_info);
+ txconf = dev_info.default_txconf;
+ txconf.txq_flags &= ~ETH_TXQ_FLAGS_NOMULTSEGS;
+ }
+
for (i = 0; i < n_txq; i++) {
diag = rte_eth_tx_queue_setup(dev->port_id, i, dev->txq_size,
- dev->socket_id, NULL);
+ dev->socket_id,
+ dpdk_multi_segment_mbufs ? &txconf
+ : NULL);
if (diag) {
VLOG_INFO("Interface %s txq(%d) setup error: %s",
dev->up.name, i, rte_strerror(-diag));
@@ -3380,6 +3411,12 @@ unlock:
return err;
}
+void
+netdev_dpdk_multi_segment_mbufs_enable(void)
+{
+ dpdk_multi_segment_mbufs = true;
+}
+
#define NETDEV_DPDK_CLASS(NAME, INIT, CONSTRUCT, DESTRUCT, \
SET_CONFIG, SET_TX_MULTIQ, SEND, \
GET_CARRIER, GET_STATS, \
diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index b7d02a7..a3339fe 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -25,6 +25,7 @@ struct dp_packet;
#ifdef DPDK_NETDEV
+void netdev_dpdk_multi_segment_mbufs_enable(void);
void netdev_dpdk_register(void);
void free_dpdk_buf(struct dp_packet *);
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index a633226..2b71c4a 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -331,6 +331,26 @@
+
+
+ Specifies if DPDK uses multi-segment mbufs for handling jumbo frames.
+
+
+ If true, DPDK allocates a single mempool per port, irrespective
+ of the ports' requested MTU sizes. The elements of this mempool are
+ 'standard'-sized mbufs (typically 2k MB), which may be chained
+ together to accommodate jumbo frames. In this approach, each mbuf
+ typically stores a fragment of the overall jumbo frame.
+
+
+ If not specified, defaults to false
, in which case, the size
+ of each mbuf within a DPDK port's mempool will be grown to accommodate
+ jumbo frames within a single mbuf.
+
+
+
+