From patchwork Tue Nov 21 18:29:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Kavanagh X-Patchwork-Id: 840147 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yhDdJ033hz9t2v for ; Wed, 22 Nov 2017 05:29:23 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 34109A88; Tue, 21 Nov 2017 18:29:21 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 9DD05A71 for ; Tue, 21 Nov 2017 18:29:20 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 048858A for ; Tue, 21 Nov 2017 18:29:19 +0000 (UTC) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Nov 2017 10:29:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,432,1505804400"; d="scan'208";a="7930071" Received: from silpixa00380299.ir.intel.com ([10.237.222.17]) by orsmga001.jf.intel.com with ESMTP; 21 Nov 2017 10:29:17 -0800 From: Mark Kavanagh To: dev@openvswitch.org, qiudayu@chinac.com Date: Tue, 21 Nov 2017 18:29:09 +0000 Message-Id: <1511288957-68599-1-git-send-email-mark.b.kavanagh@intel.com> X-Mailer: git-send-email 1.9.3 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [RFC PATCH v3 0/8] netdev-dpdk: support multi-segment mbufs X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org Overview ======== This patchset introduces support for multi-segment mbufs to OvS-DPDK. Multi-segment mbufs are typically used when the size of an mbuf is insufficient to contain the entirety of a packet's data. Instead, the data is split across numerous mbufs, each carrying a portion, or 'segment', of the packet data. mbufs are chained via their 'next' attribute (an mbuf pointer). Use Cases ========= i. Handling oversized (guest-originated) frames, which are marked for hardware accelration/offload (TSO, for example). Packets which originate from a non-DPDK source may be marked for offload; as such, they may be larger than the permitted ingress interface's MTU, and may be stored in an oversized dp-packet. In order to transmit such packets over a DPDK port, their contents must be copied to a DPDK mbuf (via dpdk_do_tx_copy). However, in its current implementation, that function only copies data into a single mbuf; if the space available in the mbuf is exhausted, but not all packet data has been copied, then it is lost. Similarly, when cloning a DPDK mbuf, it must be considered whether that mbuf contains multiple segments. Both issues are resolved within this patchset. ii. Handling jumbo frames. While OvS already supports jumbo frames, it does so by increasing mbuf size, such that the entirety of a jumbo frame may be handled in a single mbuf. This is certainly the preferred, and most performant approach (and remains the default). However, it places high demands on system memory; multi-segment mbufs may be prefereable for systems which are memory-constrained. Enabling multi-segment mbufs ============================ Multi-segment and single-segment mbufs are mutually exclusive, and the user must decide on which approach to adopt on init. The introduction of a new OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this. This is a global boolean value, which determines how jumbo frames are represented across all DPDK ports. In the absence of a user-supplied value, 'dpdk-multi-seg-mbufs' defaults to false, i.e. multi-segment mbufs must be explicitly enabled / single-segment mbufs remain the default. Setting the field is identical to setting existing DPDK-specific OVSDB fields: ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x10 ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=4096,0 ==> ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true Notes ===== This patchset was generated and built against: - Current HEAD of OvS master [1] + DPDK v17.11 support patchset [2] - DPDK v17.11 [1] 3728b3b ("Merge branch 'dpdk_merge' of https://github.com/ist...") [2] https://patchwork.ozlabs.org/series/13829/mbox/ Mark Kavanagh (4): netdev-dpdk: simplify mbuf sizing lib/dp-packet: init specific mbuf fields to 0 lib/dp-packet: fix dp_packet_put_uninit for multi-seg mbufs netdev-dpdk: support multi-segment jumbo frames Michael Qiu (4): lib/dp-packet: copy mbuf info for packet copy lib/dp-packet: Fix data_len issue with multi-segs lib/dp-packet: copy data from multi-seg. DPDK mbuf netdev-dpdk: copy large packet to multi-seg. mbufs NEWS | 1 + lib/dp-packet.c | 43 +++++++++++++++++- lib/dp-packet.h | 24 +++++----- lib/dpdk.c | 7 +++ lib/netdev-dpdk.c | 122 ++++++++++++++++++++++++++++++++++++++++++--------- lib/netdev-dpdk.h | 1 + vswitchd/vswitch.xml | 20 +++++++++ 7 files changed, 183 insertions(+), 35 deletions(-)