From patchwork Mon Nov 16 11:12:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1400848 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=dm4Vnmqr; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CZRKS3TH1z9sTc for ; Mon, 16 Nov 2020 22:14:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729485AbgKPLNq (ORCPT ); Mon, 16 Nov 2020 06:13:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728603AbgKPLNp (ORCPT ); Mon, 16 Nov 2020 06:13:45 -0500 Received: from mail-pg1-x543.google.com (mail-pg1-x543.google.com [IPv6:2607:f8b0:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5E98C0613CF; Mon, 16 Nov 2020 03:13:45 -0800 (PST) Received: by mail-pg1-x543.google.com with SMTP id j19so6107682pgg.5; Mon, 16 Nov 2020 03:13:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+SbxyZahfMfXnU/FChOm2quTfC2gyIuFdUfnHAtZnoI=; b=dm4VnmqrFU23MKn/TKtIS4x/rNBKQpkzivbC6Xa+MPM04MO8nbeTUZY8HTgfvYys1j EOYiVrhB58A2akTYPFjGXf7c2SMKcPDfdAOiRAGjSu+oOvaTYIfJ+Dk+od/gIVIu2D8r cbAXyVwzvuDEhe9GUS0IqSfIbRiAgw61hQ6AMbBpv7o2B07EiT2/VFrTgsIiM9Kj4YMs fMj06Fw8BwmdGHUA0QAA8rCG19HY4PSZu+iB0zY4NfNI9ZV19WWE+YrumAo01AqbuRyQ bNu465+g6EAYR8Rx3rjh4x/Z8Y+Y4BRhvMbr2o7xLXlnZC8QqcDlOa18ogeYc9eqYMMT MikQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+SbxyZahfMfXnU/FChOm2quTfC2gyIuFdUfnHAtZnoI=; b=sDj24V1ySvPnldSVJlt8pftQephcGpQF/gsi4ZIM2tEPeWVkEgx7h1JYSDXQe0vyyo Z3zcWrvxJCszVVAMHhidpgctKZf4s978qk2ZQve8CI0raby1P0bTMApwv7Gh0h87Brf1 oWjYqvu8SvJwdCRJelOfSB8wYA9FLzPzv7iEz6XNWIOpeXcdo988zF3ADNQCHds4h87B DIyE4zqKJ0jjL7Sf16s6KKcZdIJcdZVbjmP67bqW8INMp1u0G4n3+AdaYeF2Z8DJL6Ha 1r7xl/53d5S5Qx9AA/My/wvsG4CaZ7hy0XmFtn4BjT9pcICwkS4/xOKNnn0cIbhBbjTh 9yKw== X-Gm-Message-State: AOAM533VFtaIm/ajeQh+mraYWIouPUBtt0+AHx/QPs6XK43XxUOCbWR+ pjxPwSTdsfWF7+EYAyrumNM= X-Google-Smtp-Source: ABdhPJybPSvfextFY3UUIP4zrPkriv1bO8XhL0wD6bS3IGeBhbnbQ10ErLFdZ26Ko4b4QBOR6o+FWg== X-Received: by 2002:a17:90a:fd0d:: with SMTP id cv13mr15802905pjb.124.1605525225333; Mon, 16 Nov 2020 03:13:45 -0800 (PST) Received: from localhost.localdomain ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id u24sm19486826pfm.81.2020.11.16.03.13.40 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Nov 2020 03:13:44 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, kuba@kernel.org, john.fastabend@gmail.com Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v3 1/5] samples/bpf: increment Tx stats at sending Date: Mon, 16 Nov 2020 12:12:43 +0100 Message-Id: <1605525167-14450-2-git-send-email-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> References: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Increment the statistics over how many Tx packets have been sent at the time of sending instead of at the time of completion. This as a completion event means that the buffer has been sent AND returned to user space. The packet always gets sent shortly after sendto() is called. The kernel might, for performance reasons, decide to not return every single buffer to user space immediately after sending, for example, only after a batch of packets have been transmitted. Incrementing the number of packets sent at completion, will in that case be confusing as if you send a single packet, the counter might show zero for a while even though the packet has been transmitted. Signed-off-by: Magnus Karlsson Acked-by: John Fastabend --- samples/bpf/xdpsock_user.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index 1149e94..2567f0d 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -1146,7 +1146,6 @@ static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk, xsk_ring_prod__submit(&xsk->umem->fq, rcvd); xsk_ring_cons__release(&xsk->umem->cq, rcvd); xsk->outstanding_tx -= rcvd; - xsk->ring_stats.tx_npkts += rcvd; } } @@ -1168,7 +1167,6 @@ static inline void complete_tx_only(struct xsk_socket_info *xsk, if (rcvd > 0) { xsk_ring_cons__release(&xsk->umem->cq, rcvd); xsk->outstanding_tx -= rcvd; - xsk->ring_stats.tx_npkts += rcvd; } } @@ -1260,6 +1258,7 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size) } xsk_ring_prod__submit(&xsk->tx, batch_size); + xsk->ring_stats.tx_npkts += batch_size; xsk->outstanding_tx += batch_size; *frame_nb += batch_size; *frame_nb %= NUM_FRAMES; @@ -1348,6 +1347,7 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds) } return; } + xsk->ring_stats.rx_npkts += rcvd; ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx); while (ret != rcvd) { @@ -1379,7 +1379,7 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds) xsk_ring_prod__submit(&xsk->tx, rcvd); xsk_ring_cons__release(&xsk->rx, rcvd); - xsk->ring_stats.rx_npkts += rcvd; + xsk->ring_stats.tx_npkts += rcvd; xsk->outstanding_tx += rcvd; } From patchwork Mon Nov 16 11:12:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1400851 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=TIcuSgUW; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CZRKT19t0z9sRK for ; Mon, 16 Nov 2020 22:14:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729633AbgKPLNv (ORCPT ); Mon, 16 Nov 2020 06:13:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728211AbgKPLNu (ORCPT ); Mon, 16 Nov 2020 06:13:50 -0500 Received: from mail-pl1-x643.google.com (mail-pl1-x643.google.com [IPv6:2607:f8b0:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABF46C0613CF; Mon, 16 Nov 2020 03:13:50 -0800 (PST) Received: by mail-pl1-x643.google.com with SMTP id b3so8196887pls.11; Mon, 16 Nov 2020 03:13:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=44RUaPFcwFiB3nkeuTIXUu39UgaMmkBucMAXuU99ZwQ=; b=TIcuSgUWSWuwn2Jjx1KnB6sHrdeMikkoB9olf3FVb3jeqC3BzKywc7h6cKpbHMBeX3 +NEr1FzsgcBQBmtEwqaNPYoOya73w9zL+Q90R2ToYPhHOsPcvJesPpVSyjwmnxm0LpJN 1QXDy+xlDUxI9Nu8Sw+nfSJ5trm8df+m7JZhLzjAQknzTVRjM3byAfcJqtOeDYoHg7+9 HTK20aO3/V2kwQWHCWudepK15mqqNhP/wJwxPayM6HtQGGDZskemvInKngX5aDttsBdP iw8ua714FhhRDHCdNCl3+rbbG7pUQk/XSXEnBP/DjOucqNkTTo0OTG+T7Oj0b1sQUR1p /rmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=44RUaPFcwFiB3nkeuTIXUu39UgaMmkBucMAXuU99ZwQ=; b=hnC4DjkcSCuLH/YToLtSAh3QFkBqC341dPv7XWSZSR26QBYvwhIub7EAkFW7qCOPD5 Xho9Wu2Z3C+F8wWFPNYrXpXhS6ho34WqhYCHpL7xuIi+dzkd1RScnpBmpcSvWSM7pHbZ 11F+d6RMMR6PRnBgSi9GRUUQS8yuM9f7xnbflCOOu/OIuwtyc+8XDx2u9R8Gyn4LFB3z /7igP7cqvD1BAudzz5j6YOozvwtfnVioa7kFjibO1v9Lcy9sDjSiC//48Eu9wtdz1UAK kYkNS3QEheEBrKsAskunY3VrH76SeP3IWiPtalCHGVKcjbiTS3bvUZzy3oyJudKmhTc+ oDpA== X-Gm-Message-State: AOAM533O/AHffv50KY2ZUYqsSL9unQsyHnAXBXWfnllLQGAG6njDSE5v Bb+h2E14duFwe1/vLYHllUY= X-Google-Smtp-Source: ABdhPJyZfCFm7Pec1GQujbx90ewEdjqUNaZF5VJN51QlLDhc5+u0TOr06c1NI6pNZtLBSC/TZGogQQ== X-Received: by 2002:a17:902:361:b029:d7:cd0b:e6f2 with SMTP id 88-20020a1709020361b02900d7cd0be6f2mr12816425pld.77.1605525230298; Mon, 16 Nov 2020 03:13:50 -0800 (PST) Received: from localhost.localdomain ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id u24sm19486826pfm.81.2020.11.16.03.13.45 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Nov 2020 03:13:49 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, kuba@kernel.org, john.fastabend@gmail.com Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v3 2/5] i40e: remove unnecessary sw_ring access from xsk Tx Date: Mon, 16 Nov 2020 12:12:44 +0100 Message-Id: <1605525167-14450-3-git-send-email-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> References: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Remove the unnecessary access to the software ring for the AF_XDP zero-copy driver. This was used to record the length of the packet so that the driver Tx completion code could sum this up to produce the total bytes sent. This is now performed during the transmission of the packet, so no need to record this in the software ring. Signed-off-by: Magnus Karlsson Acked-by: John Fastabend --- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 567fd67..20d2632 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -392,7 +392,6 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) { unsigned int sent_frames = 0, total_bytes = 0; struct i40e_tx_desc *tx_desc = NULL; - struct i40e_tx_buffer *tx_bi; struct xdp_desc desc; dma_addr_t dma; @@ -404,9 +403,6 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc.len); - tx_bi = &xdp_ring->tx_bi[xdp_ring->next_to_use]; - tx_bi->bytecount = desc.len; - tx_desc = I40E_TX_DESC(xdp_ring, xdp_ring->next_to_use); tx_desc->buffer_addr = cpu_to_le64(dma); tx_desc->cmd_type_offset_bsz = @@ -415,7 +411,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) 0, desc.len, 0); sent_frames++; - total_bytes += tx_bi->bytecount; + total_bytes += desc.len; xdp_ring->next_to_use++; if (xdp_ring->next_to_use == xdp_ring->count) From patchwork Mon Nov 16 11:12:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1400852 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=HRCZiGhf; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CZRKV0K0wz9sVC for ; Mon, 16 Nov 2020 22:14:06 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729660AbgKPLN4 (ORCPT ); Mon, 16 Nov 2020 06:13:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728211AbgKPLNz (ORCPT ); Mon, 16 Nov 2020 06:13:55 -0500 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B577C0613CF; Mon, 16 Nov 2020 03:13:55 -0800 (PST) Received: by mail-pg1-x541.google.com with SMTP id p68so2448357pga.6; Mon, 16 Nov 2020 03:13:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bAeSUpBLbjOCEiLqRz8yHChict8yQeBE89i4qMRCUYE=; b=HRCZiGhfMze/cT5wkg+jlzsWbS+rxF+M74VHbs97vSlv4BcozFiJz4X5bULSA4eYGX oXA9P2hKK6KgmqvqgtJqqY0kXLZw0eX4QZpq8jDHudx3xROAALtPd9eRsYYVOyX92kIL ipHf9kmhuFfB2tYdh7dGrbY63rgvcPAYWqaUQJP2TLY0VsBJ6OeNaR9XpVWXYx45Upa0 VlPceHOcu5NldcF798EeqbU4hh19dGE1AuMzsrLl6Ja/nT+zdtKMhRrWKSJ8XmYBzoXB onxai0vJ6XujhSgUcUcSeXY+yEBOPNTfxOzdQxMUu/fa284GHOB/cZn+eDAGaSjp7ZpT vr1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bAeSUpBLbjOCEiLqRz8yHChict8yQeBE89i4qMRCUYE=; b=TC4SrMHIjwXb06qcWAgfUTLKOGizbQCJKNICiLIWm+9MSRLTGtkjCz2kHV5ohn3gcm jdrzHujrOhjPPv/+07yFdd3uI65bAO1pck8SJwL0iZjTJsr8JKZCjsPAv9+tAO32Onj+ d2zFpCgJuDK35AEz1fjK3rhTZqHOWKh0dy86zDVbW8xu+nn9FAbf2tYiFyVifjdZEHkn grh2l51H3SMjC18hpilGbj3Ubpchiqy91UabhjKo/IFKnDwXje+lTpq1zPE7O++epmPY 5o2RfMBYNHBf5eP5J2R4WUx+shxcO1TO3QGwu7SAkiXI/EPibg10RwgdFqDvsO0D8fe2 O//A== X-Gm-Message-State: AOAM532LOVgRd2Q1EEqZ4ak0rlDgp5Mu0RTCBjeLc+rHxCTWCBArwzn4 dsybp186VweCmQgOt9LUZao= X-Google-Smtp-Source: ABdhPJx783S1CRCdIOA53Z4HGpeqjGAjBbOjl5cElVHcw6E5vtxhwsc0gOevd4EBaRZuUaGtKPawbw== X-Received: by 2002:a17:90a:bc4c:: with SMTP id t12mr14790488pjv.163.1605525235215; Mon, 16 Nov 2020 03:13:55 -0800 (PST) Received: from localhost.localdomain ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id u24sm19486826pfm.81.2020.11.16.03.13.50 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Nov 2020 03:13:54 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, kuba@kernel.org, john.fastabend@gmail.com Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v3 3/5] xsk: introduce padding between more ring pointers Date: Mon, 16 Nov 2020 12:12:45 +0100 Message-Id: <1605525167-14450-4-git-send-email-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> References: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Introduce one cache line worth of padding between the consumer pointer and the flags field as well as between the flags field and the start of the descriptors in all the lockless rings. This so that the x86 HW adjacency prefetcher will not prefetch the adjacent pointer/field when only one pointer/field is going to be used. This improves throughput performance for the l2fwd sample app with 1% on my machine with HW prefetching turned on in the BIOS. Signed-off-by: Magnus Karlsson Acked-by: John Fastabend --- net/xdp/xsk_queue.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index cdb9cf3..74fac80 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -18,9 +18,11 @@ struct xdp_ring { /* Hinder the adjacent cache prefetcher to prefetch the consumer * pointer if the producer pointer is touched and vice versa. */ - u32 pad ____cacheline_aligned_in_smp; + u32 pad1 ____cacheline_aligned_in_smp; u32 consumer ____cacheline_aligned_in_smp; + u32 pad2 ____cacheline_aligned_in_smp; u32 flags; + u32 pad3 ____cacheline_aligned_in_smp; }; /* Used for the RX and TX queues for packets */ From patchwork Mon Nov 16 11:12:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1400853 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=o8zOWKeX; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CZRKV6zg2z9sSn for ; Mon, 16 Nov 2020 22:14:06 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729674AbgKPLOB (ORCPT ); Mon, 16 Nov 2020 06:14:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728211AbgKPLOA (ORCPT ); Mon, 16 Nov 2020 06:14:00 -0500 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E2CCC0613CF; Mon, 16 Nov 2020 03:14:00 -0800 (PST) Received: by mail-pf1-x442.google.com with SMTP id y7so13759052pfq.11; Mon, 16 Nov 2020 03:14:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pZBIYR9Vsszru6gWE0NAdBHVbZCybbTOMnigiUKuqKA=; b=o8zOWKeXFeroKrE1LWWiYr/9QCAEfXcCx8ROTvETgOfOo3ACYv9TdiscoMRs/+Mzi+ WSM6edIDjaHPrjuygOrchcoGvpnl+YqBaBsQaNHwHmwhLjePMQMzIuBoPbWZlH2QnxVB hdMl6b3/j0kBcmfYlTGZXD2LBQGJCfLbVkrjT5Lm4uOzURbTfgNy5/zgMTNXy2LY3GPl GmTe9PNXPlpryY7CWzt3mVHsicLZjU4Bb6TI0KHcauBGJgZY/O8/Iq0DYdY/KYt7F6LZ 9Wn8fcgnKzNwUGM2JxBgUSd2ETlYalwQn16kG2Xgne7c6m84BqfjFSlHaMkWmlTPe0RH xlwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pZBIYR9Vsszru6gWE0NAdBHVbZCybbTOMnigiUKuqKA=; b=LdpCoME2ZT5S0oCQ0d1FdUmzF3QIoUF+JeQhaZJeUruCuaTSOs5H3+MXNiz7U3tyFA 2zlAvjjmaOn4LK3Ui86J6HbAtPv+3SD1OXIW2f1ufFn6II3oiv6FXozagmohJcTZ4Cul TnxA/iYC7bRJTnGRM76nHJjpYWqt0UQr67GbIjPqivAHgNtIHvTKeKt+QwaXhsQZTbVG Jf/eRm4sB5LLu9gn7k6+w5cpKjBBEZYQybotR/OBsao7DlHKYCE+EA3Nu5xWUIncZjVR giNQB7oJsVcbP4kDMwIkeb8PPGUa+ZG/BnURPGbgw3I3iIG6xyUWIIHhCfnsZeg2a0ZH ZQjQ== X-Gm-Message-State: AOAM533ZaaLYID+NxLswGbWu1Wh7BjHr/fPKIy3U+BZkrtta9URUJrkw bUKJhHSIq1o1itOF/9Ycrns= X-Google-Smtp-Source: ABdhPJz8KgiVNYOz4K9DL7/5E04urJP/WDBKsJKwE6qHP6R2zsukEMXFyd9SaRlF4eBgvGyx3BJ9wA== X-Received: by 2002:a17:90a:ea92:: with SMTP id h18mr3020886pjz.14.1605525240162; Mon, 16 Nov 2020 03:14:00 -0800 (PST) Received: from localhost.localdomain ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id u24sm19486826pfm.81.2020.11.16.03.13.55 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Nov 2020 03:13:59 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, kuba@kernel.org, john.fastabend@gmail.com Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v3 4/5] xsk: introduce batched Tx descriptor interfaces Date: Mon, 16 Nov 2020 12:12:46 +0100 Message-Id: <1605525167-14450-5-git-send-email-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> References: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Introduce batched descriptor interfaces in the xsk core code for the Tx path to be used in the driver to write a code path with higher performance. This interface will be used by the i40e driver in the next patch. Though other drivers would likely benefit from this new interface too. Note that batching is only implemented for the common case when there is only one socket bound to the same device and queue id. When this is not the case, we fall back to the old non-batched version of the function. Signed-off-by: Magnus Karlsson Acked-by: John Fastabend --- include/net/xdp_sock_drv.h | 7 ++++ net/xdp/xsk.c | 57 +++++++++++++++++++++++++++++ net/xdp/xsk_queue.h | 89 +++++++++++++++++++++++++++++++++++++++------- 3 files changed, 140 insertions(+), 13 deletions(-) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 5b1ee8a..4e295541 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -13,6 +13,7 @@ void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries); bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); +u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, struct xdp_desc *desc, u32 max); void xsk_tx_release(struct xsk_buff_pool *pool); struct xsk_buff_pool *xsk_get_pool_from_qid(struct net_device *dev, u16 queue_id); @@ -128,6 +129,12 @@ static inline bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, return false; } +static inline u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, struct xdp_desc *desc, + u32 max) +{ + return 0; +} + static inline void xsk_tx_release(struct xsk_buff_pool *pool) { } diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index cfbec39..b014197 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -332,6 +332,63 @@ bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) } EXPORT_SYMBOL(xsk_tx_peek_desc); +static u32 xsk_tx_peek_release_fallback(struct xsk_buff_pool *pool, struct xdp_desc *descs, + u32 max_entries) +{ + u32 nb_pkts = 0; + + while (nb_pkts < max_entries && xsk_tx_peek_desc(pool, &descs[nb_pkts])) + nb_pkts++; + + xsk_tx_release(pool); + return nb_pkts; +} + +u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, struct xdp_desc *descs, + u32 max_entries) +{ + struct xdp_sock *xs; + u32 nb_pkts; + + rcu_read_lock(); + if (!list_is_singular(&pool->xsk_tx_list)) { + /* Fallback to the non-batched version */ + rcu_read_unlock(); + return xsk_tx_peek_release_fallback(pool, descs, max_entries); + } + + xs = list_first_or_null_rcu(&pool->xsk_tx_list, struct xdp_sock, tx_list); + if (!xs) { + nb_pkts = 0; + goto out; + } + + nb_pkts = xskq_cons_peek_desc_batch(xs->tx, descs, pool, max_entries); + if (!nb_pkts) { + xs->tx->queue_empty_descs++; + goto out; + } + + /* This is the backpressure mechanism for the Tx path. Try to + * reserve space in the completion queue for all packets, but + * if there are fewer slots available, just process that many + * packets. This avoids having to implement any buffering in + * the Tx path. + */ + nb_pkts = xskq_prod_reserve_addr_batch(pool->cq, descs, nb_pkts); + if (!nb_pkts) + goto out; + + xskq_cons_release_n(xs->tx, nb_pkts); + __xskq_cons_release(xs->tx); + xs->sk.sk_write_space(&xs->sk); + +out: + rcu_read_unlock(); + return nb_pkts; +} +EXPORT_SYMBOL(xsk_tx_peek_release_desc_batch); + static int xsk_wakeup(struct xdp_sock *xs, u8 flags) { struct net_device *dev = xs->dev; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 74fac80..b936c46 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -199,6 +199,30 @@ static inline bool xskq_cons_read_desc(struct xsk_queue *q, return false; } +static inline u32 xskq_cons_read_desc_batch(struct xsk_queue *q, + struct xdp_desc *descs, + struct xsk_buff_pool *pool, u32 max) +{ + u32 cached_cons = q->cached_cons, nb_entries = 0; + + while (cached_cons != q->cached_prod && nb_entries < max) { + struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring; + u32 idx = cached_cons & q->ring_mask; + + descs[nb_entries] = ring->desc[idx]; + if (unlikely(!xskq_cons_is_valid_desc(q, &descs[nb_entries], pool))) { + /* Skip the entry */ + cached_cons++; + continue; + } + + nb_entries++; + cached_cons++; + } + + return nb_entries; +} + /* Functions for consumers */ static inline void __xskq_cons_release(struct xsk_queue *q) @@ -220,17 +244,22 @@ static inline void xskq_cons_get_entries(struct xsk_queue *q) __xskq_cons_peek(q); } -static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt) +static inline u32 xskq_cons_nb_entries(struct xsk_queue *q, u32 max) { u32 entries = q->cached_prod - q->cached_cons; - if (entries >= cnt) - return true; + if (entries >= max) + return max; __xskq_cons_peek(q); entries = q->cached_prod - q->cached_cons; - return entries >= cnt; + return entries >= max ? max : entries; +} + +static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt) +{ + return xskq_cons_nb_entries(q, cnt) >= cnt ? true : false; } static inline bool xskq_cons_peek_addr_unchecked(struct xsk_queue *q, u64 *addr) @@ -249,16 +278,28 @@ static inline bool xskq_cons_peek_desc(struct xsk_queue *q, return xskq_cons_read_desc(q, desc, pool); } +static inline u32 xskq_cons_peek_desc_batch(struct xsk_queue *q, struct xdp_desc *descs, + struct xsk_buff_pool *pool, u32 max) +{ + u32 entries = xskq_cons_nb_entries(q, max); + + return xskq_cons_read_desc_batch(q, descs, pool, entries); +} + +/* To improve performance in the xskq_cons_release functions, only update local state here. + * Reflect this to global state when we get new entries from the ring in + * xskq_cons_get_entries() and whenever Rx or Tx processing are completed in the NAPI loop. + */ static inline void xskq_cons_release(struct xsk_queue *q) { - /* To improve performance, only update local state here. - * Reflect this to global state when we get new entries - * from the ring in xskq_cons_get_entries() and whenever - * Rx or Tx processing are completed in the NAPI loop. - */ q->cached_cons++; } +static inline void xskq_cons_release_n(struct xsk_queue *q, u32 cnt) +{ + q->cached_cons += cnt; +} + static inline bool xskq_cons_is_full(struct xsk_queue *q) { /* No barriers needed since data is not accessed */ @@ -268,18 +309,23 @@ static inline bool xskq_cons_is_full(struct xsk_queue *q) /* Functions for producers */ -static inline bool xskq_prod_is_full(struct xsk_queue *q) +static inline u32 xskq_prod_nb_free(struct xsk_queue *q, u32 max) { u32 free_entries = q->nentries - (q->cached_prod - q->cached_cons); - if (free_entries) - return false; + if (free_entries >= max) + return max; /* Refresh the local tail pointer */ q->cached_cons = READ_ONCE(q->ring->consumer); free_entries = q->nentries - (q->cached_prod - q->cached_cons); - return !free_entries; + return free_entries >= max ? max : free_entries; +} + +static inline bool xskq_prod_is_full(struct xsk_queue *q) +{ + return xskq_prod_nb_free(q, 1) ? false : true; } static inline int xskq_prod_reserve(struct xsk_queue *q) @@ -304,6 +350,23 @@ static inline int xskq_prod_reserve_addr(struct xsk_queue *q, u64 addr) return 0; } +static inline u32 xskq_prod_reserve_addr_batch(struct xsk_queue *q, struct xdp_desc *descs, + u32 max) +{ + struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; + u32 nb_entries, i, cached_prod; + + nb_entries = xskq_prod_nb_free(q, max); + + /* A, matches D */ + cached_prod = q->cached_prod; + for (i = 0; i < nb_entries; i++) + ring->desc[cached_prod++ & q->ring_mask] = descs[i].addr; + q->cached_prod = cached_prod; + + return nb_entries; +} + static inline int xskq_prod_reserve_desc(struct xsk_queue *q, u64 addr, u32 len) { From patchwork Mon Nov 16 11:12:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1400854 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=ZLwHPUQ4; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CZRKW6tRVz9sTL for ; Mon, 16 Nov 2020 22:14:07 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729695AbgKPLOH (ORCPT ); Mon, 16 Nov 2020 06:14:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728211AbgKPLOH (ORCPT ); Mon, 16 Nov 2020 06:14:07 -0500 Received: from mail-pf1-x441.google.com (mail-pf1-x441.google.com [IPv6:2607:f8b0:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 940CCC0613CF; Mon, 16 Nov 2020 03:14:05 -0800 (PST) Received: by mail-pf1-x441.google.com with SMTP id 10so13796179pfp.5; Mon, 16 Nov 2020 03:14:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ERjlSJo/a6lM7+vUC4rxT4rHbrdNDAXqh1n1PrY0dJA=; b=ZLwHPUQ4wh4H2wJIO3q4ypIajQTGmAjeh9a4Zn7DopWiMmuWGwWXvfT+ttzXPRxgzD bgX/a1q487QdDtWcgMVnJI7xcbfKT3Gw0lh0y8pHrGZspw8iL+iGqIU/KaIMjVlrkUif U7GlhGYatXBEnUodHy5XpA29hl3Ex+Q8DNsvjcsfL3u9S5Jcdfv19yG8WLKGNvwRidQR auyHNeMzv4zJEGsnrFpattFewo18Dz37AnzVCvJVKi7eTrL2u97OJ7XBEmCNdGczRJ/3 ZCmMPd078YkS9qePKFR9IqdsFN8CHK42RwRbszeHXIA0qVAwTuXZxGXDnfjvSFIwuMV2 jPIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ERjlSJo/a6lM7+vUC4rxT4rHbrdNDAXqh1n1PrY0dJA=; b=eE2qTwPvIlUKCNrDN6Me9Mhn29LCxUW3i68yKyPYw++BCB/ul4Sat4KDGxvBIbetdG BD+gj90xlqnAHOEJHHLKYKs9r50C2akn+fTuaOrwIPHeTLIwsoB8OmU8ui3SUpZOqPqE 6qBHFjzy4QnexxeNck/a9OD2gZUMolJbdmmdgNAplbKyuTF6QMYwl+SE87psu7K7XPif xQ2mHD+OfS7Z27Na4XaOmUDfTo/Le3ATI20pDHM3WBAakultfdDIA9WmojDgSzAW6Mgo OjG8wRGretiBc0okTD0I2HksobHFFI5+/moM6Nv+7AJZrgRnP+kUtYWTTFROhnyBK1bE 0p4g== X-Gm-Message-State: AOAM530gGo61n1N6/U/sa4SmFhQnvYjrp1I+qUdtkB/i6RVB+h7bx/e6 J00LaUnaMyzz51dhDymfb7I5TXd+5Xas2YP0NzQ= X-Google-Smtp-Source: ABdhPJxQUQhAG6+YxxqyBDAplM6pCfT6+l4DiHqgtDmZnPAP6iUPaDNtJ1H3uT/1zScSykPybKJxtA== X-Received: by 2002:a17:90a:f0c7:: with SMTP id fa7mr15887151pjb.3.1605525245177; Mon, 16 Nov 2020 03:14:05 -0800 (PST) Received: from localhost.localdomain ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id u24sm19486826pfm.81.2020.11.16.03.14.00 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Nov 2020 03:14:04 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, kuba@kernel.org, john.fastabend@gmail.com Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v3 5/5] i40e: use batched xsk Tx interfaces to increase performance Date: Mon, 16 Nov 2020 12:12:47 +0100 Message-Id: <1605525167-14450-6-git-send-email-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> References: <1605525167-14450-1-git-send-email-magnus.karlsson@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Use the new batched xsk interfaces for the Tx path in the i40e driver to improve performance. On my machine, this yields a throughput increase of 4% for the l2fwd sample app in xdpsock. If we instead just look at the Tx part, this patch set increases throughput with above 20% for Tx. Note that I had to explicitly loop unroll the inner loop to get to this performance level, by using a pragma. It is honored by both clang and gcc and should be ignored by versions that do not support it. Using the -funroll-loops compiler command line switch on the source file resulted in a loop unrolling on a higher level that lead to a performance decrease instead of an increase. Signed-off-by: Magnus Karlsson Acked-by: John Fastabend --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 11 +++ drivers/net/ethernet/intel/i40e/i40e_txrx.h | 1 + drivers/net/ethernet/intel/i40e/i40e_xsk.c | 119 ++++++++++++++++++++-------- drivers/net/ethernet/intel/i40e/i40e_xsk.h | 16 ++++ 4 files changed, 112 insertions(+), 35 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index d43ce13..c21548c 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -676,6 +676,8 @@ void i40e_free_tx_resources(struct i40e_ring *tx_ring) i40e_clean_tx_ring(tx_ring); kfree(tx_ring->tx_bi); tx_ring->tx_bi = NULL; + kfree(tx_ring->xsk_descs); + tx_ring->xsk_descs = NULL; if (tx_ring->desc) { dma_free_coherent(tx_ring->dev, tx_ring->size, @@ -1277,6 +1279,13 @@ int i40e_setup_tx_descriptors(struct i40e_ring *tx_ring) if (!tx_ring->tx_bi) goto err; + if (ring_is_xdp(tx_ring)) { + tx_ring->xsk_descs = kcalloc(I40E_MAX_NUM_DESCRIPTORS, sizeof(*tx_ring->xsk_descs), + GFP_KERNEL); + if (!tx_ring->xsk_descs) + goto err; + } + u64_stats_init(&tx_ring->syncp); /* round up to nearest 4K */ @@ -1300,6 +1309,8 @@ int i40e_setup_tx_descriptors(struct i40e_ring *tx_ring) return 0; err: + kfree(tx_ring->xsk_descs); + tx_ring->xsk_descs = NULL; kfree(tx_ring->tx_bi); tx_ring->tx_bi = NULL; return -ENOMEM; diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h index 2feed92..5f531b1 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h @@ -389,6 +389,7 @@ struct i40e_ring { struct i40e_channel *ch; struct xdp_rxq_info xdp_rxq; struct xsk_buff_pool *xsk_pool; + struct xdp_desc *xsk_descs; /* For storing descriptors in the AF_XDP ZC path */ } ____cacheline_internodealigned_in_smp; static inline bool ring_uses_build_skb(struct i40e_ring *ring) diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 20d2632..4c44f49 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -2,6 +2,7 @@ /* Copyright(c) 2018 Intel Corporation. */ #include +#include #include #include @@ -381,6 +382,69 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) return failure ? budget : (int)total_rx_packets; } +static void i40e_xmit_pkt(struct i40e_ring *xdp_ring, struct xdp_desc *desc, + unsigned int *total_bytes) +{ + struct i40e_tx_desc *tx_desc; + dma_addr_t dma; + + dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc->addr); + xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc->len); + + tx_desc = I40E_TX_DESC(xdp_ring, xdp_ring->next_to_use++); + tx_desc->buffer_addr = cpu_to_le64(dma); + tx_desc->cmd_type_offset_bsz = build_ctob(I40E_TX_DESC_CMD_ICRC | I40E_TX_DESC_CMD_EOP, + 0, desc->len, 0); + + *total_bytes += desc->len; +} + +static void i40e_xmit_pkt_batch(struct i40e_ring *xdp_ring, struct xdp_desc *desc, + unsigned int *total_bytes) +{ + u16 ntu = xdp_ring->next_to_use; + struct i40e_tx_desc *tx_desc; + dma_addr_t dma; + u32 i; + + loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) { + dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc[i].addr); + xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc[i].len); + + tx_desc = I40E_TX_DESC(xdp_ring, ntu++); + tx_desc->buffer_addr = cpu_to_le64(dma); + tx_desc->cmd_type_offset_bsz = build_ctob(I40E_TX_DESC_CMD_ICRC | + I40E_TX_DESC_CMD_EOP, + 0, desc[i].len, 0); + + *total_bytes += desc[i].len; + } + + xdp_ring->next_to_use = ntu; +} + +static void i40e_fill_tx_hw_ring(struct i40e_ring *xdp_ring, struct xdp_desc *descs, u32 nb_pkts, + unsigned int *total_bytes) +{ + u32 batched, leftover, i; + + batched = nb_pkts & ~(PKTS_PER_BATCH - 1); + leftover = nb_pkts & (PKTS_PER_BATCH - 1); + for (i = 0; i < batched; i += PKTS_PER_BATCH) + i40e_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes); + for (i = batched; i < batched + leftover; i++) + i40e_xmit_pkt(xdp_ring, &descs[i], total_bytes); +} + +static void i40e_set_rs_bit(struct i40e_ring *xdp_ring) +{ + u16 ntu = xdp_ring->next_to_use ? xdp_ring->next_to_use - 1 : xdp_ring->count - 1; + struct i40e_tx_desc *tx_desc; + + tx_desc = I40E_TX_DESC(xdp_ring, ntu); + tx_desc->cmd_type_offset_bsz |= (I40E_TX_DESC_CMD_RS << I40E_TXD_QW1_CMD_SHIFT); +} + /** * i40e_xmit_zc - Performs zero-copy Tx AF_XDP * @xdp_ring: XDP Tx ring @@ -390,45 +454,30 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) **/ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) { - unsigned int sent_frames = 0, total_bytes = 0; - struct i40e_tx_desc *tx_desc = NULL; - struct xdp_desc desc; - dma_addr_t dma; - - while (budget-- > 0) { - if (!xsk_tx_peek_desc(xdp_ring->xsk_pool, &desc)) - break; - - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc.addr); - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, - desc.len); - - tx_desc = I40E_TX_DESC(xdp_ring, xdp_ring->next_to_use); - tx_desc->buffer_addr = cpu_to_le64(dma); - tx_desc->cmd_type_offset_bsz = - build_ctob(I40E_TX_DESC_CMD_ICRC - | I40E_TX_DESC_CMD_EOP, - 0, desc.len, 0); - - sent_frames++; - total_bytes += desc.len; - - xdp_ring->next_to_use++; - if (xdp_ring->next_to_use == xdp_ring->count) - xdp_ring->next_to_use = 0; + struct xdp_desc *descs = xdp_ring->xsk_descs; + u32 nb_pkts, nb_processed = 0; + unsigned int total_bytes = 0; + + nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget); + if (!nb_pkts) + return false; + + if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) { + nb_processed = xdp_ring->count - xdp_ring->next_to_use; + i40e_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes); + xdp_ring->next_to_use = 0; } - if (tx_desc) { - /* Request an interrupt for the last frame and bump tail ptr. */ - tx_desc->cmd_type_offset_bsz |= (I40E_TX_DESC_CMD_RS << - I40E_TXD_QW1_CMD_SHIFT); - i40e_xdp_ring_update_tail(xdp_ring); + i40e_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed, + &total_bytes); - xsk_tx_release(xdp_ring->xsk_pool); - i40e_update_tx_stats(xdp_ring, sent_frames, total_bytes); - } + /* Request an interrupt for the last frame and bump tail ptr. */ + i40e_set_rs_bit(xdp_ring); + i40e_xdp_ring_update_tail(xdp_ring); + + i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes); - return !!budget; + return true; } /** diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h index 7adfd85..ea88f45 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h @@ -4,6 +4,22 @@ #ifndef _I40E_XSK_H_ #define _I40E_XSK_H_ +/* This value should match the pragma in the loop_unrolled_for + * macro. Why 4? It is strictly empirical. It seems to be a good + * compromise between the advantage of having simultaneous outstanding + * reads to the DMA array that can hide each others latency and the + * disadvantage of having a larger code path. + */ +#define PKTS_PER_BATCH 4 + +#ifdef __clang__ +#define loop_unrolled_for _Pragma("clang loop unroll_count(4)") for +#elif __GNUC__ >= 8 +#define loop_unrolled_for _Pragma("GCC unroll 4") for +#else +#define loop_unrolled_for for +#endif + struct i40e_vsi; struct xsk_buff_pool; struct zero_copy_allocator;