From patchwork Tue Sep 13 07:00:06 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Marcin Wojtas <mw@semihalf.com>
X-Patchwork-Id: 669200
X-Patchwork-Delegate: davem@davemloft.net
Return-Path: <netdev-owner@vger.kernel.org>
X-Original-To: patchwork-incoming@ozlabs.org
Delivered-To: patchwork-incoming@ozlabs.org
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by ozlabs.org (Postfix) with ESMTP id 3sYFvz01Ghz9sC3
	for <patchwork-incoming@ozlabs.org>;
	Tue, 13 Sep 2016 17:01:35 +1000 (AEST)
Authentication-Results: ozlabs.org; dkim=pass (2048-bit key;
	unprotected) header.d=semihalf-com.20150623.gappssmtp.com
	header.i=@semihalf-com.20150623.gappssmtp.com
	header.b=w5H36BYT; dkim-atps=neutral
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754890AbcIMHBO (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);
	Tue, 13 Sep 2016 03:01:14 -0400
Received: from mail-lf0-f49.google.com ([209.85.215.49]:36534 "EHLO
	mail-lf0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1754886AbcIMHBL (ORCPT
	<rfc822;netdev@vger.kernel.org>); Tue, 13 Sep 2016 03:01:11 -0400
Received: by mail-lf0-f49.google.com with SMTP id g62so102847153lfe.3
	for <netdev@vger.kernel.org>; Tue, 13 Sep 2016 00:01:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=semihalf-com.20150623.gappssmtp.com; s=20150623;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=6f67IPEycLuHNM0hMpSF4IwNhp7PzAVVKDlAMHlUooY=;
	b=w5H36BYTyL9781JpJCIb30uMy0tCji6DUUys87TgNVL/advUtsbpQli+aWMleTvyl0
	8OIHsRT8Ox2J8+EtLCBWGmr60EsaAWuloHbw9J+Irbs6HJqGqngdKY2eSdg3Z58Drt7z
	ZCvAl5SiepA7yqtvRdQHuBNwOpZAB3w7n4/SJ7bKPyeUcm1QdRFro5K4w1OiXtczQ/kh
	yB4M/qAFZpsqXmhgFoLwVt8YJWzqQUuVowiczLbB5MEDSSpyWMVzgbu3xRx6fgfdy7O8
	Nj2rOZcEbRp1MHlXmS1XIW6jNAxqBQK3notlkbNjWGd7gQyK7JZYQFjYEY7OJc/IJk5L
	XQtw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=6f67IPEycLuHNM0hMpSF4IwNhp7PzAVVKDlAMHlUooY=;
	b=QCMm//pr+PLhoaytjB1f7Panej0qATd2HO6mT8ReZlbenzIO+WKurfZGUhSIZ89r9D
	CtNK+sjVSBRKmWzfs0sx33b6RpBLyllRV5nuW+uO1f3iyEgfVXfQoQAIQwTioORauLmy
	KhWCtsdd0FFK+c7V8/MnDpFSOw5Dmu3Y4hxAJkXdCm1Dr8ig7rRkAZojagUrl1ReFvvk
	wN+3jFByEHpW2CMI+XH7QGToCamQMZrnUV1sepjicnPTzBatE3v5TTRXM4mTU5V6tKQi
	ys33snu5Wj/f5uxibSzThIa0oKv/LzOmvTwxRHJyfzfBZmQ8//Xusg0PhbS3PyIU/+5x
	0nGg==
X-Gm-Message-State: 
 AE9vXwNqUCn+NAKrVFwRb6xM0701HK9flDj4EfADh4GAlXOC1vmciD+tCE8Tt/NjzJ8Biw==
X-Received: by 10.46.0.75 with SMTP id 72mr3604023lja.50.1473750069519;
	Tue, 13 Sep 2016 00:01:09 -0700 (PDT)
Received: from enkidu.semihalf.local (31-172-191-173.noc.fibertech.net.pl.
	[31.172.191.173]) by smtp.gmail.com with ESMTPSA id
	h78sm3699062ljh.45.2016.09.13.00.01.07
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Tue, 13 Sep 2016 00:01:08 -0700 (PDT)
From: Marcin Wojtas <mw@semihalf.com>
To: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	netdev@vger.kernel.org
Cc: davem@davemloft.net, linux@arm.linux.org.uk,
	sebastian.hesselbarth@gmail.com, andrew@lunn.ch,
	jason@lakedaemon.net, thomas.petazzoni@free-electrons.com,
	gregory.clement@free-electrons.com, nadavh@marvell.com,
	alior@marvell.com, simon.guinot@sequanux.org, nitroshift@yahoo.com,
	mw@semihalf.com, jaz@semihalf.com
Subject: [PATCH net-next 2/2] net: mvneta: add BQL support
Date: Tue, 13 Sep 2016 09:00:06 +0200
Message-Id: <1473750006-21199-3-git-send-email-mw@semihalf.com>
X-Mailer: git-send-email 1.8.3.1
In-Reply-To: <1473750006-21199-1-git-send-email-mw@semihalf.com>
References: <1473750006-21199-1-git-send-email-mw@semihalf.com>
Sender: netdev-owner@vger.kernel.org
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

Tests showed that when whole bandwidth is consumed, the latency for
various kind of traffic can reach high values. With saturated
link (e.g. with iperf from target to host) simple ping could take
significant amount of time. BQL proved to improve this situation
when implemented in mvneta driver. Measurements of ping latency
for 3 link speeds:
Speed | Latency w/o BQL | Latency with BQL
10    |      7-14 ms    |     3.5 ms
100   |      2-12 ms    |     0.6 ms
1000  |   often timeout |   up to 2ms

Decreasing latency as above result in sligt performance cost - 4kpps
(-1.4%) when pushing 64B packets via two bridged interfaces of Armada 38x.
For 1500B packets in the same setup, the mpstat tool showed +8% of
CPU occupation (default affinity, second CPU idle). Even though this
cost seems reasonable to take, considering other improvements.

This commit adds byte queue limit mechanism for the mvneta driver.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index b9dccea..bb5df35 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1719,8 +1719,10 @@ static struct mvneta_tx_queue *mvneta_tx_done_policy(struct mvneta_port *pp,
 
 /* Free tx queue skbuffs */
 static void mvneta_txq_bufs_free(struct mvneta_port *pp,
-				 struct mvneta_tx_queue *txq, int num)
+				 struct mvneta_tx_queue *txq, int num,
+				 struct netdev_queue *nq)
 {
+	unsigned int bytes_compl = 0, pkts_compl = 0;
 	int i;
 
 	for (i = 0; i < num; i++) {
@@ -1728,6 +1730,11 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			txq->txq_get_index;
 		struct sk_buff *skb = txq->tx_skb[txq->txq_get_index];
 
+		if (skb) {
+			bytes_compl += skb->len;
+			pkts_compl++;
+		}
+
 		mvneta_txq_inc_get(txq);
 
 		if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr))
@@ -1738,6 +1745,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			continue;
 		dev_kfree_skb_any(skb);
 	}
+
+	netdev_tx_completed_queue(nq, pkts_compl, bytes_compl);
 }
 
 /* Handle end of transmission */
@@ -1751,7 +1760,7 @@ static void mvneta_txq_done(struct mvneta_port *pp,
 	if (!tx_done)
 		return;
 
-	mvneta_txq_bufs_free(pp, txq, tx_done);
+	mvneta_txq_bufs_free(pp, txq, tx_done, nq);
 
 	txq->count -= tx_done;
 
@@ -2358,6 +2367,8 @@ out:
 		struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 		struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id);
 
+		netdev_tx_sent_queue(nq, len);
+
 		txq->count += frags;
 		if (txq->count >= txq->tx_stop_threshold)
 			netif_tx_stop_queue(nq);
@@ -2385,9 +2396,10 @@ static void mvneta_txq_done_force(struct mvneta_port *pp,
 				  struct mvneta_tx_queue *txq)
 
 {
+	struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id);
 	int tx_done = txq->count;
 
-	mvneta_txq_bufs_free(pp, txq, tx_done);
+	mvneta_txq_bufs_free(pp, txq, tx_done, nq);
 
 	/* reset txq */
 	txq->count = 0;
@@ -2884,6 +2896,8 @@ static int mvneta_txq_init(struct mvneta_port *pp,
 static void mvneta_txq_deinit(struct mvneta_port *pp,
 			      struct mvneta_tx_queue *txq)
 {
+	struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id);
+
 	kfree(txq->tx_skb);
 
 	if (txq->tso_hdrs)
@@ -2895,6 +2909,8 @@ static void mvneta_txq_deinit(struct mvneta_port *pp,
 				  txq->size * MVNETA_DESC_ALIGNED_SIZE,
 				  txq->descs, txq->descs_phys);
 
+	netdev_tx_reset_queue(nq);
+
 	txq->descs             = NULL;
 	txq->last_desc         = 0;
 	txq->next_desc_to_proc = 0;