From patchwork Wed Dec  7 14:26:49 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: solomon <shanwei88@gmail.com>
X-Patchwork-Id: 129971
X-Patchwork-Delegate: davem@davemloft.net
Return-Path: <netdev-owner@vger.kernel.org>
X-Original-To: patchwork-incoming@ozlabs.org
Delivered-To: patchwork-incoming@ozlabs.org
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by ozlabs.org (Postfix) with ESMTP id 98C9F1007D1
	for <patchwork-incoming@ozlabs.org>;
	Thu,  8 Dec 2011 01:24:51 +1100 (EST)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1756087Ab1LGOYq (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);
	Wed, 7 Dec 2011 09:24:46 -0500
Received: from mail-qy0-f174.google.com ([209.85.216.174]:47965 "EHLO
	mail-qy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1755798Ab1LGOYo (ORCPT
	<rfc822;netdev@vger.kernel.org>); Wed, 7 Dec 2011 09:24:44 -0500
Received: by qcqz2 with SMTP id z2so364009qcq.19
	for <multiple recipients>; Wed, 07 Dec 2011 06:24:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=message-id:date:from:user-agent:mime-version:to:cc:subject
	:content-type:content-transfer-encoding;
	bh=tKH3JeQLTH58cTGyRGmXGEAqlU98wqVoB5kzDqmrApw=;
	b=p3YpXzuqm9vRBG3WtAc4Wq88zUXo8QCwMlddINk9RQbb6+MgdgmK0NkrQ8tjDnj3x/
	vgHjtDbStu9CXgkB4jIOHEbcpORwMycQ0N62MC3dUZ+uJDK/Z1bjhFBclhviJhufypKX
	1bYGrHqc0pYcPcc+zrrnJBmq/W1rmv+bpxMEU=
Received: by 10.50.106.226 with SMTP id gx2mr20173777igb.13.1323267883917;
	Wed, 07 Dec 2011 06:24:43 -0800 (PST)
Received: from [127.0.0.1] ([121.14.96.125])
	by mx.google.com with ESMTPS id l28sm7390624ibc.3.2011.12.07.06.24.39
	(version=SSLv3 cipher=OTHER); Wed, 07 Dec 2011 06:24:42 -0800 (PST)
Message-ID: <4EDF77A9.4040808@gmail.com>
Date: Wed, 07 Dec 2011 22:26:49 +0800
From: Shan Wei <shanwei88@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1;
	rv:8.0) Gecko/20111105 Thunderbird/8.0
MIME-Version: 1.0
To: "Randy Dunlap (maintainer:DOCUMENTATION)" <rdunlap@xenotime.net>,
	David Miller <davem@davemloft.net>, willemb@google.com,
	benjamin.poirier@gmail.com,
	=?GB2312?B?taXOwA==?= <shanwei88@gmail.com>, therbert@google.com
CC: linux-doc@vger.kernel.org,
	Network Developer Mailing List <netdev@vger.kernel.org>
Subject: [Patch V2 ] net: doc: cleanup Documentation/networking/scaling.txt
Sender: netdev-owner@vger.kernel.org
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

1) Fix some typos.
2) Change mode of the punctuation from full to half, eg.’,“ .
   So that the punctuation can be read at console.

Signed-off-by: Shan Wei<shanwei88@gmail.com>
---
v2: fix the broken in patchwork.

I feel uncertain when reading following contents that no variable
in rps_dev_flow_table or softnet_data records the length of the current
backlog. Just last_qtail variable pointers the tail of the backlog.

"The counter in rps_dev_flow_table values records the length of the current
 CPU's backlog when a packet in this flow was last enqueued. "

If missing something, please correct me.
---
 Documentation/networking/scaling.txt |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

 that is not the focus of these techniques.
@@ -42,7 +42,7 @@ indirection table and reading the corresponding value.
  Some advanced NICs allow steering packets to queues based on
 programmable filters. For example, webserver bound TCP port 80 packets
-can be directed to their own receive queue. Such “n-tuple” filters can
+can be directed to their own receive queue. Such "n-tuple" filters can
 be configured from ethtool (--config-ntuple).
  ==== RSS Configuration
@@ -104,7 +104,7 @@ RSS. Being in software, it is necessarily called
later in the datapath.
 Whereas RSS selects the queue and hence CPU that will run the hardware
 interrupt handler, RPS selects the CPU to perform protocol processing
 above the interrupt handler. This is accomplished by placing the packet
-on the desired CPU’s backlog queue and waking up the CPU for processing.
+on the desired CPU's backlog queue and waking up the CPU for processing.
 RPS has some advantages over RSS: 1) it can be used with any NIC,
 2) software filters can easily be added to hash over new protocols,
 3) it does not increase hardware device interrupt rate (although it does
@@ -116,20 +116,20 @@ netif_receive_skb(). These call the get_rps_cpu()
function, which
 selects the queue that should process a packet.
  The first step in determining the target CPU for RPS is to calculate a
-flow hash over the packet’s addresses or ports (2-tuple or 4-tuple hash
+flow hash over the packet's addresses or ports (2-tuple or 4-tuple hash
 depending on the protocol). This serves as a consistent hash of the
 associated flow of the packet. The hash is either provided by hardware
 or will be computed in the stack. Capable hardware can pass the hash in
 the receive descriptor for the packet; this would usually be the same
 hash used for RSS (e.g. computed Toeplitz hash). The hash is saved in
 skb->rx_hash and can be used elsewhere in the stack as a hash of the
-packet’s flow.
+packet's flow.
  Each receive hardware queue has an associated list of CPUs to which
 RPS may enqueue packets for processing. For each received packet,
 an index into the list is computed from the flow hash modulo the size
 of the list. The indexed CPU is the target for processing the packet,
-and the packet is queued to the tail of that CPU’s backlog queue. At
+and the packet is queued to the tail of that CPU's backlog queue. At
 the end of the bottom half routine, IPIs are sent to any CPUs for which
 packets have been queued to their backlog queue. The IPI wakes backlog
 processing on the remote CPU, and any queued packets are then processed
@@ -208,7 +208,7 @@ The counter in rps_dev_flow_table values records the
length of the current
 CPU's backlog when a packet in this flow was last enqueued. Each backlog
 queue has a head counter that is incremented on dequeue. A tail counter
 is computed as head counter + queue length. In other words, the counter
-in rps_dev_flow_table[i] records the last element in flow i that has
+in rps_dev_flow[i] records the last element in flow i that has
 been enqueued onto the currently designated CPU for flow i (of course,
 entry i is actually selected by hash and multiple flows may hash to the
 same entry i).
@@ -218,13 +218,13 @@ CPU for packet processing (from get_rps_cpu()) the
rps_sock_flow table
 and the rps_dev_flow table of the queue that the packet was received on
 are compared. If the desired CPU for the flow (found in the
 rps_sock_flow table) matches the current CPU (found in the rps_dev_flow
-table), the packet is enqueued onto that CPU’s backlog. If they differ,
+table), the packet is enqueued onto that CPU's backlog. If they differ,
 the current CPU is updated to match the desired CPU if one of the
 following is true:
  - The current CPU's queue head counter >= the recorded tail counter
   value in rps_dev_flow[i]
-- The current CPU is unset (equal to NR_CPUS)
+- The current CPU is unset (equal to RPS_NO_CPU)
 - The current CPU is offline
  After this check, the packet is sent to the (possibly updated) current
@@ -235,7 +235,7 @@ CPU.
  ==== RFS Configuration
 -RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on
+RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on
 by default for SMP). The functionality remains disabled until explicitly
 configured. The number of entries in the global flow table is set through:
 @@ -258,7 +258,7 @@ For a single queue device, the rps_flow_cnt value
for the single queue
 would normally be configured to the same value as rps_sock_flow_entries.
 For a multi-queue device, the rps_flow_cnt for each queue might be
 configured as rps_sock_flow_entries / N, where N is the number of
-queues. So for instance, if rps_flow_entries is set to 32768 and there
+queues. So for instance, if rps_sock_flow_entries is set to 32768 and there
 are 16 configured receive queues, rps_flow_cnt for each queue might be
 configured as 2048.
 @@ -272,7 +272,7 @@ the application thread consuming the packets of
each flow is running.
 Accelerated RFS should perform better than RFS since packets are sent
 directly to a CPU local to the thread consuming the data. The target CPU
 will either be the same CPU where the application runs, or at least a CPU
-which is local to the application thread’s CPU in the cache hierarchy.
+which is local to the application thread's CPU in the cache hierarchy.
  To enable accelerated RFS, the networking stack calls the
 ndo_rx_flow_steer driver function to communicate the desired hardware
@@ -285,7 +285,7 @@ The hardware queue for a flow is derived from the
CPU recorded in
 rps_dev_flow_table. The stack consults a CPU to hardware queue map which
 is maintained by the NIC driver. This is an auto-generated reverse map of
 the IRQ affinity table shown by /proc/interrupts. Drivers can use
-functions in the cpu_rmap (“CPU affinity reverse map”) kernel library
+functions in the cpu_rmap ("CPU affinity reverse map") kernel library
 to populate the map. For each CPU, the corresponding queue in the map is
 set to be one whose processing CPU is closest in cache locality.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/networking/scaling.txt
b/Documentation/networking/scaling.txt
index a177de2..1215fcc 100644
--- a/Documentation/networking/scaling.txt
+++ b/Documentation/networking/scaling.txt
@@ -26,7 +26,7 @@ queues to distribute processing among CPUs. The NIC
distributes packets by
 applying a filter to each packet that assigns it to one of a small number
 of logical flows. Packets for each flow are steered to a separate receive
 queue, which in turn can be processed by separate CPUs. This mechanism is
-generally known as “Receive-side Scaling” (RSS). The goal of RSS and
+generally known as "Receive-side Scaling" (RSS). The goal of RSS and
 the other scaling techniques is to increase performance uniformly.
 Multi-queue distribution can also be used for traffic prioritization, but