[net-next,1/2] net: remove NET_LL_RX_POLL config menue

Message ID	20130611142428.17879.33582.stgit@ladj378.jer.intel.com
State	Changes Requested, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id E61A22C0089 for <patchwork-incoming@ozlabs.org>; Wed, 12 Jun 2013 00:24:58 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754460Ab3FKOYg (ORCPT <rfc822;patchwork-incoming@ozlabs.org>); Tue, 11 Jun 2013 10:24:36 -0400 Received: from mga09.intel.com ([134.134.136.24]:1594 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753496Ab3FKOYe (ORCPT <rfc822;netdev@vger.kernel.org>); Tue, 11 Jun 2013 10:24:34 -0400 Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 11 Jun 2013 07:22:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.87,845,1363158000"; d="scan'208";a="327707957" Received: from ladj378.jer.intel.com ([10.12.232.220]) by orsmga001.jf.intel.com with ESMTP; 11 Jun 2013 07:24:29 -0700 From: Eliezer Tamir <eliezer.tamir@linux.intel.com> Subject: [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue To: David Miller <davem@davemloft.net> Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, Jesse Brandeburg <jesse.brandeburg@intel.com>, Don Skidmore <donald.c.skidmore@intel.com>, e1000-devel@lists.sourceforge.net, Willem de Bruijn <willemb@google.com>, Eric Dumazet <erdnetdev@gmail.com>, Ben Hutchings <bhutchings@solarflare.com>, Andi Kleen <andi@firstfloor.org>, HPA <hpa@zytor.com>, Eilon Greenstien <eilong@broadcom.com>, Or Gerlitz <or.gerlitz@gmail.com>, Amir Vadai <amirv@mellanox.com>, Eliezer Tamir <eliezer@tamir.org.il> Date: Tue, 11 Jun 2013 17:24:28 +0300 Message-ID: <20130611142428.17879.33582.stgit@ladj378.jer.intel.com> In-Reply-To: <20130611142415.17879.75569.stgit@ladj378.jer.intel.com> References: <20130611142415.17879.75569.stgit@ladj378.jer.intel.com> User-Agent: StGIT/0.14.3 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: <netdev.vger.kernel.org> X-Mailing-List: netdev@vger.kernel.org

Eliezer Tamir June 11, 2013, 2:24 p.m. UTC

Remove NET_LL_RX_POLL from the config menu.
Change default to y.
Busy polling still needs to be enabled at runtime via sysctl.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 net/Kconfig |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller June 12, 2013, 10:12 p.m. UTC | #1

From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Date: Tue, 11 Jun 2013 17:24:28 +0300

>  	depends on X86_TSC

Wait a second, I didn't notice this before.  There needs to be a better
way to test for the accuracy you need, or if the issue is lack of a proper
API for cycle counter reading, fix that rather than add ugly arch
specific dependencies to generic networking code.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Hemminger June 13, 2013, 2:01 a.m. UTC | #2

On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> Date: Tue, 11 Jun 2013 17:24:28 +0300
> 
> >  	depends on X86_TSC
> 
> Wait a second, I didn't notice this before.  There needs to be a better
> way to test for the accuracy you need, or if the issue is lack of a proper
> API for cycle counter reading, fix that rather than add ugly arch
> specific dependencies to generic networking code.

This should be sched_clock(), rather than direct TSC access.
Also any code using TSC or sched_clock has to be carefully audited to deal with
clocks running at different rates on different CPU's. Basically value is only
meaning full on same CPU.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eliezer Tamir June 13, 2013, 2:13 a.m. UTC | #3

On 13/06/2013 05:01, Stephen Hemminger wrote:
> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>
>>>   	depends on X86_TSC
>>
>> Wait a second, I didn't notice this before.  There needs to be a better
>> way to test for the accuracy you need, or if the issue is lack of a proper
>> API for cycle counter reading, fix that rather than add ugly arch
>> specific dependencies to generic networking code.
>
> This should be sched_clock(), rather than direct TSC access.
> Also any code using TSC or sched_clock has to be carefully audited to deal with
> clocks running at different rates on different CPU's. Basically value is only
> meaning full on same CPU.

OK,

If we covert to sched_clock(), would adding a define such as 
HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high 
precision clock and a 64 bit cycles_t be a good solution?

(if not any other suggestion?)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Borkmann June 13, 2013, 8 a.m. UTC | #4

On 06/13/2013 04:13 AM, Eliezer Tamir wrote:
> On 13/06/2013 05:01, Stephen Hemminger wrote:
>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
>> David Miller <davem@davemloft.net> wrote:
>>
>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>>
>>>>       depends on X86_TSC
>>>
>>> Wait a second, I didn't notice this before.  There needs to be a better
>>> way to test for the accuracy you need, or if the issue is lack of a proper
>>> API for cycle counter reading, fix that rather than add ugly arch
>>> specific dependencies to generic networking code.
>>
>> This should be sched_clock(), rather than direct TSC access.
>> Also any code using TSC or sched_clock has to be carefully audited to deal with
>> clocks running at different rates on different CPU's. Basically value is only
>> meaning full on same CPU.
>
> OK,
>
> If we covert to sched_clock(), would adding a define such as HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high precision clock and a 64 bit cycles_t be a good solution?
>
> (if not any other suggestion?)

Hm, probably cpu_clock() and similar might be better, since they use
sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
(meaning when sched_clock() provides synchronized highres time source from
the architecture), and, quoting ....

  Otherwise it tries to create a semi stable clock from a mixture of other
  clocks, including:

   - GTOD (clock monotomic)
   - sched_clock()
   - explicit idle events

But yeah, it needs to be evaluated regarding the drift between CPUs in
general.

Then, eventually, you could get rid of the entire NET_LL_RX_POLL config
option plus related ifdefs in the code and have it built-in in general?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eliezer Tamir June 13, 2013, 10:09 a.m. UTC | #5

On 13/06/2013 11:00, Daniel Borkmann wrote:
> On 06/13/2013 04:13 AM, Eliezer Tamir wrote:
>> On 13/06/2013 05:01, Stephen Hemminger wrote:
>>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
>>> David Miller <davem@davemloft.net> wrote:
>>>
>>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>>>
>>>>>       depends on X86_TSC
>>>>
>>>> Wait a second, I didn't notice this before.  There needs to be a better
>>>> way to test for the accuracy you need, or if the issue is lack of a
>>>> proper
>>>> API for cycle counter reading, fix that rather than add ugly arch
>>>> specific dependencies to generic networking code.
>>>
>>> This should be sched_clock(), rather than direct TSC access.
>>> Also any code using TSC or sched_clock has to be carefully audited to
>>> deal with
>>> clocks running at different rates on different CPU's. Basically value
>>> is only
>>> meaning full on same CPU.
>>
>> OK,
>>
>> If we covert to sched_clock(), would adding a define such as
>> HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high
>> precision clock and a 64 bit cycles_t be a good solution?
>>
>> (if not any other suggestion?)
>
> Hm, probably cpu_clock() and similar might be better, since they use
> sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
> (meaning when sched_clock() provides synchronized highres time source from
> the architecture), and, quoting ....

I don't think we want the overhead of disabling IRQs
that cpu_clock() adds.

We don't really care about precise measurement.
All we need is a sane cut-off for busy polling.
It's no big deal if on a rare occasion we poll less,
or even poll twice the time.
As long as it's rare it should not matter.

Maybe the answer is not to use cycle counting at all?
Maybe just wait the full sk_rcvtimo?
(resched() when proper, bail out if signal pending, etc.)

This could only be a safe/sane thing to do after we add
a socket option, because this can't be a global setting.

This would of course turn the option into a flag.
If it's set (and !nonblock), busy wait up to sk_recvtimo.

Opinions?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next,1/2] net: remove NET_LL_RX_POLL config menue

Commit Message

Comments

Patch