Message ID | 20130611142428.17879.33582.stgit@ladj378.jer.intel.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Eliezer Tamir <eliezer.tamir@linux.intel.com> Date: Tue, 11 Jun 2013 17:24:28 +0300 > depends on X86_TSC Wait a second, I didn't notice this before. There needs to be a better way to test for the accuracy you need, or if the issue is lack of a proper API for cycle counter reading, fix that rather than add ugly arch specific dependencies to generic networking code. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 12 Jun 2013 15:12:05 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Eliezer Tamir <eliezer.tamir@linux.intel.com> > Date: Tue, 11 Jun 2013 17:24:28 +0300 > > > depends on X86_TSC > > Wait a second, I didn't notice this before. There needs to be a better > way to test for the accuracy you need, or if the issue is lack of a proper > API for cycle counter reading, fix that rather than add ugly arch > specific dependencies to generic networking code. This should be sched_clock(), rather than direct TSC access. Also any code using TSC or sched_clock has to be carefully audited to deal with clocks running at different rates on different CPU's. Basically value is only meaning full on same CPU. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/06/2013 05:01, Stephen Hemminger wrote: > On Wed, 12 Jun 2013 15:12:05 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > >> From: Eliezer Tamir <eliezer.tamir@linux.intel.com> >> Date: Tue, 11 Jun 2013 17:24:28 +0300 >> >>> depends on X86_TSC >> >> Wait a second, I didn't notice this before. There needs to be a better >> way to test for the accuracy you need, or if the issue is lack of a proper >> API for cycle counter reading, fix that rather than add ugly arch >> specific dependencies to generic networking code. > > This should be sched_clock(), rather than direct TSC access. > Also any code using TSC or sched_clock has to be carefully audited to deal with > clocks running at different rates on different CPU's. Basically value is only > meaning full on same CPU. OK, If we covert to sched_clock(), would adding a define such as HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high precision clock and a 64 bit cycles_t be a good solution? (if not any other suggestion?) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/13/2013 04:13 AM, Eliezer Tamir wrote: > On 13/06/2013 05:01, Stephen Hemminger wrote: >> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT) >> David Miller <davem@davemloft.net> wrote: >> >>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com> >>> Date: Tue, 11 Jun 2013 17:24:28 +0300 >>> >>>> depends on X86_TSC >>> >>> Wait a second, I didn't notice this before. There needs to be a better >>> way to test for the accuracy you need, or if the issue is lack of a proper >>> API for cycle counter reading, fix that rather than add ugly arch >>> specific dependencies to generic networking code. >> >> This should be sched_clock(), rather than direct TSC access. >> Also any code using TSC or sched_clock has to be carefully audited to deal with >> clocks running at different rates on different CPU's. Basically value is only >> meaning full on same CPU. > > OK, > > If we covert to sched_clock(), would adding a define such as HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high precision clock and a 64 bit cycles_t be a good solution? > > (if not any other suggestion?) Hm, probably cpu_clock() and similar might be better, since they use sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK (meaning when sched_clock() provides synchronized highres time source from the architecture), and, quoting .... Otherwise it tries to create a semi stable clock from a mixture of other clocks, including: - GTOD (clock monotomic) - sched_clock() - explicit idle events But yeah, it needs to be evaluated regarding the drift between CPUs in general. Then, eventually, you could get rid of the entire NET_LL_RX_POLL config option plus related ifdefs in the code and have it built-in in general? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/06/2013 11:00, Daniel Borkmann wrote: > On 06/13/2013 04:13 AM, Eliezer Tamir wrote: >> On 13/06/2013 05:01, Stephen Hemminger wrote: >>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT) >>> David Miller <davem@davemloft.net> wrote: >>> >>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com> >>>> Date: Tue, 11 Jun 2013 17:24:28 +0300 >>>> >>>>> depends on X86_TSC >>>> >>>> Wait a second, I didn't notice this before. There needs to be a better >>>> way to test for the accuracy you need, or if the issue is lack of a >>>> proper >>>> API for cycle counter reading, fix that rather than add ugly arch >>>> specific dependencies to generic networking code. >>> >>> This should be sched_clock(), rather than direct TSC access. >>> Also any code using TSC or sched_clock has to be carefully audited to >>> deal with >>> clocks running at different rates on different CPU's. Basically value >>> is only >>> meaning full on same CPU. >> >> OK, >> >> If we covert to sched_clock(), would adding a define such as >> HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high >> precision clock and a 64 bit cycles_t be a good solution? >> >> (if not any other suggestion?) > > Hm, probably cpu_clock() and similar might be better, since they use > sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK > (meaning when sched_clock() provides synchronized highres time source from > the architecture), and, quoting .... I don't think we want the overhead of disabling IRQs that cpu_clock() adds. We don't really care about precise measurement. All we need is a sane cut-off for busy polling. It's no big deal if on a rare occasion we poll less, or even poll twice the time. As long as it's rare it should not matter. Maybe the answer is not to use cycle counting at all? Maybe just wait the full sk_rcvtimo? (resched() when proper, bail out if signal pending, etc.) This could only be a safe/sane thing to do after we add a socket option, because this can't be a global setting. This would of course turn the option into a flag. If it's set (and !nonblock), busy wait up to sk_recvtimo. Opinions? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/Kconfig b/net/Kconfig index d6a9ce6..8fe8845 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -244,16 +244,9 @@ config NETPRIO_CGROUP a per-interface basis config NET_LL_RX_POLL - bool "Low Latency Receive Poll" + boolean depends on X86_TSC - default n - ---help--- - Support Low Latency Receive Queue Poll. - (For network card drivers which support this option.) - When waiting for data in read or poll call directly into the the device driver - to flush packets which may be pending on the device queues into the stack. - - If unsure, say N. + default y config BQL boolean
Remove NET_LL_RX_POLL from the config menu. Change default to y. Busy polling still needs to be enabled at runtime via sysctl. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> --- net/Kconfig | 11 ++--------- 1 files changed, 2 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html