Message ID | 20130628125926.14419.89905.stgit@ladj378.jer.intel.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 2013-06-28 at 15:59 +0300, Eliezer Tamir wrote: > Our use of sched_clock is OK because we don't mind the side effects > of calling it and occasionally waking up on a different CPU. Sure about that? Jitter matters too. > When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling > sched_clock() so we don't trigger a debug_smp_processor_id() warning. [...] I think this is papering over a problem. The warning is there for a good reason. Would functions like these make it possible to use sched_clock() safely for polling? (I didn't spend much time thinking about the names.) struct sched_timestamp { int cpu; unsigned long long clock; }; static inline struct sched_timestamp sched_get_timestamp(void) { struct sched_timestamp ret; preempt_disable_notrace(); ret.cpu = smp_processor_id(); ret.clock = sched_clock(); preempt_enable_no_resched_notrace(); return ret; } static inline bool sched_timeout_or_moved(struct sched_timestamp start, unsigned long long timeout) { bool ret; preempt_disable_notrace(); ret = start.cpu != smp_processor_id() || (sched_clock() - start.clock) > timeout; preempt_enable_no_resched_notrace(); return ret; } Ben.
On 28/06/2013 19:51, Ben Hutchings wrote: > On Fri, 2013-06-28 at 15:59 +0300, Eliezer Tamir wrote: >> Our use of sched_clock is OK because we don't mind the side effects >> of calling it and occasionally waking up on a different CPU. > > Sure about that? Jitter matters too. > Pretty sure, this is a limit on how long we poll, it's for fairness to the rest of the system not for performance of this code. What matters is that on average you are bounded by something close to what the user specified. If once in a while you run less because of clock jitter, or even twice the specified time, it's no big deal. So I don't see how jitter would matter. And if your workload is jitter sensitive, you should probably be pinning tasks to CPUs anyway. >> When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling >> sched_clock() so we don't trigger a debug_smp_processor_id() warning. > [...] > > I think this is papering over a problem. The warning is there for a > good reason. I think we understand the warning, and that we are OK with the effects. looking at how other users in the kernel solved this issue It seems like this is what they do. for example trace/ring_buffer.c:ring_buffer_time_Stamp() Also kvm_clock_read() and xen_clokcsource_read() seem to disable preempt just to silence this warning. If they really cared about reading the value on one CPU, then being scheduled on another they should have disabled interrupts. or am I missing something? > Would functions like these make it possible to use sched_clock() safely > for polling? (I didn't spend much time thinking about the names.) > > struct sched_timestamp { > int cpu; > unsigned long long clock; > }; > > static inline struct sched_timestamp sched_get_timestamp(void) > { > struct sched_timestamp ret; > > preempt_disable_notrace(); > ret.cpu = smp_processor_id(); > ret.clock = sched_clock(); > preempt_enable_no_resched_notrace(); > > return ret; > } I don't understand, preempt_disable() only makes prevents preempt from taking the CPU away from you, you could still lose it for other reasons. You would really need to disable interrupts in this region to be sure that it all executed on the same CPU. -Eliezer -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 2013-06-29 at 21:50 +0300, Eliezer Tamir wrote: > On 28/06/2013 19:51, Ben Hutchings wrote: > > On Fri, 2013-06-28 at 15:59 +0300, Eliezer Tamir wrote: > >> Our use of sched_clock is OK because we don't mind the side effects > >> of calling it and occasionally waking up on a different CPU. > > > > Sure about that? Jitter matters too. > > > Pretty sure, this is a limit on how long we poll, it's for fairness to > the rest of the system not for performance of this code. > > What matters is that on average you are bounded by something close to > what the user specified. If once in a while you run less because of > clock jitter, or even twice the specified time, it's no big deal. So what is the maximum time difference in sched_clock() values between CPUs in the same system? > So I don't see how jitter would matter. > > And if your workload is jitter sensitive, you should probably be > pinning tasks to CPUs anyway. Yes, they should be pinned. But I think a single task that isn't pinned could poll for significantly longer than intended and the effect wouldn't be limited to that one task. > >> When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling > >> sched_clock() so we don't trigger a debug_smp_processor_id() warning. > > [...] > > > > I think this is papering over a problem. The warning is there for a > > good reason. > > I think we understand the warning, and that we are OK with the effects. > > looking at how other users in the kernel solved this issue > It seems like this is what they do. > for example trace/ring_buffer.c:ring_buffer_time_Stamp() Well, this seems to be debug information, not actually used for timing. > Also kvm_clock_read() and xen_clokcsource_read() seem to disable preempt > just to silence this warning. I can't see what you're referring to. > If they really cared about reading the value on one CPU, then being > scheduled on another they should have disabled interrupts. > or am I missing something? > > > Would functions like these make it possible to use sched_clock() safely > > for polling? (I didn't spend much time thinking about the names.) [...] > I don't understand, preempt_disable() only makes prevents preempt > from taking the CPU away from you, you could still lose it for > other reasons. > You would really need to disable interrupts in this region to be > sure that it all executed on the same CPU. Hmm, yes you're right. Anyway, what I was trying to suggest was that you would record the current CPU along with the initial timestamp, and then abort low-latency polling if either the task is moved onto a different CPU or the time limit was reached. Checking the CPU first means that the sched_clock() comparison is valid. Ben.
diff --git a/include/net/ll_poll.h b/include/net/ll_poll.h index 5bf2b3a..6d45e6f 100644 --- a/include/net/ll_poll.h +++ b/include/net/ll_poll.h @@ -37,20 +37,40 @@ extern unsigned int sysctl_net_ll_poll __read_mostly; #define LL_FLUSH_FAILED -1 #define LL_FLUSH_BUSY -2 -/* we can use sched_clock() because we don't care much about precision +/* a wrapper to make debug_smp_processor_id() happy + * we can use sched_clock() because we don't care much about precision * we only care that the average is bounded - * we don't mind a ~2.5% imprecision so <<10 instead of *1000 + */ +#ifdef CONFIG_DEBUG_PREEMPT +static inline u64 ll_sched_clock(void) +{ + u64 rc; + + preempt_disable_notrace(); + rc = sched_clock(); + preempt_enable_no_resched_notrace(); + + return rc; +} +#else /* CONFIG_DEBUG_PREEMPT */ +static inline u64 ll_sched_clock(void) +{ + return sched_clock(); +} +#endif /* CONFIG_DEBUG_PREEMPT */ + +/* we don't mind a ~2.5% imprecision so <<10 instead of *1000 * sk->sk_ll_usec is a u_int so this can't overflow */ static inline u64 ll_sk_end_time(struct sock *sk) { - return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + sched_clock(); + return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + ll_sched_clock(); } /* in poll/select we use the global sysctl_net_ll_poll value */ static inline u64 ll_end_time(void) { - return ((u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10) + sched_clock(); + return ((u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10) + ll_sched_clock(); } static inline bool sk_valid_ll(struct sock *sk) @@ -61,7 +81,7 @@ static inline bool sk_valid_ll(struct sock *sk) static inline bool can_poll_ll(u64 end_time) { - return !time_after64(sched_clock(), end_time); + return !time_after64(ll_sched_clock(), end_time); } /* when used in sock_poll() nonblock is known at compile time to be true
Our use of sched_clock is OK because we don't mind the side effects of calling it and occasionally waking up on a different CPU. When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling sched_clock() so we don't trigger a debug_smp_processor_id() warning. Reported-by: Cody P Schafer <devel-lists@codyps.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> --- include/net/ll_poll.h | 30 +++++++++++++++++++++++++----- 1 files changed, 25 insertions(+), 5 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html