Message ID | 20240911-devel-anna-maria-b4-timers-flseep-v2-13-b0d3f33ccfe0@linutronix.de (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | timers: Cleanup delay/sleep related mess | expand |
Le Wed, Sep 11, 2024 at 07:13:39AM +0200, Anna-Maria Behnsen a écrit : > When commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") > was introduced, documentation about proper usage of sleep realted functions related* > was outdated. > > The commit message references the usage of a HZ=100 system. When using a > 20ms sleep duration on such a system and therefore using msleep(), the > possible additional slack will be +10ms. > > When the system is configured with HZ=100 the granularity of a jiffy and of > a bucket of the lowest timer wheel level is 10ms. To make sure a timer will > not expire early (when queueing of the timer races with an concurrent > update of jiffies), timers are always queued into the next bucket. This is > the reason for the maximal possible slack of 10ms. > > fsleep() limits the maximal possible slack to 25% by making threshold > between usleep_range() and msleep() HZ dependent. As soon as the accuracy > of msleep() is sufficient, the less expensive timer list timer based > sleeping function is used instead of the more expensive hrtimer based > usleep_range() function. The udelay() will not be used in this specific > usecase as the lowest sleep length is larger than 1 microsecond. Isn't udelay() for everything below 10us ? > > Use fsleep() directly instead of using an own heuristic for the best > sleeping mechanism to use.. > > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Nathan Lynch <nathanl@linux.ibm.com> > Cc: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> > Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Frederic Weisbecker <frederic@kernel.org> writes: > Le Wed, Sep 11, 2024 at 07:13:39AM +0200, Anna-Maria Behnsen a écrit : >> When commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") >> was introduced, documentation about proper usage of sleep realted functions > > related* > >> was outdated. >> >> The commit message references the usage of a HZ=100 system. When using a >> 20ms sleep duration on such a system and therefore using msleep(), the >> possible additional slack will be +10ms. >> >> When the system is configured with HZ=100 the granularity of a jiffy and of >> a bucket of the lowest timer wheel level is 10ms. To make sure a timer will >> not expire early (when queueing of the timer races with an concurrent >> update of jiffies), timers are always queued into the next bucket. This is >> the reason for the maximal possible slack of 10ms. >> >> fsleep() limits the maximal possible slack to 25% by making threshold >> between usleep_range() and msleep() HZ dependent. As soon as the accuracy >> of msleep() is sufficient, the less expensive timer list timer based >> sleeping function is used instead of the more expensive hrtimer based >> usleep_range() function. The udelay() will not be used in this specific >> usecase as the lowest sleep length is larger than 1 microsecond. > > Isn't udelay() for everything below 10us ? It's larger than 1 millisecond... > >> >> Use fsleep() directly instead of using an own heuristic for the best >> sleeping mechanism to use.. >> >> Cc: Michael Ellerman <mpe@ellerman.id.au> >> Cc: Nathan Lynch <nathanl@linux.ibm.com> >> Cc: linuxppc-dev@lists.ozlabs.org >> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> >> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) > > Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index f7e86e09c49f..d31c9799cab2 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -1390,21 +1390,14 @@ bool __ref rtas_busy_delay(int status) */ ms = clamp(ms, 1U, 1000U); /* - * The delay hint is an order-of-magnitude suggestion, not - * a minimum. It is fine, possibly even advantageous, for - * us to pause for less time than hinted. For small values, - * use usleep_range() to ensure we don't sleep much longer - * than actually needed. - * - * See Documentation/timers/timers-howto.rst for - * explanation of the threshold used here. In effect we use - * usleep_range() for 9900 and 9901, msleep() for - * 9902-9905. + * The delay hint is an order-of-magnitude suggestion, not a + * minimum. It is fine, possibly even advantageous, for us to + * pause for less time than hinted. To make sure pause time will + * not be way longer than requested independent of HZ + * configuration, use fsleep(). See fsleep() for details of + * used sleeping functions. */ - if (ms <= 20) - usleep_range(ms * 100, ms * 1000); - else - msleep(ms); + fsleep(ms * 1000); break; case RTAS_BUSY: ret = true;