mbox series

[0/7] cpufreq: Call transition notifier only once for each policy

Message ID cover.1552545525.git.viresh.kumar@linaro.org
Headers show
Series cpufreq: Call transition notifier only once for each policy | expand

Message

Viresh Kumar March 14, 2019, 6:42 a.m. UTC
Currently we call the cpufreq transition notifiers once for each CPU of
the policy->cpus cpumask, which isn't that efficient. This patchset
tries to simplify that by adding another field in struct cpufreq_freqs,
cpus, so the callback has all the information available with a single
call for each policy.

I have tested this on arm64 platform and is compile tested for other
platforms. This has gone through 0-day testing as well, I have pushed my
branch over a week back to the public tree which gets tested by 0-day
bot.

FWIW, it maybe possible to make the callback implementation more
efficient now that they are called only once for each policy, but this
patchset only did the minimum amount of changes to make sure we don't
end up breaking otherwise working code.

--
viresh

Viresh Kumar (7):
  cpufreq: Pass policy->related_cpus to transition notifiers
  ARM: smp: Update cpufreq transition notifier to handle multiple CPUs
  ARM: twd: Update cpufreq transition notifier to handle multiple CPUs
  sparc64: Update cpufreq transition notifier to handle multiple CPUs
  x86/tsc: Update cpufreq transition notifier to handle multiple CPUs
  KVM: x86: Update cpufreq transition notifier to handle multiple CPUs
  cpufreq: Call transition notifiers only once for each policy

 arch/arm/kernel/smp.c       | 23 ++++++++++++++---------
 arch/arm/kernel/smp_twd.c   |  9 ++++++---
 arch/sparc/kernel/time_64.c | 28 ++++++++++++++++------------
 arch/x86/kernel/tsc.c       | 31 ++++++++++++++++++++-----------
 arch/x86/kvm/x86.c          | 31 ++++++++++++++++++++-----------
 drivers/cpufreq/cpufreq.c   | 19 ++++++++++---------
 include/linux/cpufreq.h     |  2 +-
 7 files changed, 87 insertions(+), 56 deletions(-)

Comments

Rafael J. Wysocki March 14, 2019, 9:28 a.m. UTC | #1
On Thu, Mar 14, 2019 at 7:43 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> Currently we call the cpufreq transition notifiers once for each CPU of
> the policy->cpus cpumask, which isn't that efficient.

Why isn't it efficient?

Transitions are per-policy anyway, so if something needs to be done
for each CPU in the policy, it doesn't matter too much which part of
the code carries out the iteration.

I guess some notifiers need to know what other CPUs there are in the
policy?  If so, then why?

> This patchset tries to simplify that by adding another field in struct cpufreq_freqs,
> cpus, so the callback has all the information available with a single
> call for each policy.

Well, you can argue that the core is simplified by it somewhat, but
the notifiers aren't.  They actually get more complex, conceptually
too, because they now need to worry about offline vs online CPUs etc.

Also I wonder why you decided to pass a cpumask in freqs instead of
just passing a policy pointer.  If you change things from per-CPU to
per-policy, passing the whole policy seems more natural.
Rafael J. Wysocki March 14, 2019, 9:40 a.m. UTC | #2
On Thu, Mar 14, 2019 at 10:28 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Mar 14, 2019 at 7:43 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > Currently we call the cpufreq transition notifiers once for each CPU of
> > the policy->cpus cpumask, which isn't that efficient.
>
> Why isn't it efficient?
>
> Transitions are per-policy anyway, so if something needs to be done
> for each CPU in the policy, it doesn't matter too much which part of
> the code carries out the iteration.
>
> I guess some notifiers need to know what other CPUs there are in the
> policy?  If so, then why?
>
> > This patchset tries to simplify that by adding another field in struct cpufreq_freqs,
> > cpus, so the callback has all the information available with a single
> > call for each policy.
>
> Well, you can argue that the core is simplified by it somewhat, but
> the notifiers aren't.  They actually get more complex, conceptually
> too, because they now need to worry about offline vs online CPUs etc.
>
> Also I wonder why you decided to pass a cpumask in freqs instead of
> just passing a policy pointer.  If you change things from per-CPU to
> per-policy, passing the whole policy seems more natural.

It also looks to me like all that needs to be one patch, or you have
the ugly transition situation in which notifiers are still invoked for
each CPU, but they assume to be invoked once per policy.
Viresh Kumar March 14, 2019, 10:16 a.m. UTC | #3
On 14-03-19, 10:28, Rafael J. Wysocki wrote:
> On Thu, Mar 14, 2019 at 7:43 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > Currently we call the cpufreq transition notifiers once for each CPU of
> > the policy->cpus cpumask, which isn't that efficient.
> 
> Why isn't it efficient?
> 
> Transitions are per-policy anyway, so if something needs to be done
> for each CPU in the policy, it doesn't matter too much which part of
> the code carries out the iteration.

Even if per-cpu iteration has to be done at some place, we are
avoiding function calls here and the code/locking in the notifier
layer as well. Will get more such info into changelog.

> I guess some notifiers need to know what other CPUs there are in the
> policy?  If so, then why?

You mean about the offline CPUs? I mentioned the rationale in 1/7. It
is to avoid bugs where we may end up using a stale value if the CPUs
are offlined/onlined regularly.

> > This patchset tries to simplify that by adding another field in struct cpufreq_freqs,
> > cpus, so the callback has all the information available with a single
> > call for each policy.
> 
> Well, you can argue that the core is simplified by it somewhat, but
> the notifiers aren't.  They actually get more complex, conceptually
> too, because they now need to worry about offline vs online CPUs etc.

24 different parts of the kernel register for transition notifiers and
only 5 required update here, the other 19 don't need to do per-cpu
stuff and they also get benefited by this work. Those routines will
get called only once now, instead of once per every CPU of the policy.

> Also I wonder why you decided to pass a cpumask in freqs instead of
> just passing a policy pointer.  If you change things from per-CPU to
> per-policy, passing the whole policy seems more natural.

I did that because they don't need to use the other fields of the
policy today and that doesn't look likely in near future as well.

I can do that if you want, but not sure why more information should be
provided than required.
Viresh Kumar March 14, 2019, 10:18 a.m. UTC | #4
On 14-03-19, 10:40, Rafael J. Wysocki wrote:
> On Thu, Mar 14, 2019 at 10:28 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Mar 14, 2019 at 7:43 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > Currently we call the cpufreq transition notifiers once for each CPU of
> > > the policy->cpus cpumask, which isn't that efficient.
> >
> > Why isn't it efficient?
> >
> > Transitions are per-policy anyway, so if something needs to be done
> > for each CPU in the policy, it doesn't matter too much which part of
> > the code carries out the iteration.
> >
> > I guess some notifiers need to know what other CPUs there are in the
> > policy?  If so, then why?
> >
> > > This patchset tries to simplify that by adding another field in struct cpufreq_freqs,
> > > cpus, so the callback has all the information available with a single
> > > call for each policy.
> >
> > Well, you can argue that the core is simplified by it somewhat, but
> > the notifiers aren't.  They actually get more complex, conceptually
> > too, because they now need to worry about offline vs online CPUs etc.
> >
> > Also I wonder why you decided to pass a cpumask in freqs instead of
> > just passing a policy pointer.  If you change things from per-CPU to
> > per-policy, passing the whole policy seems more natural.
> 
> It also looks to me like all that needs to be one patch, or you have
> the ugly transition situation in which notifiers are still invoked for
> each CPU, but they assume to be invoked once per policy.

I assumed that calling something like set_cyc2ns_scale() in x86
multiple times for each CPU shouldn't be that bad even if the
frequency changes only once, but such things may actually have
side-effects. I should merged them all.
Rafael J. Wysocki March 14, 2019, 10:55 a.m. UTC | #5
On Thu, Mar 14, 2019 at 11:16 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 14-03-19, 10:28, Rafael J. Wysocki wrote:
> > On Thu, Mar 14, 2019 at 7:43 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > Currently we call the cpufreq transition notifiers once for each CPU of
> > > the policy->cpus cpumask, which isn't that efficient.
> >
> > Why isn't it efficient?
> >
> > Transitions are per-policy anyway, so if something needs to be done
> > for each CPU in the policy, it doesn't matter too much which part of
> > the code carries out the iteration.
>
> Even if per-cpu iteration has to be done at some place, we are
> avoiding function calls here and the code/locking in the notifier
> layer as well. Will get more such info into changelog.
>
> > I guess some notifiers need to know what other CPUs there are in the
> > policy?  If so, then why?
>
> You mean about the offline CPUs? I mentioned the rationale in 1/7. It
> is to avoid bugs where we may end up using a stale value if the CPUs
> are offlined/onlined regularly.

I'm not really convinced about this.  CPU online really should take
care of updating everything anyway.

> > > This patchset tries to simplify that by adding another field in struct cpufreq_freqs,
> > > cpus, so the callback has all the information available with a single
> > > call for each policy.
> >
> > Well, you can argue that the core is simplified by it somewhat, but
> > the notifiers aren't.  They actually get more complex, conceptually
> > too, because they now need to worry about offline vs online CPUs etc.
>
> 24 different parts of the kernel register for transition notifiers and
> only 5 required update here, the other 19 don't need to do per-cpu
> stuff and they also get benefited by this work. Those routines will
> get called only once now, instead of once per every CPU of the policy.

This is a much better rationale for the change than the one given
originally IMO. :-)

> > Also I wonder why you decided to pass a cpumask in freqs instead of
> > just passing a policy pointer.  If you change things from per-CPU to
> > per-policy, passing the whole policy seems more natural.
>
> I did that because they don't need to use the other fields of the
> policy today and that doesn't look likely in near future as well.

But some of them need to combine the new cpumask with
cpu_online_mask() to get what would be policy->cpus effectively.  That
would be avoidable if you passed the policy pointer to them.
Viresh Kumar March 14, 2019, 11:03 a.m. UTC | #6
On 14-03-19, 11:55, Rafael J. Wysocki wrote:
> On Thu, Mar 14, 2019 at 11:16 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> But some of them need to combine the new cpumask with
> cpu_online_mask() to get what would be policy->cpus effectively.  That
> would be avoidable if you passed the policy pointer to them.

Right, that's what I also thought after your previous email. Will pass
the policy pointer instead.