diff mbox series

[RFC,4/5] sched/topology: Annonate RCU pointers properly

Message ID 20190221054942.132388-5-joel@joelfernandes.org
State RFC
Delegated to: David Miller
Headers show
Series RCU fixes for rcu_assign_pointer() usage | expand

Commit Message

Joel Fernandes Feb. 21, 2019, 5:49 a.m. UTC
The scheduler's topology code uses rcu_assign_pointer() to initialize
various pointers.

Let us annotate the pointers correctly which also help avoid future
bugs. This suppresses the new sparse errors caused by an annotation
check I added to rcu_assign_pointer().

Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
should be sufficient for the rq->sd initialization.

This fixes sparse errors:
kernel//sched/topology.c:378:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:387:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:612:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:615:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:618:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:621:9: sparse: error: incompatible types in
comparison expression (different address spaces)

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/sched/sched.h    | 12 ++++++------
 kernel/sched/topology.c | 12 ++++++------
 2 files changed, 12 insertions(+), 12 deletions(-)

Comments

Peter Zijlstra Feb. 21, 2019, 9:19 a.m. UTC | #1
On Thu, Feb 21, 2019 at 12:49:41AM -0500, Joel Fernandes (Google) wrote:

> Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
> should be sufficient for the rq->sd initialization.

> @@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
>  
>  	rq_attach_root(rq, rd);
>  	tmp = rq->sd;
> -	rcu_assign_pointer(rq->sd, sd);
> +	WRITE_ONCE(rq->sd, sd);
>  	dirty_sched_domain_sysctl(cpu);
>  	destroy_sched_domains(tmp);

Where did the RELEASE barrier go?

That was a publish operation, now it is not.
Joel Fernandes Feb. 21, 2019, 3:10 p.m. UTC | #2
Hi Peter,

Thanks for taking a look.

On Thu, Feb 21, 2019 at 10:19:44AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 12:49:41AM -0500, Joel Fernandes (Google) wrote:
> 
> > Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
> > should be sufficient for the rq->sd initialization.
> 
> > @@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
> >  
> >  	rq_attach_root(rq, rd);
> >  	tmp = rq->sd;
> > -	rcu_assign_pointer(rq->sd, sd);
> > +	WRITE_ONCE(rq->sd, sd);
> >  	dirty_sched_domain_sysctl(cpu);
> >  	destroy_sched_domains(tmp);
> 
> Where did the RELEASE barrier go?
> 
> That was a publish operation, now it is not.

Funny thing is, initially I had written this patch with smp_store_release()
instead of WRITE_ONCE, but checkpatch complaints with that since it needs a
comment on top of it, and I wasn't sure if RELEASE barrier was the intent of
using rcu_assign_pointer (all the more reason to replace it with something
more explicit).

I will replace it with the following and resubmit it then:

/* Release barrier */
smp_store_release(&rq->sd, sd);

Or do we want to just drop the "Release barrier" comment and live with the
checkpatch warning?

(my same response applies to patch 5/5).

thanks,

 - Joel
Peter Zijlstra Feb. 21, 2019, 3:29 p.m. UTC | #3
On Thu, Feb 21, 2019 at 10:10:57AM -0500, Joel Fernandes wrote:
> Hi Peter,
> 
> Thanks for taking a look.
> 
> On Thu, Feb 21, 2019 at 10:19:44AM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 21, 2019 at 12:49:41AM -0500, Joel Fernandes (Google) wrote:
> > 
> > > Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
> > > should be sufficient for the rq->sd initialization.
> > 
> > > @@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
> > >  
> > >  	rq_attach_root(rq, rd);
> > >  	tmp = rq->sd;
> > > -	rcu_assign_pointer(rq->sd, sd);
> > > +	WRITE_ONCE(rq->sd, sd);
> > >  	dirty_sched_domain_sysctl(cpu);
> > >  	destroy_sched_domains(tmp);
> > 
> > Where did the RELEASE barrier go?
> > 
> > That was a publish operation, now it is not.
> 
> Funny thing is, initially I had written this patch with smp_store_release()
> instead of WRITE_ONCE, but checkpatch complaints with that since it needs a
> comment on top of it, and I wasn't sure if RELEASE barrier was the intent of
> using rcu_assign_pointer (all the more reason to replace it with something
> more explicit).
> 
> I will replace it with the following and resubmit it then:
> 
> /* Release barrier */
> smp_store_release(&rq->sd, sd);
> 
> Or do we want to just drop the "Release barrier" comment and live with the
> checkpatch warning?

How about we keep using rcu_assign_pointer(), the whole sched domain
tree is under rcu; peruse that destroy_sched_domains() function for
instance.

Also check how for_each_domain() uses rcu_dereference().
Joel Fernandes Feb. 21, 2019, 5:17 p.m. UTC | #4
On Thu, Feb 21, 2019 at 04:29:44PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 10:10:57AM -0500, Joel Fernandes wrote:
> > Hi Peter,
> > 
> > Thanks for taking a look.
> > 
> > On Thu, Feb 21, 2019 at 10:19:44AM +0100, Peter Zijlstra wrote:
> > > On Thu, Feb 21, 2019 at 12:49:41AM -0500, Joel Fernandes (Google) wrote:
> > > 
> > > > Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
> > > > should be sufficient for the rq->sd initialization.
> > > 
> > > > @@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
> > > >  
> > > >  	rq_attach_root(rq, rd);
> > > >  	tmp = rq->sd;
> > > > -	rcu_assign_pointer(rq->sd, sd);
> > > > +	WRITE_ONCE(rq->sd, sd);
> > > >  	dirty_sched_domain_sysctl(cpu);
> > > >  	destroy_sched_domains(tmp);
> > > 
> > > Where did the RELEASE barrier go?
> > > 
> > > That was a publish operation, now it is not.
> > 
> > Funny thing is, initially I had written this patch with smp_store_release()
> > instead of WRITE_ONCE, but checkpatch complaints with that since it needs a
> > comment on top of it, and I wasn't sure if RELEASE barrier was the intent of
> > using rcu_assign_pointer (all the more reason to replace it with something
> > more explicit).
> > 
> > I will replace it with the following and resubmit it then:
> > 
> > /* Release barrier */
> > smp_store_release(&rq->sd, sd);
> > 
> > Or do we want to just drop the "Release barrier" comment and live with the
> > checkpatch warning?
> 
> How about we keep using rcu_assign_pointer(), the whole sched domain
> tree is under rcu; peruse that destroy_sched_domains() function for
> instance.
> 
> Also check how for_each_domain() uses rcu_dereference().

May be then, all those pointers should be made __rcu as well. Then we can use
rcu_assign_pointer() here. I will look more into it and study these functions
as you are suggesting.

thanks,

- Joel
diff mbox series

Patch

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2ab545d40381..806703afd4b0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -780,7 +780,7 @@  struct root_domain {
 	 * NULL-terminated list of performance domains intersecting with the
 	 * CPUs of the rd. Protected by RCU.
 	 */
-	struct perf_domain	*pd;
+	struct perf_domain __rcu *pd;
 };
 
 extern struct root_domain def_root_domain;
@@ -1305,13 +1305,13 @@  static inline struct sched_domain *lowest_flag_domain(int cpu, int flag)
 	return sd;
 }
 
-DECLARE_PER_CPU(struct sched_domain *, sd_llc);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc);
 DECLARE_PER_CPU(int, sd_llc_size);
 DECLARE_PER_CPU(int, sd_llc_id);
-DECLARE_PER_CPU(struct sched_domain_shared *, sd_llc_shared);
-DECLARE_PER_CPU(struct sched_domain *, sd_numa);
-DECLARE_PER_CPU(struct sched_domain *, sd_asym_packing);
-DECLARE_PER_CPU(struct sched_domain *, sd_asym_cpucapacity);
+DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_numa);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity);
 extern struct static_key_false sched_asym_cpucapacity;
 
 struct sched_group_capacity {
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 3f35ba1d8fde..2eab2e16ded5 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -586,13 +586,13 @@  static void destroy_sched_domains(struct sched_domain *sd)
  * the cpumask of the domain), this allows us to quickly tell if
  * two CPUs are in the same cache domain, see cpus_share_cache().
  */
-DEFINE_PER_CPU(struct sched_domain *, sd_llc);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc);
 DEFINE_PER_CPU(int, sd_llc_size);
 DEFINE_PER_CPU(int, sd_llc_id);
-DEFINE_PER_CPU(struct sched_domain_shared *, sd_llc_shared);
-DEFINE_PER_CPU(struct sched_domain *, sd_numa);
-DEFINE_PER_CPU(struct sched_domain *, sd_asym_packing);
-DEFINE_PER_CPU(struct sched_domain *, sd_asym_cpucapacity);
+DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity);
 DEFINE_STATIC_KEY_FALSE(sched_asym_cpucapacity);
 
 static void update_top_cache_domain(int cpu)
@@ -668,7 +668,7 @@  cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
 
 	rq_attach_root(rq, rd);
 	tmp = rq->sd;
-	rcu_assign_pointer(rq->sd, sd);
+	WRITE_ONCE(rq->sd, sd);
 	dirty_sched_domain_sysctl(cpu);
 	destroy_sched_domains(tmp);