diff mbox

netns: revert "netns: avoid disabling irq for netns id"

Message ID 147710095406.17836.12069093425270216381.stgit@localhost
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Paul Moore Oct. 22, 2016, 1:49 a.m. UTC
From: Paul Moore <paul@paul-moore.com>

This reverts commit bc51dddf98c9 ("netns: avoid disabling irq for
netns id") as it was found to cause problems with systems running
SELinux/audit, see the mailing list thread below:

 * http://marc.info/?t=147694653900002&r=1&w=2

Eventually we should be able to reintroduce this code once we have
rewritten the audit multicast code to queue messages much the same
way we do for unicast messages.  A tracking issue for this can be
found below:

 * https://github.com/linux-audit/audit-kernel/issues/23

Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Reported-by: Elad Raz <e@eladraz.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 net/core/net_namespace.c |   35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

Comments

Cong Wang Oct. 22, 2016, 3:38 a.m. UTC | #1
On Fri, Oct 21, 2016 at 6:49 PM, Paul Moore <pmoore@redhat.com> wrote:
> Eventually we should be able to reintroduce this code once we have
> rewritten the audit multicast code to queue messages much the same
> way we do for unicast messages.  A tracking issue for this can be
> found below:

NAK.

1) This will be forgotten by Paul.
2) There is already a fix which is considered as a rework by Paul.
3) -rc2 is Paul's personal deadline, not ours.
Paul Moore Oct. 22, 2016, 7:29 p.m. UTC | #2
On Fri, Oct 21, 2016 at 11:38 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Oct 21, 2016 at 6:49 PM, Paul Moore <pmoore@redhat.com> wrote:
>> Eventually we should be able to reintroduce this code once we have
>> rewritten the audit multicast code to queue messages much the same
>> way we do for unicast messages.  A tracking issue for this can be
>> found below:
>
> NAK.
>
> 1) This will be forgotten by Paul.

The way things are going right now, this argument is going to devolve
into a "yes he will"/"no I won't" so I'll just repeat that I've
created a tracking issue for this so I won't forget (and included a
link, repeated below, in the commit description) and I think I have a
reasonable history of following through on things.

* https://github.com/linux-audit/audit-kernel/issues/23

> 2) There is already a fix which is considered as a rework by Paul.

Already discussed this in the other thread, I'm not going to go into
detail here, just a quick summary: the fix provided by Cong Wang
doubles the message queue's memory consumption and changes some
fundamentals in how multicast messages are handled.  The memory
issues, while still an objectionable blocker, are easily resolved, but
moving the netlink multicast send is something I want to make sure is
tested/baked for a bit (it's 4.10 merge window material as far as I'm
concerned).

At this point I think it is worth mentioning that we are in this
position due to a lack of testing; if Cong Wang had tested his
original patch with SELinux we might not be dealing with this
regression now.  A more measured approach seems very reasonable.

> 3) -rc2 is Paul's personal deadline, not ours.

The current 4.9-rc kernels are broken and cause errors when SELinux is
enabled, while I understand SELinux is not a priority (or a secondary,
or tertiary, or N-ary concern) for many on the netdev list, it is
still an important part of the kernel and this regression needs to be
treated seriously and corrected soon.

SELinux/audit has run into interaction issues with the network stack
before, and we've worked together to sort things out; I'm hopeful
cooler heads will prevail and we can do the same here.
David Miller Oct. 22, 2016, 8:16 p.m. UTC | #3
I'm tired of hearing you guys go back and forth with each other and
will simply apply the revert.

Thanks.
Cong Wang Oct. 24, 2016, 5:52 p.m. UTC | #4
On Sat, Oct 22, 2016 at 12:29 PM, Paul Moore <pmoore@redhat.com> wrote:
> On Fri, Oct 21, 2016 at 11:38 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Fri, Oct 21, 2016 at 6:49 PM, Paul Moore <pmoore@redhat.com> wrote:
>>> Eventually we should be able to reintroduce this code once we have
>>> rewritten the audit multicast code to queue messages much the same
>>> way we do for unicast messages.  A tracking issue for this can be
>>> found below:
>>
>> NAK.
>>
>> 1) This will be forgotten by Paul.
>
> The way things are going right now, this argument is going to devolve
> into a "yes he will"/"no I won't" so I'll just repeat that I've
> created a tracking issue for this so I won't forget (and included a
> link, repeated below, in the commit description) and I think I have a
> reasonable history of following through on things.
>
> * https://github.com/linux-audit/audit-kernel/issues/23

I never doubt you will remember to do the audit part, what you will
forget is the revert to your revert. We will see.

Also, you make git log history much uglier.

>
>> 2) There is already a fix which is considered as a rework by Paul.
>
> Already discussed this in the other thread, I'm not going to go into
> detail here, just a quick summary: the fix provided by Cong Wang
> doubles the message queue's memory consumption and changes some
> fundamentals in how multicast messages are handled.  The memory
> issues, while still an objectionable blocker, are easily resolved, but
> moving the netlink multicast send is something I want to make sure is
> tested/baked for a bit (it's 4.10 merge window material as far as I'm
> concerned).

Sounds like you don't have the capacity to get it reviewed and tested
within 5 weeks (assuming -rc7 will be the final RC), as a maintainer.


>
> At this point I think it is worth mentioning that we are in this
> position due to a lack of testing; if Cong Wang had tested his
> original patch with SELinux we might not be dealing with this
> regression now.  A more measured approach seems very reasonable.
>

My SELinux is silently disabled because CONFIG_DEFAULT_SECURITY_SELINUX=y
was missing in my kernel config. The change is a cross-subsystem one,
I definitely can't guarantee I can cover all subsystems. This is exactly
why we need -rc1...-rc7, the moment you close the door for -rc2,
the moment you lose the opportunity to get it tested more widely.
I am sure you will revert the revert of revert again for the next merge
window if you continue to work in this style.


>> 3) -rc2 is Paul's personal deadline, not ours.
>
> The current 4.9-rc kernels are broken and cause errors when SELinux is
> enabled, while I understand SELinux is not a priority (or a secondary,
> or tertiary, or N-ary concern) for many on the netdev list, it is
> still an important part of the kernel and this regression needs to be
> treated seriously and corrected soon.

You get it wrong, it is never because SELinux is not important, every
part of Linux kernel is important. You need to realize we as a whole
community don't work in this way, -rc2 is NOT late for a bug fix of any
part of Linux kernel. If you can't review and test a 30-line patch in 5 weeks,
it is very likely your problem.


>
> SELinux/audit has run into interaction issues with the network stack
> before, and we've worked together to sort things out; I'm hopeful
> cooler heads will prevail and we can do the same here.

I am trying my best to help (by providing 3 possible patches), you refused
them because of _your_ -rc2 deadline. Let people judge who is the one
doesn't work together.

I am tired of explaining why we have -rc7 to you.
diff mbox

Patch

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 989434f..f61c0e0 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -215,13 +215,14 @@  static void rtnl_net_notifyid(struct net *net, int cmd, int id);
  */
 int peernet2id_alloc(struct net *net, struct net *peer)
 {
+	unsigned long flags;
 	bool alloc;
 	int id;
 
-	spin_lock_bh(&net->nsid_lock);
+	spin_lock_irqsave(&net->nsid_lock, flags);
 	alloc = atomic_read(&peer->count) == 0 ? false : true;
 	id = __peernet2id_alloc(net, peer, &alloc);
-	spin_unlock_bh(&net->nsid_lock);
+	spin_unlock_irqrestore(&net->nsid_lock, flags);
 	if (alloc && id >= 0)
 		rtnl_net_notifyid(net, RTM_NEWNSID, id);
 	return id;
@@ -230,11 +231,12 @@  int peernet2id_alloc(struct net *net, struct net *peer)
 /* This function returns, if assigned, the id of a peer netns. */
 int peernet2id(struct net *net, struct net *peer)
 {
+	unsigned long flags;
 	int id;
 
-	spin_lock_bh(&net->nsid_lock);
+	spin_lock_irqsave(&net->nsid_lock, flags);
 	id = __peernet2id(net, peer);
-	spin_unlock_bh(&net->nsid_lock);
+	spin_unlock_irqrestore(&net->nsid_lock, flags);
 	return id;
 }
 EXPORT_SYMBOL(peernet2id);
@@ -249,17 +251,18 @@  bool peernet_has_id(struct net *net, struct net *peer)
 
 struct net *get_net_ns_by_id(struct net *net, int id)
 {
+	unsigned long flags;
 	struct net *peer;
 
 	if (id < 0)
 		return NULL;
 
 	rcu_read_lock();
-	spin_lock_bh(&net->nsid_lock);
+	spin_lock_irqsave(&net->nsid_lock, flags);
 	peer = idr_find(&net->netns_ids, id);
 	if (peer)
 		get_net(peer);
-	spin_unlock_bh(&net->nsid_lock);
+	spin_unlock_irqrestore(&net->nsid_lock, flags);
 	rcu_read_unlock();
 
 	return peer;
@@ -422,17 +425,17 @@  static void cleanup_net(struct work_struct *work)
 		for_each_net(tmp) {
 			int id;
 
-			spin_lock_bh(&tmp->nsid_lock);
+			spin_lock_irq(&tmp->nsid_lock);
 			id = __peernet2id(tmp, net);
 			if (id >= 0)
 				idr_remove(&tmp->netns_ids, id);
-			spin_unlock_bh(&tmp->nsid_lock);
+			spin_unlock_irq(&tmp->nsid_lock);
 			if (id >= 0)
 				rtnl_net_notifyid(tmp, RTM_DELNSID, id);
 		}
-		spin_lock_bh(&net->nsid_lock);
+		spin_lock_irq(&net->nsid_lock);
 		idr_destroy(&net->netns_ids);
-		spin_unlock_bh(&net->nsid_lock);
+		spin_unlock_irq(&net->nsid_lock);
 
 	}
 	rtnl_unlock();
@@ -561,6 +564,7 @@  static int rtnl_net_newid(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
 	struct net *net = sock_net(skb->sk);
 	struct nlattr *tb[NETNSA_MAX + 1];
+	unsigned long flags;
 	struct net *peer;
 	int nsid, err;
 
@@ -581,15 +585,15 @@  static int rtnl_net_newid(struct sk_buff *skb, struct nlmsghdr *nlh)
 	if (IS_ERR(peer))
 		return PTR_ERR(peer);
 
-	spin_lock_bh(&net->nsid_lock);
+	spin_lock_irqsave(&net->nsid_lock, flags);
 	if (__peernet2id(net, peer) >= 0) {
-		spin_unlock_bh(&net->nsid_lock);
+		spin_unlock_irqrestore(&net->nsid_lock, flags);
 		err = -EEXIST;
 		goto out;
 	}
 
 	err = alloc_netid(net, peer, nsid);
-	spin_unlock_bh(&net->nsid_lock);
+	spin_unlock_irqrestore(&net->nsid_lock, flags);
 	if (err >= 0) {
 		rtnl_net_notifyid(net, RTM_NEWNSID, err);
 		err = 0;
@@ -711,10 +715,11 @@  static int rtnl_net_dumpid(struct sk_buff *skb, struct netlink_callback *cb)
 		.idx = 0,
 		.s_idx = cb->args[0],
 	};
+	unsigned long flags;
 
-	spin_lock_bh(&net->nsid_lock);
+	spin_lock_irqsave(&net->nsid_lock, flags);
 	idr_for_each(&net->netns_ids, rtnl_net_dumpid_one, &net_cb);
-	spin_unlock_bh(&net->nsid_lock);
+	spin_unlock_irqrestore(&net->nsid_lock, flags);
 
 	cb->args[0] = net_cb.idx;
 	return skb->len;