Message ID | 1269871964-5412-2-git-send-email-timo.teras@iki.fi |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, Mar 29, 2010 at 05:12:38PM +0300, Timo Teras wrote: > > @@ -1132,7 +1119,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol) > __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); > } > if (old_pol) > - __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); > + old_pol = __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); > write_unlock_bh(&xfrm_policy_lock); > > if (old_pol) { So when can this actually fail? Otherwise this patch looks good to me. Thanks,
Herbert Xu wrote: > On Mon, Mar 29, 2010 at 05:12:38PM +0300, Timo Teras wrote: >> @@ -1132,7 +1119,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol) >> __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); >> } >> if (old_pol) >> - __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >> + old_pol = __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >> write_unlock_bh(&xfrm_policy_lock); >> >> if (old_pol) { > > So when can this actually fail? Considering that the socket reference is received from the sk->sk_policy, and the hash bucket we use is "XFRM_POLICY_MAX+dir", it's non-obvious if it can fail or not. It would look like the timer can kill a policy and unlink it, but it would still be found from sk_policy. It probably doesn't really make sense to insert per-socket policy that expires. But in case someone does something like that, I'd think we need the above just to be sure. Considering this, xfrm_sk_policy_lookup() should probably check the dead flag, and cleanup sk_policy if it was killed by a timer. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 30, 2010 at 07:55:07AM +0300, Timo Teräs wrote: > Herbert Xu wrote: >> On Mon, Mar 29, 2010 at 05:12:38PM +0300, Timo Teras wrote: >>> @@ -1132,7 +1119,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol) >>> __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); >>> } >>> if (old_pol) >>> - __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >>> + old_pol = __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >>> write_unlock_bh(&xfrm_policy_lock); >>> if (old_pol) { >> >> So when can this actually fail? > > Considering that the socket reference is received from the sk->sk_policy, > and the hash bucket we use is "XFRM_POLICY_MAX+dir", it's non-obvious if > it can fail or not. > > It would look like the timer can kill a policy and unlink it, but it > would still be found from sk_policy. Socket policies cannot expire. In fact, they probably shouldn't even be on the bydst or any other hash table. I think the only reason they're there at all is because the hash table was added to __xfrm_policy_link which happens to be used by socket policies. Cheers,
Herbert Xu wrote: > On Tue, Mar 30, 2010 at 07:55:07AM +0300, Timo Teräs wrote: >> Herbert Xu wrote: >>> On Mon, Mar 29, 2010 at 05:12:38PM +0300, Timo Teras wrote: >>>> @@ -1132,7 +1119,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol) >>>> __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); >>>> } >>>> if (old_pol) >>>> - __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >>>> + old_pol = __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >>>> write_unlock_bh(&xfrm_policy_lock); >>>> if (old_pol) { >>> So when can this actually fail? >> Considering that the socket reference is received from the sk->sk_policy, >> and the hash bucket we use is "XFRM_POLICY_MAX+dir", it's non-obvious if >> it can fail or not. >> >> It would look like the timer can kill a policy and unlink it, but it >> would still be found from sk_policy. > > Socket policies cannot expire. Was not aware of that. The above is not needed then. > In fact, they probably shouldn't even be on the bydst or any other > hash table. I think the only reason they're there at all is because > the hash table was added to __xfrm_policy_link which happens to be > used by socket policies. I think it's hashed so socket policies are included in the policy db dumps and counts. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 30, 2010 at 03:04:01PM +0300, Timo Teräs wrote: >> I think it's hashed so socket policies are included in the policy > db dumps and counts. No we use a linked list for that. Cheers,
Herbert Xu wrote: > On Tue, Mar 30, 2010 at 03:04:01PM +0300, Timo Teräs wrote: >>> I think it's hashed so socket policies are included in the policy >> db dumps and counts. > > No we use a linked list for that. Ah, yes. It's just all done together in the __xfrm_link_policy(). Hmm... is it possible to modify/delete per-socket policies from userland via xfrm_user? That would be also another race why we'd need to check the unlinking result. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 30, 2010 at 03:21:19PM +0300, Timo Teräs wrote: > > Hmm... is it possible to modify/delete per-socket policies from > userland via xfrm_user? That would be also another race why > we'd need to check the unlinking result. How would the user be able to specify the socket? The answer is no. Cheers,
Herbert Xu wrote: > On Tue, Mar 30, 2010 at 03:21:19PM +0300, Timo Teräs wrote: >> Hmm... is it possible to modify/delete per-socket policies from >> userland via xfrm_user? That would be also another race why >> we'd need to check the unlinking result. > > How would the user be able to specify the socket? > > The answer is no. I thought it was possible since they can be dumped. But yes, I now see that the policy direction check does not allow the per-socket policies to be specified. So it'd make more sense to nuke the hashes entirely for per-socket policies? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 30, 2010 at 03:41:02PM +0300, Timo Teräs wrote: > > So it'd make more sense to nuke the hashes entirely for > per-socket policies? Absolutely.
Herbert Xu wrote: > On Tue, Mar 30, 2010 at 03:41:02PM +0300, Timo Teräs wrote: >> So it'd make more sense to nuke the hashes entirely for >> per-socket policies? > > Absolutely. I checked now the xfrm_user, and mostly it seems to prevent modification to per-socket policies. The only exception is XFRM_MSG_POLEXPIRE handler xfrm_add_pol_expire(). It calls xfrm_policy_byid() without verifying the direction, and can thus complete successfully on a per-socket policy. This can actually result in per-socket policy deletion via netlink. I guess the proper thing is to add the direction check there. It also seems that the by-index hash is also used when generating new index. It's to double check that the index is unique. So deleting the by-index hash from per-socket policies seems tricky. Removing bydst hashing should be trivial. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Timo Teräs wrote: > Herbert Xu wrote: >> On Tue, Mar 30, 2010 at 07:55:07AM +0300, Timo Teräs wrote: >>> Herbert Xu wrote: >>>> On Mon, Mar 29, 2010 at 05:12:38PM +0300, Timo Teras wrote: >>>>> @@ -1132,7 +1119,7 @@ int xfrm_sk_policy_insert(struct sock *sk, >>>>> int dir, struct xfrm_policy *pol) >>>>> __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); >>>>> } >>>>> if (old_pol) >>>>> - __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >>>>> + old_pol = __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); >>>>> write_unlock_bh(&xfrm_policy_lock); >>>>> if (old_pol) { >>>> So when can this actually fail? >>> Considering that the socket reference is received from the >>> sk->sk_policy, >>> and the hash bucket we use is "XFRM_POLICY_MAX+dir", it's non-obvious if >>> it can fail or not. >>> >>> It would look like the timer can kill a policy and unlink it, but it >>> would still be found from sk_policy. >> >> Socket policies cannot expire. > > Was not aware of that. The above is not needed then. Since the exported function xfrm_policy_byid() can result in deletion of socket policy, it's safer to leave this change in. This is can be even triggered via xfrm_user since it does not check 'dir' for the policy expired message it handles. Any custom module could do similar harm. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 30, 2010 at 05:01:47PM +0300, Timo Teräs wrote: > > Since the exported function xfrm_policy_byid() can result in deletion > of socket policy, it's safer to leave this change in. This is can be > even triggered via xfrm_user since it does not check 'dir' for the > policy expired message it handles. Any custom module could do similar > harm. That's a bug. Socket policies should never be deleted by anyone other than the socket owner through a setsockopt. Cheers,
On Tue, Mar 30, 2010 at 04:33:52PM +0300, Timo Teräs wrote: > Herbert Xu wrote: >> On Tue, Mar 30, 2010 at 03:41:02PM +0300, Timo Teräs wrote: >>> So it'd make more sense to nuke the hashes entirely for >>> per-socket policies? >> >> Absolutely. > > I checked now the xfrm_user, and mostly it seems to prevent > modification to per-socket policies. > > The only exception is XFRM_MSG_POLEXPIRE handler > xfrm_add_pol_expire(). It calls xfrm_policy_byid() without > verifying the direction, and can thus complete successfully on > a per-socket policy. This can actually result in per-socket > policy deletion via netlink. That shouldn't be possible since the directions used by socket policies cannot be set through xfrm_user. Cheers,
On Tue, Mar 30, 2010 at 10:30:48PM +0800, Herbert Xu wrote: > On Tue, Mar 30, 2010 at 04:33:52PM +0300, Timo Teräs wrote: > > Herbert Xu wrote: > >> On Tue, Mar 30, 2010 at 03:41:02PM +0300, Timo Teräs wrote: > >>> So it'd make more sense to nuke the hashes entirely for > >>> per-socket policies? > >> > >> Absolutely. > > > > I checked now the xfrm_user, and mostly it seems to prevent > > modification to per-socket policies. > > > > The only exception is XFRM_MSG_POLEXPIRE handler > > xfrm_add_pol_expire(). It calls xfrm_policy_byid() without > > verifying the direction, and can thus complete successfully on > > a per-socket policy. This can actually result in per-socket > > policy deletion via netlink. > > That shouldn't be possible since the directions used by socket > policies cannot be set through xfrm_user. If there is a missing direction check then we should add it.
On Tue, Mar 30, 2010 at 04:33:52PM +0300, Timo Teräs wrote: > > It also seems that the by-index hash is also used when > generating new index. It's to double check that the index > is unique. So deleting the by-index hash from per-socket > policies seems tricky. > > Removing bydst hashing should be trivial. Yes that makes sense.
Herbert Xu wrote: > On Tue, Mar 30, 2010 at 05:01:47PM +0300, Timo Teräs wrote: >> Since the exported function xfrm_policy_byid() can result in deletion >> of socket policy, it's safer to leave this change in. This is can be >> even triggered via xfrm_user since it does not check 'dir' for the >> policy expired message it handles. Any custom module could do similar >> harm. > > That's a bug. Socket policies should never be deleted by anyone > other than the socket owner through a setsockopt. Ok. I can remove the change to xfrm_sk_policy_insert() if you want. I only added it because it's non-trivial to figure out if there's any code path that could race. It's a great help for reader of the code to see that it's correct even if it's not strictly needed. My initial feeling is that the change produces better code as the original 'old_pol' does not have to get stored and restored. I'm fine either way. I will also include a patch to fix the missing 'dir' check in xfrm_user. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 30, 2010 at 06:36:37PM +0300, Timo Teräs wrote: > > I only added it because it's non-trivial to figure out if there's > any code path that could race. It's a great help for reader of the > code to see that it's correct even if it's not strictly needed. No that's bad because you're misleading people into thinking something that isn't allowed can happen. It's much better to add a comment instead. > I will also include a patch to fix the missing 'dir' check in > xfrm_user. Thanks,
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 843e066..595d347 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -156,7 +156,7 @@ static void xfrm_policy_timer(unsigned long data) read_lock(&xp->lock); - if (xp->walk.dead) + if (unlikely(xp->walk.dead)) goto out; dir = xfrm_policy_id2dir(xp->index); @@ -297,17 +297,7 @@ static DECLARE_WORK(xfrm_policy_gc_work, xfrm_policy_gc_task); static void xfrm_policy_kill(struct xfrm_policy *policy) { - int dead; - - write_lock_bh(&policy->lock); - dead = policy->walk.dead; policy->walk.dead = 1; - write_unlock_bh(&policy->lock); - - if (unlikely(dead)) { - WARN_ON(1); - return; - } spin_lock_bh(&xfrm_policy_gc_lock); hlist_add_head(&policy->bydst, &xfrm_policy_gc_list); @@ -776,7 +766,6 @@ xfrm_policy_flush_secctx_check(struct net *net, u8 type, struct xfrm_audit *audi int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info) { int dir, err = 0, cnt = 0; - struct xfrm_policy *dp; write_lock_bh(&xfrm_policy_lock); @@ -794,10 +783,9 @@ int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info) &net->xfrm.policy_inexact[dir], bydst) { if (pol->type != type) continue; - dp = __xfrm_policy_unlink(pol, dir); + __xfrm_policy_unlink(pol, dir); write_unlock_bh(&xfrm_policy_lock); - if (dp) - cnt++; + cnt++; xfrm_audit_policy_delete(pol, 1, audit_info->loginuid, audit_info->sessionid, @@ -816,10 +804,9 @@ int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info) bydst) { if (pol->type != type) continue; - dp = __xfrm_policy_unlink(pol, dir); + __xfrm_policy_unlink(pol, dir); write_unlock_bh(&xfrm_policy_lock); - if (dp) - cnt++; + cnt++; xfrm_audit_policy_delete(pol, 1, audit_info->loginuid, @@ -1132,7 +1119,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol) __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); } if (old_pol) - __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); + old_pol = __xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir); write_unlock_bh(&xfrm_policy_lock); if (old_pol) { @@ -1737,11 +1724,8 @@ restart: goto error; } - for (pi = 0; pi < npols; pi++) { - read_lock_bh(&pols[pi]->lock); + for (pi = 0; pi < npols; pi++) pol_dead |= pols[pi]->walk.dead; - read_unlock_bh(&pols[pi]->lock); - } write_lock_bh(&policy->lock); if (unlikely(pol_dead || stale_bundle(dst))) { diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 6106b72..778a6ec 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1766,13 +1766,9 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh, if (xp == NULL) return -ENOENT; - read_lock(&xp->lock); - if (xp->walk.dead) { - read_unlock(&xp->lock); + if (unlikely(xp->walk.dead)) goto out; - } - read_unlock(&xp->lock); err = 0; if (up->hard) { uid_t loginuid = NETLINK_CB(skb).loginuid;
All of the code considers ->dead as a hint that the cached policy needs to get refreshed. The read side can just drop the read lock without any side effects. The write side needs to make sure that it's written only exactly once. Only possible race is at xfrm_policy_kill(). This is fixed by checking result of __xfrm_policy_unlink() when needed. It will always succeed if the policy object is looked up from the hash list (so some checks are removed), but it needs to be checked if we are trying to unlink policy via a reference (appropriate checks added). Since policy->walk.dead is written exactly once, it no longer needs to be protected with a write lock. Signed-off-by: Timo Teras <timo.teras@iki.fi> --- net/xfrm/xfrm_policy.c | 30 +++++++----------------------- net/xfrm/xfrm_user.c | 6 +----- 2 files changed, 8 insertions(+), 28 deletions(-)