Message ID | 1514662599-14491-1-git-send-email-santosh.shilimkar@oracle.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | rds: fix use-after-free read in rds_find_bound | expand |
On (12/30/17 11:36), Santosh Shilimkar wrote: > > socket buffer can get freed as part of sock_close > callback so before adding reference check underneath > socket validity. I'm not sure I understand this fix- struct rds_sock is: struct rds_sock { struct sock rs_sk; : } How can rs be non-null but rds_rs_to_sk() is null? (Note that rds_rs_to_sk just returns &rs->rs_sk) so the changed line is identical to the original line. > - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) > + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) I think the real issue is refcount bug somewhere, Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/ this sounds like that type of bug. --Sowmini
On 12/30/17 12:26 PM, Sowmini Varadhan wrote: > On (12/30/17 11:36), Santosh Shilimkar wrote: >> >> socket buffer can get freed as part of sock_close >> callback so before adding reference check underneath >> socket validity. > > I'm not sure I understand this fix- > > struct rds_sock is: > struct rds_sock { > struct sock rs_sk; > : > } > > How can rs be non-null but rds_rs_to_sk() is null? (Note that > rds_rs_to_sk just returns &rs->rs_sk) so the changed line is > identical to the original line. > Well thats what the report says o.w flag test wouldn't have been attempted. >> - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) >> + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) > > I think the real issue is refcount bug somewhere, > Thats what I thought as well initially but since the reported case, the rs seems to be valid where as sk seems to be freed up as part of sock_release callback. > Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/ > this sounds like that type of bug. > That fix scenario, the rs don't get inserted in hash table and in this particular bug, the lookup was successful so am not sure if these two bugs are related. But since bound address fix was still not part of the build reproduced use after free bug, $subject fix can wait for next reproduction. Unfortunately as per the report, there is no reproducer for it to test if other fix fixes this issue. Regards, Santosh
On (12/30/17 13:37), santosh.shilimkar@oracle.com wrote: > Well thats what the report says o.w flag test wouldn't have > been attempted. the bug report says "use-after-free". It doesnt say that rds_rs_to_sk(rs) is null (if rds_rs_to_sk(rs) was null, rs would also be null, please cscope struct rds_sock) What the bug report says is " The buggy address belongs to the object at ffff8801c09a6080 which belongs to the cache RDS of size 1472 The buggy address is located 96 bytes inside of .." 96 is the offset of sk->sk_flags. so yes, there is a socket refcount issue. But the patch you sent (see next two lines) will not solve that. > >>- if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) > >>+ if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) Sowmini>I think the real issue is refcount bug somewhere, > Thats what I thought as well initially but since the reported case, > the rs seems to be valid where as sk seems to be freed up as part of > sock_release callback. I dont understand the statement above- how can "rs be valid, and sk be freed"? rs_sk is embedded in the struct rds_sock, it is not a pointer. let's find and fix the refcount bug. See stack trace in commit comment. The socket release is happening prematurely and existing WARN_ONs are not catching it. > >Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/ > >this sounds like that type of bug. --Sowmini
On 12/30/17 2:32 PM, Sowmini Varadhan wrote: > On (12/30/17 13:37), santosh.shilimkar@oracle.com wrote: [...] >> Thats what I thought as well initially but since the reported case, >> the rs seems to be valid where as sk seems to be freed up as part of >> sock_release callback. > > I dont understand the statement above- how can "rs be valid, and sk > be freed"? > > rs_sk is embedded in the struct rds_sock, it is not a pointer. > I was going with order of evaluation of if () but you made good point. rs_sk isn't a pointer so sk can't be null. > let's find and fix the refcount bug. See stack trace in commit comment. > The socket release is happening prematurely and existing WARN_ONs > are not catching it. > Right. This was loop transport in action so xmit will just flip the direction with receive. And rds_recv_incoming() can race with socket_release. rds_find_bound() is suppose to add ref count on socket for rds_recv_incoming() but by that time socket is DEAD & freed by socket release callback. And rds_release is suppose to be synced with rs_recv_lock. But release callback is marking the sk orphan before syncing up with receive path and updating the bind table. Probably it can pushed down further after the socket clean up buut need to think bit more. diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index b405f77..11e1426 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -65,7 +65,6 @@ static int rds_release(struct socket *sock) rs = rds_sk_to_rs(sk); - sock_orphan(sk); /* Note - rds_clear_recv_queue grabs rs_recv_lock, so * that ensures the recv path has completed messing * with the socket. */ @@ -85,6 +84,7 @@ static int rds_release(struct socket *sock) rds_trans_put(rs->rs_transport); + sock_orphan(sk); sock->sk = NULL; sock_put(sk); out:
On (12/30/17 21:09), santosh.shilimkar@oracle.com wrote: > Right. This was loop transport in action so xmit will just flip > the direction with receive. And rds_recv_incoming() can race with > socket_release. rds_find_bound() is suppose to add ref count on > socket for rds_recv_incoming() but by that time socket is DEAD & > freed by socket release callback. correct, that makes sense. > And rds_release is suppose to be synced with rs_recv_lock. But > release callback is marking the sk orphan before syncing > up with receive path and updating the bind table. Probably it > can pushed down further after the socket clean up buut need > to think bit more. However, I'm not sure this seals the race.. according to the bug report rds_recv_incoming->rds_find_bound is being called in rds_send_worker context and the rds_find_bound code is 63 rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms); 64 if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) 65 rds_sock_addref(rs); 66 else 67 rs = NULL; 68 Since the entire logic of rds_release can interleave between line 63 and 64, (whereas we only addref at line 65), moving the sock_orphan will not help. I see that there was an explicic synchornization via the bucket->lock before 7b5654349e. I think you need something like that, or some type or rcu-based hash list. Patch below may make race-window smaller, but race window is still there. > > > diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c > index b405f77..11e1426 100644 > --- a/net/rds/af_rds.c > +++ b/net/rds/af_rds.c > @@ -65,7 +65,6 @@ static int rds_release(struct socket *sock) > > rs = rds_sk_to_rs(sk); > > - sock_orphan(sk); > /* Note - rds_clear_recv_queue grabs rs_recv_lock, so > * that ensures the recv path has completed messing > * with the socket. */ > @@ -85,6 +84,7 @@ static int rds_release(struct socket *sock) > > rds_trans_put(rs->rs_transport); > > + sock_orphan(sk); > sock->sk = NULL; > sock_put(sk); > out:
On 12/31/17 4:33 AM, Sowmini Varadhan wrote: > On (12/30/17 21:09), santosh.shilimkar@oracle.com wrote: >> Right. This was loop transport in action so xmit will just flip >> the direction with receive. And rds_recv_incoming() can race with >> socket_release. rds_find_bound() is suppose to add ref count on >> socket for rds_recv_incoming() but by that time socket is DEAD & >> freed by socket release callback. > > correct, that makes sense. > Yea. In fact the earlier point of sk being null and rs not is also possible because socket release explicitly marks its NULL ("sock->sk = NULL"). But it just side effect of the actual race. >> And rds_release is suppose to be synced with rs_recv_lock. But >> release callback is marking the sk orphan before syncing >> up with receive path and updating the bind table. Probably it >> can pushed down further after the socket clean up buut need >> to think bit more. > > However, I'm not sure this seals the race.. according to the > bug report rds_recv_incoming->rds_find_bound is being called > in rds_send_worker context and the rds_find_bound code is > > 63 rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms); > 64 if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) > 65 rds_sock_addref(rs); > 66 else > 67 rs = NULL; > 68 > > Since the entire logic of rds_release can interleave between line 63 > and 64, (whereas we only addref at line 65), moving the sock_orphan > will not help. > > I see that there was an explicic synchornization via the bucket->lock > before 7b5654349e. I think you need something like that, or some type > or rcu-based hash list. > The rhashtable already has internal bucket lock so those operation like add/remove are synced up. But yes reference addition can still race with receive since receive lock is taken after find bound. > Patch below may make race-window smaller, but race window is still there. >If the receive lock is taken ahead then with sock_orphan change socket release will get synchronized with receive. But preventing socket release to be getting triggered while in receive path by means ref count is better option. Moving socket_orphan down is anyway a good change but its not enough. Will think bit more about it. Thanks for the good discussion. Regards, Santosh
diff --git a/net/rds/bind.c b/net/rds/bind.c index 75d43dc..8dec06e 100644 --- a/net/rds/bind.c +++ b/net/rds/bind.c @@ -61,7 +61,7 @@ struct rds_sock *rds_find_bound(__be32 addr, __be16 port) struct rds_sock *rs; rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms); - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) rds_sock_addref(rs); else rs = NULL;
socket buffer can get freed as part of sock_close callback so before adding reference check underneath socket validity. Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> --- net/rds/bind.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)