Message ID | 1432225541-28498-1-git-send-email-salyzyn@android.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Do, 2015-05-21 at 09:25 -0700, Mark Salyzyn wrote: > got a rare NULL pointer dereference in clear_bit > > Signed-off-by: Mark Salyzyn <salyzyn@android.com> > --- > net/unix/af_unix.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > index 5266ea7..37a8925 100644 > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > @@ -1880,6 +1880,11 @@ static long unix_stream_data_wait(struct sock *sk, long timeo, > unix_state_unlock(sk); > timeo = freezable_schedule_timeout(timeo); > unix_state_lock(sk); > + > + /* sk_socket may have been killed while unlocked */ > + if (!sk->sk_socket) > + break; > + > clear_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags); > } > Canonical way is to test for sock_flag(sk, SOCK_DEAD). Also it does not seem like we are returning an error to user space but are still looping to try to dequeue skbs from sk_receive_queue, which is concurrently emptied by unix_release (maybe, without holding unix_state_lock). Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/22/2015 02:50 AM, Hannes Frederic Sowa wrote: > On Do, 2015-05-21 at 09:25 -0700, Mark Salyzyn wrote: >> got a rare NULL pointer dereference in clear_bit >> >> Signed-off-by: Mark Salyzyn <salyzyn@android.com> >> --- >> net/unix/af_unix.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c >> index 5266ea7..37a8925 100644 >> --- a/net/unix/af_unix.c >> +++ b/net/unix/af_unix.c >> @@ -1880,6 +1880,11 @@ static long unix_stream_data_wait(struct sock *sk, long timeo, >> unix_state_unlock(sk); >> timeo = freezable_schedule_timeout(timeo); >> unix_state_lock(sk); >> + >> + /* sk_socket may have been killed while unlocked */ >> + if (!sk->sk_socket) >> + break; >> + >> clear_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags); >> } >> > Canonical way is to test for sock_flag(sk, SOCK_DEAD). Also it does not > seem like we are returning an error to user space but are still looping > to try to dequeue skbs from sk_receive_queue, which is concurrently > emptied by unix_release (maybe, without holding unix_state_lock). > > Bye, > Hannes > I will send an updated patch shortly. It may be acceptable given the expectation that sk_set_socket(sk, NULL) occurs after SOCK_DEAD flag is set since we would not be here during the socket initialization/connection phases. As such, for all phases (and I re-iterate, we can only be here if in connected state), it is not a generic guarantee of sk_socket != NULL. But I only saw one apparent example (in net/decnet/dn_nsp_in.c) of using sock_flag(sk, SOCK_DEAD) as protection against a possible deference NULL access with sk_socket, and many KISS examples of checking sk_socket for NULL to protect against thus. Thanks for making me look though, it appears that I missed the same problem in net/caif/caif_socket.c and will add it! Sincerely -- Mark Salyzyn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fr, 2015-05-22 at 07:51 -0700, Mark Salyzyn wrote: > On 05/22/2015 02:50 AM, Hannes Frederic Sowa wrote: > > On Do, 2015-05-21 at 09:25 -0700, Mark Salyzyn wrote: > >> got a rare NULL pointer dereference in clear_bit > >> > >> Signed-off-by: Mark Salyzyn <salyzyn@android.com> > >> --- > >> net/unix/af_unix.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > >> index 5266ea7..37a8925 100644 > >> --- a/net/unix/af_unix.c > >> +++ b/net/unix/af_unix.c > >> @@ -1880,6 +1880,11 @@ static long unix_stream_data_wait(struct sock *sk, long timeo, > >> unix_state_unlock(sk); > >> timeo = freezable_schedule_timeout(timeo); > >> unix_state_lock(sk); > >> + > >> + /* sk_socket may have been killed while unlocked */ > >> + if (!sk->sk_socket) > >> + break; > >> + > >> clear_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags); > >> } > >> > > Canonical way is to test for sock_flag(sk, SOCK_DEAD). Also it does not > > seem like we are returning an error to user space but are still looping > > to try to dequeue skbs from sk_receive_queue, which is concurrently > > emptied by unix_release (maybe, without holding unix_state_lock). > > > > Bye, > > Hannes > > > I will send an updated patch shortly. > > It may be acceptable given the expectation that sk_set_socket(sk, NULL) > occurs after SOCK_DEAD flag is set since we would not be here during the > socket initialization/connection phases. As such, for all phases (and I > re-iterate, we can only be here if in connected state), it is not a > generic guarantee of sk_socket != NULL. But I only saw one apparent > example (in net/decnet/dn_nsp_in.c) of using sock_flag(sk, SOCK_DEAD) as > protection against a possible deference NULL access with sk_socket, and > many KISS examples of checking sk_socket for NULL to protect against thus. > > Thanks for making me look though, it appears that I missed the same > problem in net/caif/caif_socket.c and will add it! Thank you for v2 of the patch. I still wonder if we need to actually recheck the condition and not simply break out of unix_stream_data_wait: We return to the unix_stream_recvmsg loop and recheck the sk_receive_queue. At this point sk_receive_queue is not really protected with unix_state_lock against concurrent modification with unix_release, as such we could end up concurrently dequeueing packets if socket is DEAD. Does that make sense? Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/22/2015 08:35 AM, Hannes Frederic Sowa wrote: > I still wonder if we need to actually recheck the condition and not > simply break out of unix_stream_data_wait: > > We return to the unix_stream_recvmsg loop and recheck the > sk_receive_queue. At this point sk_receive_queue is not really protected > with unix_state_lock against concurrent modification with unix_release, > as such we could end up concurrently dequeueing packets if socket is > DEAD. sock destroy(sic) is called before sock_orphan which sets SOCK_DEAD, so the receive queue has already been drained. Sincerely -- Mark Salyzyn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, May 22, 2015, at 18:24, Mark Salyzyn wrote: > On 05/22/2015 08:35 AM, Hannes Frederic Sowa wrote: > > I still wonder if we need to actually recheck the condition and not > > simply break out of unix_stream_data_wait: > > > > We return to the unix_stream_recvmsg loop and recheck the > > sk_receive_queue. At this point sk_receive_queue is not really protected > > with unix_state_lock against concurrent modification with unix_release, > > as such we could end up concurrently dequeueing packets if socket is > > DEAD. > sock destroy(sic) is called before sock_orphan which sets SOCK_DEAD, so > the receive queue has already been drained. I am still afraid that there is a race: When we break out in unix_stream_data_wait we most of the time hit the continue statement in unix_stream_recvmsg. Albeit we acquired state lock again, we could end up in a situation where the sk_receive_queue is not completely drained. We would miss the recheck of the sk_shutdown mask, because it is possible we dequeue a non-null skb from the receive queue. This is because unix_release_sock acquires state lock, sets appropriate flags but the draining of the receive queue does happen without locks, state lock is unlocked before that. So theoretically both, release_sock and recvmsg could dequeue skbs concurrently in nondeterministic behavior. The fix would be to recheck SOCK_DEAD or even better, sk_shutdown right after we reacquired state_lock and break out of the loop altogether, maybe with -ECONNRESET. Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/22/2015 11:16 AM, Hannes Frederic Sowa wrote: > On Fri, May 22, 2015, at 18:24, Mark Salyzyn wrote: >> On 05/22/2015 08:35 AM, Hannes Frederic Sowa wrote: >>> I still wonder if we need to actually recheck the condition and not >>> simply break out of unix_stream_data_wait: >>> >>> We return to the unix_stream_recvmsg loop and recheck the >>> sk_receive_queue. At this point sk_receive_queue is not really protected >>> with unix_state_lock against concurrent modification with unix_release, >>> as such we could end up concurrently dequeueing packets if socket is >>> DEAD. >> sock destroy(sic) is called before sock_orphan which sets SOCK_DEAD, so >> the receive queue has already been drained. > I am still afraid that there is a race: > > When we break out in unix_stream_data_wait we most of the time hit the > continue statement in unix_stream_recvmsg. Albeit we acquired state lock > again, we could end up in a situation where the sk_receive_queue is not > completely drained. We would miss the recheck of the sk_shutdown mask, > because it is possible we dequeue a non-null skb from the receive queue. > This is because unix_release_sock acquires state lock, sets appropriate > flags but the draining of the receive queue does happen without locks, > state lock is unlocked before that. So theoretically both, release_sock > and recvmsg could dequeue skbs concurrently in nondeterministic > behavior. > > The fix would be to recheck SOCK_DEAD or even better, sk_shutdown right > after we reacquired state_lock and break out of the loop altogether, > maybe with -ECONNRESET. > > Thanks, > Hannes I am trying to figure out _how_ to appease your worries. Keep in mind what I hit was rare already, and resulted in a panic. Nondeterministic packet delivery during shutdown is a given, but if I buy that one can receive another frame after packet flush and RCV_SHUTDOWN, and SOCK_DEAD is set under lock then returning to the thread in wait, would you be more comfortable with: do { int chunk; struct sk_buff *skb, *last; unix_state_lock(sk); last = skb = skb_peek(&sk->sk_receive_queue); again: - if (skb == NULL) { + if (!skb || sock_flag(sk, SOCK_DEAD)) { unix_sk(sk)->recursion_level = 0; if (copied >= target) goto unlock; - or - + skb = NULL; + if (!sock_flag(sk, SOCK_DEAD)) // check after loop, but not in again loop? + skb = skb_peek(&sk->sk_receive_queue + last = skb; I know this does not give you -ECONNRESET (but we will we get sock_error(sk) disposition, another check for sock_flag if err == 0 could fix that) Too far to deal with nondeterministic packet flow? getting a last packet or not does not seem worth the cycles of CPU trouble? Sincerely -- Mark Salyzyn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 5266ea7..37a8925 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1880,6 +1880,11 @@ static long unix_stream_data_wait(struct sock *sk, long timeo, unix_state_unlock(sk); timeo = freezable_schedule_timeout(timeo); unix_state_lock(sk); + + /* sk_socket may have been killed while unlocked */ + if (!sk->sk_socket) + break; + clear_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags); }
got a rare NULL pointer dereference in clear_bit Signed-off-by: Mark Salyzyn <salyzyn@android.com> --- net/unix/af_unix.c | 5 +++++ 1 file changed, 5 insertions(+)