Message ID | 20190312172301.590390-1-kafai@fb.com |
---|---|
Headers | show |
Series | Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release | expand |
On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote: > This set addresses issue about accessing invalid > ptr returned from bpf_tcp_sock() and bpf_sk_fullsock() > after bpf_sk_release(). > > v4: > - Tried the one "id" approach. It does not work well and the reason is in > the Patch 1 commit message. > - Rename refcount_id to ref_obj_id. > - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg() > because ref_obj_id is passed to release_reference() instead of reg->id. > - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true > - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock(). > - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2. > - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr, > bpf_sk_release(tp) is also allowed. Thanks for the hard work on this fix. I applied the set to bpf tree. I have a small question regarding patch 2: +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk) +{ + sk = sk_to_full_sk(sk); + + if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE)) + return (unsigned long)sk; + + return (unsigned long)NULL; +} My understanding that sk->sk_state == TCP_LISTEN is enough for correctness and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent silent breakage in case listener socks will be re-implemented without rcu_free? In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too? I mean in the very unlikely scenario that kernel will have such drastic implementation change the bpf progs will not work correctly because NULL is returned, but it will be harder for humans to debug? I think it's also ok without warn, since such huge re-implemenation will require all sorts of efforts, so highly unlikely we will miss this spot. warn_on_once feels like too much paranoia. May be just a comment ? Or I'm overthinking? I guess I'm fine with the code as-is.
On Wed, Mar 13, 2019 at 12:20:27PM -0700, Alexei Starovoitov wrote: > On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote: > > This set addresses issue about accessing invalid > > ptr returned from bpf_tcp_sock() and bpf_sk_fullsock() > > after bpf_sk_release(). > > > > v4: > > - Tried the one "id" approach. It does not work well and the reason is in > > the Patch 1 commit message. > > - Rename refcount_id to ref_obj_id. > > - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg() > > because ref_obj_id is passed to release_reference() instead of reg->id. > > - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true > > - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock(). > > - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2. > > - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr, > > bpf_sk_release(tp) is also allowed. > > Thanks for the hard work on this fix. > I applied the set to bpf tree. Thanks for reviewing. > > I have a small question regarding patch 2: > +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk) > +{ > + sk = sk_to_full_sk(sk); > + > + if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE)) > + return (unsigned long)sk; > + > + return (unsigned long)NULL; > +} > > My understanding that sk->sk_state == TCP_LISTEN is enough for correctness > and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent > silent breakage in case listener socks will be re-implemented without rcu_free? > In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too? > I mean in the very unlikely scenario that kernel will have such drastic > implementation change the bpf progs will not work correctly because NULL > is returned, but it will be harder for humans to debug? > I think it's also ok without warn, since such huge re-implemenation will > require all sorts of efforts, so highly unlikely we will miss this spot. > warn_on_once feels like too much paranoia. May be just a comment ? > Or I'm overthinking? No. It is a valid question. Sorry that I missed details in this patch. Here is what I had thought about in the patch: 1. inet_reqsk(sk)->rsk_listener is only set for TCP (and dccp). Hence, sk_to_full_sk(sk) is only useful for TCP (and dccp). 2. The TCP_LISTEN sock has SOCK_RCU_FREE set when it is added to the listening_hash in __inet_hash(). It is done for TCP (and dccp). 3. However, not all TCP_LISTEN socket is in SOCK_RCU_FREE. e.g. af_unix.c. It may be accessible through skb->sk. If WARN_ON_ONCE() was used, it would be a false alarm. For those non TCP sk, sk_to_full_sk(sk) is a no-op. If its passed-in sk is not already in TCP_LISTEN, bpf_get_listener_sock() has to return NULL anyway. It is not ideal for those sk because this helper still returns NULL if its passed-in sk is already TCP_LISTEN because of the sock_flag(sk, SOCK_RCU_FREE) check. On the other hand, there is sk->sk_state for the bpf_prog to check first before calling this helper. I have re-thought about (3). How about this: 1. Limit this helper to "sk->sk_protocol == IPPROTO_TCP". Return NULL for others because this helper cannot do anything extra for those sk anyway. 2. Rename this helper to bpf_get_tcp_listener_sock(). It still returns "struct bpf_sock *" because I think the send_cwnd/srtt_us/rtt_min/... of the listener's "struct bpf_tcp_sock *" is not very useful. If needed, bpf_tcp_sock() can be used to get "struct bpf_tcp_sock *". 3. Add WARN_ON_ONCE() on the SOCK_RCU_FREE flag.
On 3/13/19 10:38 PM, Martin Lau wrote: > On Wed, Mar 13, 2019 at 12:20:27PM -0700, Alexei Starovoitov wrote: >> On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote: >>> This set addresses issue about accessing invalid >>> ptr returned from bpf_tcp_sock() and bpf_sk_fullsock() >>> after bpf_sk_release(). >>> >>> v4: >>> - Tried the one "id" approach. It does not work well and the reason is in >>> the Patch 1 commit message. >>> - Rename refcount_id to ref_obj_id. >>> - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg() >>> because ref_obj_id is passed to release_reference() instead of reg->id. >>> - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true >>> - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock(). >>> - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2. >>> - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr, >>> bpf_sk_release(tp) is also allowed. >> >> Thanks for the hard work on this fix. >> I applied the set to bpf tree. > Thanks for reviewing. > >> >> I have a small question regarding patch 2: >> +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk) >> +{ >> + sk = sk_to_full_sk(sk); >> + >> + if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE)) >> + return (unsigned long)sk; >> + >> + return (unsigned long)NULL; >> +} >> >> My understanding that sk->sk_state == TCP_LISTEN is enough for correctness >> and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent >> silent breakage in case listener socks will be re-implemented without rcu_free? >> In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too? >> I mean in the very unlikely scenario that kernel will have such drastic >> implementation change the bpf progs will not work correctly because NULL >> is returned, but it will be harder for humans to debug? >> I think it's also ok without warn, since such huge re-implemenation will >> require all sorts of efforts, so highly unlikely we will miss this spot. >> warn_on_once feels like too much paranoia. May be just a comment ? >> Or I'm overthinking? > No. It is a valid question. Sorry that I missed details in this patch. > > Here is what I had thought about in the patch: > 1. inet_reqsk(sk)->rsk_listener is only set for TCP (and dccp). > Hence, sk_to_full_sk(sk) is only useful for TCP (and dccp). > > 2. The TCP_LISTEN sock has SOCK_RCU_FREE set when it is added > to the listening_hash in __inet_hash(). It is done for > TCP (and dccp). > > 3. However, not all TCP_LISTEN socket is in SOCK_RCU_FREE. > e.g. af_unix.c. It may be accessible through skb->sk. > If WARN_ON_ONCE() was used, it would be a false alarm. > > For those non TCP sk, sk_to_full_sk(sk) is a no-op. > If its passed-in sk is not already in TCP_LISTEN, > bpf_get_listener_sock() has to return NULL anyway. > > It is not ideal for those sk because this helper > still returns NULL if its passed-in sk is already TCP_LISTEN > because of the sock_flag(sk, SOCK_RCU_FREE) check. > > On the other hand, there is sk->sk_state for the bpf_prog > to check first before calling this helper. all of the above makes sense to me. thanks for explaining. With that I think it's best to leave it as-is. I don't see how any of the below options make it better. > I have re-thought about (3). How about this: > 1. Limit this helper to "sk->sk_protocol == IPPROTO_TCP". > Return NULL for others because this helper cannot > do anything extra for those sk anyway. protocol is already in bpf_sock and prog can check it if necessary. > 2. Rename this helper to bpf_get_tcp_listener_sock(). It still returns > "struct bpf_sock *" because I think the send_cwnd/srtt_us/rtt_min/... > of the listener's "struct bpf_tcp_sock *" is not very useful. > > If needed, bpf_tcp_sock() can be used to get "struct bpf_tcp_sock *". > > 3. Add WARN_ON_ONCE() on the SOCK_RCU_FREE flag. not needed as you explained it will be false alarm.