diff mbox series

skbuff: fix a data race in skb_queue_len()

Message ID 1580756190-3541-1-git-send-email-cai@lca.pw
State Changes Requested
Delegated to: David Miller
Headers show
Series skbuff: fix a data race in skb_queue_len() | expand

Commit Message

Qian Cai Feb. 3, 2020, 6:56 p.m. UTC
sk_buff.qlen can be accessed concurrently as noticed by KCSAN,

 BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_dgram_sendmsg

 read to 0xffff8a1b1d8a81c0 of 4 bytes by task 5371 on cpu 96:
  unix_dgram_sendmsg+0x9a9/0xb70 include/linux/skbuff.h:1821
				 net/unix/af_unix.c:1761
  ____sys_sendmsg+0x33e/0x370
  ___sys_sendmsg+0xa6/0xf0
  __sys_sendmsg+0x69/0xf0
  __x64_sys_sendmsg+0x51/0x70
  do_syscall_64+0x91/0xb47
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 write to 0xffff8a1b1d8a81c0 of 4 bytes by task 1 on cpu 99:
  __skb_try_recv_from_queue+0x327/0x410 include/linux/skbuff.h:2029
  __skb_try_recv_datagram+0xbe/0x220
  unix_dgram_recvmsg+0xee/0x850
  ____sys_recvmsg+0x1fb/0x210
  ___sys_recvmsg+0xa2/0xf0
  __sys_recvmsg+0x66/0xf0
  __x64_sys_recvmsg+0x51/0x70
  do_syscall_64+0x91/0xb47
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Since only the read is operating as lockless, it could introduce a logic
bug in unix_recvq_full() due to the load tearing. Fix it by adding
a READ_ONCE() there.

Signed-off-by: Qian Cai <cai@lca.pw>
---
 include/linux/skbuff.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Dumazet Feb. 3, 2020, 7:41 p.m. UTC | #1
On 2/3/20 10:56 AM, Qian Cai wrote:
> sk_buff.qlen can be accessed concurrently as noticed by KCSAN,
> 
>  BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_dgram_sendmsg
> 
>  read to 0xffff8a1b1d8a81c0 of 4 bytes by task 5371 on cpu 96:
>   unix_dgram_sendmsg+0x9a9/0xb70 include/linux/skbuff.h:1821
> 				 net/unix/af_unix.c:1761
>   ____sys_sendmsg+0x33e/0x370
>   ___sys_sendmsg+0xa6/0xf0
>   __sys_sendmsg+0x69/0xf0
>   __x64_sys_sendmsg+0x51/0x70
>   do_syscall_64+0x91/0xb47
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
>  write to 0xffff8a1b1d8a81c0 of 4 bytes by task 1 on cpu 99:
>   __skb_try_recv_from_queue+0x327/0x410 include/linux/skbuff.h:2029
>   __skb_try_recv_datagram+0xbe/0x220
>   unix_dgram_recvmsg+0xee/0x850
>   ____sys_recvmsg+0x1fb/0x210
>   ___sys_recvmsg+0xa2/0xf0
>   __sys_recvmsg+0x66/0xf0
>   __x64_sys_recvmsg+0x51/0x70
>   do_syscall_64+0x91/0xb47
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Since only the read is operating as lockless, it could introduce a logic
> bug in unix_recvq_full() due to the load tearing. Fix it by adding
> a READ_ONCE() there.
> 
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  include/linux/skbuff.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 3d13a4b717e9..4b5157164f3e 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1818,7 +1818,7 @@ static inline struct sk_buff *skb_peek_tail(const struct sk_buff_head *list_)
>   */
>  static inline __u32 skb_queue_len(const struct sk_buff_head *list_)
>  {
> -	return list_->qlen;
> +	return READ_ONCE(list_->qlen);
>  }

We do not want to add READ_ONCE() for all uses of skb_queue_len()

This could hide some real bugs, and could generate slightly less
efficient code in the cases we have the lock held.
Qian Cai Feb. 3, 2020, 8:19 p.m. UTC | #2
> On Feb 3, 2020, at 2:42 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> We do not want to add READ_ONCE() for all uses of skb_queue_len()
> 
> This could hide some real bugs, and could generate slightly less
> efficient code in the cases we have the lock held.

Good point. I should have thought about that. How about introducing 2 new helpers.

skb_queue_len_once()
unix_recvq_full_once()

which will have a READ_ONCE() there, and then unix_dgram_sendmsg() could use that instead?
Eric Dumazet Feb. 3, 2020, 8:28 p.m. UTC | #3
On 2/3/20 12:19 PM, Qian Cai wrote:
> 
> 
>> On Feb 3, 2020, at 2:42 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>> We do not want to add READ_ONCE() for all uses of skb_queue_len()
>>
>> This could hide some real bugs, and could generate slightly less
>> efficient code in the cases we have the lock held.
> 
> Good point. I should have thought about that. How about introducing 2 new helpers.
> 
> skb_queue_len_once()
> unix_recvq_full_once()
> 
> which will have a READ_ONCE() there, and then unix_dgram_sendmsg() could use that instead?
> 

We added recently skb_queue_empty_lockless() helper, to use in these contexts.

The fact that we use READ_ONCE() is more of an implementation detail I think.

Also, addressing load-stearing issues without making sure the write side
is using WRITE_ONCE() might be not enough (even if KCSAN warnings disappear)
Qian Cai Feb. 3, 2020, 10:34 p.m. UTC | #4
> On Feb 3, 2020, at 3:28 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> We added recently skb_queue_empty_lockless() helper, to use in these contexts.
> 
> The fact that we use READ_ONCE() is more of an implementation detail I think.
> 

Make sense. I’ll use lockless in naming instead.

> Also, addressing load-stearing issues without making sure the write side
> is using WRITE_ONCE() might be not enough (even if KCSAN warnings disappear)

I suppose that could be a case. I’ll have,

WRITE_ONCE(list->qlen, list->qlen - 1);

in __skb_unlink() where it had already had a few WRITE_ONCE() for other variables.
diff mbox series

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3d13a4b717e9..4b5157164f3e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1818,7 +1818,7 @@  static inline struct sk_buff *skb_peek_tail(const struct sk_buff_head *list_)
  */
 static inline __u32 skb_queue_len(const struct sk_buff_head *list_)
 {
-	return list_->qlen;
+	return READ_ONCE(list_->qlen);
 }
 
 /**