Message ID | MW2PR2101MB11164C6EEAA5C511B395EF3AC0EC0@MW2PR2101MB1116.namprd21.prod.outlook.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [net] hvsock: fix epollout hang from race condition | expand |
> From: Sunil Muthuswamy <sunilmut@microsoft.com> > Sent: Wednesday, June 12, 2019 2:19 PM > ... > The fix is to set the pending size to the default size and never change it. > This way the host will always notify the guest whenever the writable space > is bigger than the pending size. The host is already optimized to *only* > notify the guest when the pending size threshold boundary is crossed and > not everytime. > > This change also reduces the cpu usage somewhat since > hv_stream_has_space() > is in the hotpath of send: > vsock_stream_sendmsg()->hv_stream_has_space() > Earlier hv_stream_has_space was setting/clearing the pending size on every > call. > > Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Hi Sunil, thanks for the fix! It looks good. Reviewed-by: Dexuan Cui <decui@microsoft.com>
This adds lots of new warnings: net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’: net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used uninitialized in this function [-Wmaybe-uninitialized] remote->svm_port = host_ephemeral_port++; ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared here struct vsock_sock *vnew; ^~~~ net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be used uninitialized in this function [-Wmaybe-uninitialized] hvs_new->vm_srv_id = *if_type; ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~ net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared here struct hvsock *hvs, *hvs_new; ^~~~~~~
> From: linux-hyperv-owner@vger.kernel.org > <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller > Sent: Friday, June 14, 2019 7:15 PM > To: Sunil Muthuswamy <sunilmut@microsoft.com> > > This adds lots of new warnings: > > net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’: > net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used > uninitialized in this function [-Wmaybe-uninitialized] > remote->svm_port = host_ephemeral_port++; > ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ > net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared here > struct vsock_sock *vnew; > ^~~~ > net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be > used uninitialized in this function [-Wmaybe-uninitialized] > hvs_new->vm_srv_id = *if_type; > ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~ > net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared > here > struct hvsock *hvs, *hvs_new; > ^~~~~~~ Hi David, These warnings are not introduced by this patch from Sunil. I'm not sure why I didn't notice these warnings before. Probably my gcc version is not new eought? Actually these warnings are bogus, as I checked the related functions, which may confuse the compiler's static analysis. I'm going to make a patch to initialize the pointers to NULL to suppress the warnings. My patch will be based on the latest's net.git + this patch from Sunil. Thanks, -- Dexuan
> From: linux-hyperv-owner@vger.kernel.org > <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui > Sent: Friday, June 14, 2019 8:23 PM > To: David Miller <davem@davemloft.net>; Sunil Muthuswamy > <sunilmut@microsoft.com> > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang > <haiyangz@microsoft.com>; Stephen Hemminger > <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley > <mikelley@microsoft.com>; netdev@vger.kernel.org; > linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition > > > From: linux-hyperv-owner@vger.kernel.org > > <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller > > Sent: Friday, June 14, 2019 7:15 PM > > To: Sunil Muthuswamy <sunilmut@microsoft.com> > > > > This adds lots of new warnings: > > > > net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’: > > net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used > > uninitialized in this function [-Wmaybe-uninitialized] > > remote->svm_port = host_ephemeral_port++; > > ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ > > net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared > here > > struct vsock_sock *vnew; > > ^~~~ > > net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be > > used uninitialized in this function [-Wmaybe-uninitialized] > > hvs_new->vm_srv_id = *if_type; > > ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~ > > net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared > > here > > struct hvsock *hvs, *hvs_new; > > ^~~~~~~ > > Hi David, > These warnings are not introduced by this patch from Sunil. Well, technically speaking, the warnings are caused by Suni's patch, but I think it should be a bug of gcc (I'm using "gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0"). As you can see, the line numbers given by gcc, i.e. line 205, line 406, are not related to the warnings. Actually, the same warnings can repro with the below one-line patch on this commit of today's net.git: 9a33629ba6b2 ("hv_netvsc: Set probe mode to sync") --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -403,6 +403,7 @@ static void hvs_open_connection(struct vmbus_channel *chan) set_per_channel_state(chan, conn_from_host ? new : sk); vmbus_set_chn_rescind_callback(chan, hvs_close_connection); + asm ("nop"); if (conn_from_host) { new->sk_state = TCP_ESTABLISHED; It looks a simple inline assembly code can confuse gcc. I'm not sure if I should report a bug for gcc... I posted a patch to suppress these bogus warnings just now. The Subject is: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings Thanks, -- Dexuan
> -----Original Message----- > From: Dexuan Cui <decui@microsoft.com> > Sent: Friday, June 14, 2019 10:04 PM > To: David Miller <davem@davemloft.net>; Sunil Muthuswamy <sunilmut@microsoft.com> > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger > <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; netdev@vger.kernel.org; linux- > hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition > > > From: linux-hyperv-owner@vger.kernel.org > > <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui > > Sent: Friday, June 14, 2019 8:23 PM > > To: David Miller <davem@davemloft.net>; Sunil Muthuswamy > > <sunilmut@microsoft.com> > > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang > > <haiyangz@microsoft.com>; Stephen Hemminger > > <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley > > <mikelley@microsoft.com>; netdev@vger.kernel.org; > > linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > > Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition > > > > > From: linux-hyperv-owner@vger.kernel.org > > > <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller > > > Sent: Friday, June 14, 2019 7:15 PM > > > To: Sunil Muthuswamy <sunilmut@microsoft.com> > > > > > > This adds lots of new warnings: > > > > > > net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’: > > > net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used > > > uninitialized in this function [-Wmaybe-uninitialized] > > > remote->svm_port = host_ephemeral_port++; > > > ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ > > > net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared > > here > > > struct vsock_sock *vnew; > > > ^~~~ > > > net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be > > > used uninitialized in this function [-Wmaybe-uninitialized] > > > hvs_new->vm_srv_id = *if_type; > > > ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~ > > > net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared > > > here > > > struct hvsock *hvs, *hvs_new; > > > ^~~~~~~ > > > > Hi David, > > These warnings are not introduced by this patch from Sunil. > > Well, technically speaking, the warnings are caused by Suni's patch, but I think it should > be a bug of gcc (I'm using "gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0"). As you can see, the > line numbers given by gcc, i.e. line 205, line 406, are not related to the warnings. > > Actually, the same warnings can repro with the below one-line patch on this commit of > today's net.git: > 9a33629ba6b2 ("hv_netvsc: Set probe mode to sync") > > --- a/net/vmw_vsock/hyperv_transport.c > +++ b/net/vmw_vsock/hyperv_transport.c > @@ -403,6 +403,7 @@ static void hvs_open_connection(struct vmbus_channel *chan) > > set_per_channel_state(chan, conn_from_host ? new : sk); > vmbus_set_chn_rescind_callback(chan, hvs_close_connection); > + asm ("nop"); > > if (conn_from_host) { > new->sk_state = TCP_ESTABLISHED; > > It looks a simple inline assembly code can confuse gcc. I'm not sure if I should > report a bug for gcc... > > I posted a patch to suppress these bogus warnings just now. The Subject is: > > [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings > > Thanks, > -- Dexuan Yes, these warnings are not specific to this patch. And, additionally these should already addressed in the commit https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=ac383f58f3c98de37fa67452acc5bd677396e9f3 I was trying to avoid making the same changes here to avoid merge conflicts when 'net-next' merges with 'net' next.
> From: Sunil Muthuswamy <sunilmut@microsoft.com> > Sent: Saturday, June 15, 2019 12:23 AM > To: Dexuan Cui <decui@microsoft.com>; David Miller <davem@davemloft.net> > > ... > > It looks a simple inline assembly code can confuse gcc. I'm not sure if I should > > report a bug for gcc... > > > > I posted a patch to suppress these bogus warnings just now. The Subject is: > > > > [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings David, as I described, these warnings are spurious and can be safely ignored. Please consider not applying my two-line patch to avoid merge conflict with Sunil's another patch in net-next.git. > Yes, these warnings are not specific to this patch. And, additionally these > should already addressed in the commit ... > I was trying to avoid making the same changes here to avoid merge > conflicts when 'net-next' merges with 'net' next. Yeah, I agree. Thanks, -- Dexuan
From: Dexuan Cui <decui@microsoft.com> Date: Sat, 15 Jun 2019 03:22:32 +0000 > These warnings are not introduced by this patch from Sunil. > > I'm not sure why I didn't notice these warnings before. > Probably my gcc version is not new eought? > > Actually these warnings are bogus, as I checked the related functions, > which may confuse the compiler's static analysis. > > I'm going to make a patch to initialize the pointers to NULL to suppress > the warnings. My patch will be based on the latest's net.git + this patch > from Sunil. Sunil should then resubmit his patch against something that has the warning suppression patch applied.
> -----Original Message----- > From: linux-hyperv-owner@vger.kernel.org <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller > Sent: Sunday, June 16, 2019 1:55 PM > To: Dexuan Cui <decui@microsoft.com> > Cc: Sunil Muthuswamy <sunilmut@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; > Stephen Hemminger <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; > netdev@vger.kernel.org; linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH net] hvsock: fix epollout hang from race condition > > From: Dexuan Cui <decui@microsoft.com> > Date: Sat, 15 Jun 2019 03:22:32 +0000 > > > These warnings are not introduced by this patch from Sunil. > > > > I'm not sure why I didn't notice these warnings before. > > Probably my gcc version is not new eought? > > > > Actually these warnings are bogus, as I checked the related functions, > > which may confuse the compiler's static analysis. > > > > I'm going to make a patch to initialize the pointers to NULL to suppress > > the warnings. My patch will be based on the latest's net.git + this patch > > from Sunil. > > Sunil should then resubmit his patch against something that has the > warning suppression patch applied. David, Dexuan's patch to suppress the warnings seems to be applied now to the 'net' branch. Can we please get this patch applied as well?
From: Sunil Muthuswamy <sunilmut@microsoft.com> Date: Mon, 17 Jun 2019 18:47:08 +0000 > > >> -----Original Message----- >> From: linux-hyperv-owner@vger.kernel.org <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller >> Sent: Sunday, June 16, 2019 1:55 PM >> To: Dexuan Cui <decui@microsoft.com> >> Cc: Sunil Muthuswamy <sunilmut@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; >> Stephen Hemminger <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; >> netdev@vger.kernel.org; linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org >> Subject: Re: [PATCH net] hvsock: fix epollout hang from race condition >> >> From: Dexuan Cui <decui@microsoft.com> >> Date: Sat, 15 Jun 2019 03:22:32 +0000 >> >> > These warnings are not introduced by this patch from Sunil. >> > >> > I'm not sure why I didn't notice these warnings before. >> > Probably my gcc version is not new eought? >> > >> > Actually these warnings are bogus, as I checked the related functions, >> > which may confuse the compiler's static analysis. >> > >> > I'm going to make a patch to initialize the pointers to NULL to suppress >> > the warnings. My patch will be based on the latest's net.git + this patch >> > from Sunil. >> >> Sunil should then resubmit his patch against something that has the >> warning suppression patch applied. > > David, Dexuan's patch to suppress the warnings seems to be applied now > to the 'net' branch. Can we please get this patch applied as well? I don't know how else to say "Suni should then resubmit his patch" Please just resubmit it!
> -----Original Message----- > From: David Miller <davem@davemloft.net> > Sent: Monday, June 17, 2019 11:56 AM > To: Sunil Muthuswamy <sunilmut@microsoft.com> > Cc: Dexuan Cui <decui@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen > Hemminger <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; netdev@vger.kernel.org; > linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH net] hvsock: fix epollout hang from race condition > > From: Sunil Muthuswamy <sunilmut@microsoft.com> > Date: Mon, 17 Jun 2019 18:47:08 +0000 > > > > > > >> -----Original Message----- > >> From: linux-hyperv-owner@vger.kernel.org <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller > >> Sent: Sunday, June 16, 2019 1:55 PM > >> To: Dexuan Cui <decui@microsoft.com> > >> Cc: Sunil Muthuswamy <sunilmut@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang > <haiyangz@microsoft.com>; > >> Stephen Hemminger <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; > >> netdev@vger.kernel.org; linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > >> Subject: Re: [PATCH net] hvsock: fix epollout hang from race condition > >> > >> From: Dexuan Cui <decui@microsoft.com> > >> Date: Sat, 15 Jun 2019 03:22:32 +0000 > >> > >> > These warnings are not introduced by this patch from Sunil. > >> > > >> > I'm not sure why I didn't notice these warnings before. > >> > Probably my gcc version is not new eought? > >> > > >> > Actually these warnings are bogus, as I checked the related functions, > >> > which may confuse the compiler's static analysis. > >> > > >> > I'm going to make a patch to initialize the pointers to NULL to suppress > >> > the warnings. My patch will be based on the latest's net.git + this patch > >> > from Sunil. > >> > >> Sunil should then resubmit his patch against something that has the > >> warning suppression patch applied. > > > > David, Dexuan's patch to suppress the warnings seems to be applied now > > to the 'net' branch. Can we please get this patch applied as well? > > I don't know how else to say "Suni should then resubmit his patch" > > Please just resubmit it! The patch does not change at all. So, I was hoping we could reapply it. But, I have resubmitted the patch. Thanks.
From: Sunil Muthuswamy <sunilmut@microsoft.com> Date: Mon, 17 Jun 2019 19:27:45 +0000 > The patch does not change at all. So, I was hoping we could reapply > it. But, I have resubmitted the patch. Thanks. It's easy for me to track things if you just resubmit the patch. That's why I ask for things to be done this way, it helps my workflow a lot. Thank you.
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index 982a8dc..8d1ea9e 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -220,18 +220,6 @@ static void hvs_set_channel_pending_send_size(struct vmbus_channel *chan) set_channel_pending_send_size(chan, HVS_PKT_LEN(HVS_SEND_BUF_SIZE)); - /* See hvs_stream_has_space(): we must make sure the host has seen - * the new pending send size, before we can re-check the writable - * bytes. - */ - virt_mb(); -} - -static void hvs_clear_channel_pending_send_size(struct vmbus_channel *chan) -{ - set_channel_pending_send_size(chan, 0); - - /* Ditto */ virt_mb(); } @@ -301,9 +289,6 @@ static void hvs_channel_cb(void *ctx) if (hvs_channel_readable(chan)) sk->sk_data_ready(sk); - /* See hvs_stream_has_space(): when we reach here, the writable bytes - * may be already less than HVS_PKT_LEN(HVS_SEND_BUF_SIZE). - */ if (hv_get_bytes_to_write(&chan->outbound) > 0) sk->sk_write_space(sk); } @@ -404,6 +389,13 @@ static void hvs_open_connection(struct vmbus_channel *chan) set_per_channel_state(chan, conn_from_host ? new : sk); vmbus_set_chn_rescind_callback(chan, hvs_close_connection); + /* Set the pending send size to max packet size to always get + * notifications from the host when there is enough writable space. + * The host is optimized to send notifications only when the pending + * size boundary is crossed, and not always. + */ + hvs_set_channel_pending_send_size(chan); + if (conn_from_host) { new->sk_state = TCP_ESTABLISHED; sk->sk_ack_backlog++; @@ -697,23 +689,8 @@ static s64 hvs_stream_has_data(struct vsock_sock *vsk) static s64 hvs_stream_has_space(struct vsock_sock *vsk) { struct hvsock *hvs = vsk->trans; - struct vmbus_channel *chan = hvs->chan; - s64 ret; - - ret = hvs_channel_writable_bytes(chan); - if (ret > 0) { - hvs_clear_channel_pending_send_size(chan); - } else { - /* See hvs_channel_cb() */ - hvs_set_channel_pending_send_size(chan); - - /* Re-check the writable bytes to avoid race */ - ret = hvs_channel_writable_bytes(chan); - if (ret > 0) - hvs_clear_channel_pending_send_size(chan); - } - return ret; + return hvs_channel_writable_bytes(hvs->chan); } static u64 hvs_stream_rcvhiwat(struct vsock_sock *vsk)
Currently, hvsock can enter into a state where epoll_wait on EPOLLOUT will not return even when the hvsock socket is writable, under some race condition. This can happen under the following sequence: - fd = socket(hvsocket) - fd_out = dup(fd) - fd_in = dup(fd) - start a writer thread that writes data to fd_out with a combination of epoll_wait(fd_out, EPOLLOUT) and - start a reader thread that reads data from fd_in with a combination of epoll_wait(fd_in, EPOLLIN) - On the host, there are two threads that are reading/writing data to the hvsocket stack: hvs_stream_has_space hvs_notify_poll_out vsock_poll sock_poll ep_poll Race condition: check for epollout from ep_poll(): assume no writable space in the socket hvs_stream_has_space() returns 0 check for epollin from ep_poll(): assume socket has some free space < HVS_PKT_LEN(HVS_SEND_BUF_SIZE) hvs_stream_has_space() will clear the channel pending send size host will not notify the guest because the pending send size has been cleared and so the hvsocket will never mark the socket writable Now, the EPOLLOUT will never return even if the socket write buffer is empty. The fix is to set the pending size to the default size and never change it. This way the host will always notify the guest whenever the writable space is bigger than the pending size. The host is already optimized to *only* notify the guest when the pending size threshold boundary is crossed and not everytime. This change also reduces the cpu usage somewhat since hv_stream_has_space() is in the hotpath of send: vsock_stream_sendmsg()->hv_stream_has_space() Earlier hv_stream_has_space was setting/clearing the pending size on every call. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> --- net/vmw_vsock/hyperv_transport.c | 39 ++++++++------------------------------- 1 file changed, 8 insertions(+), 31 deletions(-)