diff mbox series

[net] tipc: fix successful connect() but timed out

Message ID 20200210083544.31501-1-tuong.t.lien@dektech.com.au
State Accepted
Delegated to: David Miller
Headers show
Series [net] tipc: fix successful connect() but timed out | expand

Commit Message

Tuong Lien Feb. 10, 2020, 8:35 a.m. UTC
In commit 9546a0b7ce00 ("tipc: fix wrong connect() return code"), we
fixed the issue with the 'connect()' that returns zero even though the
connecting has failed by waiting for the connection to be 'ESTABLISHED'
really. However, the approach has one drawback in conjunction with our
'lightweight' connection setup mechanism that the following scenario
can happen:

          (server)                        (client)

   +- accept()|                      |             wait_for_conn()
   |          |                      |connect() -------+
   |          |<-------[SYN]---------|                 > sleeping
   |          |                      *CONNECTING       |
   |--------->*ESTABLISHED           |                 |
              |--------[ACK]-------->*ESTABLISHED      > wakeup()
        send()|--------[DATA]------->|\                > wakeup()
        send()|--------[DATA]------->| |               > wakeup()
          .   .          .           . |-> recvq       .
          .   .          .           . |               .
        send()|--------[DATA]------->|/                > wakeup()
       close()|--------[FIN]-------->*DISCONNECTING    |
              *DISCONNECTING         |                 |
              |                      ~~~~~~~~~~~~~~~~~~> schedule()
                                                       | wait again
                                                       .
                                                       .
                                                       | ETIMEDOUT

Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
the server starts to send a number of data following by a 'close()'
shortly without waiting any response from the client, which then forces
the client socket to be 'DISCONNECTING' immediately. When the wait
process is switched to be running, it continues to wait until the timer
expires because of the unexpected socket state. The client 'connect()'
will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
there remains the messages in its receive queue.

Obviously the issue would not happen if the server had some delay prior
to its 'close()' (or the number of 'DATA' messages is large enough),
but any kind of delay would make the connection setup/shutdown "heavy".
We solve this by simply allowing the 'connect()' returns zero in this
particular case. The socket is already 'DISCONNECTING', so any further
write will get '-EPIPE' but the socket is still able to read the
messages existing in its receive queue.

Note: This solution doesn't break the previous one as it deals with a
different situation that the socket state is 'DISCONNECTING' but has no
error (i.e. sk->sk_err = 0).

Fixes: 9546a0b7ce00 ("tipc: fix wrong connect() return code")
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
---
 net/tipc/socket.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

David Miller Feb. 10, 2020, 9:25 a.m. UTC | #1
From: Tuong Lien <tuong.t.lien@dektech.com.au>
Date: Mon, 10 Feb 2020 15:35:44 +0700

> In commit 9546a0b7ce00 ("tipc: fix wrong connect() return code"), we
> fixed the issue with the 'connect()' that returns zero even though the
> connecting has failed by waiting for the connection to be 'ESTABLISHED'
> really. However, the approach has one drawback in conjunction with our
> 'lightweight' connection setup mechanism that the following scenario
> can happen:
 ...
> Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
> and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
> the server starts to send a number of data following by a 'close()'
> shortly without waiting any response from the client, which then forces
> the client socket to be 'DISCONNECTING' immediately. When the wait
> process is switched to be running, it continues to wait until the timer
> expires because of the unexpected socket state. The client 'connect()'
> will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
> there remains the messages in its receive queue.
> 
> Obviously the issue would not happen if the server had some delay prior
> to its 'close()' (or the number of 'DATA' messages is large enough),
> but any kind of delay would make the connection setup/shutdown "heavy".
> We solve this by simply allowing the 'connect()' returns zero in this
> particular case. The socket is already 'DISCONNECTING', so any further
> write will get '-EPIPE' but the socket is still able to read the
> messages existing in its receive queue.
> 
> Note: This solution doesn't break the previous one as it deals with a
> different situation that the socket state is 'DISCONNECTING' but has no
> error (i.e. sk->sk_err = 0).
> 
> Fixes: 9546a0b7ce00 ("tipc: fix wrong connect() return code")
> Acked-by: Ying Xue <ying.xue@windriver.com>
> Acked-by: Jon Maloy <jon.maloy@ericsson.com>
> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>

Applied.
diff mbox series

Patch

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index f9b4fb92c0b1..693e8902161e 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2441,6 +2441,8 @@  static int tipc_wait_for_connect(struct socket *sock, long *timeo_p)
 			return -ETIMEDOUT;
 		if (signal_pending(current))
 			return sock_intr_errno(*timeo_p);
+		if (sk->sk_state == TIPC_DISCONNECTING)
+			break;
 
 		add_wait_queue(sk_sleep(sk), &wait);
 		done = sk_wait_event(sk, timeo_p, tipc_sk_connected(sk),