diff mbox series

[mptcp-next,9/9] mptcp: sockopt: handle TCP_INFO

Message ID 20210317163828.27406-10-fw@strlen.de
State Superseded, archived
Delegated to: Paolo Abeni
Headers show
Series initial SOL_SOCKET support | expand

Commit Message

Florian Westphal March 17, 2021, 4:38 p.m. UTC
Tries to follow the subflow most recently used from transmit.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/mptcp/sockopt.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Paolo Abeni March 19, 2021, noon UTC | #1
On Wed, 2021-03-17 at 17:38 +0100, Florian Westphal wrote:
> Tries to follow the subflow most recently used from transmit.

I guess this one is the most tricky so far ;)

One downside with picking the most recent one, is that different
subflows could be picked in successive getsockopt, possibly fooling the
userspace with e.g. decreasing seg_in/seg_out.

What if we always pick the first subflow?

/P
Mat Martineau March 20, 2021, 12:04 a.m. UTC | #2
On Fri, 19 Mar 2021, Paolo Abeni wrote:

> On Wed, 2021-03-17 at 17:38 +0100, Florian Westphal wrote:
>> Tries to follow the subflow most recently used from transmit.
>
> I guess this one is the most tricky so far ;)
>
> One downside with picking the most recent one, is that different
> subflows could be picked in successive getsockopt, possibly fooling the
> userspace with e.g. decreasing seg_in/seg_out.
>
> What if we always pick the first subflow?
>

My possibly silly suggestion: when there are multiple open subflows, what 
about aggregating information from all the subflows?

Different values in tcp_info could be derived in different ways:

  * Add up values across all the subflows (packet or byte counts)

  * Maximum or minimum values would be the largest/smallest seen on all 
flows

  * Some values would be from the msk (tcpi_state, maybe info for 
mptcp-level resends?)

  * For values that don't make sense for an msk, just leave them 0 (or some 
other believable value).


This doesn't make for elegant or satisfying code, but it may be the best 
we can do.

If the returned value jumps around different subflows I'm not sure what 
value userspace can get out of that. Using the initial subflow is at least 
predictable, but it's possible that the initial subflow gets closed and 
other subflows are handling all traffic.


Another twist with aggregated tcp_info: when subflows go away, it could 
make some of the aggregated values decrease when userspace is expecting 
them to only increase. For example, bytes_received and bytes_acked only 
increase on a TCP socket.

--
Mat Martineau
Intel
Matthieu Baerts March 20, 2021, 10:22 a.m. UTC | #3
Hi Mat, Paolo, Florian,

On 20/03/2021 01:04, Mat Martineau wrote:
> On Fri, 19 Mar 2021, Paolo Abeni wrote:
> 
>> On Wed, 2021-03-17 at 17:38 +0100, Florian Westphal wrote:
>>> Tries to follow the subflow most recently used from transmit.
>>
>> I guess this one is the most tricky so far ;)
>>
>> One downside with picking the most recent one, is that different
>> subflows could be picked in successive getsockopt, possibly fooling the
>> userspace with e.g. decreasing seg_in/seg_out.
>>
>> What if we always pick the first subflow?
>>
> 
> My possibly silly suggestion: when there are multiple open subflows, 
> what about aggregating information from all the subflows?

I think always picking the first subflow of the list is a good 
compromise, no?

For the rest, I think we should add support for MPTCP_INFO like it is 
done in mptcp.org. If an app asks to use MPTCP, it should use MPTCP_INFO 
except if a fallback happened (EINVAL).

MPTCP_INFO is explained there:

   https://multipath-tcp.org/pmwiki.php/Users/ConfigureMPTCP
   Section: Get detailed information of your subflows

WDYT?

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index c1ffc3603b4f..27ea5b1ea57e 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -657,6 +657,28 @@  static int mptcp_getsockopt_first_sf_only(struct mptcp_sock *msk, int level, int
 	return ret;
 }
 
+static int mptcp_getsockopt_tcp_info(struct mptcp_sock *msk, char __user *optval, int __user *optlen)
+{
+	struct mptcp_subflow_context *subflow;
+	struct sock *sk = (struct sock *)msk;
+	struct sock *ssk = NULL;
+	int ret = -EOPNOTSUPP;
+
+	lock_sock(sk);
+	mptcp_for_each_subflow(msk, subflow) {
+		ssk = mptcp_subflow_tcp_sock(subflow);
+
+		if (ssk == msk->last_snd)
+			break;
+	}
+
+	if (ssk)
+		ret = tcp_getsockopt(ssk, SOL_TCP, TCP_INFO, optval, optlen);
+
+	release_sock(sk);
+	return ret;
+}
+
 static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
 				    char __user *optval, int __user *optlen)
 {
@@ -665,6 +687,8 @@  static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
 	case TCP_CONGESTION:
 		return mptcp_getsockopt_first_sf_only(msk, SOL_TCP, optname,
 						      optval, optlen);
+	case TCP_INFO:
+		return mptcp_getsockopt_tcp_info(msk, optval, optlen);
 	}
 	return -EOPNOTSUPP;
 }