From patchwork Thu Feb 1 00:07:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868100 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="GSnW+YEa"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0mg1xVHz9s7M for ; Thu, 1 Feb 2018 11:07:31 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754234AbeBAAH2 (ORCPT ); Wed, 31 Jan 2018 19:07:28 -0500 Received: from mail-out4.apple.com ([17.151.62.26]:62940 "EHLO mail-in4.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753577AbeBAAH0 (ORCPT ); Wed, 31 Jan 2018 19:07:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443646; x=2381357246; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=xFAK2Pb8ZymdPn0yJj78SGq/WoULxvL7gwF+rbNMijg=; b=GSnW+YEa158SYugZfR/KBfTlPFcF+kVGHJh2HMybjieH734IaNgmKKA6wZYOjXDn aOjW0m0GBikZ5QrByLw+3B85p+OJySXihY3IS54hiqy4pP8uS8UBp6CB0R3LlBQJ 6UVIcQl+8K2beTPB0hmhvpScyWbx1g5Df/EASlnnmYjeapdm2fFV+cM8HPWarEc3 u23nJmU0QArcYdBrGYm98CrOF8Xh7us/C4pp0VyFJ1mns8GTufQRNv1lKF+ADI+i WR5hiDBwBCbjbsB3TQNQw+jYCbdJzblUXNn0DlTVcmfXIVTXZPiaIqrdzVUQH8Y3 BzJGwb2V/oW0BK6IUEFJmQ==; Received: from relay3.apple.com (relay3.apple.com [17.128.113.83]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in4.apple.com (Apple Secure Mail Relay) with SMTP id D6.8A.10621.E3A527A5; Wed, 31 Jan 2018 16:07:26 -0800 (PST) X-AuditID: 11973e12-c67d59e00000297d-61-5a725a3e4f27 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay3.apple.com (Apple SCV relay) with SMTP id 8C.4C.12852.E3A527A5; Wed, 31 Jan 2018 16:07:26 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006DK30D15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:26 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau Subject: [RFC v2 01/14] tcp: Write options after the header has been fully done Date: Wed, 31 Jan 2018 16:07:03 -0800 Message-id: <20180201000716.69301-2-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupnluLIzCtJLcpLzFFi42IRbCgM1rWLKooy+DBL0OLpsUfsFn9b+lks ji0Qc2D2WLCp1GPeyUCPz5vkApijuGxSUnMyy1KL9O0SuDIOT+5kLDjIW7HtcD9LA+ME7i5G Tg4JAROJFTN/sncxcnEICaxmkvhx+hE7TGLWhmVQiUOMEg8+XWDtYuTgYBaQlzh4XhYi3sgk 0f/5CFiDsICkRPedO8wgNpuAlsTb2+2sILaIgJTExx3bwWqYBWIkXs7+xgQyR1jAX2LhtnqQ MIuAqsTjW4fBxvMKmEm8nJECcYK8xOE3TWBTOAXMJRp2zQGbIgRU8vn6YmaQEyQEdrBJXO+5 zjaBUXAWwnULGBlXMQrlJmbm6GbmmeglFhTkpOol5+duYgSF4HQ7oR2Mp1ZZHWIU4GBU4uGd cKEwSog1say4MvcQozQHi5I4r5doUZSQQHpiSWp2ampBalF8UWlOavEhRiYOTqkGRrsZavO4 PkikSm+YytCXWVSxtvGbiqhas4N/dJLZaeVsywzBFROren99K/gv+vrF+ovdS8+L5OhLbGH3 KLXKv/musprnf93m1mwPjR4FW9bXD67y7bgWfJDD+dqkfrdVcbdip+xUkZdYwWm4MNfk9G4u yQdf9L+umGqe1nuZf+X96renWIoalViKMxINtZiLihMBmk/c3SICAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrFLMWRmVeSWpSXmKPExsUi2FB8Q9cuqijK4MczPounxx6xW/xt6Wex OLZAzIHZY8GmUo95JwM9Pm+SC2COMrRJyy8qTyxKUShKLiixVSrOSEzJL4+3NDYydUgsKMhJ 1UvOz1XSt7NJSc3JLEst0rdLMMw4PLmTseAgb8W2w/0sDYwTuLsYOTkkBEwkZm1Yxt7FyMUh JHCIUeLBpwusXYwcHMwC8hIHz8tCxBuZJPo/H2EHaRAWkJTovnOHGcRmE9CSeHu7nRXEFhGQ kvi4YztYDbNAjMTL2d+YQOYIC/hLLNxWDxJmEVCVeHzrMNh4XgEziZczUiBOkJc4/KYJbAqn gLlEw645YFOEgEo+X1/MPIGRbxbCQQsYGVcxChSl5iRWGuvBvbqJERyAhcE7GP8sszrEKMDB qMTDO+FCYZQQa2JZcWUu0GcczEoivBtFiqKEeFMSK6tSi/Lji0pzUosPMfoAnTaRWUo0OR8Y HXkl8YbGFsaWJhYGBiaWZiY4hJXEeY8oAc0SSE8sSc1OTS1ILYIZx8TBKdXA6L3COfJL5obP Xw8q/GBiuyD1K1n0rdu0P7dqcpr6fPm2rryssK5vS6T/lLM7ooqUJr6tS5WJWnmGrfaDttOq i/fqT+hZ/j3xo9buvYPxy7yNRp2TnS+t5it1kVrhxciWdC35r87dz8lTlu4pKUqZmNG2+v2Z 6F87YupkSs6d1hReZlNfXSxapMQCTBSGWsxFxYkAHB+MZ20CAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The generic TCP-option framework will need to have access to the full TCP-header (e.g., if we want to compute a checksum for TCP-MD5). Thus, we move the call to tcp_options_write() to after all the fields in the header have been filled out. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_output.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index e9f985e42405..df50c7dc1a43 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1126,7 +1126,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, } } - tcp_options_write((__be32 *)(th + 1), tp, &opts); skb_shinfo(skb)->gso_type = sk->sk_gso_type; if (likely(!(tcb->tcp_flags & TCPHDR_SYN))) { th->window = htons(tcp_select_window(sk)); @@ -1137,6 +1136,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, */ th->window = htons(min(tp->rcv_wnd, 65535U)); } + tcp_options_write((__be32 *)(th + 1), tp, &opts); #ifdef CONFIG_TCP_MD5SIG /* Calculate the MD5 hash, as we have all we need now */ if (md5) { @@ -3247,8 +3247,8 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); - tcp_options_write((__be32 *)(th + 1), NULL, &opts); th->doff = (tcp_header_size >> 2); + tcp_options_write((__be32 *)(th + 1), NULL, &opts); __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); #ifdef CONFIG_TCP_MD5SIG From patchwork Thu Feb 1 00:07:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868111 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="KS1vbFs8"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX16161m9z9ryk for ; Thu, 1 Feb 2018 11:22:33 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755886AbeBAAWa (ORCPT ); Wed, 31 Jan 2018 19:22:30 -0500 Received: from mail-out25.apple.com ([17.171.2.35]:54625 "EHLO mail-in25.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753461AbeBAAW3 (ORCPT ); Wed, 31 Jan 2018 19:22:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443647; x=2381357247; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=MVp9gvpR5rziyrchnmHvgQctvFYHEUhRd0RsPqqh7b4=; b=KS1vbFs8Fb9pif8yOCQ+97esIAci6SQBEdFvEEauDfpk5g0Qi84bmESTsEO1Yfxz 2rBL16cTSn3389DIsXfhdTFmhrISaLbCGYedtFyI3to1Yy4EGs91Uxw175oEMvil 881b/EW6F5Qa6tIDr10CCiJI0G2S/PXlsjBlL/BeDEr660n6v4c/gZ9tZ288Nx3M uv0jYlrPN0YocvedymdNHc2fc62u2xtUm+8RLO2Gy6JtbwKDvIuoNOrhh68Hj0SR VN3QEmyTRK8ld/jKYzddZoYpCfQCNS/NAOt+xUW9RGVvTNteM7vWPM/tQ+xPjA+f CVWnFhV19aVewYvXlZuhbw==; Received: from relay7.apple.com (relay7.apple.com [17.128.113.101]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in25.apple.com (Apple Secure Mail Relay) with SMTP id A4.63.14365.F3A527A5; Wed, 31 Jan 2018 16:07:27 -0800 (PST) X-AuditID: 11ab0219-e904d9e00000381d-d1-5a725a3f2ffa Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay7.apple.com (Apple SCV relay) with SMTP id A8.EC.05443.F3A527A5; Wed, 31 Jan 2018 16:07:27 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006DN30E15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:27 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau Subject: [RFC v2 02/14] tcp: Pass sock and skb to tcp_options_write Date: Wed, 31 Jan 2018 16:07:04 -0800 Message-id: <20180201000716.69301-3-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprILMWRmVeSWpSXmKPExsUi2FCYqmsfVRRlMPu8ksXTY4/YLf629LNY HFsg5sDssWBTqce8k4EenzfJBTBHcdmkpOZklqUW6dslcGVMfd7BWnBJoGL7z4OsDYwXebsY OTkkBEwkvi64yARiCwmsZZI484IRJn7q9VO2LkYuoPghRokHv1+xdDFycDALyEscPC8LEW9k klj7+D87SIOwgKRE9507zCA2m4CWxNvb7awgtoiAlMTHHdvBapgFYiRezv7GBFHvLPG2cyLY MhYBVYnFD3aC9fIKmEnMe3gU6gh5icNvmsDmcAqYSzTsmsMOcaiZxOfri5lBjpAQ2MEm8fT3 EsYJjIKzEO5bwMi4ilE4NzEzRzczz8hUL7GgICdVLzk/dxMjKAxXM0nuYPz62vAQowAHoxIP 74QLhVFCrIllxZW5hxilOViUxHkjlbOihATSE0tSs1NTC1KL4otKc1KLDzEycXBKNTCGPnV+ /eqe6bczpVkBSV5it+bGqmXtcLZ5/czuaWyJyKf0ogiJZ/omW0PKJiR5swWLeT02iwhqyf3i IRdds2Ku94ruo8/UxZ9VnYvbGCDMMOtS4Gz9hPC9bMtfRRrfrmOo5+XpPDNBJn1H/vq8fQuy el3XXClKFDq0THHl78urAoTcTY3DZyixFGckGmoxFxUnAgBj7C5eJAIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrDLMWRmVeSWpSXmKPExsUi2FB8Q9c+qijK4M4zLounxx6xW/xt6Wex OLZAzIHZY8GmUo95JwM9Pm+SC2COMrRJyy8qTyxKUShKLiixVSrOSEzJL4+3NDYydUgsKMhJ 1UvOz1XSt7NJSc3JLEst0rdLMMyY+ryDteCSQMX2nwdZGxgv8nYxcnJICJhInHr9lK2LkYtD SOAQo8SD369Yuhg5OJgF5CUOnpeFiDcySax9/J8dpEFYQFKi+84dZhCbTUBL4u3tdlYQW0RA SuLjju1gNcwCMRIvZ39jgqh3lnjbOZERxGYRUJVY/GAnWC+vgJnEvIdHGSGOkJc4/KYJbA6n gLlEw645YHOEgGo+X1/MPIGRbxbCSQsYGVcxChSl5iRWmuvBPbuJERyEhak7GBuXWx1iFOBg VOLhfXGpMEqINbGsuDIX6DcOZiUR3o0iRVFCvCmJlVWpRfnxRaU5qcWHGH2AbpvILCWanA+M kLySeENjC2NLEwsDAxNLMxMcwkrivEeUgGYJpCeWpGanphakFsGMY+LglGpgdPv578X5fYml OWovH4sYTnAtMvttlcnVonrGU/71xUa3v3pfBDo3fAxMj51XFPSI64JCw1SRufeO/9IJPM3w aW2j34HmvdVm1+6Wxxt9FlEunC1pYncgomkTi+7JygX251dXJLXcfPLYWLRJM2tS3fSi/n8x uy4/zOJ/cohV055r3lNNrktvlFiAycJQi7moOBEAquVar28CAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org An upcoming patch adds a configurable, per-socket list of TCP options to populate in the TCP header. This requires tcp_options_write() to know the socket (to use the options list) and the skb (to provide visibility to the packet data for options like TCP_MD5SIG). Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_output.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index df50c7dc1a43..e598bf54e3fb 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -444,10 +444,14 @@ struct tcp_out_options { * At least SACK_PERM as the first option is known to lead to a disaster * (but it may well be that other scenarios fail similarly). */ -static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp, +static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, struct tcp_out_options *opts) { u16 options = opts->options; /* mungable copy */ + struct tcp_sock *tp = NULL; + + if (sk_fullsock(sk)) + tp = tcp_sk(sk); if (unlikely(OPTION_MD5 & options)) { *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | @@ -1136,7 +1140,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, */ th->window = htons(min(tp->rcv_wnd, 65535U)); } - tcp_options_write((__be32 *)(th + 1), tp, &opts); + tcp_options_write((__be32 *)(th + 1), skb, sk, &opts); #ifdef CONFIG_TCP_MD5SIG /* Calculate the MD5 hash, as we have all we need now */ if (md5) { @@ -3248,7 +3252,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); th->doff = (tcp_header_size >> 2); - tcp_options_write((__be32 *)(th + 1), NULL, &opts); + tcp_options_write((__be32 *)(th + 1), skb, req_to_sk(req), &opts); __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); #ifdef CONFIG_TCP_MD5SIG From patchwork Thu Feb 1 00:07:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868099 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="h2U9Sash"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0mh2XMdz9t20 for ; Thu, 1 Feb 2018 11:07:32 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754265AbeBAAH3 (ORCPT ); Wed, 31 Jan 2018 19:07:29 -0500 Received: from mail-out2.apple.com ([17.151.62.25]:47521 "EHLO mail-in2.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753730AbeBAAH2 (ORCPT ); Wed, 31 Jan 2018 19:07:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443648; x=2381357248; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=kiKX6EmSpsUDGJ//q+gzN6XGjc+qilFO8Z7P3/jI6to=; b=h2U9SashLbLL4L5KP96/Mt2Fl3dWigovozVyUczK4FEIWMw2HVhwo5zHm8kXmIKF qgEvYlU0daqD0jLaHeYDcLGlBjLNjYvQCYZOklAGEGp7cWR3u3TAvsMYIt+AH1LK S5RfGYfBhd89GbEeqrqQrsge7mDBJR82feuO5dKCH3jEyvFxRQNHRXe/4HrqkCjP 9s84cpzlsh+Z/v8V474KSwOAGP22qpnDY25+D4g9GX5b1HWdP0ABM+nmNeyRaYPh t4AS9QcNnsRcpkajRsjSMvScP9te+ylKuWTKRjcz03Xp4iEAkY/Zby5doyIfOd37 w6X5loTRgTaMRwDLHzjCcQ==; Received: from relay5.apple.com (relay5.apple.com [17.128.113.88]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in2.apple.com (Apple Secure Mail Relay) with SMTP id 20.94.12202.04A527A5; Wed, 31 Jan 2018 16:07:28 -0800 (PST) X-AuditID: 11973e11-f8a419e000002faa-44-5a725a403c0f Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay5.apple.com (Apple SCV relay) with SMTP id E9.2E.18983.F3A527A5; Wed, 31 Jan 2018 16:07:28 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006DQ30F15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:27 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau Subject: [RFC v2 03/14] tcp: Allow tcp_fast_parse_options to drop segments Date: Wed, 31 Jan 2018 16:07:05 -0800 Message-id: <20180201000716.69301-4-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprILMWRmVeSWpSXmKPExsUi2FAYoesQVRRlMPExu8XTY4/YLf629LNY HFsg5sDssWBTqce8k4EenzfJBTBHcdmkpOZklqUW6dslcGX8OjKdqeCVWMXTa0/ZGhjPCXUx cnJICJhI/L3zhKWLkYtDSGA1k8TiQz9ZYBJtj/vZIBKHGCWez74C5HBwMAvISxw8LwsRb2SS 2Luogw2kQVhAUqL7zh1mEJtNQEvi7e12VhBbREBK4uOO7ewgNrNAjMTL2d+YIOq9JN7NeQBm swioSlz/8wisnlfATGLD703MEEfISxx+0wQW5xQwl2jYNQdsjhBQzefri5lBjpAQ2MImMWFO I9MERsFZCPctYGRcxSiUm5iZo5uZZ6SXWFCQk6qXnJ+7iREUhtPtBHcwHl9ldYhRgINRiYd3 woXCKCHWxLLiytxDjNIcLErivJ6iRVFCAumJJanZqakFqUXxRaU5qcWHGJk4OKUaGB92djHn HpTUiAjf17lSll9/vk/2XPfaEuvIc3uq64K5U05enizVxRzIcH9uku1tNoXnQeoLFr44a731 l4ik7JPw85bbu7nm3f3ekqSwV/nH9mgTy2viIv5rdTbE3TbW67uybTv7s6DNdn9ePtf+JHMz tUHzRYv7nWS/vdv7j507kKywR+NpqhJLcUaioRZzUXEiAJq5xj8kAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrDLMWRmVeSWpSXmKPExsUi2FB8Q9chqijKYH07i8XTY4/YLf629LNY HFsg5sDssWBTqce8k4EenzfJBTBHGdqk5ReVJxalKBQlF5TYKhVnJKbkl8dbGhuZOiQWFOSk 6iXn5yrp29mkpOZklqUW6dslGGb8OjKdqeCVWMXTa0/ZGhjPCXUxcnJICJhItD3uZ+ti5OIQ EjjEKPF89hUgh4ODWUBe4uB5WYh4I5PE3kUdbCANwgKSEt137jCD2GwCWhJvb7ezgtgiAlIS H3dsZwexmQViJF7O/sYEUe8l8W7OAzCbRUBV4vqfR2D1vAJmEht+b2KGOEJe4vCbJrA4p4C5 RMOuOWBzhIBqPl9fzDyBkW8WwkkLGBlXMQoUpeYkVprqwT27iREchIUROxj/L7M6xCjAwajE wzvhQmGUEGtiWXFlLtBvHMxKIrwbRYqihHhTEiurUovy44tKc1KLDzH6AN02kVlKNDkfGCF5 JfGGxhbGliYWBgYmlmYmOISVxHmPKAHNEkhPLEnNTk0tSC2CGcfEwSnVwHg6+EfRdkdb1psH whbpGf8Ql9P1n68sOeOMlPBbr4tnb/HduiydYnLwoMT5y7eKVuatyYu6fpkx2M+n0UBLgclD cEK09o+QSTtX6lVpzjX4tTOQQalw4ip7RzWpmYflDv+wY9I/EtMatOaurryhxhGRXUz2T360 H+jRKHzC9MYuNmrV6wlvNiixAJOFoRZzUXEiAC+CuSpvAgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org After parsing the TCP-options, some option-kinds might trigger a drop of the segment (e.g., as is the case for TCP_MD5). As we are moving to consolidate the TCP_MD5-code in follow-up patches, we need to add the capability to drop a segment right after parsing the options in tcp_fast_parse_options(). Originally, tcp_fast_parse_options() returned false, when there is no timestamp option, except in the case of the slow-path processing through tcp_parse_options() where it always returns true. So, the return-value of tcp_fast_parse_options() was kind of inconsistent. With this patch, we make it return true when the segment should get dropped based on the parsed options, and false otherwise. In tcp_validate_incoming, we will then just check for tp->rx_opt.saw_tstamp to see if we should verify PAWS. The goto will be used in a follow-up patch to check whether one of the options triggers a drop of the segment. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_input.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index cfa51cfd2d99..1fbabcc99b62 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3847,6 +3847,8 @@ static bool tcp_parse_aligned_timestamp(struct tcp_sock *tp, const struct tcphdr /* Fast parse options. This hopes to only see timestamps. * If it is wrong it falls back on tcp_parse_options(). + * + * Returns true if we should drop this packet based on present TCP-options. */ static bool tcp_fast_parse_options(const struct net *net, const struct sk_buff *skb, @@ -3857,18 +3859,19 @@ static bool tcp_fast_parse_options(const struct net *net, */ if (th->doff == (sizeof(*th) / 4)) { tp->rx_opt.saw_tstamp = 0; - return false; + goto extra_opt_check; } else if (tp->rx_opt.tstamp_ok && th->doff == ((sizeof(*th) + TCPOLEN_TSTAMP_ALIGNED) / 4)) { if (tcp_parse_aligned_timestamp(tp, th)) - return true; + goto extra_opt_check; } tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL); if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr) tp->rx_opt.rcv_tsecr -= tp->tsoffset; - return true; +extra_opt_check: + return false; } #ifdef CONFIG_TCP_MD5SIG @@ -5188,9 +5191,11 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, struct tcp_sock *tp = tcp_sk(sk); bool rst_seq_match = false; + if (tcp_fast_parse_options(sock_net(sk), skb, th, tp)) + goto discard; + /* RFC1323: H1. Apply PAWS check first. */ - if (tcp_fast_parse_options(sock_net(sk), skb, th, tp) && - tp->rx_opt.saw_tstamp && + if (tp->rx_opt.saw_tstamp && tcp_paws_discard(sk, skb)) { if (!th->rst) { NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSESTABREJECTED); From patchwork Thu Feb 1 00:07:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868112 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="OXPqhEZI"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX16D3wt0z9ryk for ; Thu, 1 Feb 2018 11:22:44 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755905AbeBAAWk (ORCPT ); Wed, 31 Jan 2018 19:22:40 -0500 Received: from mail-out24.apple.com ([17.171.2.34]:44461 "EHLO mail-in24.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753461AbeBAAWj (ORCPT ); Wed, 31 Jan 2018 19:22:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443649; x=2381357249; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Uj7gI/6n8pQM4OHUaAU71IfMKdFGC6rGzMjPE5Pml+g=; b=OXPqhEZIZbTk6aB4lrgcJGXxxHWSY+llxlrgo0p94U6ZjsFR4Y3OhLTzDnjp/ESx U+Q2ny5AyNwQpyV3CTuyEEa3U9GSCICz/MW/Y43TgSGwAtGLorrkpwsOwSl6M2dd izoYCgyHLhg//clGGwblmVJOzPoWdLfqXWRTGqWDxJckq9JbSDf5Iva13pyrkGOD 2gzTziEsKNL8+87gaMRHkNfOL7JOT44ovOAIUy6YZYILXZ73SQoGrEoy1o3QYYqu c7mLk1BB9RJw52c91vKEe5NE/G0I7rlg/fXFCsL/HfVJsDNk+JxYqvwevSu97Qdo bPiY0asbuxrFN0L5mHJWdg==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in24.apple.com (Apple Secure Mail Relay) with SMTP id 78.38.10828.14A527A5; Wed, 31 Jan 2018 16:07:29 -0800 (PST) X-AuditID: 11ab0218-260a89e000002a4c-67-5a725a418287 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay6.apple.com (Apple SCV relay) with SMTP id BC.83.05652.14A527A5; Wed, 31 Jan 2018 16:07:29 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G001AN30HQ770@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:29 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ursula Braun Subject: [RFC v2 04/14] tcp_smc: Make smc_parse_options return 1 on success Date: Wed, 31 Jan 2018 16:07:06 -0800 Message-id: <20180201000716.69301-5-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNLMWRmVeSWpSXmKPExsUi2FAYpesYVRRlcHmfocXTY4/YLf629LNY HFsgZrF0upIDi8eCTaUe804Gejw4tJnF4/MmuQCWKC6blNSczLLUIn27BK6Ml+uOMhW84K7o 3zORpYFxA2cXIyeHhICJxI3WS0xdjFwcQgJrmCQa2+4CORxgiYdv3CDihxglJk1tZgOJMwvI Sxw8LwsRb2SSON77hRlkkLCApET3nTtgNpuAlsTb2+2sILaIgJTExx3b2UEamAU6GCUeN81m g2jwlji1dB87iM0ioCpxsauHCcTmFTCTuHLzGjvEdfISh980gQ3iFDCXaNg1BywuBFTz+fpi ZpChEgJH2CR+Tm5kmcAoOAvhwAWMjKsYhXMTM3N0M/OMTPQSCwpyUvWS83M3MYJCczWTxA7G L68NDzEKcDAq8fBOuFAYJcSaWFZcmXuIUZqDRUmcN1I5K0pIID2xJDU7NbUgtSi+qDQntfgQ IxMHp1QDI7v4dK6Sv/48okaSq1494Hru+2PLe8HU/mk9k0UOCleovdWSiml2WJ75PHn6XR9v idqDTztDxKfc43i9c6vnwS+yOyadtpqx5sChffdnHWWtPqXGmhJqfXveiSKNuI5bG2t+7M76 5nPGeMF9j/cmfsLRNruPvP0z87aFbuoTq2nNdk1pAsfa9yixFGckGmoxFxUnAgCvW9U1LgIA AA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8Q9cxqijKYMs/UYunxx6xW/xt6Wex OLZAzGLpdCUHFo8Fm0o95p0M9HhwaDOLx+dNcgEsUYY2aflF5YlFKQpFyQUltkrFGYkp+eXx lsZGpg6JBQU5qXrJ+blK+nY2Kak5mWWpRfp2CYYZL9cdZSp4wV3Rv2ciSwPjBs4uRg4OCQET iYdv3LoYuTiEBA4xSkya2swGEmcWkJc4eF4WIt7IJHG89wtzFyMnh7CApET3nTtgNpuAlsTb 2+2sILaIgJTExx3b2UEamAU6GCUeN81mg2jwlji1dB87iM0ioCpxsauHCcTmFTCTuHLzGlhc AmjZ4TdNYIM4BcwlGnbNAYsLAdV8vr6YeQIj3yyEmxYwMq5iFChKzUmsNNODe3YTIzgsC6N2 MDYstzrEKMDBqMTD++JSYZQQa2JZcWUu0HMczEoivBtFiqKEeFMSK6tSi/Lji0pzUosPMfoA 3TaRWUo0OR8YM3kl8YbGFsaWJhYGBiaWZiY4hJXEeQ8rAc0SSE8sSc1OTS1ILYIZx8TBKdXA mKf0+Go13+mN07OVdrwyUxf5/DTvt3WZWW3v8a+BH3feyPURXVXmw1KZ3rXxz8R7J0Unnpk9 rUA7aEWk1I9/dqIstdNsKxXj3P4tEGEK7Fnhs+Pd5Ukhsnujm8x6dRjzGL4Wdu3f8VirxV2S u+WwpYP9Ts25S8tL5A553rnBnfrfzWRH7dYTSizAZGGoxVxUnAgAfvwq73gCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org As we allow a generic TCP-option parser that also parses experimental TCP options, we need to add a return-value to smc_parse_options() that indicates whether the option actually matched or not. Cc: Ursula Braun Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_input.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1fbabcc99b62..94ba88b2246b 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3691,19 +3691,22 @@ static void tcp_parse_fastopen_option(int len, const unsigned char *cookie, foc->exp = exp_opt; } -static void smc_parse_options(const struct tcphdr *th, - struct tcp_options_received *opt_rx, - const unsigned char *ptr, - int opsize) +static int smc_parse_options(const struct tcphdr *th, + struct tcp_options_received *opt_rx, + const unsigned char *ptr, + int opsize) { #if IS_ENABLED(CONFIG_SMC) if (static_branch_unlikely(&tcp_have_smc)) { if (th->syn && !(opsize & 1) && opsize >= TCPOLEN_EXP_SMC_BASE && - get_unaligned_be32(ptr) == TCPOPT_SMC_MAGIC) + get_unaligned_be32(ptr) == TCPOPT_SMC_MAGIC) { opt_rx->smc_ok = 1; + return 1; + } } #endif + return 0; } /* Look for tcp options. Normally only called on SYN and SYNACK packets. From patchwork Thu Feb 1 00:07:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868101 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="O18Qz6Xr"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0mr35vtz9s7M for ; Thu, 1 Feb 2018 11:07:40 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754625AbeBAAHh (ORCPT ); Wed, 31 Jan 2018 19:07:37 -0500 Received: from mail-out5.apple.com ([17.151.62.27]:53104 "EHLO mail-in5.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754269AbeBAAHb (ORCPT ); Wed, 31 Jan 2018 19:07:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443650; x=2381357250; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=RvmdLDXOwEMNm1xNMOGsXWvvQSnrJXbSFQEClTgD4wg=; b=O18Qz6XrIb9cJaazhruqiWxoiH94WodUN0nuUkjCP8MbFm8hIx4b0h2Ehr/05F9+ f3hezqPCSTD7/itYeVoP1xJlpcgDwf5kDaJOY4qtRXKAU5PAwhzboIC0MQ95GPgX 2S/GMfFpdBEgTKDSvTUiyfvmILJ6LCKYTEny5yD6+7w3oNeZDYpyEKBPTT8ZxjiE Y71An02ZKHYRc0C7dk18xJnLWgiuDUza8q0xQopnMWph5zHNBYMYGcyaORei+tLY GKssoKGLC4tGgWjVU8gDb6OjwHGwUTlNfrSt5N/7lVLWvDAAsQms254XOd3zF6Uw NtexaoCQxN9s5q1ExEt+uA==; Received: from relay8.apple.com (relay8.apple.com [17.128.113.102]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in5.apple.com (Apple Secure Mail Relay) with SMTP id 65.EE.14264.24A527A5; Wed, 31 Jan 2018 16:07:30 -0800 (PST) X-AuditID: 11973e13-066cc9e0000037b8-14-5a725a428720 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay8.apple.com (Apple SCV relay) with SMTP id B6.09.22651.24A527A5; Wed, 31 Jan 2018 16:07:30 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G001AS30IQ770@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:30 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau Subject: [RFC v2 05/14] tcp: Register handlers for extra TCP options Date: Wed, 31 Jan 2018 16:07:07 -0800 Message-id: <20180201000716.69301-6-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprMLMWRmVeSWpSXmKPExsUi2FCYpusUVRRl0HxDyOLpsUfsFn9b+lks ji0Qc2D2WLCp1GPeyUCPz5vkApijuGxSUnMyy1KL9O0SuDIuLNjGVtB8hKniWEMTawPj+1am LkZODgkBE4kn85eydTFycQgJrGGSmL3rBlzi4azHjBCJQ4wSE1bcZ+1i5OBgFpCXOHheFiLe yCRx/N0adpAGYQFJie47d5hBbDYBLYm3t9tZQWwRASmJjzu2g9UwC8RIvJz9jQmi3kViy7Jn YDaLgKrEq7c7wWxeATOJMyd7mSGOkJc4/KYJbA6ngLlEw645YHOEgGo+X1/MDHKEhMAONokT NxexT2AUnIVw3wJGxlWMQrmJmTm6mXmmeokFBTmpesn5uZsYQaE43U54B+PpVVaHGAU4GJV4 eCdcKIwSYk0sK67MPcQozcGiJM7rKVoUJSSQnliSmp2aWpBaFF9UmpNafIiRiYNTqoEx/Ixt 3s34/9vDU773rbY7e+FgyuSYIA8zPkvmlFtccf9m+k7bqN5vf0pxnud3+RN3L15PujjH023i trd3HH4K2S9Z7/jq+29NjZq7jrpThTt+H/i3/JEMx68fMnP3/EiziV/8Q/xdee79m5mFdq4T 1x+fb2LY6PrvkONrvRPTO/f883Y7Zsa+UYmlOCPRUIu5qDgRAGfWDHomAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrHLMWRmVeSWpSXmKPExsUi2FB8Q9cpqijKYNdOXounxx6xW/xt6Wex OLZAzIHZY8GmUo95JwM9Pm+SC2COMrRJyy8qTyxKUShKLiixVSrOSEzJL4+3NDYydUgsKMhJ 1UvOz1XSt7NJSc3JLEst0rdLMMy4sGAbW0HzEaaKYw1NrA2M71uZuhg5OSQETCQeznrM2MXI xSEkcIhRYsKK+6xdjBwczALyEgfPy0LEG5kkjr9bww7SICwgKdF95w4ziM0moCXx9nY7K4gt IiAl8XHHdrAaZoEYiZezvzFB1LtIbFn2DMxmEVCVePV2J5jNK2AmceZkLzPEEfISh980gc3h FDCXaNg1B2yOEFDN5+uLmScw8s1COGkBI+MqRoGi1JzESgs9uHc3MYIDsTBtB2PTcqtDjAIc jEo8vC8uFUYJsSaWFVfmAv3GwawkwrtRpChKiDclsbIqtSg/vqg0J7X4EKMP0G0TmaVEk/OB UZJXEm9obGFsaWJhYGBiaWaCQ1hJnPeIEtAsgfTEktTs1NSC1CKYcUwcnFINjPHcX4Rc3kfb V1uHJTe8v1Gt5vCyp+ro+cMTWkvtpU8tunru41GuibP98//9iElx0xfqqOScOGHa4ohnyfxV sae2afDf2a6XohymtvQsv3bQmebS4/czPvO+7te6/qzbYDdruNc33/1xn8zfK2eUZViHTv+R YLpzyaFzfGLmsxsOu0S/kvg1SYkFmDAMtZiLihMB7K4CAHECAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mat Martineau Allow additional TCP options to be handled by registered hook functions. Registered options have a priority that determines the order in which options are prepared and written. Lower priority numbers are handled first. Option parsing will call the provided 'parse' function when a TCP option number is not recognized by the normal option parsing code. After parsing, there are two places where we post-process the options. First, a 'check' callback that allows to drop the packet based on the parsed options (e.g., useful for TCP MD5SIG). Then, a 'post_process' function that gets called after other validity checks (aka, in-window, PAWS,...). This post_process function can then update other state for this particular extra-option. In the output-path, the 'prepare' function determines the required space for registered options and store associated data. 'write' adds the option to the TCP header. These additional TCP-options are stored in hlists of the TCP-socket. To pass the state and options around during the 3-way handshake and in time-wait state, the hlists are also on the tcp_request_sock and tcp_timewait_sock. The list is copied from the listener to the request-socket (calling into the 'copy' callback). Then, moved from the request-socket to the TCP-socket and finally to the time-wait socket. Signed-off-by: Mat Martineau Signed-off-by: Christoph Paasch --- Notes: v2: * Fix a compiler error in tcp_twsk_destructor when TCP_MD5SIG is disabled drivers/infiniband/hw/cxgb4/cm.c | 2 +- include/linux/tcp.h | 28 ++++ include/net/tcp.h | 110 ++++++++++++- net/ipv4/syncookies.c | 6 +- net/ipv4/tcp.c | 327 ++++++++++++++++++++++++++++++++++++++- net/ipv4/tcp_input.c | 49 +++++- net/ipv4/tcp_ipv4.c | 98 +++++++++--- net/ipv4/tcp_minisocks.c | 34 +++- net/ipv4/tcp_output.c | 40 ++--- net/ipv6/syncookies.c | 6 +- net/ipv6/tcp_ipv6.c | 32 ++++ 11 files changed, 677 insertions(+), 55 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 21db3b48a617..a1ea5583f07b 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -3746,7 +3746,7 @@ static void build_cpl_pass_accept_req(struct sk_buff *skb, int stid , u8 tos) */ memset(&tmp_opt, 0, sizeof(tmp_opt)); tcp_clear_options(&tmp_opt); - tcp_parse_options(&init_net, skb, &tmp_opt, 0, NULL); + tcp_parse_options(&init_net, skb, &tmp_opt, 0, NULL, NULL); req = __skb_push(skb, sizeof(*req)); memset(req, 0, sizeof(*req)); diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 8f4c54986f97..6e1f0f29bf24 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -115,6 +115,24 @@ static inline void tcp_clear_options(struct tcp_options_received *rx_opt) #endif } +#define OPTION_SACK_ADVERTISE (1 << 0) +#define OPTION_TS (1 << 1) +#define OPTION_MD5 (1 << 2) +#define OPTION_WSCALE (1 << 3) +#define OPTION_FAST_OPEN_COOKIE (1 << 8) +#define OPTION_SMC (1 << 9) + +struct tcp_out_options { + u16 options; /* bit field of OPTION_* */ + u16 mss; /* 0 to disable */ + u8 ws; /* window scale, 0 to disable */ + u8 num_sack_blocks; /* number of SACK blocks to include */ + u8 hash_size; /* bytes in hash_location */ + __u8 *hash_location; /* temporary pointer, overloaded */ + __u32 tsval, tsecr; /* need to include OPTION_TS */ + struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ +}; + /* This is the max number of SACKS that we'll generate and process. It's safe * to increase this, although since: * size = TCPOLEN_SACK_BASE_ALIGNED (4) + n * TCPOLEN_SACK_PERBLOCK (8) @@ -137,6 +155,7 @@ struct tcp_request_sock { * FastOpen it's the seq# * after data-in-SYN. */ + struct hlist_head tcp_option_list; }; static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req) @@ -384,6 +403,8 @@ struct tcp_sock { */ struct request_sock *fastopen_rsk; u32 *saved_syn; + + struct hlist_head tcp_option_list; }; enum tsq_enum { @@ -411,6 +432,11 @@ static inline struct tcp_sock *tcp_sk(const struct sock *sk) return (struct tcp_sock *)sk; } +static inline struct sock *tcp_to_sk(const struct tcp_sock *tp) +{ + return (struct sock *)tp; +} + struct tcp_timewait_sock { struct inet_timewait_sock tw_sk; #define tw_rcv_nxt tw_sk.__tw_common.skc_tw_rcv_nxt @@ -423,6 +449,8 @@ struct tcp_timewait_sock { u32 tw_last_oow_ack_time; long tw_ts_recent_stamp; + + struct hlist_head tcp_option_list; #ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *tw_md5_key; #endif diff --git a/include/net/tcp.h b/include/net/tcp.h index 093e967a2960..be6709e380a6 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -202,6 +202,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo); #define TCPOLEN_FASTOPEN_BASE 2 #define TCPOLEN_EXP_FASTOPEN_BASE 4 #define TCPOLEN_EXP_SMC_BASE 6 +#define TCPOLEN_EXP_BASE 6 /* But this is what stacks really send out. */ #define TCPOLEN_TSTAMP_ALIGNED 12 @@ -403,7 +404,8 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len); void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, - int estab, struct tcp_fastopen_cookie *foc); + int estab, struct tcp_fastopen_cookie *foc, + struct sock *sk); const u8 *tcp_parse_md5sig_option(const struct tcphdr *th); /* @@ -2094,4 +2096,110 @@ static inline bool tcp_bpf_ca_needs_ecn(struct sock *sk) #if IS_ENABLED(CONFIG_SMC) extern struct static_key_false tcp_have_smc; #endif + +struct tcp_extopt_store; + +struct tcp_extopt_ops { + u32 option_kind; + unsigned char priority; + void (*parse)(int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct sock *sk, + struct tcp_extopt_store *store); + bool (*check)(struct sock *sk, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store); + void (*post_process)(struct sock *sk, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store); + /* Return the number of bytes consumed */ + unsigned int (*prepare)(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + __be32 *(*write)(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, struct sock *sk, + struct tcp_extopt_store *store); + int (*response_prepare)(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + __be32 *(*response_write)(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + int (*add_header_len)(const struct sock *orig, + const struct sock *sk, + struct tcp_extopt_store *store); + struct tcp_extopt_store *(*copy)(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *from); + struct tcp_extopt_store *(*move)(struct sock *from, struct sock *to, + struct tcp_extopt_store *store); + void (*destroy)(struct tcp_extopt_store *store); + struct module *owner; +}; + +/* The tcp_extopt_store is the generic structure that will be added to the + * list of TCP extra-options. + * + * Protocols using the framework can create a wrapper structure around it that + * stores protocol-specific state. The tcp_extopt-functions will provide + * tcp_extopt_store though, so the protocol can use container_of to get + * access to the wrapper structure containing the state. + */ +struct tcp_extopt_store { + struct hlist_node list; + const struct tcp_extopt_ops *ops; +}; + +struct hlist_head *tcp_extopt_get_list(const struct sock *sk); + +struct tcp_extopt_store *tcp_extopt_find_kind(u32 kind, const struct sock *sk); + +void tcp_extopt_parse(u32 opcode, int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, struct sock *sk); + +bool tcp_extopt_check(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx); + +void tcp_extopt_post_process(struct sock *sk, + struct tcp_options_received *opt_rx); + +unsigned int tcp_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, struct sock *sk); + +int tcp_extopt_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_extopt_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, struct tcp_out_options *opts, + const struct sock *sk); + +int tcp_extopt_add_header(const struct sock *orig, const struct sock *sk); + +/* Socket lock must be held when calling this function */ +int tcp_register_extopt(struct tcp_extopt_store *store, struct sock *sk); + +void tcp_extopt_copy(struct sock *listener, struct request_sock *req, + struct tcp_options_received *opt); + +void tcp_extopt_move(struct sock *from, struct sock *to); + +void tcp_extopt_destroy(struct sock *sk); + #endif /* _TCP_H */ diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index fda37f2862c9..8373abf19440 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -313,7 +313,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) /* check for timestamp cookie support */ memset(&tcp_opt, 0, sizeof(tcp_opt)); - tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL); + tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL, sk); if (tcp_opt.saw_tstamp && tcp_opt.rcv_tsecr) { tsoff = secure_tcp_ts_off(sock_net(sk), @@ -325,6 +325,10 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) if (!cookie_timestamp_decode(sock_net(sk), &tcp_opt)) goto out; + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tcp_opt)) + goto out; + ret = NULL; req = inet_reqsk_alloc(&tcp_request_sock_ops, sk, false); /* for safety */ if (!req) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 874c9317b8df..ffb5f4fbd935 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -414,6 +414,7 @@ void tcp_init_sock(struct sock *sk) tcp_init_xmit_timers(sk); INIT_LIST_HEAD(&tp->tsq_node); INIT_LIST_HEAD(&tp->tsorted_sent_queue); + INIT_HLIST_HEAD(&tp->tcp_option_list); icsk->icsk_rto = TCP_TIMEOUT_INIT; tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT); @@ -3506,6 +3507,331 @@ EXPORT_SYMBOL(tcp_md5_hash_key); #endif +struct hlist_head *tcp_extopt_get_list(const struct sock *sk) +{ + if (sk_fullsock(sk)) + return &tcp_sk(sk)->tcp_option_list; + else if (sk->sk_state == TCP_NEW_SYN_RECV) + return &tcp_rsk(inet_reqsk(sk))->tcp_option_list; + else if (sk->sk_state == TCP_TIME_WAIT) + return &tcp_twsk(sk)->tcp_option_list; + + return NULL; +} +EXPORT_SYMBOL_GPL(tcp_extopt_get_list); + +/* Caller must ensure that rcu is locked */ +struct tcp_extopt_store *tcp_extopt_find_kind(u32 kind, const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + lhead = tcp_extopt_get_list(sk); + + hlist_for_each_entry_rcu(entry, lhead, list) { + if (entry->ops->option_kind == kind) + return entry; + } + + return NULL; +} +EXPORT_SYMBOL_GPL(tcp_extopt_find_kind); + +void tcp_extopt_parse(u32 opcode, int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, struct sock *sk) +{ + struct tcp_extopt_store *entry; + + rcu_read_lock(); + entry = tcp_extopt_find_kind(opcode, sk); + + if (entry && entry->ops->parse) + entry->ops->parse(opsize, opptr, skb, opt_rx, sk, entry); + rcu_read_unlock(); +} + +bool tcp_extopt_check(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + bool drop = false; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + bool ret = false; + + if (entry->ops->check) + ret = entry->ops->check(sk, skb, opt_rx, entry); + + if (ret) + drop = true; + } + rcu_read_unlock(); + + return drop; +} +EXPORT_SYMBOL_GPL(tcp_extopt_check); + +void tcp_extopt_post_process(struct sock *sk, + struct tcp_options_received *opt_rx) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (entry->ops->post_process) + entry->ops->post_process(sk, opt_rx, entry); + } + rcu_read_unlock(); +} + +unsigned int tcp_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + unsigned int used = 0; + + if (!sk) + return 0; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->prepare)) + continue; + + used += entry->ops->prepare(skb, flags, remaining - used, opts, + sk, entry); + } + rcu_read_unlock(); + + return roundup(used, 4); +} + +void tcp_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + if (!sk) + return; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->write)) + continue; + + ptr = entry->ops->write(ptr, skb, opts, sk, entry); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_write); + +int tcp_extopt_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + unsigned int used = 0; + + if (!sk) + return 0; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + int ret; + + if (unlikely(!entry->ops->response_prepare)) + continue; + + ret = entry->ops->response_prepare(orig, flags, + remaining - used, opts, + sk, entry); + + used += ret; + } + rcu_read_unlock(); + + return roundup(used, 4); +} +EXPORT_SYMBOL_GPL(tcp_extopt_response_prepare); + +void tcp_extopt_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, struct tcp_out_options *opts, + const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + if (!sk) + return; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->response_write)) + continue; + + ptr = entry->ops->response_write(ptr, orig, th, opts, sk, entry); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_response_write); + +int tcp_extopt_add_header(const struct sock *orig, const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + int tcp_header_len = 0; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->add_header_len)) + continue; + + tcp_header_len += entry->ops->add_header_len(orig, sk, entry); + } + rcu_read_unlock(); + + return tcp_header_len; +} + +/* Socket lock must be held when calling this function */ +int tcp_register_extopt(struct tcp_extopt_store *store, struct sock *sk) +{ + struct hlist_node *add_before = NULL; + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + int ret = 0; + + lhead = tcp_extopt_get_list(sk); + + if (!store->ops->option_kind) + return -EINVAL; + + if (!try_module_get(store->ops->owner)) + return -ENOENT; + + hlist_for_each_entry_rcu(entry, lhead, list) { + if (entry->ops->option_kind == store->ops->option_kind) { + pr_notice("Option kind %u already registered\n", + store->ops->option_kind); + module_put(store->ops->owner); + return -EEXIST; + } + + if (entry->ops->priority <= store->ops->priority) + add_before = &entry->list; + } + + if (add_before) + hlist_add_behind_rcu(&store->list, add_before); + else + hlist_add_head_rcu(&store->list, lhead); + + pr_debug("Option kind %u registered\n", store->ops->option_kind); + + return ret; +} +EXPORT_SYMBOL_GPL(tcp_register_extopt); + +void tcp_extopt_copy(struct sock *listener, struct request_sock *req, + struct tcp_options_received *opt) +{ + struct tcp_extopt_store *entry; + struct hlist_head *from, *to; + + from = tcp_extopt_get_list(listener); + to = tcp_extopt_get_list(req_to_sk(req)); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, from, list) { + struct tcp_extopt_store *new; + + if (!try_module_get(entry->ops->owner)) { + pr_err("%s Module get failed while copying\n", __func__); + continue; + } + + new = entry->ops->copy(listener, req, opt, entry); + if (!new) { + module_put(entry->ops->owner); + continue; + } + + hlist_add_tail_rcu(&new->list, to); + } + rcu_read_unlock(); +} + +void tcp_extopt_move(struct sock *from, struct sock *to) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lfrom, *lto; + struct hlist_node *tmp; + + lfrom = tcp_extopt_get_list(from); + lto = tcp_extopt_get_list(to); + + rcu_read_lock(); + hlist_for_each_entry_safe(entry, tmp, lfrom, list) { + hlist_del_rcu(&entry->list); + + if (entry->ops->move) { + entry = entry->ops->move(from, to, entry); + if (!entry) + continue; + } + + hlist_add_tail_rcu(&entry->list, lto); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_move); + +void tcp_extopt_destroy(struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + struct hlist_node *tmp; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_safe(entry, tmp, lhead, list) { + struct module *owner = entry->ops->owner; + + hlist_del_rcu(&entry->list); + + entry->ops->destroy(entry); + + module_put(owner); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_destroy); + void tcp_done(struct sock *sk) { struct request_sock *req = tcp_sk(sk)->fastopen_rsk; @@ -3655,7 +3981,6 @@ void __init tcp_init(void) INIT_HLIST_HEAD(&tcp_hashinfo.bhash[i].chain); } - cnt = tcp_hashinfo.ehash_mask + 1; sysctl_tcp_max_orphans = cnt / 2; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 94ba88b2246b..187e3fa761c8 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3716,7 +3716,7 @@ static int smc_parse_options(const struct tcphdr *th, void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, int estab, - struct tcp_fastopen_cookie *foc) + struct tcp_fastopen_cookie *foc, struct sock *sk) { const unsigned char *ptr; const struct tcphdr *th = tcp_hdr(skb); @@ -3816,9 +3816,18 @@ void tcp_parse_options(const struct net *net, tcp_parse_fastopen_option(opsize - TCPOLEN_EXP_FASTOPEN_BASE, ptr + 2, th->syn, foc, true); - else - smc_parse_options(th, opt_rx, ptr, - opsize); + else if (smc_parse_options(th, opt_rx, ptr, + opsize)) + break; + else if (opsize >= TCPOLEN_EXP_BASE) + tcp_extopt_parse(get_unaligned_be32(ptr), + opsize, ptr, skb, + opt_rx, sk); + break; + + default: + tcp_extopt_parse(opcode, opsize, ptr, skb, + opt_rx, sk); break; } @@ -3869,11 +3878,13 @@ static bool tcp_fast_parse_options(const struct net *net, goto extra_opt_check; } - tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL); + tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL, tcp_to_sk(tp)); if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr) tp->rx_opt.rcv_tsecr -= tp->tsoffset; extra_opt_check: + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + return tcp_extopt_check(tcp_to_sk(tp), skb, &tp->rx_opt); return false; } @@ -5350,6 +5361,9 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb, tp->rx_opt.saw_tstamp = 0; + if (!hlist_empty(&tp->tcp_option_list)) + goto slow_path; + /* pred_flags is 0xS?10 << 16 + snd_wnd * if header_prediction is to be made * 'S' will always be tp->tcp_header_len >> 2 @@ -5537,7 +5551,7 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack, /* Get original SYNACK MSS value if user MSS sets mss_clamp */ tcp_clear_options(&opt); opt.user_mss = opt.mss_clamp = 0; - tcp_parse_options(sock_net(sk), synack, &opt, 0, NULL); + tcp_parse_options(sock_net(sk), synack, &opt, 0, NULL, sk); mss = opt.mss_clamp; } @@ -5600,10 +5614,14 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, int saved_clamp = tp->rx_opt.mss_clamp; bool fastopen_fail; - tcp_parse_options(sock_net(sk), skb, &tp->rx_opt, 0, &foc); + tcp_parse_options(sock_net(sk), skb, &tp->rx_opt, 0, &foc, sk); if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr) tp->rx_opt.rcv_tsecr -= tp->tsoffset; + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tp->rx_opt)) + goto discard; + if (th->ack) { /* rfc793: * "If the state is SYN-SENT then @@ -5686,6 +5704,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tp->tcp_header_len = sizeof(struct tcphdr); } + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_post_process(sk, &tp->rx_opt); + tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); tcp_initialize_rcv_mss(sk); @@ -5779,6 +5800,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tcp_ecn_rcv_syn(tp, th); + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_post_process(sk, &tp->rx_opt); + tcp_mtup_init(sk); tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); tcp_initialize_rcv_mss(sk); @@ -6262,12 +6286,17 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, tcp_rsk(req)->af_specific = af_ops; tcp_rsk(req)->ts_off = 0; + INIT_HLIST_HEAD(&tcp_rsk(req)->tcp_option_list); tcp_clear_options(&tmp_opt); tmp_opt.mss_clamp = af_ops->mss_clamp; tmp_opt.user_mss = tp->rx_opt.user_mss; tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, - want_cookie ? NULL : &foc); + want_cookie ? NULL : &foc, sk); + + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tmp_opt)) + goto drop_and_free; if (want_cookie && !tmp_opt.saw_tstamp) tcp_clear_options(&tmp_opt); @@ -6328,6 +6357,10 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, tcp_reqsk_record_syn(sk, req, skb); fastopen_sk = tcp_try_fastopen(sk, skb, req, &foc, dst); } + + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_copy(sk, req, &tmp_opt); + if (fastopen_sk) { af_ops->send_synack(fastopen_sk, dst, &fl, req, &foc, TCP_SYNACK_FASTOPEN); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 95738aa0d8a6..4112594d04be 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -600,10 +600,9 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) const struct tcphdr *th = tcp_hdr(skb); struct { struct tcphdr th; -#ifdef CONFIG_TCP_MD5SIG - __be32 opt[(TCPOLEN_MD5SIG_ALIGNED >> 2)]; -#endif + __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; + struct hlist_head *extopt_list = NULL; struct ip_reply_arg arg; #ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *key = NULL; @@ -613,6 +612,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) struct sock *sk1 = NULL; #endif struct net *net; + int offset = 0; /* Never send a reset in response to a reset. */ if (th->rst) @@ -624,6 +624,9 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) if (!sk && skb_rtable(skb)->rt_type != RTN_LOCAL) return; + if (sk) + extopt_list = tcp_extopt_get_list(sk); + /* Swap the send and the receive. */ memset(&rep, 0, sizeof(rep)); rep.th.dest = th->source; @@ -678,19 +681,44 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) goto out; } +#endif + + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { + unsigned int remaining; + struct tcp_out_options opts; + int used; + remaining = sizeof(rep.opt); +#ifdef CONFIG_TCP_MD5SIG + if (key) + remaining -= TCPOLEN_MD5SIG_ALIGNED; +#endif + + memset(&opts, 0, sizeof(opts)); + + used = tcp_extopt_response_prepare(skb, TCPHDR_RST, remaining, + &opts, sk); + + arg.iov[0].iov_len += used; + rep.th.doff = arg.iov[0].iov_len / 4; + + tcp_extopt_response_write(&rep.opt[0], skb, &rep.th, &opts, sk); + offset += used / 4; + } + +#ifdef CONFIG_TCP_MD5SIG if (key) { - rep.opt[0] = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | - TCPOLEN_MD5SIG); + rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | + (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | + TCPOLEN_MD5SIG); /* Update length and the length the header thinks exists */ arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; rep.th.doff = arg.iov[0].iov_len / 4; - tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[1], - key, ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, &rep.th); + tcp_v4_md5_hash_hdr((__u8 *)&rep.opt[offset], + key, ip_hdr(skb)->saddr, + ip_hdr(skb)->daddr, &rep.th); } #endif arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, @@ -742,14 +770,14 @@ static void tcp_v4_send_ack(const struct sock *sk, const struct tcphdr *th = tcp_hdr(skb); struct { struct tcphdr th; - __be32 opt[(TCPOLEN_TSTAMP_ALIGNED >> 2) -#ifdef CONFIG_TCP_MD5SIG - + (TCPOLEN_MD5SIG_ALIGNED >> 2) -#endif - ]; + __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; + struct hlist_head *extopt_list = NULL; struct net *net = sock_net(sk); struct ip_reply_arg arg; + int offset = 0; + + extopt_list = tcp_extopt_get_list(sk); memset(&rep.th, 0, sizeof(struct tcphdr)); memset(&arg, 0, sizeof(arg)); @@ -763,6 +791,7 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.opt[1] = htonl(tsval); rep.opt[2] = htonl(tsecr); arg.iov[0].iov_len += TCPOLEN_TSTAMP_ALIGNED; + offset += 3; } /* Swap the send and the receive. */ @@ -774,22 +803,45 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.ack = 1; rep.th.window = htons(win); + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { + unsigned int remaining; + struct tcp_out_options opts; + int used; + + remaining = sizeof(rep.th) + sizeof(rep.opt) - arg.iov[0].iov_len; + #ifdef CONFIG_TCP_MD5SIG - if (key) { - int offset = (tsecr) ? 3 : 0; + if (key) + remaining -= TCPOLEN_MD5SIG_ALIGNED; +#endif + + memset(&opts, 0, sizeof(opts)); + used = tcp_extopt_response_prepare(skb, TCPHDR_ACK, remaining, + &opts, sk); + + arg.iov[0].iov_len += used; + rep.th.doff = arg.iov[0].iov_len / 4; + tcp_extopt_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); + + offset += used / 4; + } + +#ifdef CONFIG_TCP_MD5SIG + if (key) { rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; - rep.th.doff = arg.iov[0].iov_len/4; + rep.th.doff = arg.iov[0].iov_len / 4; tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[offset], key, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &rep.th); } #endif + arg.flags = reply_flags; arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ @@ -893,6 +945,9 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, */ static void tcp_v4_reqsk_destructor(struct request_sock *req) { + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) + tcp_extopt_destroy(req_to_sk(req)); + kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1)); } @@ -1410,6 +1465,11 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, if (likely(*own_req)) { tcp_move_syn(newtp, req); ireq->ireq_opt = NULL; + + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) { + tcp_extopt_move(req_to_sk(req), newsk); + INIT_HLIST_HEAD(&tcp_rsk(req)->tcp_option_list); + } } else { newinet->inet_opt = NULL; } @@ -1907,6 +1967,8 @@ void tcp_v4_destroy_sock(struct sock *sk) /* Cleans up our, hopefully empty, out_of_order_queue. */ skb_rbtree_purge(&tp->out_of_order_queue); + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_destroy(sk); #ifdef CONFIG_TCP_MD5SIG /* Clean up the MD5 key list, if any */ if (tp->md5sig_info) { diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index a8384b0c11f8..46eb5a33aec1 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -95,9 +95,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); bool paws_reject = false; - tmp_opt.saw_tstamp = 0; + tcp_clear_options(&tmp_opt); if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) { - tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL); + tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL, + (struct sock *)tw); if (tmp_opt.saw_tstamp) { if (tmp_opt.rcv_tsecr) @@ -108,6 +109,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, } } + if (unlikely(!hlist_empty(&tcptw->tcp_option_list)) && + tcp_extopt_check((struct sock *)tw, skb, &tmp_opt)) + return TCP_TW_SUCCESS; + if (tw->tw_substate == TCP_FIN_WAIT2) { /* Just repeat all the checks of tcp_rcv_state_process() */ @@ -251,7 +256,7 @@ EXPORT_SYMBOL(tcp_timewait_state_process); void tcp_time_wait(struct sock *sk, int state, int timeo) { const struct inet_connection_sock *icsk = inet_csk(sk); - const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct inet_timewait_sock *tw; struct inet_timewait_death_row *tcp_death_row = &sock_net(sk)->ipv4.tcp_death_row; @@ -271,6 +276,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) tcptw->tw_ts_recent_stamp = tp->rx_opt.ts_recent_stamp; tcptw->tw_ts_offset = tp->tsoffset; tcptw->tw_last_oow_ack_time = 0; + INIT_HLIST_HEAD(&tcptw->tcp_option_list); #if IS_ENABLED(CONFIG_IPV6) if (tw->tw_family == PF_INET6) { @@ -284,6 +290,10 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) } #endif + if (unlikely(!hlist_empty(&tp->tcp_option_list))) { + tcp_extopt_move(sk, (struct sock *)tw); + INIT_HLIST_HEAD(&tp->tcp_option_list); + } #ifdef CONFIG_TCP_MD5SIG /* * The timewait bucket does not have the key DB from the @@ -335,12 +345,15 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) void tcp_twsk_destructor(struct sock *sk) { -#ifdef CONFIG_TCP_MD5SIG struct tcp_timewait_sock *twsk = tcp_twsk(sk); +#ifdef CONFIG_TCP_MD5SIG if (twsk->tw_md5_key) kfree_rcu(twsk->tw_md5_key, rcu); #endif + + if (unlikely(!hlist_empty(&twsk->tcp_option_list))) + tcp_extopt_destroy(sk); } EXPORT_SYMBOL_GPL(tcp_twsk_destructor); @@ -470,6 +483,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, INIT_LIST_HEAD(&newtp->tsq_node); INIT_LIST_HEAD(&newtp->tsorted_sent_queue); + INIT_HLIST_HEAD(&newtp->tcp_option_list); tcp_init_wl(newtp, treq->rcv_isn); @@ -545,6 +559,9 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, if (newtp->af_specific->md5_lookup(sk, newsk)) newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; #endif + if (unlikely(!hlist_empty(&treq->tcp_option_list))) + newtp->tcp_header_len += tcp_extopt_add_header(req_to_sk(req), newsk); + if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len) newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len; newtp->rx_opt.mss_clamp = req->mss; @@ -587,9 +604,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, bool paws_reject = false; bool own_req; - tmp_opt.saw_tstamp = 0; + tcp_clear_options(&tmp_opt); if (th->doff > (sizeof(struct tcphdr)>>2)) { - tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL); + tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL, + req_to_sk(req)); if (tmp_opt.saw_tstamp) { tmp_opt.ts_recent = req->ts_recent; @@ -604,6 +622,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, } } + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list)) && + tcp_extopt_check(req_to_sk(req), skb, &tmp_opt)) + return NULL; + /* Check for pure retransmitted SYN. */ if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn && flg == TCP_FLAG_SYN && diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index e598bf54e3fb..6d418ce06b59 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -398,13 +398,6 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) return tp->snd_una != tp->snd_up; } -#define OPTION_SACK_ADVERTISE (1 << 0) -#define OPTION_TS (1 << 1) -#define OPTION_MD5 (1 << 2) -#define OPTION_WSCALE (1 << 3) -#define OPTION_FAST_OPEN_COOKIE (1 << 8) -#define OPTION_SMC (1 << 9) - static void smc_options_write(__be32 *ptr, u16 *options) { #if IS_ENABLED(CONFIG_SMC) @@ -420,17 +413,6 @@ static void smc_options_write(__be32 *ptr, u16 *options) #endif } -struct tcp_out_options { - u16 options; /* bit field of OPTION_* */ - u16 mss; /* 0 to disable */ - u8 ws; /* window scale, 0 to disable */ - u8 num_sack_blocks; /* number of SACK blocks to include */ - u8 hash_size; /* bytes in hash_location */ - __u8 *hash_location; /* temporary pointer, overloaded */ - __u32 tsval, tsecr; /* need to include OPTION_TS */ - struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ -}; - /* Write previously computed TCP options to the packet. * * Beware: Something in the Internet is very sensitive to the ordering of @@ -447,12 +429,15 @@ struct tcp_out_options { static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, struct tcp_out_options *opts) { + struct hlist_head *extopt_list; u16 options = opts->options; /* mungable copy */ struct tcp_sock *tp = NULL; if (sk_fullsock(sk)) tp = tcp_sk(sk); + extopt_list = tcp_extopt_get_list(sk); + if (unlikely(OPTION_MD5 & options)) { *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); @@ -543,6 +528,9 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, } smc_options_write(ptr, &options); + + if (unlikely(!hlist_empty(extopt_list))) + tcp_extopt_write(ptr, skb, opts, sk); } static void smc_set_option(const struct tcp_sock *tp, @@ -645,6 +633,10 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, smc_set_option(tp, opts, &remaining); + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN, remaining, + opts, tcp_to_sk(tp)); + return MAX_TCP_OPTION_SPACE - remaining; } @@ -708,6 +700,11 @@ static unsigned int tcp_synack_options(const struct sock *sk, smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) + remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN | TCPHDR_ACK, + remaining, opts, + req_to_sk(req)); + return MAX_TCP_OPTION_SPACE - remaining; } @@ -741,6 +738,10 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb size += TCPOLEN_TSTAMP_ALIGNED; } + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + size += tcp_extopt_prepare(skb, 0, MAX_TCP_OPTION_SPACE - size, + opts, tcp_to_sk(tp)); + eff_sacks = tp->rx_opt.num_sacks + tp->rx_opt.dsack; if (unlikely(eff_sacks)) { const unsigned int remaining = MAX_TCP_OPTION_SPACE - size; @@ -3308,6 +3309,9 @@ static void tcp_connect_init(struct sock *sk) tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; #endif + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tp->tcp_header_len += tcp_extopt_add_header(sk, sk); + /* If user gave his TCP_MAXSEG, record it to clamp */ if (tp->rx_opt.user_mss) tp->rx_opt.mss_clamp = tp->rx_opt.user_mss; diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index e7a3a6b6cf56..d0716c7e9390 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -162,7 +162,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) /* check for timestamp cookie support */ memset(&tcp_opt, 0, sizeof(tcp_opt)); - tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL); + tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL, sk); if (tcp_opt.saw_tstamp && tcp_opt.rcv_tsecr) { tsoff = secure_tcpv6_ts_off(sock_net(sk), @@ -174,6 +174,10 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) if (!cookie_timestamp_decode(sock_net(sk), &tcp_opt)) goto out; + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tcp_opt)) + goto out; + ret = NULL; req = inet_reqsk_alloc(&tcp6_request_sock_ops, sk, false); if (!req) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index a1ab29e2ab3b..202bf011f462 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -499,6 +499,9 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst, static void tcp_v6_reqsk_destructor(struct request_sock *req) { + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) + tcp_extopt_destroy(req_to_sk(req)); + kfree(inet_rsk(req)->ipv6_opt); kfree_skb(inet_rsk(req)->pktopts); } @@ -788,6 +791,8 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 unsigned int tot_len = sizeof(struct tcphdr); struct dst_entry *dst; __be32 *topt; + struct hlist_head *extopt_list = NULL; + struct tcp_out_options extraopts; if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; @@ -796,6 +801,25 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 tot_len += TCPOLEN_MD5SIG_ALIGNED; #endif + if (sk) + extopt_list = tcp_extopt_get_list(sk); + + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { + unsigned int remaining = MAX_TCP_OPTION_SPACE - tot_len; + u8 extraflags = rst ? TCPHDR_RST : 0; + int used; + + if (!rst || !th->ack) + extraflags |= TCPHDR_ACK; + + memset(&extraopts, 0, sizeof(extraopts)); + + used = tcp_extopt_response_prepare(skb, extraflags, remaining, + &extraopts, sk); + + tot_len += used; + } + buff = alloc_skb(MAX_HEADER + sizeof(struct ipv6hdr) + tot_len, GFP_ATOMIC); if (!buff) @@ -836,6 +860,9 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 } #endif + if (unlikely(extopt_list && !hlist_empty(extopt_list))) + tcp_extopt_response_write(topt, skb, t1, &extraopts, sk); + memset(&fl6, 0, sizeof(fl6)); fl6.daddr = ipv6_hdr(skb)->saddr; fl6.saddr = ipv6_hdr(skb)->daddr; @@ -1230,6 +1257,11 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * skb_set_owner_r(newnp->pktoptions, newsk); } } + + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) { + tcp_extopt_move(req_to_sk(req), newsk); + INIT_HLIST_HEAD(&tcp_rsk(req)->tcp_option_list); + } } return newsk; From patchwork Thu Feb 1 00:07:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868113 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="mvSKg8NX"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX16J73jcz9ryk for ; Thu, 1 Feb 2018 11:22:48 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755931AbeBAAWq (ORCPT ); Wed, 31 Jan 2018 19:22:46 -0500 Received: from mail-out25.apple.com ([17.171.2.35]:50282 "EHLO mail-in25.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753461AbeBAAWo (ORCPT ); Wed, 31 Jan 2018 19:22:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443651; x=2381357251; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Zq7sOhg/AKmgt7Cg/qTRacuuPbEJilUF6w2Fdg7DikU=; b=mvSKg8NXl3dUbCCPE2JG6icRRaavZDCbQlDctKOir4YQwEVg4rEl5mdFuta+uTVy g0nMHRRCAO3yKkEiqxlZ0LJrLz3vu7Chpmxbi8GfLa+rKP6iLuWCPwuMNy94kS92 +r1/goNMsHztd/VRdvanY75tuqF9X+t/Uf8+ThUGjsBjCWxMbz1WAvCqCB9i9RQS 37BkmPNMvzpO6S2WwauOrG0RDzFFHzXVm7laueIol+PXCa0bpGa0DSpnDiTyDqGg IDy2kMIxipVbGUT+ZIEy26kkpmy01EXATuE5QSisWdHLIKvOfa1XL0GPsBzA2l5q 4odTeCrWSwgyvAXLcSJl9g==; Received: from relay7.apple.com (relay7.apple.com [17.128.113.101]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in25.apple.com (Apple Secure Mail Relay) with SMTP id 75.63.14365.34A527A5; Wed, 31 Jan 2018 16:07:31 -0800 (PST) X-AuditID: 11ab0219-e904d9e00000381d-d6-5a725a43901b Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay7.apple.com (Apple SCV relay) with SMTP id 62.FC.05443.24A527A5; Wed, 31 Jan 2018 16:07:30 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006DU30I15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:30 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ursula Braun Subject: [RFC v2 06/14] tcp_smc: Make SMC use TCP extra-option framework Date: Wed, 31 Jan 2018 16:07:08 -0800 Message-id: <20180201000716.69301-7-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprDLMWRmVeSWpSXmKPExsUi2FCYquscVRRlsGGJjMXTY4/YLf629LNY HFsgZrF0upIDi8eCTaUe804Gejw4tJnF4/MmuQCWKC6blNSczLLUIn27BK6Mf51bWAsOdTNW XJx1jrGBcU5+FyMnh4SAicT2Fx8ZQWwhgbVMErt3aMPEb/7tZ+1i5AKKH2KUmHH1NXsXIwcH s4C8xMHzshDxRiaJhW1HWUEahAUkJbrv3GEGsdkEtCTe3m4Hi4sISEl83LGdHaSBWaCDUeJx 02w2iAYPiUnX74JtZhFQlbjT/AEszitgJjH91zs2iCvkJQ6/aQIbxClgLtGwaw47xKVmEp+v L2YGGSohcIJN4vKb7YwTGAVnIRy4gJFxFaNwbmJmjm5mnpGpXmJBQU6qXnJ+7iZGUHCuZpLc wfj1teEhRgEORiUe3gkXCqOEWBPLiitzDzFKc7AoifNGKmdFCQmkJ5akZqemFqQWxReV5qQW H2Jk4uCUamAU3RskJ7JgdaGbm/SrrTIpFTtKGa+xhj7doejpf22HUu1y3y8rV5+KMX9jargp Zdvpv56KEd2rVq9/8m+SZZ1rkt87d+/qhYxMW6/em9LW4u0UfeVAEvNsn50P3H/8D5PfuL0h 71HyMpazm7/WZG+sWBYt9H+j5rukFx+OG+4+vlvN+4V458e5SizFGYmGWsxFxYkAJfNDQy8C AAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOLMWRmVeSWpSXmKPExsUi2FB8Q9cpqijK4OwyS4unxx6xW/xt6Wex OLZAzGLpdCUHFo8Fm0o95p0M9HhwaDOLx+dNcgEsUYY2aflF5YlFKQpFyQUltkrFGYkp+eXx lsZGpg6JBQU5qXrJ+blK+nY2Kak5mWWpRfp2CYYZ/zq3sBYc6masuDjrHGMD45z8LkZODgkB E4mbf/tZuxi5OIQEDjFKzLj6mr2LkYODWUBe4uB5WYh4I5PEwrajrCANwgKSEt137jCD2GwC WhJvb7eDxUUEpCQ+7tjODtLALNDBKPG4aTYbRIOHxKTrdxlBbBYBVYk7zR/A4rwCZhLTf71j g7hCXuLwmyawQZwC5hINu+awg9hCQDWfry9mnsDINwvhpgWMjKsYBYpScxIrzfXg3t3ECA7N wtQdjI3LrQ4xCnAwKvHwvrhUGCXEmlhWXJkL9BwHs5II70aRoigh3pTEyqrUovz4otKc1OJD jD5At01klhJNzgfGTV5JvKGxhbGliYWBgYmlmQkOYSVx3iNKQLME0hNLUrNTUwtSi2DGMXFw SjUwFpu8S+AT7Xx363dC8XHzv1v2dzSXsOS1nPnE0t+RWRsovGf+ke9HTJ8nTtnvEvdAu8jq nYjeTPZlW5oYf6wqzhOL7FTc1l3AXKe799mZ6IqrxbkrFym8rLu30lduUvPDpMPstjefysY8 OLdiBdN6LtaH1tFrdxaJbOPQ/Hh2tn2jx/KbYUlrlFiACcNQi7moOBEA0UCwZHoCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Adopt the extra-option framework for SMC. It allows us to entirely remove SMC-code out of the TCP-stack. The static key is gone, as this is now covered by the static key of the extra-option framework. We allocate state (struct tcp_smc_opt) that indicates whether SMC was successfully negotiated or not and check this state in the relevant functions. Cc: Ursula Braun Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 3 +- include/net/inet_sock.h | 3 +- include/net/tcp.h | 4 - net/ipv4/tcp.c | 5 -- net/ipv4/tcp_input.c | 36 --------- net/ipv4/tcp_minisocks.c | 18 ----- net/ipv4/tcp_output.c | 54 -------------- net/smc/af_smc.c | 190 +++++++++++++++++++++++++++++++++++++++++++++-- 8 files changed, 186 insertions(+), 127 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 6e1f0f29bf24..0958b3760cfc 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -257,8 +257,7 @@ struct tcp_sock { syn_fastopen_ch:1, /* Active TFO re-enabling probe */ syn_data_acked:1,/* data in SYN is acked by SYN-ACK */ save_syn:1, /* Save headers of SYN packet */ - is_cwnd_limited:1,/* forward progress limited by snd_cwnd? */ - syn_smc:1; /* SYN includes SMC */ + is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */ u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */ /* RTT measurement */ diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index 0a671c32d6b9..4efa6cb14705 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -90,8 +90,7 @@ struct inet_request_sock { wscale_ok : 1, ecn_ok : 1, acked : 1, - no_srccheck: 1, - smc_ok : 1; + no_srccheck: 1; u32 ir_mark; union { struct ip_options_rcu __rcu *ireq_opt; diff --git a/include/net/tcp.h b/include/net/tcp.h index be6709e380a6..2a565883e2ef 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2093,10 +2093,6 @@ static inline bool tcp_bpf_ca_needs_ecn(struct sock *sk) return (tcp_call_bpf(sk, BPF_SOCK_OPS_NEEDS_ECN, 0, NULL) == 1); } -#if IS_ENABLED(CONFIG_SMC) -extern struct static_key_false tcp_have_smc; -#endif - struct tcp_extopt_store; struct tcp_extopt_ops { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index ffb5f4fbd935..f08542d91e1c 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -292,11 +292,6 @@ EXPORT_SYMBOL(sysctl_tcp_mem); atomic_long_t tcp_memory_allocated; /* Current allocated memory. */ EXPORT_SYMBOL(tcp_memory_allocated); -#if IS_ENABLED(CONFIG_SMC) -DEFINE_STATIC_KEY_FALSE(tcp_have_smc); -EXPORT_SYMBOL(tcp_have_smc); -#endif - /* * Current number of TCP sockets. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 187e3fa761c8..fd2693baee4a 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3691,24 +3691,6 @@ static void tcp_parse_fastopen_option(int len, const unsigned char *cookie, foc->exp = exp_opt; } -static int smc_parse_options(const struct tcphdr *th, - struct tcp_options_received *opt_rx, - const unsigned char *ptr, - int opsize) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (th->syn && !(opsize & 1) && - opsize >= TCPOLEN_EXP_SMC_BASE && - get_unaligned_be32(ptr) == TCPOPT_SMC_MAGIC) { - opt_rx->smc_ok = 1; - return 1; - } - } -#endif - return 0; -} - /* Look for tcp options. Normally only called on SYN and SYNACK packets. * But, this can also be called on packets in the established flow when * the fast version below fails. @@ -3816,9 +3798,6 @@ void tcp_parse_options(const struct net *net, tcp_parse_fastopen_option(opsize - TCPOLEN_EXP_FASTOPEN_BASE, ptr + 2, th->syn, foc, true); - else if (smc_parse_options(th, opt_rx, ptr, - opsize)) - break; else if (opsize >= TCPOLEN_EXP_BASE) tcp_extopt_parse(get_unaligned_be32(ptr), opsize, ptr, skb, @@ -5595,16 +5574,6 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack, return false; } -static void smc_check_reset_syn(struct tcp_sock *tp) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (tp->syn_smc && !tp->rx_opt.smc_ok) - tp->syn_smc = 0; - } -#endif -} - static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, const struct tcphdr *th) { @@ -5715,8 +5684,6 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * is initialized. */ tp->copied_seq = tp->rcv_nxt; - smc_check_reset_syn(tp); - smp_mb(); tcp_finish_connect(sk, skb); @@ -6173,9 +6140,6 @@ static void tcp_openreq_init(struct request_sock *req, ireq->ir_rmt_port = tcp_hdr(skb)->source; ireq->ir_num = ntohs(tcp_hdr(skb)->dest); ireq->ir_mark = inet_request_mark(sk, skb); -#if IS_ENABLED(CONFIG_SMC) - ireq->smc_ok = rx_opt->smc_ok; -#endif } struct request_sock *inet_reqsk_alloc(const struct request_sock_ops *ops, diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 46eb5a33aec1..5e08dce49a00 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -435,21 +435,6 @@ void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst) } EXPORT_SYMBOL_GPL(tcp_ca_openreq_child); -static void smc_check_reset_syn_req(struct tcp_sock *oldtp, - struct request_sock *req, - struct tcp_sock *newtp) -{ -#if IS_ENABLED(CONFIG_SMC) - struct inet_request_sock *ireq; - - if (static_branch_unlikely(&tcp_have_smc)) { - ireq = inet_rsk(req); - if (oldtp->syn_smc && !ireq->smc_ok) - newtp->syn_smc = 0; - } -#endif -} - /* This is not only more efficient than what we used to do, it eliminates * a lot of code duplication between IPv4/IPv6 SYN recv processing. -DaveM * @@ -467,9 +452,6 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, struct tcp_request_sock *treq = tcp_rsk(req); struct inet_connection_sock *newicsk = inet_csk(newsk); struct tcp_sock *newtp = tcp_sk(newsk); - struct tcp_sock *oldtp = tcp_sk(sk); - - smc_check_reset_syn_req(oldtp, req, newtp); /* Now setup tcp_sock */ newtp->pred_flags = 0; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6d418ce06b59..549e33a30b41 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -398,21 +398,6 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) return tp->snd_una != tp->snd_up; } -static void smc_options_write(__be32 *ptr, u16 *options) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (unlikely(OPTION_SMC & *options)) { - *ptr++ = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_EXP << 8) | - (TCPOLEN_EXP_SMC_BASE)); - *ptr++ = htonl(TCPOPT_SMC_MAGIC); - } - } -#endif -} - /* Write previously computed TCP options to the packet. * * Beware: Something in the Internet is very sensitive to the ordering of @@ -527,45 +512,10 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, ptr += (len + 3) >> 2; } - smc_options_write(ptr, &options); - if (unlikely(!hlist_empty(extopt_list))) tcp_extopt_write(ptr, skb, opts, sk); } -static void smc_set_option(const struct tcp_sock *tp, - struct tcp_out_options *opts, - unsigned int *remaining) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (tp->syn_smc) { - if (*remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) { - opts->options |= OPTION_SMC; - *remaining -= TCPOLEN_EXP_SMC_BASE_ALIGNED; - } - } - } -#endif -} - -static void smc_set_option_cond(const struct tcp_sock *tp, - const struct inet_request_sock *ireq, - struct tcp_out_options *opts, - unsigned int *remaining) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (tp->syn_smc && ireq->smc_ok) { - if (*remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) { - opts->options |= OPTION_SMC; - *remaining -= TCPOLEN_EXP_SMC_BASE_ALIGNED; - } - } - } -#endif -} - /* Compute TCP options for SYN packets. This is not the final * network wire format yet. */ @@ -631,8 +581,6 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, } } - smc_set_option(tp, opts, &remaining); - if (unlikely(!hlist_empty(&tp->tcp_option_list))) remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN, remaining, opts, tcp_to_sk(tp)); @@ -698,8 +646,6 @@ static unsigned int tcp_synack_options(const struct sock *sk, } } - smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); - if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN | TCPHDR_ACK, remaining, opts, diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 267e68379110..1b942a73609e 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -44,6 +44,149 @@ #include "smc_rx.h" #include "smc_close.h" +static unsigned int tcp_smc_opt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); +static __be32 *tcp_smc_opt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store); +static void tcp_smc_opt_parse(int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct sock *sk, + struct tcp_extopt_store *store); +static void tcp_smc_opt_post_process(struct sock *sk, + struct tcp_options_received *opt, + struct tcp_extopt_store *store); +static struct tcp_extopt_store *tcp_smc_opt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store); +static void tcp_smc_opt_destroy(struct tcp_extopt_store *store); + +struct tcp_smc_opt { + struct tcp_extopt_store store; + int smc_ok:1; /* SMC supported on this connection */ + struct rcu_head rcu; +}; + +static const struct tcp_extopt_ops tcp_smc_extra_ops = { + .option_kind = TCPOPT_SMC_MAGIC, + .parse = tcp_smc_opt_parse, + .post_process = tcp_smc_opt_post_process, + .prepare = tcp_smc_opt_prepare, + .write = tcp_smc_opt_write, + .copy = tcp_smc_opt_copy, + .destroy = tcp_smc_opt_destroy, + .owner = THIS_MODULE, +}; + +static struct tcp_smc_opt *tcp_extopt_to_smc(struct tcp_extopt_store *store) +{ + return container_of(store, struct tcp_smc_opt, store); +} + +static struct tcp_smc_opt *tcp_smc_opt_find(struct sock *sk) +{ + struct tcp_extopt_store *ext_opt; + + ext_opt = tcp_extopt_find_kind(TCPOPT_SMC_MAGIC, sk); + + return tcp_extopt_to_smc(ext_opt); +} + +static unsigned int tcp_smc_opt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ + if (!(flags & TCPHDR_SYN)) + return 0; + + if (remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) { + opts->options |= OPTION_SMC; + return TCPOLEN_EXP_SMC_BASE_ALIGNED; + } + + return 0; +} + +static __be32 *tcp_smc_opt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store) +{ + if (unlikely(OPTION_SMC & opts->options)) { + *ptr++ = htonl((TCPOPT_NOP << 24) | + (TCPOPT_NOP << 16) | + (TCPOPT_EXP << 8) | + (TCPOLEN_EXP_SMC_BASE)); + *ptr++ = htonl(TCPOPT_SMC_MAGIC); + } + + return ptr; +} + +static void tcp_smc_opt_parse(int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct sock *sk, + struct tcp_extopt_store *store) +{ + struct tcphdr *th = tcp_hdr(skb); + + if (th->syn && !(opsize & 1) && opsize >= TCPOLEN_EXP_SMC_BASE) + opt_rx->smc_ok = 1; +} + +static void tcp_smc_opt_post_process(struct sock *sk, + struct tcp_options_received *opt, + struct tcp_extopt_store *store) +{ + struct tcp_smc_opt *smc_opt = tcp_extopt_to_smc(store); + + if (sk->sk_state != TCP_SYN_SENT) + return; + + if (opt->smc_ok) + smc_opt->smc_ok = 1; + else + smc_opt->smc_ok = 0; +} + +static struct tcp_extopt_store *tcp_smc_opt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store) +{ + struct tcp_smc_opt *smc_opt; + + /* First, check if the peer sent us the smc-opt */ + if (!opt->smc_ok) + return NULL; + + smc_opt = kzalloc(sizeof(*smc_opt), GFP_ATOMIC); + if (!smc_opt) + return NULL; + + smc_opt->store.ops = &tcp_smc_extra_ops; + + smc_opt->smc_ok = 1; + + return (struct tcp_extopt_store *)smc_opt; +} + +static void tcp_smc_opt_destroy(struct tcp_extopt_store *store) +{ + struct tcp_smc_opt *smc_opt = tcp_extopt_to_smc(store); + + kfree_rcu(smc_opt, rcu); +} + static DEFINE_MUTEX(smc_create_lgr_pending); /* serialize link group * creation */ @@ -389,6 +532,7 @@ static int smc_connect_rdma(struct smc_sock *smc) struct smc_clc_msg_accept_confirm aclc; int local_contact = SMC_FIRST_CONTACT; struct smc_ib_device *smcibdev; + struct tcp_smc_opt *smc_opt; struct smc_link *link; u8 srv_first_contact; int reason_code = 0; @@ -397,7 +541,8 @@ static int smc_connect_rdma(struct smc_sock *smc) sock_hold(&smc->sk); /* sock put in passive closing */ - if (!tcp_sk(smc->clcsock->sk)->syn_smc) { + smc_opt = tcp_smc_opt_find(smc->clcsock->sk); + if (!smc_opt || !smc_opt->smc_ok) { /* peer has not signalled SMC-capability */ smc->use_fallback = true; goto out_connected; @@ -548,6 +693,7 @@ static int smc_connect_rdma(struct smc_sock *smc) static int smc_connect(struct socket *sock, struct sockaddr *addr, int alen, int flags) { + struct tcp_smc_opt *smc_opt; struct sock *sk = sock->sk; struct smc_sock *smc; int rc = -EINVAL; @@ -561,9 +707,17 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr, goto out_err; smc->addr = addr; /* needed for nonblocking connect */ + smc_opt = kzalloc(sizeof(*smc_opt), GFP_KERNEL); + if (!smc_opt) { + rc = -ENOMEM; + goto out_err; + } + smc_opt->store.ops = &tcp_smc_extra_ops; + lock_sock(sk); switch (sk->sk_state) { default: + rc = -EINVAL; goto out; case SMC_ACTIVE: rc = -EISCONN; @@ -573,8 +727,15 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr, break; } + /* We are the only owner of smc->clcsock->sk, so we can be lockless */ + rc = tcp_register_extopt(&smc_opt->store, smc->clcsock->sk); + if (rc) { + release_sock(smc->clcsock->sk); + kfree(smc_opt); + goto out_err; + } + smc_copy_sock_settings_to_clc(smc); - tcp_sk(smc->clcsock->sk)->syn_smc = 1; rc = kernel_connect(smc->clcsock, addr, alen, flags); if (rc) goto out; @@ -768,6 +929,7 @@ static void smc_listen_work(struct work_struct *work) struct smc_clc_msg_proposal *pclc; struct smc_ib_device *smcibdev; struct sockaddr_in peeraddr; + struct tcp_smc_opt *smc_opt; u8 buf[SMC_CLC_MAX_LEN]; struct smc_link *link; int reason_code = 0; @@ -777,7 +939,8 @@ static void smc_listen_work(struct work_struct *work) u8 ibport; /* check if peer is smc capable */ - if (!tcp_sk(newclcsock->sk)->syn_smc) { + smc_opt = tcp_smc_opt_find(newclcsock->sk); + if (!smc_opt || !smc_opt->smc_ok) { new_smc->use_fallback = true; goto out_connected; } @@ -987,10 +1150,18 @@ static void smc_tcp_listen_work(struct work_struct *work) static int smc_listen(struct socket *sock, int backlog) { + struct tcp_smc_opt *smc_opt; struct sock *sk = sock->sk; struct smc_sock *smc; int rc; + smc_opt = kzalloc(sizeof(*smc_opt), GFP_KERNEL); + if (!smc_opt) { + rc = -ENOMEM; + goto out_err; + } + smc_opt->store.ops = &tcp_smc_extra_ops; + smc = smc_sk(sk); lock_sock(sk); @@ -1003,11 +1174,19 @@ static int smc_listen(struct socket *sock, int backlog) sk->sk_max_ack_backlog = backlog; goto out; } + + /* We are the only owner of smc->clcsock->sk, so we can be lockless */ + rc = tcp_register_extopt(&smc_opt->store, smc->clcsock->sk); + if (rc) { + release_sock(smc->clcsock->sk); + kfree(smc_opt); + goto out_err; + } + /* some socket options are handled in core, so we could not apply * them to the clc socket -- copy smc socket options to clc socket */ smc_copy_sock_settings_to_clc(smc); - tcp_sk(smc->clcsock->sk)->syn_smc = 1; rc = kernel_listen(smc->clcsock, backlog); if (rc) @@ -1022,6 +1201,7 @@ static int smc_listen(struct socket *sock, int backlog) out: release_sock(sk); +out_err: return rc; } @@ -1460,7 +1640,6 @@ static int __init smc_init(void) goto out_sock; } - static_branch_enable(&tcp_have_smc); return 0; out_sock: @@ -1485,7 +1664,6 @@ static void __exit smc_exit(void) list_del_init(&lgr->list); smc_lgr_free(lgr); /* free link group */ } - static_branch_disable(&tcp_have_smc); smc_ib_unregister_client(); sock_unregister(PF_SMC); proto_unregister(&smc_proto); From patchwork Thu Feb 1 00:07:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868103 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="mKYL3JrG"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0n103JLz9s7M for ; Thu, 1 Feb 2018 11:07:49 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754276AbeBAAHg (ORCPT ); Wed, 31 Jan 2018 19:07:36 -0500 Received: from mail-out4.apple.com ([17.151.62.26]:62940 "EHLO mail-in4.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754305AbeBAAHc (ORCPT ); Wed, 31 Jan 2018 19:07:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443651; x=2381357251; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=jWiu3hPXeK/aEC2160okeAua7NIIvsUe4bR8+zG9SUM=; b=mKYL3JrGQjGSgcEV39VhGCUrA4UW78HeXOPfN5aU7hD5J3leKdLX/aeY5raqMvzB wMzwk9XQnrEXRUNWBweYMvwLlPk2m40gOM6Rwudetb9/ZK3HMCXrxn9w+eCcJLrl yb+rRsJhl3+LAxBAcKKCUfdRtW8UUV/+OGsnZbsydKFyAgWyiZ0eD7ZNjS3VhXqR EFjwQ9y09ep5lJJRaqiDO13jihHv7HwtWJczjS5OZMdoL/9zJSH2nIKbCi3oBIbU pgErIcxDheyL7a+VT8meN+LkRKtoRJQuYfC93ZiON+HKky19z0N7HypyvhZKAZKv o2mX8wUYpqdVsjMH+19sTQ==; Received: from relay3.apple.com (relay3.apple.com [17.128.113.83]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in4.apple.com (Apple Secure Mail Relay) with SMTP id 87.8A.10621.34A527A5; Wed, 31 Jan 2018 16:07:31 -0800 (PST) X-AuditID: 11973e12-c67d59e00000297d-66-5a725a43da2a Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay3.apple.com (Apple SCV relay) with SMTP id 8E.7C.12852.34A527A5; Wed, 31 Jan 2018 16:07:31 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006DW30J15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:31 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 07/14] tcp_md5: Don't pass along md5-key Date: Wed, 31 Jan 2018 16:07:09 -0800 Message-id: <20180201000716.69301-8-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNLMWRmVeSWpSXmKPExsUi2FAYrOscVRRlsK9dx2L33XCLp8cesVv8 belnsTi2QMyBxWP39CZGjwWbSj3mnQz0+LxJLoAlissmJTUnsyy1SN8ugStjbsNy5oLlFhWb F+xmbmDcodvFyMkhIWAicWH5BsYuRi4OIYHVTBJffi1jh0k0HuhhgkgcYpT4+28LcxcjBwez gLzEwfOyEPFGJon1E2aBNQgLSEp037nDDGKzCWhJvL3dzgpiiwhISXzcsZ0dpIFZoIlR4tHC cywQDZYSE9dcBWtgEVCVeHisF8zmFTCTWP9iGTPEFfISh980gQ3iFDCXaNg1B2yZEFDN5+uL mUGGSggcYZNY23iKcQKj4CyEAxcwMq5iFMpNzMzRzcwz0UssKMhJ1UvOz93ECArN6XZCOxhP rbI6xCjAwajEwzvhQmGUEGtiWXFl7iFGaQ4WJXFeL9GiKCGB9MSS1OzU1ILUovii0pzU4kOM TBycUg2M8cIvFeo+J7ltmOUU+nK1mN1CK5W/PF67Jj08s/Iw/92FvZWqaz9xP1xd1Sa5h++a e/u3J4pSh2r2qu3Y6LrcJfmivuTP9Hyhm9zrLVotL5xSNTp7W3Vr0sefsl8NeMvybXOPlGS4 amt/f8jZ/HlKrJhW7LrTLku3eSm/YTaYWB364nTe5s2HlFiKMxINtZiLihMBIKhjvi4CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsUi2FB8Q9c5qijKYMUBdYvdd8Mtnh57xG7x t6WfxeLYAjEHFo/d05sYPRZsKvWYdzLQ4/MmuQCWKEObtPyi8sSiFIWi5IISW6XijMSU/PJ4 S2MjU4fEgoKcVL3k/FwlfTublNSczLLUIn27BMOMuQ3LmQuWW1RsXrCbuYFxh24XIyeHhICJ ROOBHqYuRi4OIYFDjBJ//21h7mLk4GAWkJc4eF4WIt7IJLF+wix2kAZhAUmJ7jt3mEFsNgEt ibe321lBbBEBKYmPO7azgzQwCzQxSjxaeI4FosFSYuKaq2ANLAKqEg+P9YLZvAJmEutfLGOG uEJe4vCbJrBBnALmEg275oAtEwKq+Xx9MfMERr5ZCDctYGRcxShQlJqTWGmsB/ftJkZwYBYG 72D8s8zqEKMAB6MSD++EC4VRQqyJZcWVuUDPcTArifBuFCmKEuJNSaysSi3Kjy8qzUktPsTo A3TbRGYp0eR8YNTklcQbGlsYW5pYGBiYWJqZ4BBWEuc9ogQ0SyA9sSQ1OzW1ILUIZhwTB6dU A6PaopTmqKv8qd0H+C4pHChiipBq3R8i9usJZ97cWFaDH2lL99qZhkfEnQ69+o7jUULlxWTW N14sqwwdf6/YYtq2lCVP/lmroq/tAi6vFrdnuTkV7pN/LOpJ7FIqPupUyrC5T7KvMZL7WJC+ gpnhlOuRBk8fTt51a6VDX1kAU1CeFfOTqz+dlViA6cJQi7moOBEAEqOy/XkCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org It is much cleaner to store the key-pointer in tcp_out_options. It allows to remove some MD5-specific code out of the function-arguments and paves the way to adopting the TCP-option framework with TCP-MD5. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 1 + net/ipv4/tcp_output.c | 46 +++++++++++++++++++--------------------------- 2 files changed, 20 insertions(+), 27 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 0958b3760cfc..ef0279194ef9 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -131,6 +131,7 @@ struct tcp_out_options { __u8 *hash_location; /* temporary pointer, overloaded */ __u32 tsval, tsecr; /* need to include OPTION_TS */ struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ + struct tcp_md5sig_key *md5; /* TCP_MD5 signature key */ }; /* This is the max number of SACKS that we'll generate and process. It's safe diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 549e33a30b41..facbdf4fe9be 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -520,21 +520,18 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, * network wire format yet. */ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, - struct tcp_out_options *opts, - struct tcp_md5sig_key **md5) + struct tcp_out_options *opts) { struct tcp_sock *tp = tcp_sk(sk); unsigned int remaining = MAX_TCP_OPTION_SPACE; struct tcp_fastopen_request *fastopen = tp->fastopen_req; #ifdef CONFIG_TCP_MD5SIG - *md5 = tp->af_specific->md5_lookup(sk, sk); - if (*md5) { + opts->md5 = tp->af_specific->md5_lookup(sk, sk); + if (opts->md5) { opts->options |= OPTION_MD5; remaining -= TCPOLEN_MD5SIG_ALIGNED; } -#else - *md5 = NULL; #endif /* We always get an MSS option. The option bytes which will be seen in @@ -549,7 +546,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, opts->mss = tcp_advertise_mss(sk); remaining -= TCPOLEN_MSS_ALIGNED; - if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !*md5)) { + if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !opts->md5)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; @@ -593,14 +590,13 @@ static unsigned int tcp_synack_options(const struct sock *sk, struct request_sock *req, unsigned int mss, struct sk_buff *skb, struct tcp_out_options *opts, - const struct tcp_md5sig_key *md5, struct tcp_fastopen_cookie *foc) { struct inet_request_sock *ireq = inet_rsk(req); unsigned int remaining = MAX_TCP_OPTION_SPACE; #ifdef CONFIG_TCP_MD5SIG - if (md5) { + if (opts->md5) { opts->options |= OPTION_MD5; remaining -= TCPOLEN_MD5SIG_ALIGNED; @@ -658,8 +654,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, * final wire format yet. */ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb, - struct tcp_out_options *opts, - struct tcp_md5sig_key **md5) + struct tcp_out_options *opts) { struct tcp_sock *tp = tcp_sk(sk); unsigned int size = 0; @@ -668,13 +663,13 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->options = 0; #ifdef CONFIG_TCP_MD5SIG - *md5 = tp->af_specific->md5_lookup(sk, sk); - if (unlikely(*md5)) { + opts->md5 = tp->af_specific->md5_lookup(sk, sk); + if (unlikely(opts->md5)) { opts->options |= OPTION_MD5; size += TCPOLEN_MD5SIG_ALIGNED; } #else - *md5 = NULL; + opts->md5 = NULL; #endif if (likely(tp->rx_opt.tstamp_ok)) { @@ -992,7 +987,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, struct tcp_out_options opts; unsigned int tcp_options_size, tcp_header_size; struct sk_buff *oskb = NULL; - struct tcp_md5sig_key *md5; struct tcphdr *th; int err; @@ -1021,10 +1015,9 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, memset(&opts, 0, sizeof(opts)); if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) - tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5); + tcp_options_size = tcp_syn_options(sk, skb, &opts); else - tcp_options_size = tcp_established_options(sk, skb, &opts, - &md5); + tcp_options_size = tcp_established_options(sk, skb, &opts); tcp_header_size = tcp_options_size + sizeof(struct tcphdr); /* if no packet is in qdisc/device queue, then allow XPS to select @@ -1090,10 +1083,10 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, tcp_options_write((__be32 *)(th + 1), skb, sk, &opts); #ifdef CONFIG_TCP_MD5SIG /* Calculate the MD5 hash, as we have all we need now */ - if (md5) { + if (opts.md5) { sk_nocaps_add(sk, NETIF_F_GSO_MASK); tp->af_specific->calc_md5_hash(opts.hash_location, - md5, sk, skb); + opts.md5, sk, skb); } #endif @@ -1537,7 +1530,6 @@ unsigned int tcp_current_mss(struct sock *sk) u32 mss_now; unsigned int header_len; struct tcp_out_options opts; - struct tcp_md5sig_key *md5; mss_now = tp->mss_cache; @@ -1547,7 +1539,7 @@ unsigned int tcp_current_mss(struct sock *sk) mss_now = tcp_sync_mss(sk, mtu); } - header_len = tcp_established_options(sk, NULL, &opts, &md5) + + header_len = tcp_established_options(sk, NULL, &opts) + sizeof(struct tcphdr); /* The mss_cache is sized based on tp->tcp_header_len, which assumes * some common options. If this is an odd packet (because we have SACK @@ -3128,7 +3120,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, { struct inet_request_sock *ireq = inet_rsk(req); const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *md5 = NULL; struct tcp_out_options opts; struct sk_buff *skb; int tcp_header_size; @@ -3174,10 +3165,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #ifdef CONFIG_TCP_MD5SIG rcu_read_lock(); - md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); + opts.md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); #endif skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4); - tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5, + tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, foc) + sizeof(*th); skb_push(skb, tcp_header_size); @@ -3204,9 +3195,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #ifdef CONFIG_TCP_MD5SIG /* Okay, we have all we need - do the md5 hash if needed */ - if (md5) + if (opts.md5) tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location, - md5, req_to_sk(req), skb); + opts.md5, + req_to_sk(req), skb); rcu_read_unlock(); #endif From patchwork Thu Feb 1 00:07:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868107 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="U84NXnYk"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0nL6jDpz9s7M for ; Thu, 1 Feb 2018 11:08:06 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754753AbeBAAIE (ORCPT ); Wed, 31 Jan 2018 19:08:04 -0500 Received: from mail-out4.apple.com ([17.151.62.26]:62940 "EHLO mail-in4.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754310AbeBAAHd (ORCPT ); Wed, 31 Jan 2018 19:07:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443652; x=2381357252; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=qqTp/6KLL7+/RAMv0Y0LXOpiSs13RK8NPDOyjrkil+0=; b=U84NXnYkErQn/ZkShU2k65t7+5OQKj06uSjyXffp+gZh+Ja2/wt1u9tbITlOi+ee 8TmOimuurdmduu1Ijy5Yfxv0ovduKTkQvf4/Pna1Wkujks7AqbLp4ATtAOoQS+8a NCmlAdmArC2dbEz93xNLDhYNGgbhoYg5D7nDZ1yP0PKSy9MZZdbTsQweCpLENlnA qGMdSuSOxFQw2XG8xB6We33o/ELo86enmuPd/JcArE0pusU7z2oucCDKlOv+wOPn kvYymSIJO0Rm/9H8nW5r3eQDDJnXEAQBVYHIEtg/L7LktOkL1I4NrHuwpwj3Qpjb 8d9wyIcHcpZ5qLJki/LVjQ==; Received: from relay3.apple.com (relay3.apple.com [17.128.113.83]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in4.apple.com (Apple Secure Mail Relay) with SMTP id 28.8A.10621.44A527A5; Wed, 31 Jan 2018 16:07:32 -0800 (PST) X-AuditID: 11973e12-c67d59e00000297d-69-5a725a44ce56 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay3.apple.com (Apple SCV relay) with SMTP id 7F.7C.12852.44A527A5; Wed, 31 Jan 2018 16:07:32 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006DY30K15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:32 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 08/14] tcp_md5: Detect key inside tcp_v4_send_ack instead of passing it as an argument Date: Wed, 31 Jan 2018 16:07:10 -0800 Message-id: <20180201000716.69301-9-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJLMWRmVeSWpSXmKPExsUi2FAYrOsSVRRlsKFL02L33XCLp8cesVv8 belnsTi2QMyBxWP39CZGjwWbSj3mnQz0+LxJLoAlissmJTUnsyy1SN8ugSuj7+AS5oIvQhXP Ny5nbWCcyd/FyMkhIWAisfriFVYQW0hgNZPE9U1ZMPE3u3cCxbmA4ocYJVZO2MfSxcjBwSwg L3HwvCxEfSOTxNMDniC2sICkRPedO8wgNpuAlsTb2+1gM0UEpCQ+7tjODjKHWaCJUeLRwnMs EA2ZEnfm/GMHsVkEVCUWbH8C1sArYCbxce4rdogj5CUOv2kCi3MKmEs07JrDDrHYTOLz9cXM IEMlBE6wSZw9tohtAqPgLIT7FjAyrmIUyk3MzNHNzDPRSywoyEnVS87P3cQICsvpdkI7GE+t sjrEKMDBqMTDO+FCYZQQa2JZcWXuIUZpDhYlcV4v0aIoIYH0xJLU7NTUgtSi+KLSnNTiQ4xM HJxSDYz+p5bUzPHSYZZWXzH5QKpOW6Tu9Z5dJt8Pb5v+qddmXfUrH5ajtjyxk+3tX528dFu7 2cxdhu9mttzhpfdup8Qt6Mgxtm38qPVJu06+qCLL4M5cvb9Pd6+pP7O1wXgxV9DW6rPJfpEn VllOnsJR/vhDnGvgO1s1oS7JlQXMW7VcxWLz/IOYjiqxFGckGmoxFxUnAgD8eEnkLAIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8Q9clqijK4PRFVYvdd8Mtnh57xG7x t6WfxeLYAjEHFo/d05sYPRZsKvWYdzLQ4/MmuQCWKEObtPyi8sSiFIWi5IISW6XijMSU/PJ4 S2MjU4fEgoKcVL3k/FwlfTublNSczLLUIn27BMOMvoNLmAu+CFU837ictYFxJn8XIyeHhICJ xJvdO1m7GLk4hAQOMUqsnLCPpYuRg4NZQF7i4HlZkBohgUYmiacHPEFsYQFJie47d5hBbDYB LYm3t9tZQWwRASmJjzu2s4PMYRZoYpR4tPAcC0RDpsSdOf/YQWwWAVWJBdufgDXwCphJfJz7 ih3iCHmJw2+awOKcAuYSDbvmsEMsNpP4fH0x8wRGvlkIJy1gZFzFKFCUmpNYaawH9+wmRnBY FgbvYPyzzOoQowAHoxIP74QLhVFCrIllxZW5QL9xMCuJ8G4UKYoS4k1JrKxKLcqPLyrNSS0+ xOgDdNtEZinR5HxgzOSVxBsaWxhbmlgYGJhYmpngEFYS5z2iBDRLID2xJDU7NbUgtQhmHBMH p1QD45RIFw+vt8YP9XvZv1jIBC7fEHFA2ONolc6jmtTdR1+2l5ycaH/02/1KpScHu893Tj/M //fNlmfyH7R9difKX25z9dFePvlE9px0rkpb1vm2mg5JHAFpzJ9lFmXv3VejWp/+71bhRd/w WuXP15+X7NweN/Wo0f6O1ewbgiRPVGz4ELrx0CruciUWYLIw1GIuKk4EAGqPnjp4AgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This will simplify to consolidate the TCP_MD5-code into a single place. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_ipv4.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 4112594d04be..4211f8e38ef9 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -764,7 +764,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) static void tcp_v4_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, int reply_flags, u8 tos) { const struct tcphdr *th = tcp_hdr(skb); @@ -773,6 +772,9 @@ static void tcp_v4_send_ack(const struct sock *sk, __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct hlist_head *extopt_list = NULL; +#ifdef CONFIG_TCP_MD5SIG + struct tcp_md5sig_key *key; +#endif struct net *net = sock_net(sk); struct ip_reply_arg arg; int offset = 0; @@ -803,6 +805,17 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.ack = 1; rep.th.window = htons(win); +#ifdef CONFIG_TCP_MD5SIG + if (sk->sk_state == TCP_TIME_WAIT) { + key = tcp_twsk_md5_key(tcp_twsk(sk)); + } else if (sk->sk_state == TCP_NEW_SYN_RECV) { + key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr, + AF_INET); + } else { + key = NULL; /* Should not happen */ + } +#endif + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; struct tcp_out_options opts; @@ -872,7 +885,6 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) tcp_time_stamp_raw() + tcptw->tw_ts_offset, tcptw->tw_ts_recent, tw->tw_bound_dev_if, - tcp_twsk_md5_key(tcptw), tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0, tw->tw_tos ); @@ -900,8 +912,6 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, req->ts_recent, 0, - tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr, - AF_INET), inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, ip_hdr(skb)->tos); } From patchwork Thu Feb 1 00:07:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868105 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="T7xHpphx"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0nH4m39z9s7M for ; Thu, 1 Feb 2018 11:08:03 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932106AbeBAAIB (ORCPT ); Wed, 31 Jan 2018 19:08:01 -0500 Received: from mail-out4.apple.com ([17.151.62.26]:62940 "EHLO mail-in4.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754449AbeBAAHf (ORCPT ); Wed, 31 Jan 2018 19:07:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443653; x=2381357253; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=lndFsBCVqdfEpTGVBmNCtRVqlCxdJK/JTMA1CXa41+I=; b=T7xHpphxaAugWZTDEXdOjW70/9ufjQJBel3b2QUsK5A8PuSH91xKfSpT92tixC8l 1yCazgQ4nKH/WwllxWvUtrDuSO87/a116Fp+Q9j7/DVju1F2OQvsB4GNhkcj485D zmTGGopt/CgdUWVPrKlovEwzF8NVaAnt2Z0CK2fSMnr7jo6s84TmfveNCfWdk/0x RPeBh+mhPcxIR+O3CKNy1f1P2RkSqoEl05Koqp9sTR8F96lWDbtEBv4DQXC8E9sn VUYpdys0Rm4PZVNHb8BOsmlIq23OUeI/dIVJ0mYJON4DXmbGx/4Tn9Ntd9UcKkQx /MctTqL5MlyGWZqQQo7BJg==; Received: from relay3.apple.com (relay3.apple.com [17.128.113.83]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in4.apple.com (Apple Secure Mail Relay) with SMTP id B8.8A.10621.54A527A5; Wed, 31 Jan 2018 16:07:33 -0800 (PST) X-AuditID: 11973e12-c67d59e00000297d-6d-5a725a456b3d Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay3.apple.com (Apple SCV relay) with SMTP id A0.8C.12852.54A527A5; Wed, 31 Jan 2018 16:07:33 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006E130L15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:33 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 09/14] tcp_md5: Detect key inside tcp_v6_send_response instead of passing it as an argument Date: Wed, 31 Jan 2018 16:07:11 -0800 Message-id: <20180201000716.69301-10-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FAYrOsaVRRl0NjNZ7H7brjF02OP2C3+ tvSzWBxbIObA4rF7ehOjx4JNpR7zTgZ6fN4kF8ASxWWTkpqTWZZapG+XwJVxaudqpoIN1hVt XycxNTAu1u9i5OSQEDCRWH9xHROILSSwmkni5+IAmPiawydZuxi5gOKHGCW+NO0Fcjg4mAXk JQ6el4WINzJJnP36gwWkQVhAUqL7zh1mEJtNQEvi7e12VhBbREBK4uOO7ewgDcwCTYwSjxae g2rIkzj89hLYZhYBVYlZtxeA2bwC5hIrl95mhbhCXuLwmyYwmxMo3rBrDjvEpWYSn68vZgYZ KiFwhE1i84THLBMYBWchHLiAkXEVo1BuYmaObmaeiV5iQUFOql5yfu4mRlBgTrcT2sF4apXV IUYBDkYlHt4JFwqjhFgTy4orcw8xSnOwKInzeokWRQkJpCeWpGanphakFsUXleakFh9iZOLg lGpgzO+YdSFF7vWxa+qtE+6FVOWdj/Nk/P5jt7iP3fpdoYl+q9cdnvGLc+Xuyynf3VoPxq6K 60wMmVbxa8WyD6FvbEo6FWca+d+M2yMkPTco4kT/yl27Eg/zvfY3Nf+aa7rSOsb6zA+1PUr2 d92UzOxuGh35Llf7Sj/loW6zz2HLxGVHwysL5fzTlViKMxINtZiLihMB9K2dQS0CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsUi2FB8Q9c1qijK4Ptabovdd8Mtnh57xG7x t6WfxeLYAjEHFo/d05sYPRZsKvWYdzLQ4/MmuQCWKEObtPyi8sSiFIWi5IISW6XijMSU/PJ4 S2MjU4fEgoKcVL3k/FwlfTublNSczLLUIn27BMOMUztXMxVssK5o+zqJqYFxsX4XIyeHhICJ xJrDJ1m7GLk4hAQOMUp8adoL5HBwMAvISxw8LwsRb2SSOPv1BwtIg7CApET3nTvMIDabgJbE 29vtrCC2iICUxMcd29lBGpgFmhglHi08B9WQJ3H47SUmEJtFQFVi1u0FYDavgLnEyqW3WSGu kJc4/KYJzOYEijfsmsMOYgsJmEl8vr6YeQIj3yyEmxYwMq5iFChKzUmsNNaD+3YTIzgwC4N3 MP5ZZnWIUYCDUYmHd8KFwigh1sSy4spcoOc4mJVEeDeKFEUJ8aYkVlalFuXHF5XmpBYfYvQB um0is5Rocj4wavJK4g2NLYwtTSwMDEwszUxwCCuJ8x5RApolkJ5YkpqdmlqQWgQzjomDU6qB sWOHuFTJf0e+A0tXXwlR1j2t57Ln9PIQZaf9Xq8dHyYYcv18duP5c+FAGYa9rjb75MK2Cl6b L31tn/ffFb/jE9YyS4jN0/zSEP/PU1n8+tyEVa+v1p2+2fXJ+0n4kd4+oZpr32oPcLvaHPjJ eO+2w9OWFTnpD53VMsouyVpeevPPaeY8j3qDU0oswHRhqMVcVJwIAMnVrD15AgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org We want to move all the TCP-MD5 code to a single place which enables us to factor the TCP-MD5 code out of the TCP-stack into the extra-option framework. Detection of whether or not to drop the segment (as done in tcp_v6_send_reset()) has now been moved to tcp_v6_send_response(). So we needed to adapt the latter so that it can handle the case where we want to exit without sending anything. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv6/tcp_ipv6.c | 119 +++++++++++++++++++++++++--------------------------- 1 file changed, 57 insertions(+), 62 deletions(-) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 202bf011f462..8c6d0362299e 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -82,12 +82,6 @@ static const struct inet_connection_sock_af_ops ipv6_specific; #ifdef CONFIG_TCP_MD5SIG static const struct tcp_sock_af_ops tcp_sock_ipv6_specific; static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; -#else -static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, - const struct in6_addr *addr) -{ - return NULL; -} #endif static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb) @@ -779,12 +773,11 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, - int oif, struct tcp_md5sig_key *key, int rst, - u8 tclass, __be32 label) + int oif, int rst, u8 tclass, __be32 label) { const struct tcphdr *th = tcp_hdr(skb); struct tcphdr *t1; - struct sk_buff *buff; + struct sk_buff *buff = NULL; struct flowi6 fl6; struct net *net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); struct sock *ctl_sk = net->ipv6.tcp_sk; @@ -793,10 +786,54 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 __be32 *topt; struct hlist_head *extopt_list = NULL; struct tcp_out_options extraopts; +#ifdef CONFIG_TCP_MD5SIG + struct tcp_md5sig_key *key = NULL; + const __u8 *hash_location = NULL; + struct ipv6hdr *ipv6h = ipv6_hdr(skb); +#endif if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG + rcu_read_lock(); + hash_location = tcp_parse_md5sig_option(th); + if (sk && sk_fullsock(sk)) { + key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (sk && sk->sk_state == TCP_TIME_WAIT) { + struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + + key = tcp_twsk_md5_key(tcptw); + } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { + key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (hash_location) { + unsigned char newhash[16]; + struct sock *sk1 = NULL; + int genhash; + + /* active side is lost. Try to find listening socket through + * source port, and then find md5 key through listening socket. + * we are not loose security here: + * Incoming packet is checked with md5 hash with finding key, + * no RST generated if md5 hash doesn't match. + */ + sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), + &tcp_hashinfo, NULL, 0, + &ipv6h->saddr, + th->source, &ipv6h->daddr, + ntohs(th->source), tcp_v6_iif(skb), + tcp_v6_sdif(skb)); + if (!sk1) + goto out; + + key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); + if (!key) + goto out; + + genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); + if (genhash || memcmp(hash_location, newhash, 16) != 0) + goto out; + } + if (key) tot_len += TCPOLEN_MD5SIG_ALIGNED; #endif @@ -823,7 +860,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 buff = alloc_skb(MAX_HEADER + sizeof(struct ipv6hdr) + tot_len, GFP_ATOMIC); if (!buff) - return; + goto out; skb_reserve(buff, MAX_HEADER + sizeof(struct ipv6hdr) + tot_len); @@ -900,24 +937,21 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 TCP_INC_STATS(net, TCP_MIB_OUTSEGS); if (rst) TCP_INC_STATS(net, TCP_MIB_OUTRSTS); - return; + buff = NULL; } +out: kfree_skb(buff); + +#ifdef CONFIG_TCP_MD5SIG + rcu_read_unlock(); +#endif } static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) { const struct tcphdr *th = tcp_hdr(skb); u32 seq = 0, ack_seq = 0; - struct tcp_md5sig_key *key = NULL; -#ifdef CONFIG_TCP_MD5SIG - const __u8 *hash_location = NULL; - struct ipv6hdr *ipv6h = ipv6_hdr(skb); - unsigned char newhash[16]; - int genhash; - struct sock *sk1 = NULL; -#endif int oif = 0; if (th->rst) @@ -929,38 +963,6 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) if (!sk && !ipv6_unicast_destination(skb)) return; -#ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (hash_location) { - /* - * active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), tcp_v6_iif(skb), - tcp_v6_sdif(skb)); - if (!sk1) - goto out; - - key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); - if (!key) - goto out; - - genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out; - } -#endif - if (th->ack) seq = ntohl(th->ack_seq); else @@ -972,20 +974,14 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) trace_tcp_send_reset(sk, skb); } - tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, key, 1, 0, 0); - -#ifdef CONFIG_TCP_MD5SIG -out: - rcu_read_unlock(); -#endif + tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, 1, 0, 0); } static void tcp_v6_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, u8 tclass, - __be32 label) + u8 tclass, __be32 label) { - tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, key, 0, + tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, 0, tclass, label); } @@ -997,7 +993,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) tcp_v6_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt, tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale, tcp_time_stamp_raw() + tcptw->tw_ts_offset, - tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), + tcptw->tw_ts_recent, tw->tw_bound_dev_if, tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel)); inet_twsk_put(tw); @@ -1020,7 +1016,6 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, req->ts_recent, sk->sk_bound_dev_if, - tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr), 0, 0); } From patchwork Thu Feb 1 00:07:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868106 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="ZXTm8YPY"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0nK1FZkz9s7M for ; Thu, 1 Feb 2018 11:08:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932114AbeBAAID (ORCPT ); Wed, 31 Jan 2018 19:08:03 -0500 Received: from mail-out5.apple.com ([17.151.62.27]:53104 "EHLO mail-in5.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754313AbeBAAHe (ORCPT ); Wed, 31 Jan 2018 19:07:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443653; x=2381357253; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=JZMIt46DJJiLzDfV4RNHmUNndRh53DkqX2V15SQNDb0=; b=ZXTm8YPYD92TQp5YbVe7HS3HiCcPX0NgXM0HJSyVtXxfXD1EZehox0xLrfXxZjCm y4yWXrU8gRUqR1Yt07Bytwt2vmNLEaUsG8N3NEy5hF4ynWBFm/xVYktQeLvIKhbi Ffk3qj38GuXGmFsvUwRI+EpdF/Er7Q21e2njsO8x6+olyFXJ90P4x9Phh9HKpll7 cw7M4fn9lVSqrLKOxqn/Escc/rh1SULXU5KaKjmTXRSiSt35MMcku85nyysJ9+Q0 SJQLIfML/IAivlRdu8Ku7mHpeonMKJXquccBhpIfsPskXn+uUjtP9mzaLZuJEtEV E600QlXHSRG2AER/G3D0TA==; Received: from relay4.apple.com (relay4.apple.com [17.128.113.87]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in5.apple.com (Apple Secure Mail Relay) with SMTP id 56.EE.14264.54A527A5; Wed, 31 Jan 2018 16:07:33 -0800 (PST) X-AuditID: 11973e13-b4fff700000037b8-1a-5a725a452e9f Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay4.apple.com (Apple SCV relay) with SMTP id C4.E8.21277.54A527A5; Wed, 31 Jan 2018 16:07:33 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006E530L15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:33 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 10/14] tcp_md5: Check for TCP_MD5 after TCP Timestamps in tcp_established_options Date: Wed, 31 Jan 2018 16:07:12 -0800 Message-id: <20180201000716.69301-11-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FAYrusaVRRlsHmNpcXuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLFJdNSmpOZllqkb5dAlfG993PGQvO8lTc e/WCrYFxH1cXIyeHhICJxOnXy9i6GLk4hARWM0l8aepnhElcf/sSKnGIUeL50XtACQ4OZgF5 iYPnZSHijUwSl361s4I0CAtISnTfucMMYrMJaEm8vQ0RFxGQkvi4Yzs7SAOzQBOjxKOF51gg GlIkVlxYDbaNRUBV4vG7SWwgC3gFzCU+P1OAOEJe4vCbJrA5nEDhhl1z2EFsIQEzic/XFzOD zJQQOMEmsXr+e6YJjIKzEO5bwMi4ilEoNzEzRzczz1QvsaAgJ1UvOT93EyMoMKfbCe9gPL3K 6hCjAAejEg/vhAuFUUKsiWXFlbmHGKU5WJTEeT1Fi6KEBNITS1KzU1MLUovii0pzUosPMTJx cEo1MNawip7dveJe/ouXpc0ah6ov7GJtv+0ydYHB8cPbSiZ8+26czjTnMOemo3NOMcw6ouh+ VeSp5SVdi0LrzHulcw+dmXzN5MexK/d/VmdmHDA3WjDtZOSj7BmbWfq+Z6b5ycXf86ufdUTC vn6aGFf/3Ld6L/95NJez/Nvk1OLhX9jw+p6JbO0NVhklluKMREMt5qLiRAD2bi39LQIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8Q9c1qijK4E6bqcXuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLlKFNWn5ReWJRikJRckGJrVJxRmJKfnm8 pbGRqUNiQUFOql5yfq6Svp1NSmpOZllqkb5dgmHG993PGQvO8lTce/WCrYFxH1cXIyeHhICJ xPW3L9m6GLk4hAQOMUo8P3qPsYuRg4NZQF7i4HlZiHgjk8SlX+2sIA3CApIS3XfuMIPYbAJa Em9vQ8RFBKQkPu7Yzg7SwCzQxCjxaOE5FoiGFIkVF1YzgtgsAqoSj99NYgNZwCtgLvH5mQLE EfISh980gc3hBAo37JrDDmILCZhJfL6+mHkCI98shJMWMDKuYhQoSs1JrDTRg3t2EyM4LAvD dzD+W2Z1iFGAg1GJh/fFpcIoIdbEsuLKXKDfOJiVRHg3ihRFCfGmJFZWpRblxxeV5qQWH2L0 ATptIrOUaHI+MGbySuINjS2MLU0sDAxMLM1McAgrifMeUQKaJZCeWJKanZpakFoEM46Jg1Oq gbGA+4VHxS//zA81SUmyC5J+n93Kp7ma/7P0kjMLN7zYH6g5oWPvxJYJnpxlxySVAqvv3p3x soLfPsC13PTpzsbTh7Kajt1n3nnmQO+WtdqBH9cbyl8pUzgRcj7h/K/Q7c9Wf38uPW2TS+6s a8EKmxnSN/u8+9arocU/7dXVwuKJwgcO3mf6+CZOiQWYLAy1mIuKEwELcS/ieAIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org It really does not matter, because we never use TCP timestamps when TCP_MD5 is enabled (see tcp_syn_options). Moving TCP_MD5 a bit lower allows for easier adoption of the tcp_extra_option framework. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_output.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index facbdf4fe9be..97e6aecc03eb 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -662,6 +662,13 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->options = 0; + if (likely(tp->rx_opt.tstamp_ok)) { + opts->options |= OPTION_TS; + opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0; + opts->tsecr = tp->rx_opt.ts_recent; + size += TCPOLEN_TSTAMP_ALIGNED; + } + #ifdef CONFIG_TCP_MD5SIG opts->md5 = tp->af_specific->md5_lookup(sk, sk); if (unlikely(opts->md5)) { @@ -672,13 +679,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->md5 = NULL; #endif - if (likely(tp->rx_opt.tstamp_ok)) { - opts->options |= OPTION_TS; - opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0; - opts->tsecr = tp->rx_opt.ts_recent; - size += TCPOLEN_TSTAMP_ALIGNED; - } - if (unlikely(!hlist_empty(&tp->tcp_option_list))) size += tcp_extopt_prepare(skb, 0, MAX_TCP_OPTION_SPACE - size, opts, tcp_to_sk(tp)); From patchwork Thu Feb 1 00:07:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868104 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="Mfq+hZZx"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0n250L3z9s7M for ; Thu, 1 Feb 2018 11:07:50 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932091AbeBAAHs (ORCPT ); Wed, 31 Jan 2018 19:07:48 -0500 Received: from mail-out5.apple.com ([17.151.62.27]:53104 "EHLO mail-in5.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753730AbeBAAHg (ORCPT ); Wed, 31 Jan 2018 19:07:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443654; x=2381357254; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=csrvGGIERaQhRbV55vBy3EqGhTBv0ZK8YVLllAaF29U=; b=Mfq+hZZxmU78sUNQeo8PPTAjjPylB532Oba1c7T+1wOKTa9s8nGXP/YA8ZensDBU hcYYq1z0sauBB9Lk91WZSda0loqQucdOlUQdjGGcrGLze2rSKfMmOAvi7hyPzXRm 8x6GpJ7GnaTMXN6cSclPVL+0kGD/CZ+ofSlwTFxLXCr9vuD99xW5gaAmnwKO0+sU jinZx5VMOb5NSmTQpI8UUE8kMLNAMSMmfY5PlYq0GJiykNqdSSaehVPq6nsLAOvd x++vnY8Sh7MQIJH3nlb0lJOTpsjGWqpC3UV/+X+DnRbfsHxlgxDM9G4UT13CQHUs srTMq/UDwarYBQc6T3Wweg==; Received: from relay8.apple.com (relay8.apple.com [17.128.113.102]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in5.apple.com (Apple Secure Mail Relay) with SMTP id F6.EE.14264.64A527A5; Wed, 31 Jan 2018 16:07:34 -0800 (PST) X-AuditID: 11973e13-066cc9e0000037b8-1e-5a725a46d727 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay8.apple.com (Apple SCV relay) with SMTP id 20.19.22651.64A527A5; Wed, 31 Jan 2018 16:07:34 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G00BI930MWZC0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:34 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 11/14] tcp_md5: Move TCP-MD5 code out of TCP itself Date: Wed, 31 Jan 2018 16:07:13 -0800 Message-id: <20180201000716.69301-12-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJLMWRmVeSWpSXmKPExsUi2FCYpusWVRRl8GaSisXuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLFJdNSmpOZllqkb5dAlfG/RMfWAt+N7FW fJsziamB8XwTSxcjB4eEgInEppVGXYxcHEICa5gk7j2/ywoTf/xPCyJ+iFFi3t1rYHFmAXmJ g+dlQUwhgUYmiXf5XYycHMICkhLdd+4wg9hsAloSb2+3s4LYIgJSEh93bGcHGcMs0MQo8Wjh ORaIBheJJX8WgdksAqoSM67OAGvmFTCXaFn1mhHElgBadfhNE9ggTqB4w6457CC2kICZxOfr i5lBhkoInGCTWPemk3ECo+AshPMWMDKuYhTKTczM0c3MM9VLLCjISdVLzs/dxAgKy+l2wjsY T6+yOsQowMGoxMM74UJhlBBrYllxZe4hRmkOFiVxXk/RoighgfTEktTs1NSC1KL4otKc1OJD jEwcnFINjFY+lh6h6vGH90e6thos2xbWUh+9bbJXxIUJBq/WFD/eOJOX8/CSkLiMN/s+fXjY 1Vt3n01ApeO0l5jenTu3Oi7c6Tm7g7PMPnWmmtSNLcc+y7uZtkakOi3ftXwx16vk/Y/tmLi+ b5t2SG+PzJw1K9j8ztjlV37675HOdORL+e8VSv23VQ5cv6DEUpyRaKjFXFScCABZDqbtLAIA AA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8Q9ctqijK4Op0eYvdd8Mtnh57xG7x t6WfxeLYAjEHFo/d05sYPRZsKvWYdzLQ4/MmuQCWKEObtPyi8sSiFIWi5IISW6XijMSU/PJ4 S2MjU4fEgoKcVL3k/FwlfTublNSczLLUIn27BMOM+yc+sBb8bmKt+DZnElMD4/kmli5GDg4J AROJx/+0uhi5OIQEDjFKzLt7jRUkziwgL3HwvCyIKSTQyCTxLr+LkZNDWEBSovvOHWYQm01A S+Lt7XZWEFtEQEri447t7CBjmAWaGCUeLTzHAtHgIrHkzyIwm0VAVWLG1RlgzbwC5hItq14z gtgSQKsOv2kCG8QJFG/YNYcdxBYSMJP4fH0x8wRGvlkIFy1gZFzFKFCUmpNYaaEH9+wmRnBY FqbtYGxabnWIUYCDUYmH98Wlwigh1sSy4spcoNc4mJVEeDeKFEUJ8aYkVlalFuXHF5XmpBYf YvQBum0is5Rocj4wZvJK4g2NLYwtTSwMDEwszUxwCCuJ8x5RApolkJ5YkpqdmlqQWgQzjomD U6qBUeTaL/bk56yOD27oHYwWZtd943aWnWlbgRSPyVzmn38L3EsmcnqWRUg2eqycWruujJfv 1aa/ygvC9Kr/aWl9Pc7I9WpHXFYL04mr+rse7T0h68lnqPg5+bnTTk7hYv7Lu6Lv55wVDfBx Ps37dU6s6FVry0pLm5yJzxSWT8gu/bzdULGIZdtdJRZgsjDUYi4qTgQAsAFrlHgCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is all just copy-pasting the TCP_MD5-code into functions that are placed in net/ipv4/tcp_md5.c. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- Notes: v2: * Add SPDX-identifier (Mat Martineau's feedback) include/linux/inet_diag.h | 1 + include/linux/tcp_md5.h | 137 ++++++ include/net/tcp.h | 77 ---- net/ipv4/Makefile | 1 + net/ipv4/tcp.c | 133 +----- net/ipv4/tcp_diag.c | 81 +--- net/ipv4/tcp_input.c | 38 -- net/ipv4/tcp_ipv4.c | 520 ++------------------- net/ipv4/tcp_md5.c | 1103 +++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_minisocks.c | 23 +- net/ipv4/tcp_output.c | 4 +- net/ipv6/tcp_ipv6.c | 318 +------------ 12 files changed, 1304 insertions(+), 1132 deletions(-) create mode 100644 include/linux/tcp_md5.h create mode 100644 net/ipv4/tcp_md5.c diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h index 39faaaf843e1..1ef6727e41c9 100644 --- a/include/linux/inet_diag.h +++ b/include/linux/inet_diag.h @@ -2,6 +2,7 @@ #ifndef _INET_DIAG_H_ #define _INET_DIAG_H_ 1 +#include #include struct net; diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h new file mode 100644 index 000000000000..d4a2175030d0 --- /dev/null +++ b/include/linux/tcp_md5.h @@ -0,0 +1,137 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_TCP_MD5_H +#define _LINUX_TCP_MD5_H + +#include + +#ifdef CONFIG_TCP_MD5SIG +#include + +#include + +union tcp_md5_addr { + struct in_addr a4; +#if IS_ENABLED(CONFIG_IPV6) + struct in6_addr a6; +#endif +}; + +/* - key database */ +struct tcp_md5sig_key { + struct hlist_node node; + u8 keylen; + u8 family; /* AF_INET or AF_INET6 */ + union tcp_md5_addr addr; + u8 prefixlen; + u8 key[TCP_MD5SIG_MAXKEYLEN]; + struct rcu_head rcu; +}; + +/* - sock block */ +struct tcp_md5sig_info { + struct hlist_head head; + struct rcu_head rcu; +}; + +union tcp_md5sum_block { + struct tcp4_pseudohdr ip4; +#if IS_ENABLED(CONFIG_IPV6) + struct tcp6_pseudohdr ip6; +#endif +}; + +/* - pool: digest algorithm, hash description and scratch buffer */ +struct tcp_md5sig_pool { + struct ahash_request *md5_req; + void *scratch; +}; + +extern const struct tcp_sock_af_ops tcp_sock_ipv4_specific; +extern const struct tcp_sock_af_ops tcp_sock_ipv6_specific; +extern const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; + +/* - functions */ +int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, + const struct sock *sk, const struct sk_buff *skb); + +struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, + const struct sock *addr_sk); + +void tcp_v4_md5_destroy_sock(struct sock *sk); + +int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk); + +int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk); + +bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb); + +void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); + +void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); + +void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw); + +struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, + const struct sock *addr_sk); + +int tcp_v6_md5_hash_skb(char *md5_hash, + const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb); + +bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb); + +static inline void tcp_md5_twsk_destructor(struct tcp_timewait_sock *twsk) +{ + if (twsk->tw_md5_key) + kfree_rcu(twsk->tw_md5_key, rcu); +} + +static inline void tcp_md5_add_header_len(const struct sock *listener, + struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (tp->af_specific->md5_lookup(listener, sk)) + tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; +} + +int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb); + +int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin); + +#else + +static inline bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + return false; +} + +static inline bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + return false; +} + +#endif + +#endif /* _LINUX_TCP_MD5_H */ diff --git a/include/net/tcp.h b/include/net/tcp.h index 2a565883e2ef..d2738cb01cf2 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -406,7 +406,6 @@ void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, int estab, struct tcp_fastopen_cookie *foc, struct sock *sk); -const u8 *tcp_parse_md5sig_option(const struct tcphdr *th); /* * TCP v4 functions exported for the inet6 API @@ -1416,30 +1415,6 @@ static inline void tcp_clear_all_retrans_hints(struct tcp_sock *tp) tp->retransmit_skb_hint = NULL; } -union tcp_md5_addr { - struct in_addr a4; -#if IS_ENABLED(CONFIG_IPV6) - struct in6_addr a6; -#endif -}; - -/* - key database */ -struct tcp_md5sig_key { - struct hlist_node node; - u8 keylen; - u8 family; /* AF_INET or AF_INET6 */ - union tcp_md5_addr addr; - u8 prefixlen; - u8 key[TCP_MD5SIG_MAXKEYLEN]; - struct rcu_head rcu; -}; - -/* - sock block */ -struct tcp_md5sig_info { - struct hlist_head head; - struct rcu_head rcu; -}; - /* - pseudo header */ struct tcp4_pseudohdr { __be32 saddr; @@ -1456,58 +1431,6 @@ struct tcp6_pseudohdr { __be32 protocol; /* including padding */ }; -union tcp_md5sum_block { - struct tcp4_pseudohdr ip4; -#if IS_ENABLED(CONFIG_IPV6) - struct tcp6_pseudohdr ip6; -#endif -}; - -/* - pool: digest algorithm, hash description and scratch buffer */ -struct tcp_md5sig_pool { - struct ahash_request *md5_req; - void *scratch; -}; - -/* - functions */ -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, const struct sk_buff *skb); -int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, - int family, u8 prefixlen, const u8 *newkey, u8 newkeylen, - gfp_t gfp); -int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, - int family, u8 prefixlen); -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk); - -#ifdef CONFIG_TCP_MD5SIG -struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, - const union tcp_md5_addr *addr, - int family); -#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) -#else -static inline struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, - const union tcp_md5_addr *addr, - int family) -{ - return NULL; -} -#define tcp_twsk_md5_key(twsk) NULL -#endif - -bool tcp_alloc_md5sig_pool(void); - -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void); -static inline void tcp_put_md5sig_pool(void) -{ - local_bh_enable(); -} - -int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *, const struct sk_buff *, - unsigned int header_len); -int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, - const struct tcp_md5sig_key *key); - /* From tcp_fastopen.c */ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss, struct tcp_fastopen_cookie *cookie); diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 47a0a6649a9d..dd6bd3b29f5c 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -60,6 +60,7 @@ obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o +obj-$(CONFIG_TCP_MD5SIG) += tcp_md5.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o xfrm4_protocol.o diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index f08542d91e1c..fc5c9cb19b9b 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -271,6 +271,7 @@ #include #include #include +#include #include #include @@ -3370,138 +3371,6 @@ int compat_tcp_getsockopt(struct sock *sk, int level, int optname, EXPORT_SYMBOL(compat_tcp_getsockopt); #endif -#ifdef CONFIG_TCP_MD5SIG -static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); -static DEFINE_MUTEX(tcp_md5sig_mutex); -static bool tcp_md5sig_pool_populated = false; - -static void __tcp_alloc_md5sig_pool(void) -{ - struct crypto_ahash *hash; - int cpu; - - hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC); - if (IS_ERR(hash)) - return; - - for_each_possible_cpu(cpu) { - void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch; - struct ahash_request *req; - - if (!scratch) { - scratch = kmalloc_node(sizeof(union tcp_md5sum_block) + - sizeof(struct tcphdr), - GFP_KERNEL, - cpu_to_node(cpu)); - if (!scratch) - return; - per_cpu(tcp_md5sig_pool, cpu).scratch = scratch; - } - if (per_cpu(tcp_md5sig_pool, cpu).md5_req) - continue; - - req = ahash_request_alloc(hash, GFP_KERNEL); - if (!req) - return; - - ahash_request_set_callback(req, 0, NULL, NULL); - - per_cpu(tcp_md5sig_pool, cpu).md5_req = req; - } - /* before setting tcp_md5sig_pool_populated, we must commit all writes - * to memory. See smp_rmb() in tcp_get_md5sig_pool() - */ - smp_wmb(); - tcp_md5sig_pool_populated = true; -} - -bool tcp_alloc_md5sig_pool(void) -{ - if (unlikely(!tcp_md5sig_pool_populated)) { - mutex_lock(&tcp_md5sig_mutex); - - if (!tcp_md5sig_pool_populated) - __tcp_alloc_md5sig_pool(); - - mutex_unlock(&tcp_md5sig_mutex); - } - return tcp_md5sig_pool_populated; -} -EXPORT_SYMBOL(tcp_alloc_md5sig_pool); - - -/** - * tcp_get_md5sig_pool - get md5sig_pool for this user - * - * We use percpu structure, so if we succeed, we exit with preemption - * and BH disabled, to make sure another thread or softirq handling - * wont try to get same context. - */ -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) -{ - local_bh_disable(); - - if (tcp_md5sig_pool_populated) { - /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */ - smp_rmb(); - return this_cpu_ptr(&tcp_md5sig_pool); - } - local_bh_enable(); - return NULL; -} -EXPORT_SYMBOL(tcp_get_md5sig_pool); - -int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, - const struct sk_buff *skb, unsigned int header_len) -{ - struct scatterlist sg; - const struct tcphdr *tp = tcp_hdr(skb); - struct ahash_request *req = hp->md5_req; - unsigned int i; - const unsigned int head_data_len = skb_headlen(skb) > header_len ? - skb_headlen(skb) - header_len : 0; - const struct skb_shared_info *shi = skb_shinfo(skb); - struct sk_buff *frag_iter; - - sg_init_table(&sg, 1); - - sg_set_buf(&sg, ((u8 *) tp) + header_len, head_data_len); - ahash_request_set_crypt(req, &sg, NULL, head_data_len); - if (crypto_ahash_update(req)) - return 1; - - for (i = 0; i < shi->nr_frags; ++i) { - const struct skb_frag_struct *f = &shi->frags[i]; - unsigned int offset = f->page_offset; - struct page *page = skb_frag_page(f) + (offset >> PAGE_SHIFT); - - sg_set_page(&sg, page, skb_frag_size(f), - offset_in_page(offset)); - ahash_request_set_crypt(req, &sg, NULL, skb_frag_size(f)); - if (crypto_ahash_update(req)) - return 1; - } - - skb_walk_frags(skb, frag_iter) - if (tcp_md5_hash_skb_data(hp, frag_iter, 0)) - return 1; - - return 0; -} -EXPORT_SYMBOL(tcp_md5_hash_skb_data); - -int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key) -{ - struct scatterlist sg; - - sg_init_one(&sg, key->key, key->keylen); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, key->keylen); - return crypto_ahash_update(hp->md5_req); -} -EXPORT_SYMBOL(tcp_md5_hash_key); - -#endif - struct hlist_head *tcp_extopt_get_list(const struct sock *sk) { if (sk_fullsock(sk)) diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index 81148f7a2323..82097a58976a 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -15,6 +15,7 @@ #include #include +#include #include #include @@ -37,70 +38,14 @@ static void tcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r, tcp_get_info(sk, info); } -#ifdef CONFIG_TCP_MD5SIG -static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, - const struct tcp_md5sig_key *key) -{ - info->tcpm_family = key->family; - info->tcpm_prefixlen = key->prefixlen; - info->tcpm_keylen = key->keylen; - memcpy(info->tcpm_key, key->key, key->keylen); - - if (key->family == AF_INET) - info->tcpm_addr[0] = key->addr.a4.s_addr; - #if IS_ENABLED(CONFIG_IPV6) - else if (key->family == AF_INET6) - memcpy(&info->tcpm_addr, &key->addr.a6, - sizeof(info->tcpm_addr)); - #endif -} - -static int tcp_diag_put_md5sig(struct sk_buff *skb, - const struct tcp_md5sig_info *md5sig) -{ - const struct tcp_md5sig_key *key; - struct tcp_diag_md5sig *info; - struct nlattr *attr; - int md5sig_count = 0; - - hlist_for_each_entry_rcu(key, &md5sig->head, node) - md5sig_count++; - if (md5sig_count == 0) - return 0; - - attr = nla_reserve(skb, INET_DIAG_MD5SIG, - md5sig_count * sizeof(struct tcp_diag_md5sig)); - if (!attr) - return -EMSGSIZE; - - info = nla_data(attr); - memset(info, 0, md5sig_count * sizeof(struct tcp_diag_md5sig)); - hlist_for_each_entry_rcu(key, &md5sig->head, node) { - tcp_diag_md5sig_fill(info++, key); - if (--md5sig_count == 0) - break; - } - - return 0; -} -#endif - static int tcp_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb) { #ifdef CONFIG_TCP_MD5SIG - if (net_admin) { - struct tcp_md5sig_info *md5sig; - int err = 0; - - rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) - err = tcp_diag_put_md5sig(skb, md5sig); - rcu_read_unlock(); - if (err < 0) - return err; - } + int err = tcp_md5_diag_get_aux(sk, net_admin, skb); + + if (err < 0) + return err; #endif return 0; @@ -111,21 +56,7 @@ static size_t tcp_diag_get_aux_size(struct sock *sk, bool net_admin) size_t size = 0; #ifdef CONFIG_TCP_MD5SIG - if (net_admin && sk_fullsock(sk)) { - const struct tcp_md5sig_info *md5sig; - const struct tcp_md5sig_key *key; - size_t md5sig_count = 0; - - rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) { - hlist_for_each_entry_rcu(key, &md5sig->head, node) - md5sig_count++; - } - rcu_read_unlock(); - size += nla_total_size(md5sig_count * - sizeof(struct tcp_diag_md5sig)); - } + size += tcp_md5_diag_get_aux_size(sk, net_admin); #endif return size; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index fd2693baee4a..1ac1d8d431ad 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3867,44 +3867,6 @@ static bool tcp_fast_parse_options(const struct net *net, return false; } -#ifdef CONFIG_TCP_MD5SIG -/* - * Parse MD5 Signature option - */ -const u8 *tcp_parse_md5sig_option(const struct tcphdr *th) -{ - int length = (th->doff << 2) - sizeof(*th); - const u8 *ptr = (const u8 *)(th + 1); - - /* If the TCP option is too short, we can short cut */ - if (length < TCPOLEN_MD5SIG) - return NULL; - - while (length > 0) { - int opcode = *ptr++; - int opsize; - - switch (opcode) { - case TCPOPT_EOL: - return NULL; - case TCPOPT_NOP: - length--; - continue; - default: - opsize = *ptr++; - if (opsize < 2 || opsize > length) - return NULL; - if (opcode == TCPOPT_MD5SIG) - return opsize == TCPOLEN_MD5SIG ? ptr : NULL; - } - ptr += opsize - 2; - length -= opsize; - } - return NULL; -} -EXPORT_SYMBOL(tcp_parse_md5sig_option); -#endif - /* Sorry, PAWS as specified is broken wrt. pure-ACKs -DaveM * * It is not fatal. If this ACK does _not_ change critical state (seqs, window) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 4211f8e38ef9..2446a4cb1749 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -62,6 +62,7 @@ #include #include #include +#include #include #include @@ -87,11 +88,6 @@ #include -#ifdef CONFIG_TCP_MD5SIG -static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, - __be32 daddr, __be32 saddr, const struct tcphdr *th); -#endif - struct inet_hashinfo tcp_hashinfo; EXPORT_SYMBOL(tcp_hashinfo); @@ -603,16 +599,13 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct hlist_head *extopt_list = NULL; + struct tcp_out_options opts; struct ip_reply_arg arg; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key = NULL; - const __u8 *hash_location = NULL; - unsigned char newhash[16]; - int genhash; - struct sock *sk1 = NULL; -#endif struct net *net; int offset = 0; +#ifdef CONFIG_TCP_MD5SIG + int ret; +#endif /* Never send a reset in response to a reset. */ if (th->rst) @@ -627,6 +620,8 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) if (sk) extopt_list = tcp_extopt_get_list(sk); + memset(&opts, 0, sizeof(opts)); + /* Swap the send and the receive. */ memset(&rep, 0, sizeof(rep)); rep.th.dest = th->source; @@ -647,55 +642,28 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) arg.iov[0].iov_len = sizeof(rep.th); net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); -#ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *) - &ip_hdr(skb)->saddr, AF_INET); - } else if (hash_location) { - /* - * active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = __inet_lookup_listener(net, &tcp_hashinfo, NULL, 0, - ip_hdr(skb)->saddr, - th->source, ip_hdr(skb)->daddr, - ntohs(th->source), inet_iif(skb), - tcp_v4_sdif(skb)); - /* don't send rst if it can't find key */ - if (!sk1) - goto out; - - key = tcp_md5_do_lookup(sk1, (union tcp_md5_addr *) - &ip_hdr(skb)->saddr, AF_INET); - if (!key) - goto out; +#ifdef CONFIG_TCP_MD5SIG + ret = tcp_v4_md5_send_response_prepare(skb, 0, + MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, + &opts, sk); - genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out; + if (ret == -1) + return; - } + arg.iov[0].iov_len += ret; #endif if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; - struct tcp_out_options opts; int used; remaining = sizeof(rep.opt); #ifdef CONFIG_TCP_MD5SIG - if (key) + if (opts.md5) remaining -= TCPOLEN_MD5SIG_ALIGNED; #endif - memset(&opts, 0, sizeof(opts)); - used = tcp_extopt_response_prepare(skb, TCPHDR_RST, remaining, &opts, sk); @@ -707,19 +675,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) } #ifdef CONFIG_TCP_MD5SIG - if (key) { - rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | - TCPOLEN_MD5SIG); - /* Update length and the length the header thinks exists */ - arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; - rep.th.doff = arg.iov[0].iov_len / 4; - - tcp_v4_md5_hash_hdr((__u8 *)&rep.opt[offset], - key, ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, &rep.th); - } + tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); #endif arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ @@ -750,11 +706,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) __TCP_INC_STATS(net, TCP_MIB_OUTSEGS); __TCP_INC_STATS(net, TCP_MIB_OUTRSTS); local_bh_enable(); - -#ifdef CONFIG_TCP_MD5SIG -out: - rcu_read_unlock(); -#endif } /* The code following below sending ACKs in SYN-RECV and TIME-WAIT states @@ -772,17 +723,19 @@ static void tcp_v4_send_ack(const struct sock *sk, __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct hlist_head *extopt_list = NULL; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif + struct tcp_out_options opts; struct net *net = sock_net(sk); struct ip_reply_arg arg; int offset = 0; +#ifdef CONFIG_TCP_MD5SIG + int ret; +#endif extopt_list = tcp_extopt_get_list(sk); memset(&rep.th, 0, sizeof(struct tcphdr)); memset(&arg, 0, sizeof(arg)); + memset(&opts, 0, sizeof(opts)); arg.iov[0].iov_base = (unsigned char *)&rep; arg.iov[0].iov_len = sizeof(rep.th); @@ -806,25 +759,24 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.window = htons(win); #ifdef CONFIG_TCP_MD5SIG - if (sk->sk_state == TCP_TIME_WAIT) { - key = tcp_twsk_md5_key(tcp_twsk(sk)); - } else if (sk->sk_state == TCP_NEW_SYN_RECV) { - key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr, - AF_INET); - } else { - key = NULL; /* Should not happen */ - } + ret = tcp_v4_md5_send_response_prepare(skb, 0, + MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, + &opts, sk); + + if (ret == -1) + return; + + arg.iov[0].iov_len += ret; #endif if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; - struct tcp_out_options opts; int used; remaining = sizeof(rep.th) + sizeof(rep.opt) - arg.iov[0].iov_len; #ifdef CONFIG_TCP_MD5SIG - if (key) + if (opts.md5) remaining -= TCPOLEN_MD5SIG_ALIGNED; #endif @@ -841,18 +793,11 @@ static void tcp_v4_send_ack(const struct sock *sk, } #ifdef CONFIG_TCP_MD5SIG - if (key) { - rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | - TCPOLEN_MD5SIG); + if (opts.md5) { arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; rep.th.doff = arg.iov[0].iov_len / 4; - - tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[offset], - key, ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, &rep.th); } + tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); #endif arg.flags = reply_flags; @@ -961,374 +906,6 @@ static void tcp_v4_reqsk_destructor(struct request_sock *req) kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1)); } -#ifdef CONFIG_TCP_MD5SIG -/* - * RFC2385 MD5 checksumming requires a mapping of - * IP address->MD5 Key. - * We need to maintain these in the sk structure. - */ - -/* Find the Key structure for an address. */ -struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, - const union tcp_md5_addr *addr, - int family) -{ - const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - const struct tcp_md5sig_info *md5sig; - __be32 mask; - struct tcp_md5sig_key *best_match = NULL; - bool match; - - /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); - if (!md5sig) - return NULL; - - hlist_for_each_entry_rcu(key, &md5sig->head, node) { - if (key->family != family) - continue; - - if (family == AF_INET) { - mask = inet_make_mask(key->prefixlen); - match = (key->addr.a4.s_addr & mask) == - (addr->a4.s_addr & mask); -#if IS_ENABLED(CONFIG_IPV6) - } else if (family == AF_INET6) { - match = ipv6_prefix_equal(&key->addr.a6, &addr->a6, - key->prefixlen); -#endif - } else { - match = false; - } - - if (match && (!best_match || - key->prefixlen > best_match->prefixlen)) - best_match = key; - } - return best_match; -} -EXPORT_SYMBOL(tcp_md5_do_lookup); - -static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk, - const union tcp_md5_addr *addr, - int family, u8 prefixlen) -{ - const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - unsigned int size = sizeof(struct in_addr); - const struct tcp_md5sig_info *md5sig; - - /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); - if (!md5sig) - return NULL; -#if IS_ENABLED(CONFIG_IPV6) - if (family == AF_INET6) - size = sizeof(struct in6_addr); -#endif - hlist_for_each_entry_rcu(key, &md5sig->head, node) { - if (key->family != family) - continue; - if (!memcmp(&key->addr, addr, size) && - key->prefixlen == prefixlen) - return key; - } - return NULL; -} - -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) -{ - const union tcp_md5_addr *addr; - - addr = (const union tcp_md5_addr *)&addr_sk->sk_daddr; - return tcp_md5_do_lookup(sk, addr, AF_INET); -} -EXPORT_SYMBOL(tcp_v4_md5_lookup); - -/* This can be called on a newly created socket, from other files */ -int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, - int family, u8 prefixlen, const u8 *newkey, u8 newkeylen, - gfp_t gfp) -{ - /* Add Key to the list */ - struct tcp_md5sig_key *key; - struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_info *md5sig; - - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); - if (key) { - /* Pre-existing entry - just update that one. */ - memcpy(key->key, newkey, newkeylen); - key->keylen = newkeylen; - return 0; - } - - md5sig = rcu_dereference_protected(tp->md5sig_info, - lockdep_sock_is_held(sk)); - if (!md5sig) { - md5sig = kmalloc(sizeof(*md5sig), gfp); - if (!md5sig) - return -ENOMEM; - - sk_nocaps_add(sk, NETIF_F_GSO_MASK); - INIT_HLIST_HEAD(&md5sig->head); - rcu_assign_pointer(tp->md5sig_info, md5sig); - } - - key = sock_kmalloc(sk, sizeof(*key), gfp); - if (!key) - return -ENOMEM; - if (!tcp_alloc_md5sig_pool()) { - sock_kfree_s(sk, key, sizeof(*key)); - return -ENOMEM; - } - - memcpy(key->key, newkey, newkeylen); - key->keylen = newkeylen; - key->family = family; - key->prefixlen = prefixlen; - memcpy(&key->addr, addr, - (family == AF_INET6) ? sizeof(struct in6_addr) : - sizeof(struct in_addr)); - hlist_add_head_rcu(&key->node, &md5sig->head); - return 0; -} -EXPORT_SYMBOL(tcp_md5_do_add); - -int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family, - u8 prefixlen) -{ - struct tcp_md5sig_key *key; - - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); - if (!key) - return -ENOENT; - hlist_del_rcu(&key->node); - atomic_sub(sizeof(*key), &sk->sk_omem_alloc); - kfree_rcu(key, rcu); - return 0; -} -EXPORT_SYMBOL(tcp_md5_do_del); - -static void tcp_clear_md5_list(struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - struct hlist_node *n; - struct tcp_md5sig_info *md5sig; - - md5sig = rcu_dereference_protected(tp->md5sig_info, 1); - - hlist_for_each_entry_safe(key, n, &md5sig->head, node) { - hlist_del_rcu(&key->node); - atomic_sub(sizeof(*key), &sk->sk_omem_alloc); - kfree_rcu(key, rcu); - } -} - -static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) -{ - struct tcp_md5sig cmd; - struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; - u8 prefixlen = 32; - - if (optlen < sizeof(cmd)) - return -EINVAL; - - if (copy_from_user(&cmd, optval, sizeof(cmd))) - return -EFAULT; - - if (sin->sin_family != AF_INET) - return -EINVAL; - - if (optname == TCP_MD5SIG_EXT && - cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { - prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 32) - return -EINVAL; - } - - if (!cmd.tcpm_keylen) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen); - - if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) - return -EINVAL; - - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, - GFP_KERNEL); -} - -static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, - __be32 daddr, __be32 saddr, - const struct tcphdr *th, int nbytes) -{ - struct tcp4_pseudohdr *bp; - struct scatterlist sg; - struct tcphdr *_th; - - bp = hp->scratch; - bp->saddr = saddr; - bp->daddr = daddr; - bp->pad = 0; - bp->protocol = IPPROTO_TCP; - bp->len = cpu_to_be16(nbytes); - - _th = (struct tcphdr *)(bp + 1); - memcpy(_th, th, sizeof(*th)); - _th->check = 0; - - sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, - sizeof(*bp) + sizeof(*th)); - return crypto_ahash_update(hp->md5_req); -} - -static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, - __be32 daddr, __be32 saddr, const struct tcphdr *th) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} - -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - __be32 saddr, daddr; - - if (sk) { /* valid for establish/request sockets */ - saddr = sk->sk_rcv_saddr; - daddr = sk->sk_daddr; - } else { - const struct iphdr *iph = ip_hdr(skb); - saddr = iph->saddr; - daddr = iph->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} -EXPORT_SYMBOL(tcp_v4_md5_hash_skb); - -#endif - -/* Called with rcu_read_lock() */ -static bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ -#ifdef CONFIG_TCP_MD5SIG - /* - * This gets called for each TCP segment that arrives - * so we want to be efficient. - * We have 3 drop cases: - * o No MD5 hash and one expected. - * o MD5 hash and we're not expecting one. - * o MD5 hash and its wrong. - */ - const __u8 *hash_location = NULL; - struct tcp_md5sig_key *hash_expected; - const struct iphdr *iph = ip_hdr(skb); - const struct tcphdr *th = tcp_hdr(skb); - int genhash; - unsigned char newhash[16]; - - hash_expected = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&iph->saddr, - AF_INET); - hash_location = tcp_parse_md5sig_option(th); - - /* We've parsed the options - do we have a hash? */ - if (!hash_expected && !hash_location) - return false; - - if (hash_expected && !hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); - return true; - } - - if (!hash_expected && hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); - return true; - } - - /* Okay, so this is hash_expected and hash_location - - * so we need to calculate the checksum. - */ - genhash = tcp_v4_md5_hash_skb(newhash, - hash_expected, - NULL, skb); - - if (genhash || memcmp(hash_location, newhash, 16) != 0) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); - net_info_ratelimited("MD5 Hash failed for (%pI4, %d)->(%pI4, %d)%s\n", - &iph->saddr, ntohs(th->source), - &iph->daddr, ntohs(th->dest), - genhash ? " tcp_v4_calc_md5_hash failed" - : ""); - return true; - } - return false; -#endif - return false; -} - static void tcp_v4_init_req(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb) @@ -1404,9 +981,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, struct inet_sock *newinet; struct tcp_sock *newtp; struct sock *newsk; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif struct ip_options_rcu *inet_opt; if (sk_acceptq_is_full(sk)) @@ -1453,20 +1027,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, tcp_initialize_rcv_mss(newsk); #ifdef CONFIG_TCP_MD5SIG - /* Copy over the MD5 key from the original socket */ - key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&newinet->inet_daddr, - AF_INET); - if (key) { - /* - * We're using one, so create a matching key - * on the newsk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(newsk, (union tcp_md5_addr *)&newinet->inet_daddr, - AF_INET, 32, key->key, key->keylen, GFP_ATOMIC); - sk_nocaps_add(newsk, NETIF_F_GSO_MASK); - } + tcp_v4_md5_syn_recv_sock(sk, newsk); #endif if (__inet_inherit_port(sk, newsk) < 0) @@ -1930,14 +1491,6 @@ const struct inet_connection_sock_af_ops ipv4_specific = { }; EXPORT_SYMBOL(ipv4_specific); -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v4_parse_md5_keys, -}; -#endif - /* NOTE: A lot of things set to zero explicitly by call to * sk_alloc() so need not be done here. */ @@ -1980,12 +1533,7 @@ void tcp_v4_destroy_sock(struct sock *sk) if (unlikely(!hlist_empty(&tp->tcp_option_list))) tcp_extopt_destroy(sk); #ifdef CONFIG_TCP_MD5SIG - /* Clean up the MD5 key list, if any */ - if (tp->md5sig_info) { - tcp_clear_md5_list(sk); - kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu); - tp->md5sig_info = NULL; - } + tcp_v4_md5_destroy_sock(sk); #endif /* Clean up a referenced TCP bind bucket. */ diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c new file mode 100644 index 000000000000..d50580536978 --- /dev/null +++ b/net/ipv4/tcp_md5.c @@ -0,0 +1,1103 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include +#include +#include +#include + +#include + +#include + +static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); +static DEFINE_MUTEX(tcp_md5sig_mutex); +static bool tcp_md5sig_pool_populated; + +#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) + +static void __tcp_alloc_md5sig_pool(void) +{ + struct crypto_ahash *hash; + int cpu; + + hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC); + if (IS_ERR(hash)) + return; + + for_each_possible_cpu(cpu) { + void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch; + struct ahash_request *req; + + if (!scratch) { + scratch = kmalloc_node(sizeof(union tcp_md5sum_block) + + sizeof(struct tcphdr), + GFP_KERNEL, + cpu_to_node(cpu)); + if (!scratch) + return; + per_cpu(tcp_md5sig_pool, cpu).scratch = scratch; + } + if (per_cpu(tcp_md5sig_pool, cpu).md5_req) + continue; + + req = ahash_request_alloc(hash, GFP_KERNEL); + if (!req) + return; + + ahash_request_set_callback(req, 0, NULL, NULL); + + per_cpu(tcp_md5sig_pool, cpu).md5_req = req; + } + /* before setting tcp_md5sig_pool_populated, we must commit all writes + * to memory. See smp_rmb() in tcp_get_md5sig_pool() + */ + smp_wmb(); + tcp_md5sig_pool_populated = true; +} + +static bool tcp_alloc_md5sig_pool(void) +{ + if (unlikely(!tcp_md5sig_pool_populated)) { + mutex_lock(&tcp_md5sig_mutex); + + if (!tcp_md5sig_pool_populated) + __tcp_alloc_md5sig_pool(); + + mutex_unlock(&tcp_md5sig_mutex); + } + return tcp_md5sig_pool_populated; +} + +static void tcp_put_md5sig_pool(void) +{ + local_bh_enable(); +} + +/** + * tcp_get_md5sig_pool - get md5sig_pool for this user + * + * We use percpu structure, so if we succeed, we exit with preemption + * and BH disabled, to make sure another thread or softirq handling + * wont try to get same context. + */ +static struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) +{ + local_bh_disable(); + + if (tcp_md5sig_pool_populated) { + /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */ + smp_rmb(); + return this_cpu_ptr(&tcp_md5sig_pool); + } + local_bh_enable(); + return NULL; +} + +static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk, + const union tcp_md5_addr *addr, + int family, u8 prefixlen) +{ + const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + unsigned int size = sizeof(struct in_addr); + const struct tcp_md5sig_info *md5sig; + + /* caller either holds rcu_read_lock() or socket lock */ + md5sig = rcu_dereference_check(tp->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) + return NULL; +#if IS_ENABLED(CONFIG_IPV6) + if (family == AF_INET6) + size = sizeof(struct in6_addr); +#endif + hlist_for_each_entry_rcu(key, &md5sig->head, node) { + if (key->family != family) + continue; + if (!memcmp(&key->addr, addr, size) && + key->prefixlen == prefixlen) + return key; + } + return NULL; +} + +/* This can be called on a newly created socket, from other files */ +static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, + int family, u8 prefixlen, const u8 *newkey, + u8 newkeylen, gfp_t gfp) +{ + /* Add Key to the list */ + struct tcp_md5sig_key *key; + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_info *md5sig; + + key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + if (key) { + /* Pre-existing entry - just update that one. */ + memcpy(key->key, newkey, newkeylen); + key->keylen = newkeylen; + return 0; + } + + md5sig = rcu_dereference_protected(tp->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) { + md5sig = kmalloc(sizeof(*md5sig), gfp); + if (!md5sig) + return -ENOMEM; + + sk_nocaps_add(sk, NETIF_F_GSO_MASK); + INIT_HLIST_HEAD(&md5sig->head); + rcu_assign_pointer(tp->md5sig_info, md5sig); + } + + key = sock_kmalloc(sk, sizeof(*key), gfp); + if (!key) + return -ENOMEM; + if (!tcp_alloc_md5sig_pool()) { + sock_kfree_s(sk, key, sizeof(*key)); + return -ENOMEM; + } + + memcpy(key->key, newkey, newkeylen); + key->keylen = newkeylen; + key->family = family; + key->prefixlen = prefixlen; + memcpy(&key->addr, addr, + (family == AF_INET6) ? sizeof(struct in6_addr) : + sizeof(struct in_addr)); + hlist_add_head_rcu(&key->node, &md5sig->head); + return 0; +} + +static void tcp_clear_md5_list(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + struct hlist_node *n; + struct tcp_md5sig_info *md5sig; + + md5sig = rcu_dereference_protected(tp->md5sig_info, 1); + + hlist_for_each_entry_safe(key, n, &md5sig->head, node) { + hlist_del_rcu(&key->node); + atomic_sub(sizeof(*key), &sk->sk_omem_alloc); + kfree_rcu(key, rcu); + } +} + +static int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, + int family, u8 prefixlen) +{ + struct tcp_md5sig_key *key; + + key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + if (!key) + return -ENOENT; + hlist_del_rcu(&key->node); + atomic_sub(sizeof(*key), &sk->sk_omem_alloc); + kfree_rcu(key, rcu); + return 0; +} + +static int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, + const struct tcp_md5sig_key *key) +{ + struct scatterlist sg; + + sg_init_one(&sg, key->key, key->keylen); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, key->keylen); + return crypto_ahash_update(hp->md5_req); +} + +static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, + char __user *optval, int optlen) +{ + struct tcp_md5sig cmd; + struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; + u8 prefixlen = 32; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + if (copy_from_user(&cmd, optval, sizeof(cmd))) + return -EFAULT; + + if (sin->sin_family != AF_INET) + return -EINVAL; + + if (optname == TCP_MD5SIG_EXT && + cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { + prefixlen = cmd.tcpm_prefixlen; + if (prefixlen > 32) + return -EINVAL; + } + + if (!cmd.tcpm_keylen) + return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, + AF_INET, prefixlen); + + if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) + return -EINVAL; + + return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, + AF_INET, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, + GFP_KERNEL); +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, + char __user *optval, int optlen) +{ + struct tcp_md5sig cmd; + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; + u8 prefixlen; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + if (copy_from_user(&cmd, optval, sizeof(cmd))) + return -EFAULT; + + if (sin6->sin6_family != AF_INET6) + return -EINVAL; + + if (optname == TCP_MD5SIG_EXT && + cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { + prefixlen = cmd.tcpm_prefixlen; + if (prefixlen > 128 || (ipv6_addr_v4mapped(&sin6->sin6_addr) && + prefixlen > 32)) + return -EINVAL; + } else { + prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128; + } + + if (!cmd.tcpm_keylen) { + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) + return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], + AF_INET, prefixlen); + return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr, + AF_INET6, prefixlen); + } + + if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) + return -EINVAL; + + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) + return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], + AF_INET, prefixlen, cmd.tcpm_key, + cmd.tcpm_keylen, GFP_KERNEL); + + return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, + AF_INET6, prefixlen, cmd.tcpm_key, + cmd.tcpm_keylen, GFP_KERNEL); +} +#endif + +static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, + __be32 daddr, __be32 saddr, + const struct tcphdr *th, int nbytes) +{ + struct tcp4_pseudohdr *bp; + struct scatterlist sg; + struct tcphdr *_th; + + bp = hp->scratch; + bp->saddr = saddr; + bp->daddr = daddr; + bp->pad = 0; + bp->protocol = IPPROTO_TCP; + bp->len = cpu_to_be16(nbytes); + + _th = (struct tcphdr *)(bp + 1); + memcpy(_th, th, sizeof(*th)); + _th->check = 0; + + sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, + sizeof(*bp) + sizeof(*th)); + return crypto_ahash_update(hp->md5_req); +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp, + const struct in6_addr *daddr, + const struct in6_addr *saddr, + const struct tcphdr *th, int nbytes) +{ + struct tcp6_pseudohdr *bp; + struct scatterlist sg; + struct tcphdr *_th; + + bp = hp->scratch; + /* 1. TCP pseudo-header (RFC2460) */ + bp->saddr = *saddr; + bp->daddr = *daddr; + bp->protocol = cpu_to_be32(IPPROTO_TCP); + bp->len = cpu_to_be32(nbytes); + + _th = (struct tcphdr *)(bp + 1); + memcpy(_th, th, sizeof(*th)); + _th->check = 0; + + sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, + sizeof(*bp) + sizeof(*th)); + return crypto_ahash_update(hp->md5_req); +} +#endif + +static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, + __be32 daddr, __be32 saddr, + const struct tcphdr *th) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, + const struct in6_addr *daddr, + struct in6_addr *saddr, const struct tcphdr *th) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +#endif + +/* RFC2385 MD5 checksumming requires a mapping of + * IP address->MD5 Key. + * We need to maintain these in the sk structure. + */ + +/* Find the Key structure for an address. */ +static struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, + const union tcp_md5_addr *addr, + int family) +{ + const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + const struct tcp_md5sig_info *md5sig; + __be32 mask; + struct tcp_md5sig_key *best_match = NULL; + bool match; + + /* caller either holds rcu_read_lock() or socket lock */ + md5sig = rcu_dereference_check(tp->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) + return NULL; + + hlist_for_each_entry_rcu(key, &md5sig->head, node) { + if (key->family != family) + continue; + + if (family == AF_INET) { + mask = inet_make_mask(key->prefixlen); + match = (key->addr.a4.s_addr & mask) == + (addr->a4.s_addr & mask); +#if IS_ENABLED(CONFIG_IPV6) + } else if (family == AF_INET6) { + match = ipv6_prefix_equal(&key->addr.a6, &addr->a6, + key->prefixlen); +#endif + } else { + match = false; + } + + if (match && (!best_match || + key->prefixlen > best_match->prefixlen)) + best_match = key; + } + return best_match; +} + +/* Parse MD5 Signature option */ +static const u8 *tcp_parse_md5sig_option(const struct tcphdr *th) +{ + int length = (th->doff << 2) - sizeof(*th); + const u8 *ptr = (const u8 *)(th + 1); + + /* If the TCP option is too short, we can short cut */ + if (length < TCPOLEN_MD5SIG) + return NULL; + + while (length > 0) { + int opcode = *ptr++; + int opsize; + + switch (opcode) { + case TCPOPT_EOL: + return NULL; + case TCPOPT_NOP: + length--; + continue; + default: + opsize = *ptr++; + if (opsize < 2 || opsize > length) + return NULL; + if (opcode == TCPOPT_MD5SIG) + return opsize == TCPOLEN_MD5SIG ? ptr : NULL; + } + ptr += opsize - 2; + length -= opsize; + } + return NULL; +} + +#if IS_ENABLED(CONFIG_IPV6) +static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, + const struct in6_addr *addr) +{ + return tcp_md5_do_lookup(sk, (union tcp_md5_addr *)addr, AF_INET6); +} +#endif + +static int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, + const struct sk_buff *skb, + unsigned int header_len) +{ + struct scatterlist sg; + const struct tcphdr *tp = tcp_hdr(skb); + struct ahash_request *req = hp->md5_req; + unsigned int i; + const unsigned int head_data_len = skb_headlen(skb) > header_len ? + skb_headlen(skb) - header_len : 0; + const struct skb_shared_info *shi = skb_shinfo(skb); + struct sk_buff *frag_iter; + + sg_init_table(&sg, 1); + + sg_set_buf(&sg, ((u8 *)tp) + header_len, head_data_len); + ahash_request_set_crypt(req, &sg, NULL, head_data_len); + if (crypto_ahash_update(req)) + return 1; + + for (i = 0; i < shi->nr_frags; ++i) { + const struct skb_frag_struct *f = &shi->frags[i]; + unsigned int offset = f->page_offset; + struct page *page = skb_frag_page(f) + (offset >> PAGE_SHIFT); + + sg_set_page(&sg, page, skb_frag_size(f), + offset_in_page(offset)); + ahash_request_set_crypt(req, &sg, NULL, skb_frag_size(f)); + if (crypto_ahash_update(req)) + return 1; + } + + skb_walk_frags(skb, frag_iter) + if (tcp_md5_hash_skb_data(hp, frag_iter, 0)) + return 1; + + return 0; +} + +int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + const struct tcphdr *th = tcp_hdr(skb); + const struct iphdr *iph = ip_hdr(skb); + const __u8 *hash_location = NULL; + + rcu_read_lock(); + hash_location = tcp_parse_md5sig_option(th); + if (sk && sk_fullsock(sk)) { + opts->md5 = tcp_md5_do_lookup(sk, + (union tcp_md5_addr *)&iph->saddr, + AF_INET); + } else if (sk && sk->sk_state == TCP_TIME_WAIT) { + struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + + opts->md5 = tcp_twsk_md5_key(tcptw); + } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { + opts->md5 = tcp_md5_do_lookup(sk, + (union tcp_md5_addr *)&iph->saddr, + AF_INET); + } else if (hash_location) { + unsigned char newhash[16]; + struct sock *sk1; + int genhash; + + /* active side is lost. Try to find listening socket through + * source port, and then find md5 key through listening socket. + * we are not loose security here: + * Incoming packet is checked with md5 hash with finding key, + * no RST generated if md5 hash doesn't match. + */ + sk1 = __inet_lookup_listener(dev_net(skb_dst(skb)->dev), + &tcp_hashinfo, NULL, 0, + iph->saddr, + th->source, iph->daddr, + ntohs(th->source), inet_iif(skb), + tcp_v4_sdif(skb)); + /* don't send rst if it can't find key */ + if (!sk1) + goto out_err; + + opts->md5 = tcp_md5_do_lookup(sk1, (union tcp_md5_addr *) + &iph->saddr, AF_INET); + if (!opts->md5) + goto out_err; + + genhash = tcp_v4_md5_hash_skb(newhash, opts->md5, NULL, skb); + if (genhash || memcmp(hash_location, newhash, 16) != 0) + goto out_err; + } + + if (opts->md5) + return TCPOLEN_MD5SIG_ALIGNED; + + rcu_read_unlock(); + return 0; + +out_err: + rcu_read_unlock(); + return -1; +} + +void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) +{ + if (opts->md5) { + *topt++ = htonl((TCPOPT_NOP << 24) | + (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | + TCPOLEN_MD5SIG); + + tcp_v4_md5_hash_hdr((__u8 *)topt, opts->md5, + ip_hdr(skb)->saddr, + ip_hdr(skb)->daddr, t1); + rcu_read_unlock(); + } +} + +#if IS_ENABLED(CONFIG_IPV6) +int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + const struct tcphdr *th = tcp_hdr(skb); + struct ipv6hdr *ipv6h = ipv6_hdr(skb); + const __u8 *hash_location = NULL; + + rcu_read_lock(); + hash_location = tcp_parse_md5sig_option(th); + if (sk && sk_fullsock(sk)) { + opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (sk && sk->sk_state == TCP_TIME_WAIT) { + struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + + opts->md5 = tcp_twsk_md5_key(tcptw); + } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { + opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (hash_location) { + unsigned char newhash[16]; + struct sock *sk1; + int genhash; + + /* active side is lost. Try to find listening socket through + * source port, and then find md5 key through listening socket. + * we are not loose security here: + * Incoming packet is checked with md5 hash with finding key, + * no RST generated if md5 hash doesn't match. + */ + sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), + &tcp_hashinfo, NULL, 0, + &ipv6h->saddr, + th->source, &ipv6h->daddr, + ntohs(th->source), tcp_v6_iif(skb), + tcp_v6_sdif(skb)); + if (!sk1) + goto out_err; + + opts->md5 = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); + if (!opts->md5) + goto out_err; + + genhash = tcp_v6_md5_hash_skb(newhash, opts->md5, NULL, skb); + if (genhash || memcmp(hash_location, newhash, 16) != 0) + goto out_err; + } + + if (opts->md5) + return TCPOLEN_MD5SIG_ALIGNED; + + rcu_read_unlock(); + return 0; + +out_err: + rcu_read_unlock(); + return -1; +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_prepare); + +void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) +{ + if (opts->md5) { + *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); + tcp_v6_md5_hash_hdr((__u8 *)topt, opts->md5, + &ipv6_hdr(skb)->saddr, + &ipv6_hdr(skb)->daddr, t1); + + rcu_read_unlock(); + } +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_write); +#endif + +struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) +{ + const union tcp_md5_addr *addr; + + addr = (const union tcp_md5_addr *)&addr_sk->sk_daddr; + return tcp_md5_do_lookup(sk, addr, AF_INET); +} +EXPORT_SYMBOL(tcp_v4_md5_lookup); + +int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + __be32 saddr, daddr; + + if (sk) { /* valid for establish/request sockets */ + saddr = sk->sk_rcv_saddr; + daddr = sk->sk_daddr; + } else { + const struct iphdr *iph = ip_hdr(skb); + + saddr = iph->saddr; + daddr = iph->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +EXPORT_SYMBOL(tcp_v4_md5_hash_skb); + +#if IS_ENABLED(CONFIG_IPV6) +int tcp_v6_md5_hash_skb(char *md5_hash, + const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb) +{ + const struct in6_addr *saddr, *daddr; + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + + if (sk) { /* valid for establish/request sockets */ + saddr = &sk->sk_v6_rcv_saddr; + daddr = &sk->sk_v6_daddr; + } else { + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + + saddr = &ip6h->saddr; + daddr = &ip6h->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_hash_skb); +#endif + +/* Called with rcu_read_lock() */ +bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + /* This gets called for each TCP segment that arrives + * so we want to be efficient. + * We have 3 drop cases: + * o No MD5 hash and one expected. + * o MD5 hash and we're not expecting one. + * o MD5 hash and its wrong. + */ + const __u8 *hash_location = NULL; + struct tcp_md5sig_key *hash_expected; + const struct iphdr *iph = ip_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); + int genhash; + unsigned char newhash[16]; + + hash_expected = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&iph->saddr, + AF_INET); + hash_location = tcp_parse_md5sig_option(th); + + /* We've parsed the options - do we have a hash? */ + if (!hash_expected && !hash_location) + return false; + + if (hash_expected && !hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); + return true; + } + + if (!hash_expected && hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); + return true; + } + + /* Okay, so this is hash_expected and hash_location - + * so we need to calculate the checksum. + */ + genhash = tcp_v4_md5_hash_skb(newhash, + hash_expected, + NULL, skb); + + if (genhash || memcmp(hash_location, newhash, 16) != 0) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); + net_info_ratelimited("MD5 Hash failed for (%pI4, %d)->(%pI4, %d)%s\n", + &iph->saddr, ntohs(th->source), + &iph->daddr, ntohs(th->dest), + genhash ? " tcp_v4_calc_md5_hash failed" + : ""); + return true; + } + return false; +} + +#if IS_ENABLED(CONFIG_IPV6) +bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + const __u8 *hash_location = NULL; + struct tcp_md5sig_key *hash_expected; + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); + int genhash; + u8 newhash[16]; + + hash_expected = tcp_v6_md5_do_lookup(sk, &ip6h->saddr); + hash_location = tcp_parse_md5sig_option(th); + + /* We've parsed the options - do we have a hash? */ + if (!hash_expected && !hash_location) + return false; + + if (hash_expected && !hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); + return true; + } + + if (!hash_expected && hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); + return true; + } + + /* check the signature */ + genhash = tcp_v6_md5_hash_skb(newhash, + hash_expected, + NULL, skb); + + if (genhash || memcmp(hash_location, newhash, 16) != 0) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); + net_info_ratelimited("MD5 Hash %s for [%pI6c]:%u->[%pI6c]:%u\n", + genhash ? "failed" : "mismatch", + &ip6h->saddr, ntohs(th->source), + &ip6h->daddr, ntohs(th->dest)); + return true; + } + + return false; +} +EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); +#endif + +void tcp_v4_md5_destroy_sock(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + + /* Clean up the MD5 key list, if any */ + if (tp->md5sig_info) { + tcp_clear_md5_list(sk); + kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu); + tp->md5sig_info = NULL; + } +} + +void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) +{ + struct inet_sock *inet = inet_sk(sk); + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_md5_do_lookup(listener, (union tcp_md5_addr *)&inet->inet_daddr, + AF_INET); + if (key) { + /* We're using one, so create a matching key + * on the sk structure. If we fail to get + * memory, then we end up not copying the key + * across. Shucks. + */ + tcp_md5_do_add(sk, (union tcp_md5_addr *)&inet->inet_daddr, + AF_INET, 32, key->key, key->keylen, GFP_ATOMIC); + sk_nocaps_add(sk, NETIF_F_GSO_MASK); + } +} + +#if IS_ENABLED(CONFIG_IPV6) +void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) +{ + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_v6_md5_do_lookup(listener, &sk->sk_v6_daddr); + if (key) { + /* We're using one, so create a matching key + * on the newsk structure. If we fail to get + * memory, then we end up not copying the key + * across. Shucks. + */ + tcp_md5_do_add(sk, (union tcp_md5_addr *)&sk->sk_v6_daddr, + AF_INET6, 128, key->key, key->keylen, + sk_gfp_mask(sk, GFP_ATOMIC)); + } +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_syn_recv_sock); + +struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) +{ + return tcp_v6_md5_do_lookup(sk, &addr_sk->sk_v6_daddr); +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_lookup); +#endif + +void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw) +{ + struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + + /* The timewait bucket does not have the key DB from the + * sock structure. We just make a quick copy of the + * md5 key being used (if indeed we are using one) + * so the timewait ack generating code has the key. + */ + tcptw->tw_md5_key = NULL; + key = tp->af_specific->md5_lookup(sk, sk); + if (key) { + tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); + BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool()); + } +} + +static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, + const struct tcp_md5sig_key *key) +{ + info->tcpm_family = key->family; + info->tcpm_prefixlen = key->prefixlen; + info->tcpm_keylen = key->keylen; + memcpy(info->tcpm_key, key->key, key->keylen); + + if (key->family == AF_INET) + info->tcpm_addr[0] = key->addr.a4.s_addr; + #if IS_ENABLED(CONFIG_IPV6) + else if (key->family == AF_INET6) + memcpy(&info->tcpm_addr, &key->addr.a6, + sizeof(info->tcpm_addr)); + #endif +} + +static int tcp_diag_put_md5sig(struct sk_buff *skb, + const struct tcp_md5sig_info *md5sig) +{ + const struct tcp_md5sig_key *key; + struct tcp_diag_md5sig *info; + struct nlattr *attr; + int md5sig_count = 0; + + hlist_for_each_entry_rcu(key, &md5sig->head, node) + md5sig_count++; + if (md5sig_count == 0) + return 0; + + attr = nla_reserve(skb, INET_DIAG_MD5SIG, + md5sig_count * sizeof(struct tcp_diag_md5sig)); + if (!attr) + return -EMSGSIZE; + + info = nla_data(attr); + memset(info, 0, md5sig_count * sizeof(struct tcp_diag_md5sig)); + hlist_for_each_entry_rcu(key, &md5sig->head, node) { + tcp_diag_md5sig_fill(info++, key); + if (--md5sig_count == 0) + break; + } + + return 0; +} + +int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb) +{ + if (net_admin) { + struct tcp_md5sig_info *md5sig; + int err = 0; + + rcu_read_lock(); + md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); + if (md5sig) + err = tcp_diag_put_md5sig(skb, md5sig); + rcu_read_unlock(); + if (err < 0) + return err; + } + + return 0; +} +EXPORT_SYMBOL_GPL(tcp_md5_diag_get_aux); + +int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin) +{ + int size = 0; + + if (net_admin && sk_fullsock(sk)) { + const struct tcp_md5sig_info *md5sig; + const struct tcp_md5sig_key *key; + size_t md5sig_count = 0; + + rcu_read_lock(); + md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); + if (md5sig) { + hlist_for_each_entry_rcu(key, &md5sig->head, node) + md5sig_count++; + } + rcu_read_unlock(); + size += nla_total_size(md5sig_count * + sizeof(struct tcp_diag_md5sig)); + } + + return size; +} +EXPORT_SYMBOL_GPL(tcp_md5_diag_get_aux_size); + +const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { + .md5_lookup = tcp_v4_md5_lookup, + .calc_md5_hash = tcp_v4_md5_hash_skb, + .md5_parse = tcp_v4_parse_md5_keys, +}; + +#if IS_ENABLED(CONFIG_IPV6) +const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { + .md5_lookup = tcp_v6_md5_lookup, + .calc_md5_hash = tcp_v6_md5_hash_skb, + .md5_parse = tcp_v6_parse_md5_keys, +}; +EXPORT_SYMBOL_GPL(tcp_sock_ipv6_specific); + +const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { + .md5_lookup = tcp_v4_md5_lookup, + .calc_md5_hash = tcp_v4_md5_hash_skb, + .md5_parse = tcp_v6_parse_md5_keys, +}; +EXPORT_SYMBOL_GPL(tcp_sock_ipv6_mapped_specific); +#endif diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 5e08dce49a00..072dbcebfbaf 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -295,21 +296,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) INIT_HLIST_HEAD(&tp->tcp_option_list); } #ifdef CONFIG_TCP_MD5SIG - /* - * The timewait bucket does not have the key DB from the - * sock structure. We just make a quick copy of the - * md5 key being used (if indeed we are using one) - * so the timewait ack generating code has the key. - */ - do { - struct tcp_md5sig_key *key; - tcptw->tw_md5_key = NULL; - key = tp->af_specific->md5_lookup(sk, sk); - if (key) { - tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool()); - } - } while (0); + tcp_md5_time_wait(sk, tw); #endif /* Get the TIME_WAIT timeout firing. */ @@ -348,8 +335,7 @@ void tcp_twsk_destructor(struct sock *sk) struct tcp_timewait_sock *twsk = tcp_twsk(sk); #ifdef CONFIG_TCP_MD5SIG - if (twsk->tw_md5_key) - kfree_rcu(twsk->tw_md5_key, rcu); + tcp_md5_twsk_destructor(twsk); #endif if (unlikely(!hlist_empty(&twsk->tcp_option_list))) @@ -538,8 +524,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->tsoffset = treq->ts_off; #ifdef CONFIG_TCP_MD5SIG newtp->md5sig_info = NULL; /*XXX*/ - if (newtp->af_specific->md5_lookup(sk, newsk)) - newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; + tcp_md5_add_header_len(sk, newsk); #endif if (unlikely(!hlist_empty(&treq->tcp_option_list))) newtp->tcp_header_len += tcp_extopt_add_header(req_to_sk(req), newsk); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 97e6aecc03eb..c7fb7a0e1610 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -42,6 +42,7 @@ #include #include #include +#include #include @@ -3243,8 +3244,7 @@ static void tcp_connect_init(struct sock *sk) tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG - if (tp->af_specific->md5_lookup(sk, sk)) - tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; + tcp_md5_add_header_len(sk, sk); #endif if (unlikely(!hlist_empty(&tp->tcp_option_list))) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 8c6d0362299e..26b19475d91c 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -79,10 +80,6 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb); static const struct inet_connection_sock_af_ops ipv6_mapped; static const struct inet_connection_sock_af_ops ipv6_specific; -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv6_specific; -static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; -#endif static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb) { @@ -500,218 +497,6 @@ static void tcp_v6_reqsk_destructor(struct request_sock *req) kfree_skb(inet_rsk(req)->pktopts); } -#ifdef CONFIG_TCP_MD5SIG -static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, - const struct in6_addr *addr) -{ - return tcp_md5_do_lookup(sk, (union tcp_md5_addr *)addr, AF_INET6); -} - -static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) -{ - return tcp_v6_md5_do_lookup(sk, &addr_sk->sk_v6_daddr); -} - -static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) -{ - struct tcp_md5sig cmd; - struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; - u8 prefixlen; - - if (optlen < sizeof(cmd)) - return -EINVAL; - - if (copy_from_user(&cmd, optval, sizeof(cmd))) - return -EFAULT; - - if (sin6->sin6_family != AF_INET6) - return -EINVAL; - - if (optname == TCP_MD5SIG_EXT && - cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { - prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 128 || (ipv6_addr_v4mapped(&sin6->sin6_addr) && - prefixlen > 32)) - return -EINVAL; - } else { - prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128; - } - - if (!cmd.tcpm_keylen) { - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen); - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen); - } - - if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) - return -EINVAL; - - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen, cmd.tcpm_key, - cmd.tcpm_keylen, GFP_KERNEL); - - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen, cmd.tcpm_key, - cmd.tcpm_keylen, GFP_KERNEL); -} - -static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp, - const struct in6_addr *daddr, - const struct in6_addr *saddr, - const struct tcphdr *th, int nbytes) -{ - struct tcp6_pseudohdr *bp; - struct scatterlist sg; - struct tcphdr *_th; - - bp = hp->scratch; - /* 1. TCP pseudo-header (RFC2460) */ - bp->saddr = *saddr; - bp->daddr = *daddr; - bp->protocol = cpu_to_be32(IPPROTO_TCP); - bp->len = cpu_to_be32(nbytes); - - _th = (struct tcphdr *)(bp + 1); - memcpy(_th, th, sizeof(*th)); - _th->check = 0; - - sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, - sizeof(*bp) + sizeof(*th)); - return crypto_ahash_update(hp->md5_req); -} - -static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, - const struct in6_addr *daddr, struct in6_addr *saddr, - const struct tcphdr *th) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} - -static int tcp_v6_md5_hash_skb(char *md5_hash, - const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - const struct in6_addr *saddr, *daddr; - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - - if (sk) { /* valid for establish/request sockets */ - saddr = &sk->sk_v6_rcv_saddr; - daddr = &sk->sk_v6_daddr; - } else { - const struct ipv6hdr *ip6h = ipv6_hdr(skb); - saddr = &ip6h->saddr; - daddr = &ip6h->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} - -#endif - -static bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ -#ifdef CONFIG_TCP_MD5SIG - const __u8 *hash_location = NULL; - struct tcp_md5sig_key *hash_expected; - const struct ipv6hdr *ip6h = ipv6_hdr(skb); - const struct tcphdr *th = tcp_hdr(skb); - int genhash; - u8 newhash[16]; - - hash_expected = tcp_v6_md5_do_lookup(sk, &ip6h->saddr); - hash_location = tcp_parse_md5sig_option(th); - - /* We've parsed the options - do we have a hash? */ - if (!hash_expected && !hash_location) - return false; - - if (hash_expected && !hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); - return true; - } - - if (!hash_expected && hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); - return true; - } - - /* check the signature */ - genhash = tcp_v6_md5_hash_skb(newhash, - hash_expected, - NULL, skb); - - if (genhash || memcmp(hash_location, newhash, 16) != 0) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); - net_info_ratelimited("MD5 Hash %s for [%pI6c]:%u->[%pI6c]:%u\n", - genhash ? "failed" : "mismatch", - &ip6h->saddr, ntohs(th->source), - &ip6h->daddr, ntohs(th->dest)); - return true; - } -#endif - return false; -} - static void tcp_v6_init_req(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb) @@ -786,56 +571,24 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 __be32 *topt; struct hlist_head *extopt_list = NULL; struct tcp_out_options extraopts; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key = NULL; - const __u8 *hash_location = NULL; - struct ipv6hdr *ipv6h = ipv6_hdr(skb); -#endif + + memset(&extraopts, 0, sizeof(extraopts)); if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (sk && sk->sk_state == TCP_TIME_WAIT) { - struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - - key = tcp_twsk_md5_key(tcptw); - } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (hash_location) { - unsigned char newhash[16]; - struct sock *sk1 = NULL; - int genhash; - - /* active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), tcp_v6_iif(skb), - tcp_v6_sdif(skb)); - if (!sk1) - goto out; +{ + int ret; - key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); - if (!key) - goto out; + ret = tcp_v6_md5_send_response_prepare(skb, 0, + MAX_TCP_OPTION_SPACE - tot_len, + &extraopts, sk); - genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out; - } + if (ret == -1) + goto out; - if (key) - tot_len += TCPOLEN_MD5SIG_ALIGNED; + tot_len += ret; +} #endif if (sk) @@ -849,8 +602,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 if (!rst || !th->ack) extraflags |= TCPHDR_ACK; - memset(&extraopts, 0, sizeof(extraopts)); - used = tcp_extopt_response_prepare(skb, extraflags, remaining, &extraopts, sk); @@ -888,13 +639,8 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 } #ifdef CONFIG_TCP_MD5SIG - if (key) { - *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); - tcp_v6_md5_hash_hdr((__u8 *)topt, key, - &ipv6_hdr(skb)->saddr, - &ipv6_hdr(skb)->daddr, t1); - } + if (extraopts.md5) + tcp_v6_md5_send_response_write(topt, skb, t1, &extraopts, sk); #endif if (unlikely(extopt_list && !hlist_empty(extopt_list))) @@ -942,10 +688,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 out: kfree_skb(buff); - -#ifdef CONFIG_TCP_MD5SIG - rcu_read_unlock(); -#endif } static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) @@ -1071,9 +813,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * struct inet_sock *newinet; struct tcp_sock *newtp; struct sock *newsk; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif struct flowi6 fl6; if (skb->protocol == htons(ETH_P_IP)) { @@ -1218,18 +957,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * newinet->inet_rcv_saddr = LOOPBACK4_IPV6; #ifdef CONFIG_TCP_MD5SIG - /* Copy over the MD5 key from the original socket */ - key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr); - if (key) { - /* We're using one, so create a matching key - * on the newsk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr, - AF_INET6, 128, key->key, key->keylen, - sk_gfp_mask(sk, GFP_ATOMIC)); - } + tcp_v6_md5_syn_recv_sock(sk, newsk); #endif if (__inet_inherit_port(sk, newsk) < 0) { @@ -1691,14 +1419,6 @@ static const struct inet_connection_sock_af_ops ipv6_specific = { .mtu_reduced = tcp_v6_mtu_reduced, }; -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { - .md5_lookup = tcp_v6_md5_lookup, - .calc_md5_hash = tcp_v6_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -#endif - /* * TCP over IPv4 via INET6 API */ @@ -1721,14 +1441,6 @@ static const struct inet_connection_sock_af_ops ipv6_mapped = { .mtu_reduced = tcp_v4_mtu_reduced, }; -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -#endif - /* NOTE: A lot of things set to zero explicitly by call to * sk_alloc() so need not be done here. */ From patchwork Thu Feb 1 00:07:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868108 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="ULIqZEFQ"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0nW6Tyzz9s7M for ; Thu, 1 Feb 2018 11:08:15 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932101AbeBAAIA (ORCPT ); Wed, 31 Jan 2018 19:08:00 -0500 Received: from mail-out7.apple.com ([17.151.62.29]:63167 "EHLO mail-in7.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754445AbeBAAHf (ORCPT ); Wed, 31 Jan 2018 19:07:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443655; x=2381357255; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=kxeQmozipXFyLKeWniPZp8m+AZRpPxPD6Q4X4dF7Tj4=; b=ULIqZEFQZ/S/c5v+iNeqcwmCaOplRpX8ZQLf3C8/vP1CZ7rm0VgxT0aIUM8wV/K6 gBsSz2iqzi9ZGnkJdOJl7rNPyZVVHc4yc93d6F14w5vM/V0MpWrzIR4jVOrE0hu5 ltHUT/vHNv8pefizLOolAxp/amJuwdMH35mSsf/To7Ils9rhWf+LPN6nWE/iHy18 wxYKzYV86/MZ5FGHrN+dPngmths6+K67Kl3wtFwyEzKNhHZ32IQUclPPgirmyNQV kYlYVaiazOHBB6Xjka7Dbbnok4OpBaJgdeQaXmotiUllGoHBJVkWOOCyNiNPyOPA fVpQ/axHfLD1yw5tR5THMA==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in7.apple.com (Apple Secure Mail Relay) with SMTP id 48.BF.02108.74A527A5; Wed, 31 Jan 2018 16:07:35 -0800 (PST) X-AuditID: 11973e16-fc5ff7000000083c-ff-5a725a47accc Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay6.apple.com (Apple SCV relay) with SMTP id 2C.B3.05652.74A527A5; Wed, 31 Jan 2018 16:07:35 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006E930M15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:35 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 12/14] tcp_md5: Use tcp_extra_options in output path Date: Wed, 31 Jan 2018 16:07:14 -0800 Message-id: <20180201000716.69301-13-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNLMWRmVeSWpSXmKPExsUi2FAYpeseVRRlcGgBn8Xuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLFJdNSmpOZllqkb5dAlfGuc3aBcfuMVWs 7p7C1MD4dTpTFyMHh4SAicSc295djFwcQgKrmSTa319i7mLkBItfPH2PDSJxiFFi9/IORpAG ZgF5iYPnZSHijUANS9+wgDQIC0hKdN+5A9bMJqAl8fZ2OyuILSIgJfFxx3Z2kAZmgSZGiUcL z0E1uEr8WH6GDcRmEVCVeHh0Olgzr4C5xMNL/9ggrpCXOPymCWwQJ1C8YdccdhBbSMBM4vP1 xcwgQyUEzrBJTJrzkGUCo+AshAMXMDKuYhTKTczM0c3MM9dLLCjISdVLzs/dxAgKzel2YjsY H66yOsQowMGoxMN7grEoSog1say4MvcQozQHi5I4r5coUEggPbEkNTs1tSC1KL6oNCe1+BAj EwenVAPjHEuju3Xnd1YnPfkVVpI/4YP6h3XG7GyWG16LT34rw9yV5Nk97Xz97e7JNR/KtNed DfjXc/J04Z6J7+P/Rtf/FCje+e6qGQ8X90k/Fe2zRv4/u/NeqF1aKvPFrTYzVeb5f6c/OQmC 21oX3NHw+yq5YbZy2tSD634HC84oMpW0XXz8saDF3+3NSizFGYmGWsxFxYkADXdE5y4CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOLMWRmVeSWpSXmKPExsUi2FB8Q9c9qijKoOkpp8Xuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLlKFNWn5ReWJRikJRckGJrVJxRmJKfnm8 pbGRqUNiQUFOql5yfq6Svp1NSmpOZllqkb5dgmHGuc3aBcfuMVWs7p7C1MD4dTpTFyMnh4SA icTF0/fYuhi5OIQEDjFK7F7ewdjFyMHBLCAvcfC8LES8kUmifekbFpAGYQFJie47d5hBbDYB LYm3t9tZQWwRASmJjzu2s4M0MAs0MUo8WngOqsFV4sfyM2wgNouAqsTDo9PBmnkFzCUeXvrH BnGFvMThN01ggziB4g275rCD2EICZhKfry9mnsDINwvhpgWMjKsYBYpScxIrzfTg3t3ECA7N wqgdjA3LrQ4xCnAwKvHwvrhUGCXEmlhWXJkL9BwHs5II70aRoigh3pTEyqrUovz4otKc1OJD jD5At01klhJNzgfGTV5JvKGxhbGliYWBgYmlmQkOYSVx3sNKQLME0hNLUrNTUwtSi2DGMXFw SjUwGrfNC+eNCXb4+ufC3lrl978XOATqFki63lyQs/Ku8lZrqUXrLlx3dDNkmxG/OuXWwoQZ 5leWvxZddjTgjRtf4/oPyRcYS18HNdckzeY9vU182+2rvsd335i4+rvqDgHxG2tMX/yU3WnG vOz6s5sPp362P/368wL/+76X53qETec78PbBXKWdKz8osQAThqEWc1FxIgAabN+KegIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch starts making use of the extra_option framework for TCP_MD5. One tricky part is that extra_options are called at the end of the tcp_syn_options(), while TCP_MD5 is called at the beginning. TCP_MD5 is called at the beginning because it wants to disable TCP-timestamps (for option-space reasons). So, in the _prepare-function of the extra options we need to undo the work that was done when enabling TCP timestamps. Another thing to note is that in tcp_v4_send_reset (and its IPv6 counterpart), we were looking previously for the listening-socket (if sk == NULL) in case there was an MD5 signature in the TCP-option space of the incoming packet. With the extra-option framework we can't do this anymore, because extra-options are part of the TCP-socket's tcp_option_list. If there is no socket, it means we can't parse the option. This shouldn't have an impact, because when we receive a segment and there is not established socket, we will match on the listening socket (if it's still there). Then, when we decide to respond with a RST in tcp_rcv_state_process, we will give to tcp_v4_send_reset() the listening-socket and thus will parse the TCP_MD5 option. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- Notes: v2: * Fix compiler warning about unused variable when RCU-debugging is disabled include/linux/tcp.h | 10 +- include/linux/tcp_md5.h | 62 ----- net/ipv4/tcp_ipv4.c | 55 ---- net/ipv4/tcp_md5.c | 695 +++++++++++++++++++++++++++++++++-------------- net/ipv4/tcp_minisocks.c | 12 - net/ipv4/tcp_output.c | 68 +---- net/ipv6/tcp_ipv6.c | 23 -- 7 files changed, 487 insertions(+), 438 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index ef0279194ef9..d4d22b9c19be 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -127,11 +127,11 @@ struct tcp_out_options { u16 mss; /* 0 to disable */ u8 ws; /* window scale, 0 to disable */ u8 num_sack_blocks; /* number of SACK blocks to include */ - u8 hash_size; /* bytes in hash_location */ - __u8 *hash_location; /* temporary pointer, overloaded */ __u32 tsval, tsecr; /* need to include OPTION_TS */ struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ +#ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *md5; /* TCP_MD5 signature key */ +#endif }; /* This is the max number of SACKS that we'll generate and process. It's safe @@ -391,9 +391,6 @@ struct tcp_sock { #ifdef CONFIG_TCP_MD5SIG /* TCP AF-Specific parts; only used by MD5 Signature support so far */ const struct tcp_sock_af_ops *af_specific; - -/* TCP MD5 Signature Option information */ - struct tcp_md5sig_info __rcu *md5sig_info; #endif /* TCP fastopen related information */ @@ -451,9 +448,6 @@ struct tcp_timewait_sock { long tw_ts_recent_stamp; struct hlist_head tcp_option_list; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *tw_md5_key; -#endif }; static inline struct tcp_timewait_sock *tcp_twsk(const struct sock *sk) diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h index d4a2175030d0..94a29c4f6fd1 100644 --- a/include/linux/tcp_md5.h +++ b/include/linux/tcp_md5.h @@ -27,25 +27,6 @@ struct tcp_md5sig_key { struct rcu_head rcu; }; -/* - sock block */ -struct tcp_md5sig_info { - struct hlist_head head; - struct rcu_head rcu; -}; - -union tcp_md5sum_block { - struct tcp4_pseudohdr ip4; -#if IS_ENABLED(CONFIG_IPV6) - struct tcp6_pseudohdr ip6; -#endif -}; - -/* - pool: digest algorithm, hash description and scratch buffer */ -struct tcp_md5sig_pool { - struct ahash_request *md5_req; - void *scratch; -}; - extern const struct tcp_sock_af_ops tcp_sock_ipv4_specific; extern const struct tcp_sock_af_ops tcp_sock_ipv6_specific; extern const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; @@ -57,37 +38,9 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, const struct sock *addr_sk); -void tcp_v4_md5_destroy_sock(struct sock *sk); - -int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk); - -void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk); - -int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk); - -void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk); - bool tcp_v4_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); -void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); - -void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); - -void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw); - struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, const struct sock *addr_sk); @@ -99,21 +52,6 @@ int tcp_v6_md5_hash_skb(char *md5_hash, bool tcp_v6_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); -static inline void tcp_md5_twsk_destructor(struct tcp_timewait_sock *twsk) -{ - if (twsk->tw_md5_key) - kfree_rcu(twsk->tw_md5_key, rcu); -} - -static inline void tcp_md5_add_header_len(const struct sock *listener, - struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - - if (tp->af_specific->md5_lookup(listener, sk)) - tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; -} - int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb); int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 2446a4cb1749..694089b0536b 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -603,9 +603,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) struct ip_reply_arg arg; struct net *net; int offset = 0; -#ifdef CONFIG_TCP_MD5SIG - int ret; -#endif /* Never send a reset in response to a reset. */ if (th->rst) @@ -643,26 +640,11 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); -#ifdef CONFIG_TCP_MD5SIG - ret = tcp_v4_md5_send_response_prepare(skb, 0, - MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, - &opts, sk); - - if (ret == -1) - return; - - arg.iov[0].iov_len += ret; -#endif - if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; int used; remaining = sizeof(rep.opt); -#ifdef CONFIG_TCP_MD5SIG - if (opts.md5) - remaining -= TCPOLEN_MD5SIG_ALIGNED; -#endif used = tcp_extopt_response_prepare(skb, TCPHDR_RST, remaining, &opts, sk); @@ -674,9 +656,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) offset += used / 4; } -#ifdef CONFIG_TCP_MD5SIG - tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); -#endif arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ arg.iov[0].iov_len, IPPROTO_TCP, 0); @@ -727,9 +706,6 @@ static void tcp_v4_send_ack(const struct sock *sk, struct net *net = sock_net(sk); struct ip_reply_arg arg; int offset = 0; -#ifdef CONFIG_TCP_MD5SIG - int ret; -#endif extopt_list = tcp_extopt_get_list(sk); @@ -758,28 +734,12 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.ack = 1; rep.th.window = htons(win); -#ifdef CONFIG_TCP_MD5SIG - ret = tcp_v4_md5_send_response_prepare(skb, 0, - MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, - &opts, sk); - - if (ret == -1) - return; - - arg.iov[0].iov_len += ret; -#endif - if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; int used; remaining = sizeof(rep.th) + sizeof(rep.opt) - arg.iov[0].iov_len; -#ifdef CONFIG_TCP_MD5SIG - if (opts.md5) - remaining -= TCPOLEN_MD5SIG_ALIGNED; -#endif - memset(&opts, 0, sizeof(opts)); used = tcp_extopt_response_prepare(skb, TCPHDR_ACK, remaining, &opts, sk); @@ -792,14 +752,6 @@ static void tcp_v4_send_ack(const struct sock *sk, offset += used / 4; } -#ifdef CONFIG_TCP_MD5SIG - if (opts.md5) { - arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; - rep.th.doff = arg.iov[0].iov_len / 4; - } - tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); -#endif - arg.flags = reply_flags; arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ @@ -1026,10 +978,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, tcp_initialize_rcv_mss(newsk); -#ifdef CONFIG_TCP_MD5SIG - tcp_v4_md5_syn_recv_sock(sk, newsk); -#endif - if (__inet_inherit_port(sk, newsk) < 0) goto put_and_exit; *own_req = inet_ehash_nolisten(newsk, req_to_sk(req_unhash)); @@ -1532,9 +1480,6 @@ void tcp_v4_destroy_sock(struct sock *sk) if (unlikely(!hlist_empty(&tp->tcp_option_list))) tcp_extopt_destroy(sk); -#ifdef CONFIG_TCP_MD5SIG - tcp_v4_md5_destroy_sock(sk); -#endif /* Clean up a referenced TCP bind bucket. */ if (inet_csk(sk)->icsk_bind_hash) diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c index d50580536978..2c238c853a56 100644 --- a/net/ipv4/tcp_md5.c +++ b/net/ipv4/tcp_md5.c @@ -8,11 +8,119 @@ #include +struct tcp_md5sig_info { + struct hlist_head head; + struct rcu_head rcu; +}; + +union tcp_md5sum_block { + struct tcp4_pseudohdr ip4; +#if IS_ENABLED(CONFIG_IPV6) + struct tcp6_pseudohdr ip6; +#endif +}; + +/* - pool: digest algorithm, hash description and scratch buffer */ +struct tcp_md5sig_pool { + struct ahash_request *md5_req; + void *scratch; +}; + static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); static DEFINE_MUTEX(tcp_md5sig_mutex); static bool tcp_md5sig_pool_populated; -#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) +static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + +static __be32 *tcp_md5_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store); + +static int tcp_md5_send_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + +static __be32 *tcp_md5_send_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + +static int tcp_md5_extopt_add_header_len(const struct sock *orig, + const struct sock *sk, + struct tcp_extopt_store *store); + +static struct tcp_extopt_store *tcp_md5_extopt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store); + +static struct tcp_extopt_store *tcp_md5_extopt_move(struct sock *from, + struct sock *to, + struct tcp_extopt_store *store); + +static void tcp_md5_extopt_destroy(struct tcp_extopt_store *store); + +struct tcp_md5_extopt { + struct tcp_extopt_store store; + struct tcp_md5sig_info __rcu *md5sig_info; + struct sock *sk; + struct rcu_head rcu; +}; + +static const struct tcp_extopt_ops tcp_md5_extra_ops = { + .option_kind = TCPOPT_MD5SIG, + .prepare = tcp_md5_extopt_prepare, + .write = tcp_md5_extopt_write, + .response_prepare = tcp_md5_send_response_prepare, + .response_write = tcp_md5_send_response_write, + .add_header_len = tcp_md5_extopt_add_header_len, + .copy = tcp_md5_extopt_copy, + .move = tcp_md5_extopt_move, + .destroy = tcp_md5_extopt_destroy, + .owner = THIS_MODULE, +}; + +static struct tcp_md5_extopt *tcp_extopt_to_md5(struct tcp_extopt_store *store) +{ + return container_of(store, struct tcp_md5_extopt, store); +} + +static struct tcp_md5_extopt *tcp_md5_opt_find(const struct sock *sk) +{ + struct tcp_extopt_store *ext_opt; + + ext_opt = tcp_extopt_find_kind(TCPOPT_MD5SIG, sk); + + return tcp_extopt_to_md5(ext_opt); +} + +static int tcp_md5_register(struct sock *sk, + struct tcp_md5_extopt *md5_opt) +{ + return tcp_register_extopt(&md5_opt->store, sk); +} + +static struct tcp_md5_extopt *tcp_md5_alloc_store(struct sock *sk) +{ + struct tcp_md5_extopt *md5_opt; + + md5_opt = kzalloc(sizeof(*md5_opt), GFP_ATOMIC); + if (!md5_opt) + return NULL; + + md5_opt->store.ops = &tcp_md5_extra_ops; + md5_opt->sk = sk; + + return md5_opt; +} static void __tcp_alloc_md5sig_pool(void) { @@ -92,18 +200,17 @@ static struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) return NULL; } -static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk, +static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct tcp_md5_extopt *md5_opt, const union tcp_md5_addr *addr, int family, u8 prefixlen) { - const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; unsigned int size = sizeof(struct in_addr); const struct tcp_md5sig_info *md5sig; /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); + md5sig = rcu_dereference_check(md5_opt->md5sig_info, + sk_fullsock(md5_opt->sk) && lockdep_sock_is_held(md5_opt->sk)); if (!md5sig) return NULL; #if IS_ENABLED(CONFIG_IPV6) @@ -126,11 +233,26 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, u8 newkeylen, gfp_t gfp) { /* Add Key to the list */ - struct tcp_md5sig_key *key; - struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_info *md5sig; + struct tcp_md5_extopt *md5_opt; + struct tcp_md5sig_key *key; - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + md5_opt = tcp_md5_opt_find(sk); + if (!md5_opt) { + int ret; + + md5_opt = tcp_md5_alloc_store(sk); + if (!md5_opt) + return -ENOMEM; + + ret = tcp_md5_register(sk, md5_opt); + if (ret) { + kfree(md5_opt); + return ret; + } + } + + key = tcp_md5_do_lookup_exact(md5_opt, addr, family, prefixlen); if (key) { /* Pre-existing entry - just update that one. */ memcpy(key->key, newkey, newkeylen); @@ -138,8 +260,8 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, return 0; } - md5sig = rcu_dereference_protected(tp->md5sig_info, - lockdep_sock_is_held(sk)); + md5sig = rcu_dereference_protected(md5_opt->md5sig_info, + sk_fullsock(sk) && lockdep_sock_is_held(sk)); if (!md5sig) { md5sig = kmalloc(sizeof(*md5sig), gfp); if (!md5sig) @@ -147,7 +269,7 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, sk_nocaps_add(sk, NETIF_F_GSO_MASK); INIT_HLIST_HEAD(&md5sig->head); - rcu_assign_pointer(tp->md5sig_info, md5sig); + rcu_assign_pointer(md5_opt->md5sig_info, md5sig); } key = sock_kmalloc(sk, sizeof(*key), gfp); @@ -169,18 +291,18 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, return 0; } -static void tcp_clear_md5_list(struct sock *sk) +static void tcp_clear_md5_list(struct tcp_md5_extopt *md5_opt) { - struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_info *md5sig; struct tcp_md5sig_key *key; struct hlist_node *n; - struct tcp_md5sig_info *md5sig; - md5sig = rcu_dereference_protected(tp->md5sig_info, 1); + md5sig = rcu_dereference_protected(md5_opt->md5sig_info, 1); hlist_for_each_entry_safe(key, n, &md5sig->head, node) { hlist_del_rcu(&key->node); - atomic_sub(sizeof(*key), &sk->sk_omem_alloc); + if (md5_opt->sk && sk_fullsock(md5_opt->sk)) + atomic_sub(sizeof(*key), &md5_opt->sk->sk_omem_alloc); kfree_rcu(key, rcu); } } @@ -188,9 +310,14 @@ static void tcp_clear_md5_list(struct sock *sk) static int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family, u8 prefixlen) { + struct tcp_md5_extopt *md5_opt; struct tcp_md5sig_key *key; - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + md5_opt = tcp_md5_opt_find(sk); + if (!md5_opt) + return -ENOENT; + + key = tcp_md5_do_lookup_exact(md5_opt, addr, family, prefixlen); if (!key) return -ENOENT; hlist_del_rcu(&key->node); @@ -422,16 +549,20 @@ static struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, const union tcp_md5_addr *addr, int family) { - const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; + struct tcp_md5sig_key *best_match = NULL; const struct tcp_md5sig_info *md5sig; + struct tcp_md5_extopt *md5_opt; + struct tcp_md5sig_key *key; __be32 mask; - struct tcp_md5sig_key *best_match = NULL; bool match; + md5_opt = tcp_md5_opt_find(sk); + if (!md5_opt) + return NULL; + /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); + md5sig = rcu_dereference_check(md5_opt->md5sig_info, + sk_fullsock(sk) && lockdep_sock_is_held(sk)); if (!md5sig) return NULL; @@ -539,75 +670,30 @@ static int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, return 0; } -int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk) +static int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) { - const struct tcphdr *th = tcp_hdr(skb); const struct iphdr *iph = ip_hdr(skb); - const __u8 *hash_location = NULL; rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - opts->md5 = tcp_md5_do_lookup(sk, - (union tcp_md5_addr *)&iph->saddr, - AF_INET); - } else if (sk && sk->sk_state == TCP_TIME_WAIT) { - struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - - opts->md5 = tcp_twsk_md5_key(tcptw); - } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { - opts->md5 = tcp_md5_do_lookup(sk, - (union tcp_md5_addr *)&iph->saddr, - AF_INET); - } else if (hash_location) { - unsigned char newhash[16]; - struct sock *sk1; - int genhash; - - /* active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = __inet_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - iph->saddr, - th->source, iph->daddr, - ntohs(th->source), inet_iif(skb), - tcp_v4_sdif(skb)); - /* don't send rst if it can't find key */ - if (!sk1) - goto out_err; - - opts->md5 = tcp_md5_do_lookup(sk1, (union tcp_md5_addr *) - &iph->saddr, AF_INET); - if (!opts->md5) - goto out_err; - - genhash = tcp_v4_md5_hash_skb(newhash, opts->md5, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out_err; - } + opts->md5 = tcp_md5_do_lookup(sk, + (union tcp_md5_addr *)&iph->saddr, + AF_INET); if (opts->md5) + /* rcu_read_unlock() is in _response_write */ return TCPOLEN_MD5SIG_ALIGNED; rcu_read_unlock(); return 0; - -out_err: - rcu_read_unlock(); - return -1; } -void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk) +static __be32 *tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) { if (opts->md5) { *topt++ = htonl((TCPOPT_NOP << 24) | @@ -618,75 +704,39 @@ void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, tcp_v4_md5_hash_hdr((__u8 *)topt, opts->md5, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, t1); + + topt += 4; + + /* Unlocking from _response_prepare */ rcu_read_unlock(); } + + return topt; } #if IS_ENABLED(CONFIG_IPV6) -int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk) +static int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) { - const struct tcphdr *th = tcp_hdr(skb); struct ipv6hdr *ipv6h = ipv6_hdr(skb); - const __u8 *hash_location = NULL; rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (sk && sk->sk_state == TCP_TIME_WAIT) { - struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - - opts->md5 = tcp_twsk_md5_key(tcptw); - } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { - opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (hash_location) { - unsigned char newhash[16]; - struct sock *sk1; - int genhash; - - /* active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), tcp_v6_iif(skb), - tcp_v6_sdif(skb)); - if (!sk1) - goto out_err; - - opts->md5 = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); - if (!opts->md5) - goto out_err; - - genhash = tcp_v6_md5_hash_skb(newhash, opts->md5, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out_err; - } + opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); if (opts->md5) + /* rcu_read_unlock() is in _response_write */ return TCPOLEN_MD5SIG_ALIGNED; rcu_read_unlock(); return 0; - -out_err: - rcu_read_unlock(); - return -1; } -EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_prepare); -void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk) +static __be32 *tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) { if (opts->md5) { *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | @@ -695,12 +745,45 @@ void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr, t1); + topt += 4; + + /* Unlocking from _response_prepare */ rcu_read_unlock(); } + + return topt; } -EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_write); #endif +static int tcp_md5_send_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (orig->protocol != htons(ETH_P_IP)) + return tcp_v6_md5_send_response_prepare(orig, flags, remaining, + opts, sk); + else +#endif + return tcp_v4_md5_send_response_prepare(orig, flags, remaining, + opts, sk); +} + +static __be32 *tcp_md5_send_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (orig->protocol != htons(ETH_P_IP)) + return tcp_v6_md5_send_response_write(ptr, orig, th, opts, sk); +#endif + return tcp_v4_md5_send_response_write(ptr, orig, th, opts, sk); +} + struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, const struct sock *addr_sk) { @@ -910,59 +993,6 @@ bool tcp_v6_inbound_md5_hash(const struct sock *sk, return false; } EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); -#endif - -void tcp_v4_md5_destroy_sock(struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - - /* Clean up the MD5 key list, if any */ - if (tp->md5sig_info) { - tcp_clear_md5_list(sk); - kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu); - tp->md5sig_info = NULL; - } -} - -void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) -{ - struct inet_sock *inet = inet_sk(sk); - struct tcp_md5sig_key *key; - - /* Copy over the MD5 key from the original socket */ - key = tcp_md5_do_lookup(listener, (union tcp_md5_addr *)&inet->inet_daddr, - AF_INET); - if (key) { - /* We're using one, so create a matching key - * on the sk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(sk, (union tcp_md5_addr *)&inet->inet_daddr, - AF_INET, 32, key->key, key->keylen, GFP_ATOMIC); - sk_nocaps_add(sk, NETIF_F_GSO_MASK); - } -} - -#if IS_ENABLED(CONFIG_IPV6) -void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) -{ - struct tcp_md5sig_key *key; - - /* Copy over the MD5 key from the original socket */ - key = tcp_v6_md5_do_lookup(listener, &sk->sk_v6_daddr); - if (key) { - /* We're using one, so create a matching key - * on the newsk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(sk, (union tcp_md5_addr *)&sk->sk_v6_daddr, - AF_INET6, 128, key->key, key->keylen, - sk_gfp_mask(sk, GFP_ATOMIC)); - } -} -EXPORT_SYMBOL_GPL(tcp_v6_md5_syn_recv_sock); struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, const struct sock *addr_sk) @@ -972,25 +1002,6 @@ struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, EXPORT_SYMBOL_GPL(tcp_v6_md5_lookup); #endif -void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw) -{ - struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); - struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - - /* The timewait bucket does not have the key DB from the - * sock structure. We just make a quick copy of the - * md5 key being used (if indeed we are using one) - * so the timewait ack generating code has the key. - */ - tcptw->tw_md5_key = NULL; - key = tp->af_specific->md5_lookup(sk, sk); - if (key) { - tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool()); - } -} - static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, const struct tcp_md5sig_key *key) { @@ -1040,13 +1051,17 @@ static int tcp_diag_put_md5sig(struct sk_buff *skb, int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb) { if (net_admin) { + struct tcp_md5_extopt *md5_opt; struct tcp_md5sig_info *md5sig; int err = 0; rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) - err = tcp_diag_put_md5sig(skb, md5sig); + md5_opt = tcp_md5_opt_find(sk); + if (md5_opt) { + md5sig = rcu_dereference(md5_opt->md5sig_info); + if (md5sig) + err = tcp_diag_put_md5sig(skb, md5sig); + } rcu_read_unlock(); if (err < 0) return err; @@ -1061,15 +1076,19 @@ int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin) int size = 0; if (net_admin && sk_fullsock(sk)) { + struct tcp_md5_extopt *md5_opt; const struct tcp_md5sig_info *md5sig; const struct tcp_md5sig_key *key; size_t md5sig_count = 0; rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) { - hlist_for_each_entry_rcu(key, &md5sig->head, node) - md5sig_count++; + md5_opt = tcp_md5_opt_find(sk); + if (md5_opt) { + md5sig = rcu_dereference(md5_opt->md5sig_info); + if (md5sig) { + hlist_for_each_entry_rcu(key, &md5sig->head, node) + md5sig_count++; + } } rcu_read_unlock(); size += nla_total_size(md5sig_count * @@ -1080,6 +1099,260 @@ int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin) } EXPORT_SYMBOL_GPL(tcp_md5_diag_get_aux_size); +static int tcp_md5_extopt_add_header_len(const struct sock *orig, + const struct sock *sk, + struct tcp_extopt_store *store) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (tp->af_specific->md5_lookup(orig, sk)) + return TCPOLEN_MD5SIG_ALIGNED; + + return 0; +} + +static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ + int ret = 0; + + if (sk_fullsock(sk)) { + struct tcp_sock *tp = tcp_sk(sk); + + opts->md5 = tp->af_specific->md5_lookup(sk, sk); + } else { + struct request_sock *req = inet_reqsk(sk); + struct sock *listener = req->rsk_listener; + + /* Coming from tcp_make_synack, unlock is in + * tcp_md5_extopt_write + */ + rcu_read_lock(); + + opts->md5 = tcp_rsk(req)->af_specific->req_md5_lookup(listener, sk); + + if (!opts->md5) + rcu_read_unlock(); + } + + if (unlikely(opts->md5)) { + ret = TCPOLEN_MD5SIG_ALIGNED; + opts->options |= OPTION_MD5; + + /* Don't use TCP timestamps with TCP_MD5 */ + if ((opts->options & OPTION_TS)) { + ret -= TCPOLEN_TSTAMP_ALIGNED; + + /* When TS are enabled, Linux puts the SACK_OK + * next to the timestamp option, thus not accounting + * for its space. Here, we disable timestamps, thus + * we need to account for the space. + */ + if (opts->options & OPTION_SACK_ADVERTISE) + ret += TCPOLEN_SACKPERM_ALIGNED; + } + + opts->options &= ~OPTION_TS; + opts->tsval = 0; + opts->tsecr = 0; + + if (!sk_fullsock(sk)) { + struct request_sock *req = inet_reqsk(sk); + + inet_rsk(req)->tstamp_ok = 0; + } + } + + return ret; +} + +static __be32 *tcp_md5_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store) +{ + if (unlikely(OPTION_MD5 & opts->options)) { +#if IS_ENABLED(CONFIG_IPV6) + const struct in6_addr *addr6; + + if (sk_fullsock(sk)) { + addr6 = &sk->sk_v6_daddr; + } else { + BUG_ON(sk->sk_state != TCP_NEW_SYN_RECV); + addr6 = &inet_rsk(inet_reqsk(sk))->ir_v6_rmt_addr; + } +#endif + + *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); + + if (sk_fullsock(sk)) + sk_nocaps_add(sk, NETIF_F_GSO_MASK); + + /* Calculate the MD5 hash, as we have all we need now */ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6 && !ipv6_addr_v4mapped(addr6)) + tcp_v6_md5_hash_skb((__u8 *)ptr, opts->md5, sk, skb); + else +#endif + tcp_v4_md5_hash_skb((__u8 *)ptr, opts->md5, sk, skb); + + ptr += 4; + + /* Coming from tcp_make_synack */ + if (!sk_fullsock(sk)) + rcu_read_unlock(); + } + + return ptr; +} + +static struct tcp_md5_extopt *__tcp_md5_extopt_copy(struct request_sock *req, + const struct tcp_md5sig_key *key, + const union tcp_md5_addr *addr, + int family) +{ + struct tcp_md5_extopt *md5_opt = NULL; + struct tcp_md5sig_info *md5sig; + struct tcp_md5sig_key *newkey; + + md5_opt = tcp_md5_alloc_store(req_to_sk(req)); + if (!md5_opt) + goto err; + + md5sig = kmalloc(sizeof(*md5sig), GFP_ATOMIC); + if (!md5sig) + goto err_md5sig; + + INIT_HLIST_HEAD(&md5sig->head); + rcu_assign_pointer(md5_opt->md5sig_info, md5sig); + + newkey = kmalloc(sizeof(*newkey), GFP_ATOMIC); + if (!newkey) + goto err_newkey; + + memcpy(newkey->key, key->key, key->keylen); + newkey->keylen = key->keylen; + newkey->family = family; + newkey->prefixlen = 32; + memcpy(&newkey->addr, addr, + (family == AF_INET6) ? sizeof(struct in6_addr) : + sizeof(struct in_addr)); + hlist_add_head_rcu(&newkey->node, &md5sig->head); + + return md5_opt; + +err_newkey: + kfree(md5sig); +err_md5sig: + kfree_rcu(md5_opt, rcu); +err: + return NULL; +} + +static struct tcp_extopt_store *tcp_md5_v4_extopt_copy(const struct sock *listener, + struct request_sock *req) +{ + struct inet_request_sock *ireq = inet_rsk(req); + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_md5_do_lookup(listener, + (union tcp_md5_addr *)&ireq->ir_rmt_addr, + AF_INET); + if (!key) + return NULL; + + return (struct tcp_extopt_store *)__tcp_md5_extopt_copy(req, key, + (union tcp_md5_addr *)&ireq->ir_rmt_addr, + AF_INET); +} + +#if IS_ENABLED(CONFIG_IPV6) +static struct tcp_extopt_store *tcp_md5_v6_extopt_copy(const struct sock *listener, + struct request_sock *req) +{ + struct inet_request_sock *ireq = inet_rsk(req); + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_v6_md5_do_lookup(listener, &ireq->ir_v6_rmt_addr); + if (!key) + return NULL; + + return (struct tcp_extopt_store *)__tcp_md5_extopt_copy(req, key, + (union tcp_md5_addr *)&ireq->ir_v6_rmt_addr, + AF_INET6); +} +#endif + +/* We are creating a new request-socket, based on the listener's key that + * matches the IP-address. Thus, we need to create a new tcp_extopt_store, and + * store the matching key in there for the request-sock. + */ +static struct tcp_extopt_store *tcp_md5_extopt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store) +{ +#if IS_ENABLED(CONFIG_IPV6) + struct inet_request_sock *ireq = inet_rsk(req); + + if (ireq->ireq_family == AF_INET6) + return tcp_md5_v6_extopt_copy(listener, req); +#endif + return tcp_md5_v4_extopt_copy(listener, req); +} + +/* Moving from a request-sock to a full socket means we need to account for + * the memory and set GSO-flags. When moving from a full socket to ta time-wait + * socket we also need to adjust the memory accounting. + */ +static struct tcp_extopt_store *tcp_md5_extopt_move(struct sock *from, + struct sock *to, + struct tcp_extopt_store *store) +{ + struct tcp_md5_extopt *md5_opt = tcp_extopt_to_md5(store); + unsigned int size = sizeof(struct tcp_md5sig_key); + + if (sk_fullsock(to)) { + /* From request-sock to full socket */ + + if (size > sysctl_optmem_max || + atomic_read(&to->sk_omem_alloc) + size >= sysctl_optmem_max) { + tcp_md5_extopt_destroy(store); + return NULL; + } + + sk_nocaps_add(to, NETIF_F_GSO_MASK); + atomic_add(size, &to->sk_omem_alloc); + } else if (sk_fullsock(from)) { + /* From full socket to time-wait-socket */ + atomic_sub(size, &from->sk_omem_alloc); + } + + md5_opt->sk = to; + + return store; +} + +static void tcp_md5_extopt_destroy(struct tcp_extopt_store *store) +{ + struct tcp_md5_extopt *md5_opt = tcp_extopt_to_md5(store); + + /* Clean up the MD5 key list, if any */ + if (md5_opt) { + tcp_clear_md5_list(md5_opt); + kfree_rcu(rcu_dereference_protected(md5_opt->md5sig_info, 1), rcu); + md5_opt->md5sig_info = NULL; + + kfree_rcu(md5_opt, rcu); + } +} + const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { .md5_lookup = tcp_v4_md5_lookup, .calc_md5_hash = tcp_v4_md5_hash_skb, diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 072dbcebfbaf..bf2f3ac5c167 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -22,7 +22,6 @@ #include #include #include -#include #include #include #include @@ -295,9 +294,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) tcp_extopt_move(sk, (struct sock *)tw); INIT_HLIST_HEAD(&tp->tcp_option_list); } -#ifdef CONFIG_TCP_MD5SIG - tcp_md5_time_wait(sk, tw); -#endif /* Get the TIME_WAIT timeout firing. */ if (timeo < rto) @@ -334,10 +330,6 @@ void tcp_twsk_destructor(struct sock *sk) { struct tcp_timewait_sock *twsk = tcp_twsk(sk); -#ifdef CONFIG_TCP_MD5SIG - tcp_md5_twsk_destructor(twsk); -#endif - if (unlikely(!hlist_empty(&twsk->tcp_option_list))) tcp_extopt_destroy(sk); } @@ -522,10 +514,6 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->tcp_header_len = sizeof(struct tcphdr); } newtp->tsoffset = treq->ts_off; -#ifdef CONFIG_TCP_MD5SIG - newtp->md5sig_info = NULL; /*XXX*/ - tcp_md5_add_header_len(sk, newsk); -#endif if (unlikely(!hlist_empty(&treq->tcp_option_list))) newtp->tcp_header_len += tcp_extopt_add_header(req_to_sk(req), newsk); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index c7fb7a0e1610..66f10af98a2e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -42,7 +42,6 @@ #include #include #include -#include #include @@ -424,14 +423,6 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, extopt_list = tcp_extopt_get_list(sk); - if (unlikely(OPTION_MD5 & options)) { - *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); - /* overload cookie hash location */ - opts->hash_location = (__u8 *)ptr; - ptr += 4; - } - if (unlikely(opts->mss)) { *ptr++ = htonl((TCPOPT_MSS << 24) | (TCPOLEN_MSS << 16) | @@ -527,14 +518,6 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, unsigned int remaining = MAX_TCP_OPTION_SPACE; struct tcp_fastopen_request *fastopen = tp->fastopen_req; -#ifdef CONFIG_TCP_MD5SIG - opts->md5 = tp->af_specific->md5_lookup(sk, sk); - if (opts->md5) { - opts->options |= OPTION_MD5; - remaining -= TCPOLEN_MD5SIG_ALIGNED; - } -#endif - /* We always get an MSS option. The option bytes which will be seen in * normal data packets should timestamps be used, must be in the MSS * advertised. But we subtract them from tp->mss_cache so that @@ -547,7 +530,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, opts->mss = tcp_advertise_mss(sk); remaining -= TCPOLEN_MSS_ALIGNED; - if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !opts->md5)) { + if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; @@ -596,20 +579,6 @@ static unsigned int tcp_synack_options(const struct sock *sk, struct inet_request_sock *ireq = inet_rsk(req); unsigned int remaining = MAX_TCP_OPTION_SPACE; -#ifdef CONFIG_TCP_MD5SIG - if (opts->md5) { - opts->options |= OPTION_MD5; - remaining -= TCPOLEN_MD5SIG_ALIGNED; - - /* We can't fit any SACK blocks in a packet with MD5 + TS - * options. There was discussion about disabling SACK - * rather than TS in order to fit in better with old, - * buggy kernels, but that was deemed to be unnecessary. - */ - ireq->tstamp_ok &= !ireq->sack_ok; - } -#endif - /* We always send an MSS option. */ opts->mss = mss; remaining -= TCPOLEN_MSS_ALIGNED; @@ -670,16 +639,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb size += TCPOLEN_TSTAMP_ALIGNED; } -#ifdef CONFIG_TCP_MD5SIG - opts->md5 = tp->af_specific->md5_lookup(sk, sk); - if (unlikely(opts->md5)) { - opts->options |= OPTION_MD5; - size += TCPOLEN_MD5SIG_ALIGNED; - } -#else - opts->md5 = NULL; -#endif - if (unlikely(!hlist_empty(&tp->tcp_option_list))) size += tcp_extopt_prepare(skb, 0, MAX_TCP_OPTION_SPACE - size, opts, tcp_to_sk(tp)); @@ -1082,14 +1041,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, th->window = htons(min(tp->rcv_wnd, 65535U)); } tcp_options_write((__be32 *)(th + 1), skb, sk, &opts); -#ifdef CONFIG_TCP_MD5SIG - /* Calculate the MD5 hash, as we have all we need now */ - if (opts.md5) { - sk_nocaps_add(sk, NETIF_F_GSO_MASK); - tp->af_specific->calc_md5_hash(opts.hash_location, - opts.md5, sk, skb); - } -#endif icsk->icsk_af_ops->send_check(sk, skb); @@ -3164,10 +3115,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #endif skb->skb_mstamp = tcp_clock_us(); -#ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - opts.md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); -#endif skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4); tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, foc) + sizeof(*th); @@ -3194,15 +3141,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, tcp_options_write((__be32 *)(th + 1), skb, req_to_sk(req), &opts); __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); -#ifdef CONFIG_TCP_MD5SIG - /* Okay, we have all we need - do the md5 hash if needed */ - if (opts.md5) - tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location, - opts.md5, - req_to_sk(req), skb); - rcu_read_unlock(); -#endif - /* Do not fool tcpdump (if any), clean our debris */ skb->tstamp = 0; return skb; @@ -3243,10 +3181,6 @@ static void tcp_connect_init(struct sock *sk) if (sock_net(sk)->ipv4.sysctl_tcp_timestamps) tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED; -#ifdef CONFIG_TCP_MD5SIG - tcp_md5_add_header_len(sk, sk); -#endif - if (unlikely(!hlist_empty(&tp->tcp_option_list))) tp->tcp_header_len += tcp_extopt_add_header(sk, sk); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 26b19475d91c..3c48283a76c1 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -576,20 +576,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; -#ifdef CONFIG_TCP_MD5SIG -{ - int ret; - - ret = tcp_v6_md5_send_response_prepare(skb, 0, - MAX_TCP_OPTION_SPACE - tot_len, - &extraopts, sk); - - if (ret == -1) - goto out; - - tot_len += ret; -} -#endif if (sk) extopt_list = tcp_extopt_get_list(sk); @@ -638,11 +624,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 *topt++ = htonl(tsecr); } -#ifdef CONFIG_TCP_MD5SIG - if (extraopts.md5) - tcp_v6_md5_send_response_write(topt, skb, t1, &extraopts, sk); -#endif - if (unlikely(extopt_list && !hlist_empty(extopt_list))) tcp_extopt_response_write(topt, skb, t1, &extraopts, sk); @@ -956,10 +937,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6; newinet->inet_rcv_saddr = LOOPBACK4_IPV6; -#ifdef CONFIG_TCP_MD5SIG - tcp_v6_md5_syn_recv_sock(sk, newsk); -#endif - if (__inet_inherit_port(sk, newsk) < 0) { inet_csk_prepare_forced_close(newsk); tcp_done(newsk); From patchwork Thu Feb 1 00:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868114 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="CLEWgCyf"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX16N6j8lz9ryk for ; Thu, 1 Feb 2018 11:22:52 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932097AbeBAAWu (ORCPT ); Wed, 31 Jan 2018 19:22:50 -0500 Received: from mail-out25.apple.com ([17.171.2.35]:50282 "EHLO mail-in25.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755910AbeBAAWp (ORCPT ); Wed, 31 Jan 2018 19:22:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443656; x=2381357256; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=VrxMZz4rY1pJnUc7Y957bDRm8FXQ+aU5GdghMUXm2xM=; b=CLEWgCyfQS9HFFmhDt5KIpbxUbnsOi4oR0DKzDD/STHMWVdVHnMmtv9b2zrTH71d 739ODM5SObkAhr9pem6gXmyAmyXD2CCOawMbMJsiDW759VNuynGc7CvdmMZbecVV bVIMC/YZEeE9Zdpa/xsVkaHN+i4Lng8XwVxwyrmsj0AqlCxmj00YwNyqBFAa+sto Rh4rzquhZPrDdGDC/Z3kSe7gSBkvnFS+/skBQhY98YzAIECQo3p+N0ICHY7do81E htvLiU+JwuBRvLNW3tbAkL8QmEYE47bI8u55unserzefTiITnlc4PC95gT1HwaYF YwRmm7keY43t1AoM010VGQ==; Received: from relay7.apple.com (relay7.apple.com [17.128.113.101]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in25.apple.com (Apple Secure Mail Relay) with SMTP id 6A.63.14365.74A527A5; Wed, 31 Jan 2018 16:07:36 -0800 (PST) X-AuditID: 11ab0219-e904d9e00000381d-fa-5a725a47a00c Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay7.apple.com (Apple SCV relay) with SMTP id D3.0D.05443.74A527A5; Wed, 31 Jan 2018 16:07:35 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G00BIB30NWZC0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:35 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 13/14] tcp_md5: Cleanup TCP-code Date: Wed, 31 Jan 2018 16:07:15 -0800 Message-id: <20180201000716.69301-14-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprDLMWRmVeSWpSXmKPExsUi2FCYqusRVRRlMHOKgMXuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLFJdNSmpOZllqkb5dAlfGkf63jAULZjFW nLnB38A4sbaLkZNDQsBEoq/pKlsXIxeHkMBaJolj7Q3MMIktLbtYIRKHGCU2df0FSnBwMAvI Sxw8LwsRb2SS6OpYxwjSICwgKdF95w5YM5uAlsTb2+2sILaIgJTExx3b2UEamAWaGCUeLTzH AtFgKHF7yT02EJtFQFVi5eoesEG8AuYSE/bsY4O4Ql7i8JsmsEGcQPGGXXPYQWwhATOJz9cX M4MMlRA4wSbRNfEk2wRGwVkIBy5gZFzFKJybmJmjm5lnZKqXWFCQk6qXnJ+7iREUnKuZJHcw fn1teIhRgINRiYd3woXCKCHWxLLiytxDjNIcLErivJHKWVFCAumJJanZqakFqUXxRaU5qcWH GJk4OKUaGOfpdKwym312nXLo9E21lwprmIWcSltDe3uv14rvez6h5ILCZ+4XZn9OzAo1zan+ yKzWfGPWupt/A3eeVmTS/jFp38PQoDc92p82eGvnrP1slrHusmT8m+V1vCej2CaE79YValxo wX93QtAdg4u7S6Xbmd6JVvNoRE56zXfrtqfTeUN9d5sWJSWW4oxEQy3mouJEANfcZucvAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsUi2FB8Q9c9qijK4P51XYvdd8Mtnh57xG7x t6WfxeLYAjEHFo/d05sYPRZsKvWYdzLQ4/MmuQCWKEObtPyi8sSiFIWi5IISW6XijMSU/PJ4 S2MjU4fEgoKcVL3k/FwlfTublNSczLLUIn27BMOMI/1vGQsWzGKsOHODv4FxYm0XIyeHhICJ xJaWXaxdjFwcQgKHGCU2df1l7mLk4GAWkJc4eF4WIt7IJNHVsY4RpEFYQFKi+84dZhCbTUBL 4u3tdlYQW0RASuLjju3sIA3MAk2MEo8WnmOBaDCUuL3kHhuIzSKgKrFydQ/YIF4Bc4kJe/ax QVwhL3H4TRPYIE6geMOuOewgtpCAmcTn64uZJzDyzUK4aQEj4ypGgaLUnMRKcz24bzcxggOz MHUHY+Nyq0OMAhyMSjy8Ly4VRgmxJpYVV+YCPcfBrCTCu1GkKEqINyWxsiq1KD++qDQntfgQ ow/QbROZpUST84FRk1cSb2hsYWxpYmFgYGJpZoJDWEmc94gS0CyB9MSS1OzU1ILUIphxTByc Ug2MCgH7suqzvx4/aFLcxtebwjTRdCHDv9/y/5+JnVvyWv8eX4yT6SEnJZHJd+/MtxRymH38 dmN65LHYKzNj1os4rjjtvjN9rbKc656O7zF6l/MeO7Bct5XIFdCw6I0OCW3v80nz2Vj78FAZ f7Cu1IMbC1lU6muyqlPbFz17dqP+S+ZcYWOlGxuVWIDpwlCLuag4EQBXW6xweQIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Now that we have consolidated the TCP_MD5 output path, we can cleanup TCP and its callbacks to MD5. These callbacks are solely there to handle the different address-familiese (v4, v6 and v4mapped). Now that we have isolated the TCP_MD5-code it is acceptable to add a bit more complexity inside tcp_md5.c to handle these address-families at the benefit of getting rid of these callbacks in tcp_sock, together with its assignments in tcp_v4/6_connect,... Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 5 - include/linux/tcp_md5.h | 18 +-- include/net/tcp.h | 24 ---- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_ipv4.c | 8 -- net/ipv4/tcp_md5.c | 340 ++++++++++++++++++++++-------------------------- net/ipv6/tcp_ipv6.c | 17 --- 7 files changed, 155 insertions(+), 259 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index d4d22b9c19be..36f9bedeb6b1 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -388,11 +388,6 @@ struct tcp_sock { * while socket was owned by user. */ -#ifdef CONFIG_TCP_MD5SIG -/* TCP AF-Specific parts; only used by MD5 Signature support so far */ - const struct tcp_sock_af_ops *af_specific; -#endif - /* TCP fastopen related information */ struct tcp_fastopen_request *fastopen_req; /* fastopen_rsk points to request_sock that resulted in this big diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h index 94a29c4f6fd1..441be65ec893 100644 --- a/include/linux/tcp_md5.h +++ b/include/linux/tcp_md5.h @@ -27,28 +27,14 @@ struct tcp_md5sig_key { struct rcu_head rcu; }; -extern const struct tcp_sock_af_ops tcp_sock_ipv4_specific; -extern const struct tcp_sock_af_ops tcp_sock_ipv6_specific; -extern const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; - /* - functions */ -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, const struct sk_buff *skb); -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk); +int tcp_md5_parse_keys(struct sock *sk, int optname, char __user *optval, + int optlen); bool tcp_v4_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); -struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, - const struct sock *addr_sk); - -int tcp_v6_md5_hash_skb(char *md5_hash, - const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb); - bool tcp_v6_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); diff --git a/include/net/tcp.h b/include/net/tcp.h index d2738cb01cf2..ceb8ac1e17bd 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1730,32 +1730,8 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, const struct tcp_request_sock_ops *af_ops, struct sock *sk, struct sk_buff *skb); -/* TCP af-specific functions */ -struct tcp_sock_af_ops { -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *(*md5_lookup) (const struct sock *sk, - const struct sock *addr_sk); - int (*calc_md5_hash)(char *location, - const struct tcp_md5sig_key *md5, - const struct sock *sk, - const struct sk_buff *skb); - int (*md5_parse)(struct sock *sk, - int optname, - char __user *optval, - int optlen); -#endif -}; - struct tcp_request_sock_ops { u16 mss_clamp; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *(*req_md5_lookup)(const struct sock *sk, - const struct sock *addr_sk); - int (*calc_md5_hash) (char *location, - const struct tcp_md5sig_key *md5, - const struct sock *sk, - const struct sk_buff *skb); -#endif void (*init_req)(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index fc5c9cb19b9b..ffff795f2dfb 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2828,7 +2828,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, case TCP_MD5SIG: case TCP_MD5SIG_EXT: /* Read the IP->Key mappings from userspace */ - err = tp->af_specific->md5_parse(sk, optname, optval, optlen); + err = tcp_md5_parse_keys(sk, optname, optval, optlen); break; #endif case TCP_USER_TIMEOUT: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 694089b0536b..6a839c1280b3 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -889,10 +889,6 @@ struct request_sock_ops tcp_request_sock_ops __read_mostly = { static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = { .mss_clamp = TCP_MSS_DEFAULT, -#ifdef CONFIG_TCP_MD5SIG - .req_md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, -#endif .init_req = tcp_v4_init_req, #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v4_init_sequence, @@ -1450,10 +1446,6 @@ static int tcp_v4_init_sock(struct sock *sk) icsk->icsk_af_ops = &ipv4_specific; -#ifdef CONFIG_TCP_MD5SIG - tcp_sk(sk)->af_specific = &tcp_sock_ipv4_specific; -#endif - return 0; } diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c index 2c238c853a56..e05db5af06ee 100644 --- a/net/ipv4/tcp_md5.c +++ b/net/ipv4/tcp_md5.c @@ -336,12 +336,13 @@ static int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, return crypto_ahash_update(hp->md5_req); } -static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) +int tcp_md5_parse_keys(struct sock *sk, int optname, char __user *optval, + int optlen) { + u8 prefixlen = 32, maxprefixlen; + union tcp_md5_addr *tcpmd5addr; struct tcp_md5sig cmd; - struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; - u8 prefixlen = 32; + unsigned short family; if (optlen < sizeof(cmd)) return -EINVAL; @@ -349,76 +350,48 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, if (copy_from_user(&cmd, optval, sizeof(cmd))) return -EFAULT; - if (sin->sin_family != AF_INET) - return -EINVAL; - - if (optname == TCP_MD5SIG_EXT && - cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { - prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 32) - return -EINVAL; - } + family = cmd.tcpm_addr.ss_family; - if (!cmd.tcpm_keylen) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen); - - if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) + if (family != AF_INET && family != AF_INET6) return -EINVAL; - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, - GFP_KERNEL); -} - -#if IS_ENABLED(CONFIG_IPV6) -static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) -{ - struct tcp_md5sig cmd; - struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; - u8 prefixlen; - - if (optlen < sizeof(cmd)) + if (sk->sk_family != family) return -EINVAL; - if (copy_from_user(&cmd, optval, sizeof(cmd))) - return -EFAULT; + if (family == AF_INET6) { + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; - if (sin6->sin6_family != AF_INET6) - return -EINVAL; + if (!ipv6_addr_v4mapped(&sin6->sin6_addr)) { + tcpmd5addr = (union tcp_md5_addr *)&sin6->sin6_addr; + maxprefixlen = 128; + } else { + tcpmd5addr = (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3]; + family = AF_INET; + maxprefixlen = 32; + } + } else { + struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; + + tcpmd5addr = (union tcp_md5_addr *)&sin->sin_addr; + maxprefixlen = 32; + } if (optname == TCP_MD5SIG_EXT && cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 128 || (ipv6_addr_v4mapped(&sin6->sin6_addr) && - prefixlen > 32)) + if (prefixlen > maxprefixlen) return -EINVAL; - } else { - prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128; } - if (!cmd.tcpm_keylen) { - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen); - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen); - } + if (!cmd.tcpm_keylen) + return tcp_md5_do_del(sk, tcpmd5addr, family, prefixlen); if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) return -EINVAL; - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen, cmd.tcpm_key, - cmd.tcpm_keylen, GFP_KERNEL); - - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen, cmd.tcpm_key, + return tcp_md5_do_add(sk, tcpmd5addr, family, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL); } -#endif static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, __be32 daddr, __be32 saddr, @@ -670,6 +643,102 @@ static int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, return 0; } +static int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, + const struct sock *sk, const struct sk_buff *skb) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + __be32 saddr, daddr; + + if (sk) { /* valid for establish/request sockets */ + saddr = sk->sk_rcv_saddr; + daddr = sk->sk_daddr; + } else { + const struct iphdr *iph = ip_hdr(skb); + + saddr = iph->saddr; + daddr = iph->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_md5_hash_skb(char *md5_hash, + const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb) +{ + const struct in6_addr *saddr, *daddr; + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + + if (sk) { /* valid for establish/request sockets */ + saddr = &sk->sk_v6_rcv_saddr; + daddr = &sk->sk_v6_daddr; + } else { + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + + saddr = &ip6h->saddr; + daddr = &ip6h->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +#endif + static int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, unsigned int remaining, struct tcp_out_options *opts, @@ -784,114 +853,14 @@ static __be32 *tcp_md5_send_response_write(__be32 *ptr, struct sk_buff *orig, return tcp_v4_md5_send_response_write(ptr, orig, th, opts, sk); } -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) +static struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) { const union tcp_md5_addr *addr; addr = (const union tcp_md5_addr *)&addr_sk->sk_daddr; return tcp_md5_do_lookup(sk, addr, AF_INET); } -EXPORT_SYMBOL(tcp_v4_md5_lookup); - -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - __be32 saddr, daddr; - - if (sk) { /* valid for establish/request sockets */ - saddr = sk->sk_rcv_saddr; - daddr = sk->sk_daddr; - } else { - const struct iphdr *iph = ip_hdr(skb); - - saddr = iph->saddr; - daddr = iph->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} -EXPORT_SYMBOL(tcp_v4_md5_hash_skb); - -#if IS_ENABLED(CONFIG_IPV6) -int tcp_v6_md5_hash_skb(char *md5_hash, - const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - const struct in6_addr *saddr, *daddr; - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - - if (sk) { /* valid for establish/request sockets */ - saddr = &sk->sk_v6_rcv_saddr; - daddr = &sk->sk_v6_daddr; - } else { - const struct ipv6hdr *ip6h = ipv6_hdr(skb); - - saddr = &ip6h->saddr; - daddr = &ip6h->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} -EXPORT_SYMBOL_GPL(tcp_v6_md5_hash_skb); -#endif /* Called with rcu_read_lock() */ bool tcp_v4_inbound_md5_hash(const struct sock *sk, @@ -994,8 +963,8 @@ bool tcp_v6_inbound_md5_hash(const struct sock *sk, } EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); -struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) +static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) { return tcp_v6_md5_do_lookup(sk, &addr_sk->sk_v6_daddr); } @@ -1103,10 +1072,17 @@ static int tcp_md5_extopt_add_header_len(const struct sock *orig, const struct sock *sk, struct tcp_extopt_store *store) { - struct tcp_sock *tp = tcp_sk(sk); - - if (tp->af_specific->md5_lookup(orig, sk)) +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6 && + !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) { + if (tcp_v6_md5_lookup(orig, sk)) + return TCPOLEN_MD5SIG_ALIGNED; + } else +#endif +{ + if (tcp_v4_md5_lookup(orig, sk)) return TCPOLEN_MD5SIG_ALIGNED; +} return 0; } @@ -1120,19 +1096,29 @@ static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, int ret = 0; if (sk_fullsock(sk)) { - struct tcp_sock *tp = tcp_sk(sk); - - opts->md5 = tp->af_specific->md5_lookup(sk, sk); +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6 && !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) + opts->md5 = tcp_v6_md5_lookup(sk, sk); + else +#endif + opts->md5 = tcp_v4_md5_lookup(sk, sk); } else { struct request_sock *req = inet_reqsk(sk); struct sock *listener = req->rsk_listener; + struct inet_request_sock *ireq = inet_rsk(req); /* Coming from tcp_make_synack, unlock is in * tcp_md5_extopt_write */ rcu_read_lock(); - opts->md5 = tcp_rsk(req)->af_specific->req_md5_lookup(listener, sk); +#if IS_ENABLED(CONFIG_IPV6) + if (ireq->ireq_family == AF_INET6 && + !ipv6_addr_v4mapped(&ireq->ir_v6_rmt_addr)) + opts->md5 = tcp_v6_md5_lookup(listener, sk); + else +#endif + opts->md5 = tcp_v4_md5_lookup(listener, sk); if (!opts->md5) rcu_read_unlock(); @@ -1352,25 +1338,3 @@ static void tcp_md5_extopt_destroy(struct tcp_extopt_store *store) kfree_rcu(md5_opt, rcu); } } - -const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v4_parse_md5_keys, -}; - -#if IS_ENABLED(CONFIG_IPV6) -const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { - .md5_lookup = tcp_v6_md5_lookup, - .calc_md5_hash = tcp_v6_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -EXPORT_SYMBOL_GPL(tcp_sock_ipv6_specific); - -const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -EXPORT_SYMBOL_GPL(tcp_sock_ipv6_mapped_specific); -#endif diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 3c48283a76c1..8800e5d75677 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -207,9 +207,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, icsk->icsk_af_ops = &ipv6_mapped; sk->sk_backlog_rcv = tcp_v4_do_rcv; -#ifdef CONFIG_TCP_MD5SIG - tp->af_specific = &tcp_sock_ipv6_mapped_specific; -#endif err = tcp_v4_connect(sk, (struct sockaddr *)&sin, sizeof(sin)); @@ -217,9 +214,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, icsk->icsk_ext_hdr_len = exthdrlen; icsk->icsk_af_ops = &ipv6_specific; sk->sk_backlog_rcv = tcp_v6_do_rcv; -#ifdef CONFIG_TCP_MD5SIG - tp->af_specific = &tcp_sock_ipv6_specific; -#endif goto failure; } np->saddr = sk->sk_v6_rcv_saddr; @@ -542,10 +536,6 @@ struct request_sock_ops tcp6_request_sock_ops __read_mostly = { static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { .mss_clamp = IPV6_MIN_MTU - sizeof(struct tcphdr) - sizeof(struct ipv6hdr), -#ifdef CONFIG_TCP_MD5SIG - .req_md5_lookup = tcp_v6_md5_lookup, - .calc_md5_hash = tcp_v6_md5_hash_skb, -#endif .init_req = tcp_v6_init_req, #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v6_init_sequence, @@ -820,9 +810,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * inet_csk(newsk)->icsk_af_ops = &ipv6_mapped; newsk->sk_backlog_rcv = tcp_v4_do_rcv; -#ifdef CONFIG_TCP_MD5SIG - newtp->af_specific = &tcp_sock_ipv6_mapped_specific; -#endif newnp->ipv6_mc_list = NULL; newnp->ipv6_ac_list = NULL; @@ -1429,10 +1416,6 @@ static int tcp_v6_init_sock(struct sock *sk) icsk->icsk_af_ops = &ipv6_specific; -#ifdef CONFIG_TCP_MD5SIG - tcp_sk(sk)->af_specific = &tcp_sock_ipv6_specific; -#endif - return 0; } From patchwork Thu Feb 1 00:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 868102 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="zkf6O0OH"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zX0mt6wQWz9s7M for ; Thu, 1 Feb 2018 11:07:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754670AbeBAAHk (ORCPT ); Wed, 31 Jan 2018 19:07:40 -0500 Received: from mail-out5.apple.com ([17.151.62.27]:53104 "EHLO mail-in5.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754527AbeBAAHh (ORCPT ); Wed, 31 Jan 2018 19:07:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1517443656; x=2381357256; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Az2c3yLF0p0awqPFs4Ng64WsMO/QneET/DD5IWFgF4c=; b=zkf6O0OHgvF9EASxxRtFLBSj2nHxX2A8dSKBgmrAcuu1PvjPx9Mug+gF1iNqJT/5 OJtjq9egvF4jQtZX+u+wyamCe+F3FJ3PnDRlt2zXXKi5JyzeJDFcqT0JIFW5o/Pf mY2/5Y+qEI1/4uBlGVE3Orr5yLvQa/n1pZBJGgWFRt7DAdsaaNaxu/qGB7n6Yt7X W/LPrJ359qzaO+TttcoNv538YpxyrbJhXuQ4VuGdEgW3eOeAUfW0ffBdjksSDCR/ 4P+AEv9KGjeDfTbcmNQbXby0qJ+vfGe5fzsl6OzbmIOcM8BwqHcSPuBW9HTLsvZR kcM50uV9XaUFxyzv2W64qA==; Received: from relay8.apple.com (relay8.apple.com [17.128.113.102]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in5.apple.com (Apple Secure Mail Relay) with SMTP id 97.EE.14264.84A527A5; Wed, 31 Jan 2018 16:07:36 -0800 (PST) X-AuditID: 11973e13-066cc9e0000037b8-20-5a725a485bd4 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay8.apple.com (Apple SCV relay) with SMTP id 34.19.22651.84A527A5; Wed, 31 Jan 2018 16:07:36 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.225]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20180104 64bit (built Jan 4 2018)) with ESMTPSA id <0P3G006EC30O15B0@nwk-mmpp-sz13.apple.com>; Wed, 31 Jan 2018 16:07:36 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Ivan Delalande Subject: [RFC v2 14/14] tcp_md5: Use TCP extra-options on the input path Date: Wed, 31 Jan 2018 16:07:16 -0800 Message-id: <20180201000716.69301-15-cpaasch@apple.com> X-Mailer: git-send-email 2.16.1 In-reply-to: <20180201000716.69301-1-cpaasch@apple.com> References: <20180201000716.69301-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FCYpusRVRRlMH+1mMXuu+EWT489Yrf4 29LPYnFsgZgDi8fu6U2MHgs2lXrMOxno8XmTXABLFJdNSmpOZllqkb5dAlfGsosPGQumm1bs +XKWuYFxinYXIyeHhICJxNXJ15i7GLk4hATWMEks//OMFSZxYv5MRojEIUaJUzemsXUxcnAw C8hLHDwvC1IjJNDIJHGqox7EFhaQlOi+c4cZxGYT0JJ4e7sdbI6IgJTExx3b2UHmMAs0MUo8 WniOBaLBQ2L1zHVsIDaLgKrE+kPPwOK8AuYSMzc+Y4Y4Ql7i8JsmsEGcQPGGXXPYIRabSXy+ vhjsagmBE2wSd//9ZJ/AKDgL4b4FjIyrGIVyEzNzdDPzTPUSCwpyUvWS83M3MYICc7qd8A7G 06usDjEKcDAq8fBOuFAYJcSaWFZcmXuIUZqDRUmc11O0KEpIID2xJDU7NbUgtSi+qDQntfgQ IxMHp1QDI7cGe2Vn+jpWtq2qwmbXfZt3lNszzf82S32CkEvs3VPH9IR03wbOFnf+OFPA53tC p+SzJ5fEfBZoPeb89X3y4k7/qRvYuQWFXobltPz86RbXdeuI+dGwiw87U7ffa/nYFCb5t+mx psq9g247jrvqpDUcnO5+VHbyA/m3Qs2HF19hUmsyE42bqsRSnJFoqMVcVJwIANhB8iQtAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8Q9cjqijK4O1KYYvdd8Mtnh57xG7x t6WfxeLYAjEHFo/d05sYPRZsKvWYdzLQ4/MmuQCWKEObtPyi8sSiFIWi5IISW6XijMSU/PJ4 S2MjU4fEgoKcVL3k/FwlfTublNSczLLUIn27BMOMZRcfMhZMN63Y8+UscwPjFO0uRk4OCQET iRPzZzJ2MXJxCAkcYpQ4dWMaWxcjBwezgLzEwfOyIDVCAo1MEqc66kFsYQFJie47d5hBbDYB LYm3t9tZQWwRASmJjzu2s4PMYRZoYpR4tPAcC0SDh8TqmevYQGwWAVWJ9YeegcV5BcwlZm58 xgxxhLzE4TdNYIM4geINu+awQyw2k/h8fTHzBEa+WQgnLWBkXMUoUJSak1hpoQf37CZGcFgW pu1gbFpudYhRgINRiYf3xaXCKCHWxLLiylyg3ziYlUR4N4oURQnxpiRWVqUW5ccXleakFh9i 9AG6bSKzlGhyPjBm8kriDY0tjC1NLAwMTCzNTHAIK4nzHlECmiWQnliSmp2aWpBaBDOOiYNT qoHxSjJjXbuw74+LIvEuGiorpqYnlD+2VZ6r137L4IDJq4WvVzdJ1i+Zrf//RtEBlnUd5/c6 MZ9srd/yxuTZdYOAz2yuYj4HvUONbzS8jVXOTonj5jmSrhokzDc9aQ1LvpiwfNmf4+dFZ35u 1Az2sYhf1Bjxz+ToBd/ErNd/IldcCWPh+Df9fIsSCzBZGGoxFxUnAgCVyzzIeAIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The checks are now being done through the extra-option framework. For TCP MD5 this means that the check happens a bit later than usual. Cc: Ivan Delalande Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp_md5.h | 23 +---------------------- net/ipv4/tcp_input.c | 8 -------- net/ipv4/tcp_ipv4.c | 9 --------- net/ipv4/tcp_md5.c | 29 ++++++++++++++++++++++++----- net/ipv6/tcp_ipv6.c | 9 --------- 5 files changed, 25 insertions(+), 53 deletions(-) diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h index 441be65ec893..fe84c706299c 100644 --- a/include/linux/tcp_md5.h +++ b/include/linux/tcp_md5.h @@ -32,30 +32,9 @@ struct tcp_md5sig_key { int tcp_md5_parse_keys(struct sock *sk, int optname, char __user *optval, int optlen); -bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb); - -bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb); - int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb); int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin); -#else - -static inline bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ - return false; -} - -static inline bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ - return false; -} - -#endif - +#endif /* CONFIG_TCP_MD5SIG */ #endif /* _LINUX_TCP_MD5_H */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1ac1d8d431ad..56cdc3093d6a 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3774,14 +3774,6 @@ void tcp_parse_options(const struct net *net, TCP_SKB_CB(skb)->sacked = (ptr - 2) - (unsigned char *)th; } break; -#ifdef CONFIG_TCP_MD5SIG - case TCPOPT_MD5SIG: - /* - * The MD5 Hash has already been - * checked (see tcp_v{4,6}_do_rcv()). - */ - break; -#endif case TCPOPT_FASTOPEN: tcp_parse_fastopen_option( opsize - TCPOLEN_FASTOPEN_BASE, diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 6a839c1280b3..c5405bd62322 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -62,7 +62,6 @@ #include #include #include -#include #include #include @@ -1249,11 +1248,6 @@ int tcp_v4_rcv(struct sk_buff *skb) struct sock *nsk; sk = req->rsk_listener; - if (unlikely(tcp_v4_inbound_md5_hash(sk, skb))) { - sk_drops_add(sk, skb); - reqsk_put(req); - goto discard_it; - } if (unlikely(sk->sk_state != TCP_LISTEN)) { inet_csk_reqsk_queue_drop_and_put(sk, req); goto lookup; @@ -1293,9 +1287,6 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; - if (tcp_v4_inbound_md5_hash(sk, skb)) - goto discard_and_relse; - nf_reset(skb); if (tcp_filter(sk, skb)) diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c index e05db5af06ee..ad41b9fd6f88 100644 --- a/net/ipv4/tcp_md5.c +++ b/net/ipv4/tcp_md5.c @@ -30,6 +30,10 @@ static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); static DEFINE_MUTEX(tcp_md5sig_mutex); static bool tcp_md5sig_pool_populated; +static bool tcp_inbound_md5_hash(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store); + static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, unsigned int remaining, struct tcp_out_options *opts, @@ -77,6 +81,7 @@ struct tcp_md5_extopt { static const struct tcp_extopt_ops tcp_md5_extra_ops = { .option_kind = TCPOPT_MD5SIG, + .check = tcp_inbound_md5_hash, .prepare = tcp_md5_extopt_prepare, .write = tcp_md5_extopt_write, .response_prepare = tcp_md5_send_response_prepare, @@ -863,8 +868,8 @@ static struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, } /* Called with rcu_read_lock() */ -bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) +static bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) { /* This gets called for each TCP segment that arrives * so we want to be efficient. @@ -918,8 +923,8 @@ bool tcp_v4_inbound_md5_hash(const struct sock *sk, } #if IS_ENABLED(CONFIG_IPV6) -bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) +static bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) { const __u8 *hash_location = NULL; struct tcp_md5sig_key *hash_expected; @@ -961,7 +966,6 @@ bool tcp_v6_inbound_md5_hash(const struct sock *sk, return false; } -EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, const struct sock *addr_sk) @@ -971,6 +975,21 @@ static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, EXPORT_SYMBOL_GPL(tcp_v6_md5_lookup); #endif +static bool tcp_inbound_md5_hash(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store) +{ + if (skb->protocol == htons(ETH_P_IP)) { + return tcp_v4_inbound_md5_hash(sk, skb); +#if IS_ENABLED(CONFIG_IPV6) + } else { + return tcp_v6_inbound_md5_hash(sk, skb); +#endif + } + + return false; +} + static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, const struct tcp_md5sig_key *key) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 8800e5d75677..ab3a77a95cff 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -43,7 +43,6 @@ #include #include #include -#include #include #include @@ -1172,11 +1171,6 @@ static int tcp_v6_rcv(struct sk_buff *skb) struct sock *nsk; sk = req->rsk_listener; - if (tcp_v6_inbound_md5_hash(sk, skb)) { - sk_drops_add(sk, skb); - reqsk_put(req); - goto discard_it; - } if (unlikely(sk->sk_state != TCP_LISTEN)) { inet_csk_reqsk_queue_drop_and_put(sk, req); goto lookup; @@ -1213,9 +1207,6 @@ static int tcp_v6_rcv(struct sk_buff *skb) if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; - if (tcp_v6_inbound_md5_hash(sk, skb)) - goto discard_and_relse; - if (tcp_filter(sk, skb)) goto discard_and_relse; th = (const struct tcphdr *)skb->data;