From patchwork Mon Dec 18 21:50:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850421 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="fxJTz5On"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vr93jnXz9s7v for ; Tue, 19 Dec 2017 08:51:37 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937459AbdLRVvf (ORCPT ); Mon, 18 Dec 2017 16:51:35 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:48858 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935637AbdLRVva (ORCPT ); Mon, 18 Dec 2017 16:51:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633890; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=tVV48Frl2EtqDSMz+mpLDADrcJZK0qLBOMQBXGRXohA=; b=fxJTz5OnfHJgJNO4CmpXOhZscXit/JMKBzRouY8bSh0zNXXOWqS/P8U6fsIDgesY 4SVCXQPKSDbWLD4Mxm6POsF0qGscfFMEQ4kcSEarC47XhfLp2jTz38Yu3Chl0aBb /gCAQgsYS6gKWJe6SX1vEaLeqy7JnOIDOy1BP8qLhiiKc33wRitYRyd7auKv52gO CsSnkhBvxxAliq2+/x7fbZhdg/Nfjg+mPeXLOEfqPlYk9LHCfmwpyxp2A3p0X41f u8MxgLwRvqC7Tw1De+6m87cb8WFUH4T591jElmP35dgokyS7I00+w2jJ5kL2gs/J cp1aNhevY3RZHd/Sch72jQ==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id 38.23.20985.268383A5; Mon, 18 Dec 2017 13:51:30 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-2a-5a3838623f6a Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay6.apple.com (Apple SCV relay) with SMTP id D1.13.05652.268383A5; Mon, 18 Dec 2017 13:51:30 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LBSFDUYT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:30 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 01/14] tcp: Write options after the header has been fully done Date: Mon, 18 Dec 2017 13:50:56 -0800 Message-id: <20171218215109.38700-2-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprBLMWRmVeSWpSXmKPExsUi2FAYpZtkYRFl0PJI2uLLz9vsFk+PPWK3 +NvSz2JxbIGYA4vHgk2lHptWdbJ5zDsZ6PF5k1wASxSXTUpqTmZZapG+XQJXxpTeM2wFB3kr Jr/eyNLAOIG7i5GTQ0LAROLY2t0sXYxcHEICq5kkOn7dBXI4wBI3LotBxA8xSqy8vosNJM4s IC9x8LwsRLyRSeLYxlVMIIOEBSQluu/cYQax2QS0JN7ebmcFsUUEpCQ+7tjODmIzCzQzSuyc IwIyR1jAR+LDaU4Qk0VAVeLiYzeQCl4BM4lLU2cyQ5wmL7H4+042EJtTwFyi5+sjsClCQDXn f75kAjlBQuAIm8Si7rlsExgFZyFct4CRcRWjUG5iZo5uZp6ZXmJBQU6qXnJ+7iZGUFBOtxPd wXhmldUhRgEORiUe3hlXzaOEWBPLiitzDzFKc7AoifNerDaJEhJITyxJzU5NLUgtii8qzUkt PsTIxMEp1cDYFP5mZvieSWZRq5hjtzsz/LrcsEHPSYWZ2U02Qoi9rvHFDJfkU+fm/OfjOh7K v8ScLe21W73M0wmapQ9r+j8KsnoZ6T7/0vZHjP3R86t7anMu/JivtfKMrdtst0jR2x/l/yWZ v/+6ccuGx/WKwUo+la7rd8jG/+cK+/PY1dQ+d/Hh8FkPv95XYinOSDTUYi4qTgQAXZlA6isC AAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrMLMWRmVeSWpSXmKPExsUi2FB8QzfJwiLKYNEGUYsvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGVN6z7AVHOStmPx6I0sD4wTuLkYODgkB E4kbl8W6GLk4hAQOMUqsvL6LDSTOLCAvcfC8LES8kUni2MZVTF2MnBzCApIS3XfuMIPYbAJa Em9vt7OC2CICUhIfd2xnB7GZBZoZJXbOEQGZIyzgI/HhNCeIySKgKnHxsRtIBa+AmcSlqTPB pkgAbVr8fScbiM0pYC7R8/UR2BQhoJrzP18yTWDkm4Vw0AJGxlWMAkWpOYmVZnpwn25iBIdk YdQOxoblVocYBTgYlXh4Z1w1jxJiTSwrrswF+oyDWUmE1+8sUIg3JbGyKrUoP76oNCe1+BCj D9BpE5mlRJPzgfGSVxJvaGxhbGliYWBgYmlmgkNYSZzXfRXQLIH0xJLU7NTUgtQimHFMHJxS DYyBk56E9WvUW3UeKd9Sl1fJfEAppro+xmSL/aR3J4JMbF1iIprnqjnYpByxZLtRkrNEIbCx yeec5N9NKg8W7wqzuxJ6fpo7T63cjqLpWqrMK+4eUomaaztv6dS+1r/fE7fN2vHX5Me1h1uS Xm7tK33cqDu/XLQ0I6b9jffVgJWNTFlWYY893yixANOEoRZzUXEiAOg4B7Z2AgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The generic TCP-option framework will need to have access to the full TCP-header (e.g., if we want to compute a checksum for TCP-MD5). Thus, we move the call to tcp_options_write() to after all the fields in the header have been filled out. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_output.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 04be9f833927..0f66d101d0ca 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1126,7 +1126,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, } } - tcp_options_write((__be32 *)(th + 1), tp, &opts); skb_shinfo(skb)->gso_type = sk->sk_gso_type; if (likely(!(tcb->tcp_flags & TCPHDR_SYN))) { th->window = htons(tcp_select_window(sk)); @@ -1137,6 +1136,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, */ th->window = htons(min(tp->rcv_wnd, 65535U)); } + tcp_options_write((__be32 *)(th + 1), tp, &opts); #ifdef CONFIG_TCP_MD5SIG /* Calculate the MD5 hash, as we have all we need now */ if (md5) { @@ -3242,8 +3242,8 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); - tcp_options_write((__be32 *)(th + 1), NULL, &opts); th->doff = (tcp_header_size >> 2); + tcp_options_write((__be32 *)(th + 1), NULL, &opts); __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); #ifdef CONFIG_TCP_MD5SIG From patchwork Mon Dec 18 21:50:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850424 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="MpVb4iYk"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrN396Fz9s7v for ; Tue, 19 Dec 2017 08:51:48 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937601AbdLRVvr (ORCPT ); Mon, 18 Dec 2017 16:51:47 -0500 Received: from mail-out2.apple.com ([17.151.62.25]:53894 "EHLO mail-in2.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S934193AbdLRVvb (ORCPT ); Mon, 18 Dec 2017 16:51:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633891; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Fb77NcWqYxn9tw/A5ZyWnZoVqsW6gqgNwKgKPsZZoUU=; b=MpVb4iYk12Sy5h4bVjCh9/4eXtXXTcEK0p/GL6pBeTSQjZk9H1Md2h2EIhHFmM0O uI8L4TMByBED/jVWCXM5O+sKYFxbKv5dSOh2CTT9kOLaD+piHgmD9oyrb/pKxFzx tR9GN6NpsHhj4VEB/EHTgQ4GfI2jMzz3nVXyvtnW7fywZLA9DVP2ueNhqaYD4Rmk gZuRoH7D+WqWaaXSvptHavH/8WXyTH2cpHIhJWFSVqYOgCiKZww99mSEh3yQVDm6 fQ6paRG3BuYGyW7qlOaBnkT1GURdtT8yqmwUm18RB4OVgT5YjsUWBxJOFHP/PNLF 7yWM/Q4yx9CcJXuAtL2+NQ==; Received: from relay8.apple.com (relay8.apple.com [17.128.113.102]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in2.apple.com (Apple Secure Mail Relay) with SMTP id C7.62.22347.368383A5; Mon, 18 Dec 2017 13:51:31 -0800 (PST) X-AuditID: 11973e11-163b19c00000574b-d5-5a383863fe7f Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay8.apple.com (Apple SCV relay) with SMTP id 03.0F.22651.368383A5; Mon, 18 Dec 2017 13:51:31 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LBWFDVYT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:31 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 02/14] tcp: Pass sock and skb to tcp_options_write Date: Mon, 18 Dec 2017 13:50:57 -0800 Message-id: <20171218215109.38700-3-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJLMWRmVeSWpSXmKPExsUi2FCYpptsYRFlcH+FtMWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujM9X37IWXBKo mN+8ib2B8SJvFyMnh4SAicSdxXeYuhi5OIQE1jBJ/Og7yNzFyAGWmPRBHCJ+iFFi/5P1TCBx ZgF5iYPnZSHijUwSjXO2s4EMEhaQlOi+c4cZxGYT0JJ4e7udFcQWEZCS+LhjOzuIzSzQzCix c44IRL2DxMquXWwgM1kEVCX+XGMCCfMKmEn8ebeeCeI2eYnF33eCjecUMJfo+foIbIwQUM35 ny/BbpYQOMImcby5j3ECo+AshPMWMDKuYhTKTczM0c3MM9JLLCjISdVLzs/dxAgKy+l2gjsY j6+yOsQowMGoxMM746p5lBBrYllxZe4hRmkOFiVx3ovVJlFCAumJJanZqakFqUXxRaU5qcWH GJk4OKUaGG8sXMEbud78hIL08z71KYa71xV4WIg9LJVWPKQm7/k62Yf7w9TgnA7NyiVHT66R v9b2/dgjG4P8BY9P/DT7wuYsub/oz/tQj62imbXd4UeKAmM2HVU2WSLIfHK53upr65T1Zp+P O33tomvYBtHPZ7Ou/XHv81odWvz3HmvCtPUnFp6xK/qwR0yJpTgj0VCLuag4EQAN2Qs+LAIA AA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrCLMWRmVeSWpSXmKPExsUi2FB8QzfZwiLKYMYXIYsvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGZ+vvmUtuCRQMb95E3sD40XeLkYODgkB E4lJH8S7GLk4hAQOMUrsf7KeCSTOLCAvcfC8LES8kUmicc52ti5GTg5hAUmJ7jt3mEFsNgEt ibe321lBbBEBKYmPO7azg9jMAs2MEjvniEDUO0is7NrFBjKTRUBV4s81JpAwr4CZxJ9368Fs CaBVi7/vBBvPKWAu0fP1EdgYIaCa8z9fMk1g5JuFcNECRsZVjAJFqTmJlRZ6cK9uYgQHZWHa Dsam5VaHGAU4GJV4eGdcNY8SYk0sK67MBXqNg1lJhNfvLFCINyWxsiq1KD++qDQntfgQow/Q aROZpUST84ERk1cSb2hsYWxpYmFgYGJpZoJDWEmc12MV0CyB9MSS1OzU1ILUIphxTBycUg2M ywrOPuI5WzPp9i7XmXK7Fq9KjRPySL6zdaVRnl/u66ygu/tZtNpztW1dOjt0fk45Uh09cXXt gYL+tnuua/IfTv+wxutEzeXnsRte7bv06eBt9rdPvq/bena94byglwn6iUtNzmuu1IoVT1EP yqqxnWI/je3mk4aAnjbrbyePT5KNfrZ/wYykTiUWYKIw1GIuKk4EAJngBf13AgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org An upcoming patch adds a configurable, per-socket list of TCP options to populate in the TCP header. This requires tcp_options_write() to know the socket (to use the options list) and the skb (to provide visibility to the packet data for options like TCP_MD5SIG). Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_output.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 0f66d101d0ca..efe599a41e36 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -444,10 +444,14 @@ struct tcp_out_options { * At least SACK_PERM as the first option is known to lead to a disaster * (but it may well be that other scenarios fail similarly). */ -static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp, +static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, struct tcp_out_options *opts) { u16 options = opts->options; /* mungable copy */ + struct tcp_sock *tp = NULL; + + if (sk_fullsock(sk)) + tp = tcp_sk(sk); if (unlikely(OPTION_MD5 & options)) { *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | @@ -1136,7 +1140,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, */ th->window = htons(min(tp->rcv_wnd, 65535U)); } - tcp_options_write((__be32 *)(th + 1), tp, &opts); + tcp_options_write((__be32 *)(th + 1), skb, sk, &opts); #ifdef CONFIG_TCP_MD5SIG /* Calculate the MD5 hash, as we have all we need now */ if (md5) { @@ -3243,7 +3247,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); th->doff = (tcp_header_size >> 2); - tcp_options_write((__be32 *)(th + 1), NULL, &opts); + tcp_options_write((__be32 *)(th + 1), skb, req_to_sk(req), &opts); __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); #ifdef CONFIG_TCP_MD5SIG From patchwork Mon Dec 18 21:50:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850423 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="3cllJ9vB"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrL2qkCz9s7v for ; Tue, 19 Dec 2017 08:51:46 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937572AbdLRVvn (ORCPT ); Mon, 18 Dec 2017 16:51:43 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:48858 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S937167AbdLRVvd (ORCPT ); Mon, 18 Dec 2017 16:51:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633892; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=2mpG38hv9klaBbNntD80yb4Gj1vMLxJy6EJ4/Oep1Mc=; b=3cllJ9vBLDVpf2Zr1uSpC5qVi+bAdx4NIUVpuv7uTA78gqbXfLjkg7ykdLL6yZ2y 7SViUvtopLWOZ9J7QqvNoW1dDVjB2jWOa2LZ7Ig+2WyyU0y+dBX1DUVNx1bVJgVQ Zou0B9JUE7iTPteSmPEV77v1UnsoA/7DJSx8GwVL2+qMN2+EMT3ARnCBDtXiTuhP P3XfNvIBqCBcok0rYo5SFjDm97z5zBJRNiuxYr7DEgpeteLKWWFff372YeABdHyt DsoUJ/lSqqC5zw90/g/Q0492SPH52YtEeVAOoSZo/TtgydMHHds+ZG/x8D5wab6F ++PFA5v0akqwuGwTRgPPmw==; Received: from relay5.apple.com (relay5.apple.com [17.128.113.88]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id 09.23.20985.468383A5; Mon, 18 Dec 2017 13:51:32 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-2f-5a383864ac37 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay5.apple.com (Apple SCV relay) with SMTP id E4.D8.18983.368383A5; Mon, 18 Dec 2017 13:51:32 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LC0FDVYT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:31 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 03/14] tcp: Allow tcp_fast_parse_options to drop segments Date: Mon, 18 Dec 2017 13:50:58 -0800 Message-id: <20171218215109.38700-4-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FAYoZtiYRFlcPomm8WXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujKXbelgKXolV 7J64iamB8ZxQFyMnh4SAicS6N5+Zuxi5OIQEVjNJzPx6igUmsXPjXajEIUaJg4tmACU4OJgF 5CUOnpeFiDcySSzct44VpEFYQFKi+84dZhCbTUBL4u3tdrC4iICUxMcd29lBbGaBZkaJnXNE IOrdJb6fPsIIYrMIqEpsmtbFBGLzCphJtEzYzQhxhLzE4u872UBsTgFziZ6vj8DmCAHVnP/5 kgmi5gSbxNSNyhMYBWchnLeAkXEVo1BuYmaObmaemV5iQUFOql5yfu4mRlBgTrcT3cF4ZpXV IUYBDkYlHt4ZV82jhFgTy4orcw8xSnOwKInzXqw2iRISSE8sSc1OTS1ILYovKs1JLT7EyMTB KdXAWH+IdY7Q9Rmdvwvb/9QdPr7yz7qDHTJt2xQ3RAW9aNPbynDDoSB04UdNp7t7OppP3lM7 OEuqvr01+ZTl5h7+2R5rndx/WWgmaDGlty2RWL/pziER9qOcPG3lqluXXVy17FP1qdrIpqqO h+/ZOI10y39v3O1tf98yM6v42a+UIIcz3KUuqc7PlFiKMxINtZiLihMBQhxtai0CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsUi2FB8QzfFwiLKYOtCFosvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGUu39bAUvBKr2D1xE1MD4zmhLkZODgkB E4mdG+8ydzFycQgJHGKUOLhoBksXIwcHs4C8xMHzshDxRiaJhfvWsYI0CAtISnTfucMMYrMJ aEm8vd0OFhcRkJL4uGM7O4jNLNDMKLFzjghEvbvE99NHGEFsFgFViU3TuphAbF4BM4mWCbsZ IY6Ql1j8fScbiM0pYC7R8/UR2BwhoJrzP18yTWDkm4Vw0gJGxlWMAkWpOYmVpnpwz25iBAdm YcQOxv/LrA4xCnAwKvHwzrhqHiXEmlhWXJkL9BsHs5IIr99ZoBBvSmJlVWpRfnxRaU5q8SFG H6DbJjJLiSbnA6MmryTe0NjC2NLEwsDAxNLMBIewkjiv+yqgWQLpiSWp2ampBalFMOOYODil Ghj3TN8gfUYmk+t7h4lHo/nCP9aMORuqttV58/t433pvmOHx7FFE/gkf6U+Nb5ed/umUJuHg 2Bk5nXtOeyLv/B2vdi6eFX3R+RjHrvtX/ELPiLSvYy3aLTin7sKNFgXHiD7zogYPQwXhqpOv Dk3Z8WH+bVb3tK1zQ4J3xjyoDVqQfvbBepuI3ZeUWIDJwlCLuag4EQBrDU9beQIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org After parsing the TCP-options, some option-kinds might trigger a drop of the segment (e.g., as is the case for TCP_MD5). As we are moving to consolidate the TCP_MD5-code in follow-up patches, we need to add the capability to drop a segment right after parsing the options in tcp_fast_parse_options(). Originally, tcp_fast_parse_options() returned false, when there is no timestamp option, except in the case of the slow-path processing through tcp_parse_options() where it always returns true. So, the return-value of tcp_fast_parse_options() was kind of inconsistent. With this patch, we make it return true when the segment should get dropped based on the parsed options, and false otherwise. In tcp_validate_incoming, we will then just check for tp->rx_opt.saw_tstamp to see if we should verify PAWS. The goto will be used in a follow-up patch to check whether one of the options triggers a drop of the segment. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_input.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 4d55c4b338ee..eb97ee24c601 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3827,6 +3827,8 @@ static bool tcp_parse_aligned_timestamp(struct tcp_sock *tp, const struct tcphdr /* Fast parse options. This hopes to only see timestamps. * If it is wrong it falls back on tcp_parse_options(). + * + * Returns true if we should drop this packet based on present TCP-options. */ static bool tcp_fast_parse_options(const struct net *net, const struct sk_buff *skb, @@ -3837,18 +3839,19 @@ static bool tcp_fast_parse_options(const struct net *net, */ if (th->doff == (sizeof(*th) / 4)) { tp->rx_opt.saw_tstamp = 0; - return false; + goto extra_opt_check; } else if (tp->rx_opt.tstamp_ok && th->doff == ((sizeof(*th) + TCPOLEN_TSTAMP_ALIGNED) / 4)) { if (tcp_parse_aligned_timestamp(tp, th)) - return true; + goto extra_opt_check; } tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL); if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr) tp->rx_opt.rcv_tsecr -= tp->tsoffset; - return true; +extra_opt_check: + return false; } #ifdef CONFIG_TCP_MD5SIG @@ -5168,9 +5171,11 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, struct tcp_sock *tp = tcp_sk(sk); bool rst_seq_match = false; + if (tcp_fast_parse_options(sock_net(sk), skb, th, tp)) + goto discard; + /* RFC1323: H1. Apply PAWS check first. */ - if (tcp_fast_parse_options(sock_net(sk), skb, th, tp) && - tp->rx_opt.saw_tstamp && + if (tp->rx_opt.saw_tstamp && tcp_paws_discard(sk, skb)) { if (!th->rst) { NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSESTABREJECTED); From patchwork Mon Dec 18 21:50:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850432 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="VZxg5lP6"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vs84tMzz9s7v for ; Tue, 19 Dec 2017 08:52:28 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965186AbdLRVw0 (ORCPT ); Mon, 18 Dec 2017 16:52:26 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:65236 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S937505AbdLRVvh (ORCPT ); Mon, 18 Dec 2017 16:51:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633893; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=OeG7825j1AEaGbZWiQG370RkPeHHwaf702PkTB4WLL8=; b=VZxg5lP6w28NPW8J4fLwTP41Lwa0FENYQTAVZTAQ+lrtCpkGDde0/dD+AQmeDcsS y1PnwQ6dI3yK4Y4680LSU4OtCPDb6E/J7IeSpWic1GXfu9MVTWyp4yU+fA+prJjy LlVDbaHbknZfUsxjrPoTkhKc/ILQMq7iNaesgAqnateYsz28W9t8f0lDprUm93sU kq+wJThGsRR+JBrJo3knXFQY4jXj3tN28RgLFhRkkIfMkOfPZUMWuIumAaVuc0fI 3hdOn0ETyaXV0+H6eBfVDrP2tcO8SPUfmbVpXYQdBpTLTTz1Ocom695q6CxyGLZU tjdGzufSzV+aZZLvnf0Akg==; Received: from relay5.apple.com (relay5.apple.com [17.128.113.88]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id C9.23.20985.468383A5; Mon, 18 Dec 2017 13:51:32 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-34-5a3838646a78 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay5.apple.com (Apple SCV relay) with SMTP id D6.D8.18983.468383A5; Mon, 18 Dec 2017 13:51:32 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LC4FDWYT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:32 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov , Ursula Braun Subject: [RFC 04/14] tcp_smc: Make smc_parse_options return 1 on success Date: Mon, 18 Dec 2017 13:50:59 -0800 Message-id: <20171218215109.38700-5-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrKLMWRmVeSWpSXmKPExsUi2FAYoZtiYRFlcOqPjcWXn7fZLZ4ee8Ru 8beln8Xi2AIxi6XTlRxYPRZsKvXYtKqTzWPeyUCPB4c2s3h83iQXwBrFZZOSmpNZllqkb5fA lbHw6gPmghfcFbcvr2NvYNzA2cXIySEhYCKxt2M/excjF4eQwGomif7P21lgEpOWb2WGSBxi lPh37wBjFyMHB7OAvMTB87IQ8UYmifd71jGCNAgLSEp037nDDGKzCWhJvL3dzgpiiwhISXzc sR1sA7PAWkaJu1MPskA0eEi83vkXrJlFQFXi+exHYHFeATOJ012r2CCukJdY/H0nmM0pYC7R 8/URO4gtBFRz/udLJpChEgLX2SS6Z6xnnsAoOAvhwAWMjKsYhXITM3N0M/PM9BILCnJS9ZLz czcxgsJ1up3oDsYzq6wOMQpwMCrx8M64ah4lxJpYVlyZe4hRmoNFSZz3YrVJlJBAemJJanZq akFqUXxRaU5q8SFGJg5OqQbGKIXzbwu/Gn9iKz3Wc/qBUc075+p/j+sNz7ScyvL8nhVdK+X5 ZmHr8gXl031dGb8Yzyq+ZaHuEq6rvvXZyddi5o1v3avnvxPIqdOL++2w5u0G26O++47tP5h1 6cGDDY2hMvF27+p5VrVe4HaqkDbgmmlfLielGFVbarTn5m/n/Jz5B15NCi9RYinOSDTUYi4q TgQAeK5g1zgCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrALMWRmVeSWpSXmKPExsUi2FB8QzfFwiLK4PppK4svP2+zWzw99ojd 4m9LP4vFsQViFkunKzmweizYVOqxaVUnm8e8k4EeDw5tZvH4vEkugDXK0CYtv6g8sShFoSi5 oMRWqTgjMSW/PN7S2MjUIbGgICdVLzk/V0nfziYlNSezLLVI3y7BMGPh1QfMBS+4K25fXsfe wLiBs4uRk0NCwERi0vKtzF2MXBxCAocYJf7dO8DYxcjBwSwgL3HwvCxEvJFJ4v2edYwgDcIC khLdd+4wg9hsAloSb2+3s4LYIgJSEh93bGcHaWAWWMsocXfqQRaIBg+J1zv/gjWzCKhKPJ/9 CCzOK2AmcbprFRvEFfISi7/vBLM5Bcwler4+YgexhYBqzv98yTSBkW8Wwk0LGBlXMQoUpeYk VprqwX27iREcrIUROxj/L7M6xCjAwajEwzvjqnmUEGtiWXFlLtBzHMxKIrx+Z4FCvCmJlVWp RfnxRaU5qcWHGH2AbpvILCWanA+MpLySeENjC2NLEwsDAxNLMxMcwkrivO6rgGYJpCeWpGan phakFsGMY+LglGpgXNl9lMt97onYmx+Xubyo+SA0cxZDB0+G+pvFrKKzy6I5OIu45oV1fO8p YhFY2j5pu+VZ7rg1nyMrqoV5P8x45PPy7cbfllGKyWevSrAZHN7dUf+15KPpjQf6aRZfDnzJ iT64b26Jyj2DCgv+o9KPvzqI+7In6c3x1J7x89/c2M3lzNdn8My7oMQCTBeGWsxFxYkAKo6d LYMCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org As we allow a generic TCP-option parser that also parses experimental TCP options, we need to add a return-value to smc_parse_options() that indicates whether the option actually matched or not. Cc: Ursula Braun Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_input.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index eb97ee24c601..5c35fd568b13 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3671,19 +3671,22 @@ static void tcp_parse_fastopen_option(int len, const unsigned char *cookie, foc->exp = exp_opt; } -static void smc_parse_options(const struct tcphdr *th, - struct tcp_options_received *opt_rx, - const unsigned char *ptr, - int opsize) +static int smc_parse_options(const struct tcphdr *th, + struct tcp_options_received *opt_rx, + const unsigned char *ptr, + int opsize) { #if IS_ENABLED(CONFIG_SMC) if (static_branch_unlikely(&tcp_have_smc)) { if (th->syn && !(opsize & 1) && opsize >= TCPOLEN_EXP_SMC_BASE && - get_unaligned_be32(ptr) == TCPOPT_SMC_MAGIC) + get_unaligned_be32(ptr) == TCPOPT_SMC_MAGIC) { opt_rx->smc_ok = 1; + return 1; + } } #endif + return 0; } /* Look for tcp options. Normally only called on SYN and SYNACK packets. From patchwork Mon Dec 18 21:51:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850445 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="woq4XUTR"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0w9Y0bwsz9sNr for ; Tue, 19 Dec 2017 09:06:40 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936029AbdLRWGi (ORCPT ); Mon, 18 Dec 2017 17:06:38 -0500 Received: from mail-out22.apple.com ([17.171.2.32]:52767 "EHLO mail-in22.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933786AbdLRWGg (ORCPT ); Mon, 18 Dec 2017 17:06:36 -0500 X-Greylist: delayed 901 seconds by postgrey-1.27 at vger.kernel.org; Mon, 18 Dec 2017 17:06:36 EST DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633895; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=too2kOH1BvOsG0psJqC7wTmluVYEzPFSRzahVmzXKAA=; b=woq4XUTRwNhYQUsc9LwuebjlNbHRxMW7sFZc7A3VpC14OLVkQnehPEd4NwIbAG9i WAR9GIYSyEWa5OC7DcozhFn3ZLIlhW0f6O4Epy11ev/m/VovILCbiup4jdjUsYbD ZVfTBXrvYE3bTAQlAr9qArqJtyQTz3R7LSDPMiDpwUVBGfgYyHIc6w0B+ij7uUCA a2+jv74dkXlmEFg/MW2o5vVDFO704PuqqpGbFysXVqP03uM3wtLlWZ0if/br6Iwe qNOit3FZxwmzHyn8C4+VGwiZKbuxWsFQVANY2+xdtWChYyklINzJ4kkes4IPtNQp lEAtk6L/eevEeUaswIP2OA==; Received: from relay4.apple.com (relay4.apple.com [17.128.113.87]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in22.apple.com (Apple Secure Mail Relay) with SMTP id A6.4E.08992.668383A5; Mon, 18 Dec 2017 13:51:35 -0800 (PST) X-AuditID: 11ab0216-f8bff70000002320-05-5a383866dca6 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay4.apple.com (Apple SCV relay) with SMTP id 09.E0.21277.568383A5; Mon, 18 Dec 2017 13:51:33 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600HUZFDXVT70@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:33 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 05/14] tcp: Register handlers for extra TCP options Date: Mon, 18 Dec 2017 13:51:00 -0800 Message-id: <20171218215109.38700-6-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprDLMWRmVeSWpSXmKPExsUi2FAYrptuYRFlcD3a4svP2+wWT489Yrf4 29LPYnFsgZgDi8eCTaUem1Z1snnMOxno8XmTXABLFJdNSmpOZllqkb5dAlfGjddTmQu+HGSq WLi3n72BcUYLUxcjJ4eEgInE54W9rCC2kMAaJolbs9Vh4iembWfvYuQCih9ilNj0oQeoiIOD WUBe4uB5WYh4I5PE65bXYM3CApIS3XfuMIPYbAJaEm9vt4PFRQSkJD7uABnECdTbzCixc44I RL2jxP5tyxhBbBYBVYkJ3+aB2bwCZhJfPt5khDhCXmLx951sIDangLlEz9dH7BCHmkmc//mS CeQICYETbBLH+94yTmAUnIVw3wJGxlWMwrmJmTm6mXlGRnqJBQU5qXrJ+bmbGEHBuZpJbAfj vdeGhxgFOBiVeHhnXDWPEmJNLCuuzD3EKM3BoiTO+6DaJEpIID2xJDU7NbUgtSi+qDQntfgQ IxMHp1QDY+sRE0uRg4ZSIU7vZobIXA9/9uHkr/T3P1K39/GUlpgY/T+T/Z3BxcX66lv2eT+t Txn5BweEcQebcNQ035v+4SbDXuYunwiGJRM6te4EzVrwtKFD8ZbI1purdm6wYr4ovaGrtOPF 0+R3SZsnMx7fVumlv0X0xZKT//l23xb/FxPmqhPwbOMuRyWW4oxEQy3mouJEAADUAr4vAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrBLMWRmVeSWpSXmKPExsUi2FB8QzfVwiLKYN8Ec4svP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGTdeT2Uu+HKQqWLh3n72BsYZLUxdjJwc EgImEiembWfvYuTiEBI4xCix6UMPaxcjBwezgLzEwfOyEPFGJonXLa9ZQRqEBSQluu/cYQax 2QS0JN7ebgeLiwhISXzcATKIE6i3mVFi5xwRiHpHif3bljGC2CwCqhITvs0Ds3kFzCS+fLzJ CHGEvMTi7zvZQGxOAXOJnq+PwOYIAdWc//mSaQIj3yyEkxYwMq5iFChKzUmsNNGDe3cTIzg4 C8N3MP5bZnWIUYCDUYmHd8ZV8ygh1sSy4spcoN84mJVEeP3OAoV4UxIrq1KL8uOLSnNSiw8x +gDdNpFZSjQ5Hxg5eSXxhsYWxpYmFgYGJpZmJjiElcR53VcBzRJITyxJzU5NLUgtghnHxMEp 1cA4u7h+1tXQZdrr57D+P3JjfUfNNLX4prkm+yczVu9g+/Lle5TqvfwrGae6dDo4Tzxg2yrn KdDpVbpozmX3wpulO/6Ytqp+/Sm0/tOyT2/Z93WVflq+4wIzb0rca6c3Kha9V7XyL2lN/ffD 9HWcjjuPxbEMxp71zDp1/UlXs1h85Tf2eeXOKTipxAJMGIZazEXFiQBjx7ykewIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mat Martineau Allow additional TCP options to be handled by registered hook functions. Registered options have a priority that determines the order in which options are prepared and written. Lower priority numbers are handled first. Option parsing will call the provided 'parse' function when a TCP option number is not recognized by the normal option parsing code. After parsing, there are two places where we post-process the options. First, a 'check' callback that allows to drop the packet based on the parsed options (e.g., useful for TCP MD5SIG). Then, a 'post_process' function that gets called after other validity checks (aka, in-window, PAWS,...). This post_process function can then update other state for this particular extra-option. In the output-path, the 'prepare' function determines the required space for registered options and store associated data. 'write' adds the option to the TCP header. These additional TCP-options are stored in hlists of the TCP-socket. To pass the state and options around during the 3-way handshake and in time-wait state, the hlists are also on the tcp_request_sock and tcp_timewait_sock. The list is copied from the listener to the request-socket (calling into the 'copy' callback). Then, moved from the request-socket to the TCP-socket and finally to the time-wait socket. Signed-off-by: Mat Martineau Signed-off-by: Christoph Paasch --- drivers/infiniband/hw/cxgb4/cm.c | 2 +- include/linux/tcp.h | 28 ++++ include/net/tcp.h | 110 ++++++++++++- net/ipv4/syncookies.c | 6 +- net/ipv4/tcp.c | 327 ++++++++++++++++++++++++++++++++++++++- net/ipv4/tcp_input.c | 49 +++++- net/ipv4/tcp_ipv4.c | 98 +++++++++--- net/ipv4/tcp_minisocks.c | 32 +++- net/ipv4/tcp_output.c | 40 ++--- net/ipv6/syncookies.c | 6 +- net/ipv6/tcp_ipv6.c | 32 ++++ 11 files changed, 676 insertions(+), 54 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 21db3b48a617..a1ea5583f07b 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -3746,7 +3746,7 @@ static void build_cpl_pass_accept_req(struct sk_buff *skb, int stid , u8 tos) */ memset(&tmp_opt, 0, sizeof(tmp_opt)); tcp_clear_options(&tmp_opt); - tcp_parse_options(&init_net, skb, &tmp_opt, 0, NULL); + tcp_parse_options(&init_net, skb, &tmp_opt, 0, NULL, NULL); req = __skb_push(skb, sizeof(*req)); memset(req, 0, sizeof(*req)); diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 4f93f0953c41..4756bd2c4b54 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -115,6 +115,24 @@ static inline void tcp_clear_options(struct tcp_options_received *rx_opt) #endif } +#define OPTION_SACK_ADVERTISE (1 << 0) +#define OPTION_TS (1 << 1) +#define OPTION_MD5 (1 << 2) +#define OPTION_WSCALE (1 << 3) +#define OPTION_FAST_OPEN_COOKIE (1 << 8) +#define OPTION_SMC (1 << 9) + +struct tcp_out_options { + u16 options; /* bit field of OPTION_* */ + u16 mss; /* 0 to disable */ + u8 ws; /* window scale, 0 to disable */ + u8 num_sack_blocks; /* number of SACK blocks to include */ + u8 hash_size; /* bytes in hash_location */ + __u8 *hash_location; /* temporary pointer, overloaded */ + __u32 tsval, tsecr; /* need to include OPTION_TS */ + struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ +}; + /* This is the max number of SACKS that we'll generate and process. It's safe * to increase this, although since: * size = TCPOLEN_SACK_BASE_ALIGNED (4) + n * TCPOLEN_SACK_PERBLOCK (8) @@ -137,6 +155,7 @@ struct tcp_request_sock { * FastOpen it's the seq# * after data-in-SYN. */ + struct hlist_head tcp_option_list; }; static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req) @@ -373,6 +392,8 @@ struct tcp_sock { */ struct request_sock *fastopen_rsk; u32 *saved_syn; + + struct hlist_head tcp_option_list; }; enum tsq_enum { @@ -400,6 +421,11 @@ static inline struct tcp_sock *tcp_sk(const struct sock *sk) return (struct tcp_sock *)sk; } +static inline struct sock *tcp_to_sk(const struct tcp_sock *tp) +{ + return (struct sock *)tp; +} + struct tcp_timewait_sock { struct inet_timewait_sock tw_sk; #define tw_rcv_nxt tw_sk.__tw_common.skc_tw_rcv_nxt @@ -412,6 +438,8 @@ struct tcp_timewait_sock { u32 tw_last_oow_ack_time; long tw_ts_recent_stamp; + + struct hlist_head tcp_option_list; #ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *tw_md5_key; #endif diff --git a/include/net/tcp.h b/include/net/tcp.h index 6939e69d3c37..ac62ceff9815 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -202,6 +202,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo); #define TCPOLEN_FASTOPEN_BASE 2 #define TCPOLEN_EXP_FASTOPEN_BASE 4 #define TCPOLEN_EXP_SMC_BASE 6 +#define TCPOLEN_EXP_BASE 6 /* But this is what stacks really send out. */ #define TCPOLEN_TSTAMP_ALIGNED 12 @@ -403,7 +404,8 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len); void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, - int estab, struct tcp_fastopen_cookie *foc); + int estab, struct tcp_fastopen_cookie *foc, + struct sock *sk); const u8 *tcp_parse_md5sig_option(const struct tcphdr *th); /* @@ -2063,4 +2065,110 @@ static inline bool tcp_bpf_ca_needs_ecn(struct sock *sk) #if IS_ENABLED(CONFIG_SMC) extern struct static_key_false tcp_have_smc; #endif + +struct tcp_extopt_store; + +struct tcp_extopt_ops { + u32 option_kind; + unsigned char priority; + void (*parse)(int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct sock *sk, + struct tcp_extopt_store *store); + bool (*check)(struct sock *sk, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store); + void (*post_process)(struct sock *sk, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store); + /* Return the number of bytes consumed */ + unsigned int (*prepare)(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + __be32 *(*write)(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, struct sock *sk, + struct tcp_extopt_store *store); + int (*response_prepare)(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + __be32 *(*response_write)(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + int (*add_header_len)(const struct sock *orig, + const struct sock *sk, + struct tcp_extopt_store *store); + struct tcp_extopt_store *(*copy)(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *from); + struct tcp_extopt_store *(*move)(struct sock *from, struct sock *to, + struct tcp_extopt_store *store); + void (*destroy)(struct tcp_extopt_store *store); + struct module *owner; +}; + +/* The tcp_extopt_store is the generic structure that will be added to the + * list of TCP extra-options. + * + * Protocols using the framework can create a wrapper structure around it that + * stores protocol-specific state. The tcp_extopt-functions will provide + * tcp_extopt_store though, so the protocol can use container_of to get + * access to the wrapper structure containing the state. + */ +struct tcp_extopt_store { + struct hlist_node list; + const struct tcp_extopt_ops *ops; +}; + +struct hlist_head *tcp_extopt_get_list(const struct sock *sk); + +struct tcp_extopt_store *tcp_extopt_find_kind(u32 kind, const struct sock *sk); + +void tcp_extopt_parse(u32 opcode, int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, struct sock *sk); + +bool tcp_extopt_check(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx); + +void tcp_extopt_post_process(struct sock *sk, + struct tcp_options_received *opt_rx); + +unsigned int tcp_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, struct sock *sk); + +int tcp_extopt_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_extopt_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, struct tcp_out_options *opts, + const struct sock *sk); + +int tcp_extopt_add_header(const struct sock *orig, const struct sock *sk); + +/* Socket lock must be held when calling this function */ +int tcp_register_extopt(struct tcp_extopt_store *store, struct sock *sk); + +void tcp_extopt_copy(struct sock *listener, struct request_sock *req, + struct tcp_options_received *opt); + +void tcp_extopt_move(struct sock *from, struct sock *to); + +void tcp_extopt_destroy(struct sock *sk); + #endif /* _TCP_H */ diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index fda37f2862c9..8373abf19440 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -313,7 +313,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) /* check for timestamp cookie support */ memset(&tcp_opt, 0, sizeof(tcp_opt)); - tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL); + tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL, sk); if (tcp_opt.saw_tstamp && tcp_opt.rcv_tsecr) { tsoff = secure_tcp_ts_off(sock_net(sk), @@ -325,6 +325,10 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) if (!cookie_timestamp_decode(sock_net(sk), &tcp_opt)) goto out; + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tcp_opt)) + goto out; + ret = NULL; req = inet_reqsk_alloc(&tcp_request_sock_ops, sk, false); /* for safety */ if (!req) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c470fec9062f..17f38afb4212 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -416,6 +416,7 @@ void tcp_init_sock(struct sock *sk) tcp_init_xmit_timers(sk); INIT_LIST_HEAD(&tp->tsq_node); INIT_LIST_HEAD(&tp->tsorted_sent_queue); + INIT_HLIST_HEAD(&tp->tcp_option_list); icsk->icsk_rto = TCP_TIMEOUT_INIT; tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT); @@ -3473,6 +3474,331 @@ EXPORT_SYMBOL(tcp_md5_hash_key); #endif +struct hlist_head *tcp_extopt_get_list(const struct sock *sk) +{ + if (sk_fullsock(sk)) + return &tcp_sk(sk)->tcp_option_list; + else if (sk->sk_state == TCP_NEW_SYN_RECV) + return &tcp_rsk(inet_reqsk(sk))->tcp_option_list; + else if (sk->sk_state == TCP_TIME_WAIT) + return &tcp_twsk(sk)->tcp_option_list; + + return NULL; +} +EXPORT_SYMBOL_GPL(tcp_extopt_get_list); + +/* Caller must ensure that rcu is locked */ +struct tcp_extopt_store *tcp_extopt_find_kind(u32 kind, const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + lhead = tcp_extopt_get_list(sk); + + hlist_for_each_entry_rcu(entry, lhead, list) { + if (entry->ops->option_kind == kind) + return entry; + } + + return NULL; +} +EXPORT_SYMBOL_GPL(tcp_extopt_find_kind); + +void tcp_extopt_parse(u32 opcode, int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, struct sock *sk) +{ + struct tcp_extopt_store *entry; + + rcu_read_lock(); + entry = tcp_extopt_find_kind(opcode, sk); + + if (entry && entry->ops->parse) + entry->ops->parse(opsize, opptr, skb, opt_rx, sk, entry); + rcu_read_unlock(); +} + +bool tcp_extopt_check(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + bool drop = false; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + bool ret = false; + + if (entry->ops->check) + ret = entry->ops->check(sk, skb, opt_rx, entry); + + if (ret) + drop = true; + } + rcu_read_unlock(); + + return drop; +} +EXPORT_SYMBOL_GPL(tcp_extopt_check); + +void tcp_extopt_post_process(struct sock *sk, + struct tcp_options_received *opt_rx) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (entry->ops->post_process) + entry->ops->post_process(sk, opt_rx, entry); + } + rcu_read_unlock(); +} + +unsigned int tcp_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + unsigned int used = 0; + + if (!sk) + return 0; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->prepare)) + continue; + + used += entry->ops->prepare(skb, flags, remaining - used, opts, + sk, entry); + } + rcu_read_unlock(); + + return roundup(used, 4); +} + +void tcp_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + if (!sk) + return; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->write)) + continue; + + ptr = entry->ops->write(ptr, skb, opts, sk, entry); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_write); + +int tcp_extopt_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + unsigned int used = 0; + + if (!sk) + return 0; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + int ret; + + if (unlikely(!entry->ops->response_prepare)) + continue; + + ret = entry->ops->response_prepare(orig, flags, + remaining - used, opts, + sk, entry); + + used += ret; + } + rcu_read_unlock(); + + return roundup(used, 4); +} +EXPORT_SYMBOL_GPL(tcp_extopt_response_prepare); + +void tcp_extopt_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, struct tcp_out_options *opts, + const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + + if (!sk) + return; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->response_write)) + continue; + + ptr = entry->ops->response_write(ptr, orig, th, opts, sk, entry); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_response_write); + +int tcp_extopt_add_header(const struct sock *orig, const struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + int tcp_header_len = 0; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, lhead, list) { + if (unlikely(!entry->ops->add_header_len)) + continue; + + tcp_header_len += entry->ops->add_header_len(orig, sk, entry); + } + rcu_read_unlock(); + + return tcp_header_len; +} + +/* Socket lock must be held when calling this function */ +int tcp_register_extopt(struct tcp_extopt_store *store, struct sock *sk) +{ + struct hlist_node *add_before = NULL; + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + int ret = 0; + + lhead = tcp_extopt_get_list(sk); + + if (!store->ops->option_kind) + return -EINVAL; + + if (!try_module_get(store->ops->owner)) + return -ENOENT; + + hlist_for_each_entry_rcu(entry, lhead, list) { + if (entry->ops->option_kind == store->ops->option_kind) { + pr_notice("Option kind %u already registered\n", + store->ops->option_kind); + module_put(store->ops->owner); + return -EEXIST; + } + + if (entry->ops->priority <= store->ops->priority) + add_before = &entry->list; + } + + if (add_before) + hlist_add_behind_rcu(&store->list, add_before); + else + hlist_add_head_rcu(&store->list, lhead); + + pr_debug("Option kind %u registered\n", store->ops->option_kind); + + return ret; +} +EXPORT_SYMBOL_GPL(tcp_register_extopt); + +void tcp_extopt_copy(struct sock *listener, struct request_sock *req, + struct tcp_options_received *opt) +{ + struct tcp_extopt_store *entry; + struct hlist_head *from, *to; + + from = tcp_extopt_get_list(listener); + to = tcp_extopt_get_list(req_to_sk(req)); + + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, from, list) { + struct tcp_extopt_store *new; + + if (!try_module_get(entry->ops->owner)) { + pr_err("%s Module get failed while copying\n", __func__); + continue; + } + + new = entry->ops->copy(listener, req, opt, entry); + if (!new) { + module_put(entry->ops->owner); + continue; + } + + hlist_add_tail_rcu(&new->list, to); + } + rcu_read_unlock(); +} + +void tcp_extopt_move(struct sock *from, struct sock *to) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lfrom, *lto; + struct hlist_node *tmp; + + lfrom = tcp_extopt_get_list(from); + lto = tcp_extopt_get_list(to); + + rcu_read_lock(); + hlist_for_each_entry_safe(entry, tmp, lfrom, list) { + hlist_del_rcu(&entry->list); + + if (entry->ops->move) { + entry = entry->ops->move(from, to, entry); + if (!entry) + continue; + } + + hlist_add_tail_rcu(&entry->list, lto); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_move); + +void tcp_extopt_destroy(struct sock *sk) +{ + struct tcp_extopt_store *entry; + struct hlist_head *lhead; + struct hlist_node *tmp; + + lhead = tcp_extopt_get_list(sk); + + rcu_read_lock(); + hlist_for_each_entry_safe(entry, tmp, lhead, list) { + struct module *owner = entry->ops->owner; + + hlist_del_rcu(&entry->list); + + entry->ops->destroy(entry); + + module_put(owner); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(tcp_extopt_destroy); + void tcp_done(struct sock *sk) { struct request_sock *req = tcp_sk(sk)->fastopen_rsk; @@ -3622,7 +3948,6 @@ void __init tcp_init(void) INIT_HLIST_HEAD(&tcp_hashinfo.bhash[i].chain); } - cnt = tcp_hashinfo.ehash_mask + 1; sysctl_tcp_max_orphans = cnt / 2; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 5c35fd568b13..1950ff80fb3f 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3696,7 +3696,7 @@ static int smc_parse_options(const struct tcphdr *th, void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, int estab, - struct tcp_fastopen_cookie *foc) + struct tcp_fastopen_cookie *foc, struct sock *sk) { const unsigned char *ptr; const struct tcphdr *th = tcp_hdr(skb); @@ -3796,9 +3796,18 @@ void tcp_parse_options(const struct net *net, tcp_parse_fastopen_option(opsize - TCPOLEN_EXP_FASTOPEN_BASE, ptr + 2, th->syn, foc, true); - else - smc_parse_options(th, opt_rx, ptr, - opsize); + else if (smc_parse_options(th, opt_rx, ptr, + opsize)) + break; + else if (opsize >= TCPOLEN_EXP_BASE) + tcp_extopt_parse(get_unaligned_be32(ptr), + opsize, ptr, skb, + opt_rx, sk); + break; + + default: + tcp_extopt_parse(opcode, opsize, ptr, skb, + opt_rx, sk); break; } @@ -3849,11 +3858,13 @@ static bool tcp_fast_parse_options(const struct net *net, goto extra_opt_check; } - tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL); + tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL, tcp_to_sk(tp)); if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr) tp->rx_opt.rcv_tsecr -= tp->tsoffset; extra_opt_check: + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + return tcp_extopt_check(tcp_to_sk(tp), skb, &tp->rx_opt); return false; } @@ -5327,6 +5338,9 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb, tp->rx_opt.saw_tstamp = 0; + if (!hlist_empty(&tp->tcp_option_list)) + goto slow_path; + /* pred_flags is 0xS?10 << 16 + snd_wnd * if header_prediction is to be made * 'S' will always be tp->tcp_header_len >> 2 @@ -5514,7 +5528,7 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack, /* Get original SYNACK MSS value if user MSS sets mss_clamp */ tcp_clear_options(&opt); opt.user_mss = opt.mss_clamp = 0; - tcp_parse_options(sock_net(sk), synack, &opt, 0, NULL); + tcp_parse_options(sock_net(sk), synack, &opt, 0, NULL, sk); mss = opt.mss_clamp; } @@ -5577,10 +5591,14 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, int saved_clamp = tp->rx_opt.mss_clamp; bool fastopen_fail; - tcp_parse_options(sock_net(sk), skb, &tp->rx_opt, 0, &foc); + tcp_parse_options(sock_net(sk), skb, &tp->rx_opt, 0, &foc, sk); if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr) tp->rx_opt.rcv_tsecr -= tp->tsoffset; + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tp->rx_opt)) + goto discard; + if (th->ack) { /* rfc793: * "If the state is SYN-SENT then @@ -5663,6 +5681,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tp->tcp_header_len = sizeof(struct tcphdr); } + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_post_process(sk, &tp->rx_opt); + tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); tcp_initialize_rcv_mss(sk); @@ -5756,6 +5777,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tcp_ecn_rcv_syn(tp, th); + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_post_process(sk, &tp->rx_opt); + tcp_mtup_init(sk); tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); tcp_initialize_rcv_mss(sk); @@ -6239,12 +6263,17 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, tcp_rsk(req)->af_specific = af_ops; tcp_rsk(req)->ts_off = 0; + INIT_HLIST_HEAD(&tcp_rsk(req)->tcp_option_list); tcp_clear_options(&tmp_opt); tmp_opt.mss_clamp = af_ops->mss_clamp; tmp_opt.user_mss = tp->rx_opt.user_mss; tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, - want_cookie ? NULL : &foc); + want_cookie ? NULL : &foc, sk); + + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tmp_opt)) + goto drop_and_free; if (want_cookie && !tmp_opt.saw_tstamp) tcp_clear_options(&tmp_opt); @@ -6305,6 +6334,10 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, tcp_reqsk_record_syn(sk, req, skb); fastopen_sk = tcp_try_fastopen(sk, skb, req, &foc, dst); } + + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_copy(sk, req, &tmp_opt); + if (fastopen_sk) { af_ops->send_synack(fastopen_sk, dst, &fl, req, &foc, TCP_SYNACK_FASTOPEN); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 94e28350f420..dee296097b8f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -600,10 +600,9 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) const struct tcphdr *th = tcp_hdr(skb); struct { struct tcphdr th; -#ifdef CONFIG_TCP_MD5SIG - __be32 opt[(TCPOLEN_MD5SIG_ALIGNED >> 2)]; -#endif + __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; + struct hlist_head *extopt_list = NULL; struct ip_reply_arg arg; #ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *key = NULL; @@ -613,6 +612,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) struct sock *sk1 = NULL; #endif struct net *net; + int offset = 0; /* Never send a reset in response to a reset. */ if (th->rst) @@ -624,6 +624,9 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) if (!sk && skb_rtable(skb)->rt_type != RTN_LOCAL) return; + if (sk) + extopt_list = tcp_extopt_get_list(sk); + /* Swap the send and the receive. */ memset(&rep, 0, sizeof(rep)); rep.th.dest = th->source; @@ -678,19 +681,44 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) goto out; } +#endif + + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { + unsigned int remaining; + struct tcp_out_options opts; + int used; + remaining = sizeof(rep.opt); +#ifdef CONFIG_TCP_MD5SIG + if (key) + remaining -= TCPOLEN_MD5SIG_ALIGNED; +#endif + + memset(&opts, 0, sizeof(opts)); + + used = tcp_extopt_response_prepare(skb, TCPHDR_RST, remaining, + &opts, sk); + + arg.iov[0].iov_len += used; + rep.th.doff = arg.iov[0].iov_len / 4; + + tcp_extopt_response_write(&rep.opt[0], skb, &rep.th, &opts, sk); + offset += used / 4; + } + +#ifdef CONFIG_TCP_MD5SIG if (key) { - rep.opt[0] = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | - TCPOLEN_MD5SIG); + rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | + (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | + TCPOLEN_MD5SIG); /* Update length and the length the header thinks exists */ arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; rep.th.doff = arg.iov[0].iov_len / 4; - tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[1], - key, ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, &rep.th); + tcp_v4_md5_hash_hdr((__u8 *)&rep.opt[offset], + key, ip_hdr(skb)->saddr, + ip_hdr(skb)->daddr, &rep.th); } #endif arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, @@ -742,14 +770,14 @@ static void tcp_v4_send_ack(const struct sock *sk, const struct tcphdr *th = tcp_hdr(skb); struct { struct tcphdr th; - __be32 opt[(TCPOLEN_TSTAMP_ALIGNED >> 2) -#ifdef CONFIG_TCP_MD5SIG - + (TCPOLEN_MD5SIG_ALIGNED >> 2) -#endif - ]; + __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; + struct hlist_head *extopt_list = NULL; struct net *net = sock_net(sk); struct ip_reply_arg arg; + int offset = 0; + + extopt_list = tcp_extopt_get_list(sk); memset(&rep.th, 0, sizeof(struct tcphdr)); memset(&arg, 0, sizeof(arg)); @@ -763,6 +791,7 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.opt[1] = htonl(tsval); rep.opt[2] = htonl(tsecr); arg.iov[0].iov_len += TCPOLEN_TSTAMP_ALIGNED; + offset += 3; } /* Swap the send and the receive. */ @@ -774,22 +803,45 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.ack = 1; rep.th.window = htons(win); + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { + unsigned int remaining; + struct tcp_out_options opts; + int used; + + remaining = sizeof(rep.th) + sizeof(rep.opt) - arg.iov[0].iov_len; + #ifdef CONFIG_TCP_MD5SIG - if (key) { - int offset = (tsecr) ? 3 : 0; + if (key) + remaining -= TCPOLEN_MD5SIG_ALIGNED; +#endif + + memset(&opts, 0, sizeof(opts)); + used = tcp_extopt_response_prepare(skb, TCPHDR_ACK, remaining, + &opts, sk); + + arg.iov[0].iov_len += used; + rep.th.doff = arg.iov[0].iov_len / 4; + tcp_extopt_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); + + offset += used / 4; + } + +#ifdef CONFIG_TCP_MD5SIG + if (key) { rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; - rep.th.doff = arg.iov[0].iov_len/4; + rep.th.doff = arg.iov[0].iov_len / 4; tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[offset], key, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &rep.th); } #endif + arg.flags = reply_flags; arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ @@ -893,6 +945,9 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, */ static void tcp_v4_reqsk_destructor(struct request_sock *req) { + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) + tcp_extopt_destroy(req_to_sk(req)); + kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1)); } @@ -1410,6 +1465,11 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, if (likely(*own_req)) { tcp_move_syn(newtp, req); ireq->ireq_opt = NULL; + + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) { + tcp_extopt_move(req_to_sk(req), newsk); + INIT_HLIST_HEAD(&tcp_rsk(req)->tcp_option_list); + } } else { newinet->inet_opt = NULL; } @@ -1907,6 +1967,8 @@ void tcp_v4_destroy_sock(struct sock *sk) /* Cleans up our, hopefully empty, out_of_order_queue. */ skb_rbtree_purge(&tp->out_of_order_queue); + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tcp_extopt_destroy(sk); #ifdef CONFIG_TCP_MD5SIG /* Clean up the MD5 key list, if any */ if (tp->md5sig_info) { diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index a8384b0c11f8..676ad7ca13ad 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -95,9 +95,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); bool paws_reject = false; - tmp_opt.saw_tstamp = 0; + tcp_clear_options(&tmp_opt); if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) { - tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL); + tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL, + (struct sock *)tw); if (tmp_opt.saw_tstamp) { if (tmp_opt.rcv_tsecr) @@ -108,6 +109,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, } } + if (unlikely(!hlist_empty(&tcptw->tcp_option_list)) && + tcp_extopt_check((struct sock *)tw, skb, &tmp_opt)) + return TCP_TW_SUCCESS; + if (tw->tw_substate == TCP_FIN_WAIT2) { /* Just repeat all the checks of tcp_rcv_state_process() */ @@ -251,7 +256,7 @@ EXPORT_SYMBOL(tcp_timewait_state_process); void tcp_time_wait(struct sock *sk, int state, int timeo) { const struct inet_connection_sock *icsk = inet_csk(sk); - const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct inet_timewait_sock *tw; struct inet_timewait_death_row *tcp_death_row = &sock_net(sk)->ipv4.tcp_death_row; @@ -271,6 +276,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) tcptw->tw_ts_recent_stamp = tp->rx_opt.ts_recent_stamp; tcptw->tw_ts_offset = tp->tsoffset; tcptw->tw_last_oow_ack_time = 0; + INIT_HLIST_HEAD(&tcptw->tcp_option_list); #if IS_ENABLED(CONFIG_IPV6) if (tw->tw_family == PF_INET6) { @@ -284,6 +290,10 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) } #endif + if (unlikely(!hlist_empty(&tp->tcp_option_list))) { + tcp_extopt_move(sk, (struct sock *)tw); + INIT_HLIST_HEAD(&tp->tcp_option_list); + } #ifdef CONFIG_TCP_MD5SIG /* * The timewait bucket does not have the key DB from the @@ -341,6 +351,9 @@ void tcp_twsk_destructor(struct sock *sk) if (twsk->tw_md5_key) kfree_rcu(twsk->tw_md5_key, rcu); #endif + + if (unlikely(!hlist_empty(&twsk->tcp_option_list))) + tcp_extopt_destroy(sk); } EXPORT_SYMBOL_GPL(tcp_twsk_destructor); @@ -470,6 +483,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, INIT_LIST_HEAD(&newtp->tsq_node); INIT_LIST_HEAD(&newtp->tsorted_sent_queue); + INIT_HLIST_HEAD(&newtp->tcp_option_list); tcp_init_wl(newtp, treq->rcv_isn); @@ -545,6 +559,9 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, if (newtp->af_specific->md5_lookup(sk, newsk)) newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; #endif + if (unlikely(!hlist_empty(&treq->tcp_option_list))) + newtp->tcp_header_len += tcp_extopt_add_header(req_to_sk(req), newsk); + if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len) newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len; newtp->rx_opt.mss_clamp = req->mss; @@ -587,9 +604,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, bool paws_reject = false; bool own_req; - tmp_opt.saw_tstamp = 0; + tcp_clear_options(&tmp_opt); if (th->doff > (sizeof(struct tcphdr)>>2)) { - tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL); + tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL, + req_to_sk(req)); if (tmp_opt.saw_tstamp) { tmp_opt.ts_recent = req->ts_recent; @@ -604,6 +622,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, } } + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list)) && + tcp_extopt_check(req_to_sk(req), skb, &tmp_opt)) + return NULL; + /* Check for pure retransmitted SYN. */ if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn && flg == TCP_FLAG_SYN && diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index efe599a41e36..6804a9325107 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -398,13 +398,6 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) return tp->snd_una != tp->snd_up; } -#define OPTION_SACK_ADVERTISE (1 << 0) -#define OPTION_TS (1 << 1) -#define OPTION_MD5 (1 << 2) -#define OPTION_WSCALE (1 << 3) -#define OPTION_FAST_OPEN_COOKIE (1 << 8) -#define OPTION_SMC (1 << 9) - static void smc_options_write(__be32 *ptr, u16 *options) { #if IS_ENABLED(CONFIG_SMC) @@ -420,17 +413,6 @@ static void smc_options_write(__be32 *ptr, u16 *options) #endif } -struct tcp_out_options { - u16 options; /* bit field of OPTION_* */ - u16 mss; /* 0 to disable */ - u8 ws; /* window scale, 0 to disable */ - u8 num_sack_blocks; /* number of SACK blocks to include */ - u8 hash_size; /* bytes in hash_location */ - __u8 *hash_location; /* temporary pointer, overloaded */ - __u32 tsval, tsecr; /* need to include OPTION_TS */ - struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ -}; - /* Write previously computed TCP options to the packet. * * Beware: Something in the Internet is very sensitive to the ordering of @@ -447,12 +429,15 @@ struct tcp_out_options { static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, struct tcp_out_options *opts) { + struct hlist_head *extopt_list; u16 options = opts->options; /* mungable copy */ struct tcp_sock *tp = NULL; if (sk_fullsock(sk)) tp = tcp_sk(sk); + extopt_list = tcp_extopt_get_list(sk); + if (unlikely(OPTION_MD5 & options)) { *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); @@ -543,6 +528,9 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, } smc_options_write(ptr, &options); + + if (unlikely(!hlist_empty(extopt_list))) + tcp_extopt_write(ptr, skb, opts, sk); } static void smc_set_option(const struct tcp_sock *tp, @@ -645,6 +633,10 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, smc_set_option(tp, opts, &remaining); + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN, remaining, + opts, tcp_to_sk(tp)); + return MAX_TCP_OPTION_SPACE - remaining; } @@ -708,6 +700,11 @@ static unsigned int tcp_synack_options(const struct sock *sk, smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) + remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN | TCPHDR_ACK, + remaining, opts, + req_to_sk(req)); + return MAX_TCP_OPTION_SPACE - remaining; } @@ -741,6 +738,10 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb size += TCPOLEN_TSTAMP_ALIGNED; } + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + size += tcp_extopt_prepare(skb, 0, MAX_TCP_OPTION_SPACE - size, + opts, tcp_to_sk(tp)); + eff_sacks = tp->rx_opt.num_sacks + tp->rx_opt.dsack; if (unlikely(eff_sacks)) { const unsigned int remaining = MAX_TCP_OPTION_SPACE - size; @@ -3303,6 +3304,9 @@ static void tcp_connect_init(struct sock *sk) tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; #endif + if (unlikely(!hlist_empty(&tp->tcp_option_list))) + tp->tcp_header_len += tcp_extopt_add_header(sk, sk); + /* If user gave his TCP_MAXSEG, record it to clamp */ if (tp->rx_opt.user_mss) tp->rx_opt.mss_clamp = tp->rx_opt.user_mss; diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index e7a3a6b6cf56..d0716c7e9390 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -162,7 +162,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) /* check for timestamp cookie support */ memset(&tcp_opt, 0, sizeof(tcp_opt)); - tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL); + tcp_parse_options(sock_net(sk), skb, &tcp_opt, 0, NULL, sk); if (tcp_opt.saw_tstamp && tcp_opt.rcv_tsecr) { tsoff = secure_tcpv6_ts_off(sock_net(sk), @@ -174,6 +174,10 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) if (!cookie_timestamp_decode(sock_net(sk), &tcp_opt)) goto out; + if (unlikely(!hlist_empty(&tp->tcp_option_list)) && + tcp_extopt_check(sk, skb, &tcp_opt)) + goto out; + ret = NULL; req = inet_reqsk_alloc(&tcp6_request_sock_ops, sk, false); if (!req) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 7178476b3d2f..5af5dcc1ac83 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -500,6 +500,9 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst, static void tcp_v6_reqsk_destructor(struct request_sock *req) { + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) + tcp_extopt_destroy(req_to_sk(req)); + kfree(inet_rsk(req)->ipv6_opt); kfree_skb(inet_rsk(req)->pktopts); } @@ -789,6 +792,8 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 unsigned int tot_len = sizeof(struct tcphdr); struct dst_entry *dst; __be32 *topt; + struct hlist_head *extopt_list = NULL; + struct tcp_out_options extraopts; if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; @@ -797,6 +802,25 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 tot_len += TCPOLEN_MD5SIG_ALIGNED; #endif + if (sk) + extopt_list = tcp_extopt_get_list(sk); + + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { + unsigned int remaining = MAX_TCP_OPTION_SPACE - tot_len; + u8 extraflags = rst ? TCPHDR_RST : 0; + int used; + + if (!rst || !th->ack) + extraflags |= TCPHDR_ACK; + + memset(&extraopts, 0, sizeof(extraopts)); + + used = tcp_extopt_response_prepare(skb, extraflags, remaining, + &extraopts, sk); + + tot_len += used; + } + buff = alloc_skb(MAX_HEADER + sizeof(struct ipv6hdr) + tot_len, GFP_ATOMIC); if (!buff) @@ -837,6 +861,9 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 } #endif + if (unlikely(extopt_list && !hlist_empty(extopt_list))) + tcp_extopt_response_write(topt, skb, t1, &extraopts, sk); + memset(&fl6, 0, sizeof(fl6)); fl6.daddr = ipv6_hdr(skb)->saddr; fl6.saddr = ipv6_hdr(skb)->daddr; @@ -1231,6 +1258,11 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * skb_set_owner_r(newnp->pktoptions, newsk); } } + + if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) { + tcp_extopt_move(req_to_sk(req), newsk); + INIT_HLIST_HEAD(&tcp_rsk(req)->tcp_option_list); + } } return newsk; From patchwork Mon Dec 18 21:51:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850433 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="IGNPnTua"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vsB4zqQz9s81 for ; Tue, 19 Dec 2017 08:52:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965208AbdLRVw3 (ORCPT ); Mon, 18 Dec 2017 16:52:29 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:48858 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S937497AbdLRVvh (ORCPT ); Mon, 18 Dec 2017 16:51:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633894; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=+ZM2ybc0T7Esd59NIstf0XHO4ulFQyprah9VT3ifnsY=; b=IGNPnTuaSxCV/HGPwiQUz7g+ScD+iP5rehzxdgJsJ05WT2ZvywI9YtbeB5uVcRfM sAZ+aMMdXL3vlVthvRjUfa/SNQPyXnk0uR4tlYnh4fSnMUGd9JteiRuGwFaum3D+ 3gItycqISyE2n8AICN5CUwNVmnirVXNIFyrdx3WnhXJuthstqPtAdQFe1hQSEuru f3W1T5GtrNHkU1ZUKxKOYHU8kEAfbpsqvna18fyzJ9UfbEpMCVAKx8JPsQrU6OZB WM3tVq9+zAs/Ry5sjPzhzZEKmpZaUiZGyd1Cd4DkYep2xf+PAMgOo2b0xn5MGrvV b9+ZBbRBebMqrLtKapwQfQ==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id 3B.23.20985.668383A5; Mon, 18 Dec 2017 13:51:34 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-3e-5a383866acbf Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay6.apple.com (Apple SCV relay) with SMTP id 86.13.05652.668383A5; Mon, 18 Dec 2017 13:51:34 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600HV1FDYVT70@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:34 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov , Ursula Braun Subject: [RFC 06/14] tcp_smc: Make SMC use TCP extra-option framework Date: Mon, 18 Dec 2017 13:51:01 -0800 Message-id: <20171218215109.38700-7-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrMLMWRmVeSWpSXmKPExsUi2FAYpZtmYRFl0LNY3eLLz9vsFk+PPWK3 +NvSz2JxbIGYxdLpSg6sHgs2lXpsWtXJ5jHvZKDHg0ObWTw+b5ILYI3isklJzcksSy3St0vg yvg28yRjwaUuxorP51awNDDuyuti5OCQEDCR+HojuIuRi0NIYDWTxO6m96ww8ZlTZCHihxgl 3r5/wQYSZxaQlzh4HireyCSxsn8XSxcjJ4ewgKRE9507zCA2m4CWxNvb7awgtoiAlMTHHdvZ QRqYBdYyStydepAFZJCwgKvE2kOcIDUsAqoS6zb3gc3hFTCTOHWmkw3ElgDatfj7TjCbU8Bc oufrI3YQWwio5vzPl0wgMyUEbrNJvD/cxTiBUXAWwn0LGBlXMQrlJmbm6GbmmeklFhTkpOol 5+duYgSF6nQ70R2MZ1ZZHWIU4GBU4uGdcdU8Sog1say4MvcQozQHi5I478VqkyghgfTEktTs 1NSC1KL4otKc1OJDjEwcnFINjKG/w00D4uZtfMr8S5h5WUZ6W6/b3+M+P8rSc5QPqRvFXmZY 36xQZpSo1hb9s3CPRmKcut1s2+znTmoawj/WH5Y/Z5He1lzry1n712VKYa9ak65er6aTz6zg nP3q88TVZ+9XTiwLe5Uw48eihM1LNnjedbZ5Ok3aqYI1ycXgWHBy34NpC94psRRnJBpqMRcV JwIAaKtTuTYCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrPLMWRmVeSWpSXmKPExsUi2FB8QzfNwiLK4MMzJYsvP2+zWzw99ojd 4m9LP4vFsQViFkunKzmweizYVOqxaVUnm8e8k4EeDw5tZvH4vEkugDXK0CYtv6g8sShFoSi5 oMRWqTgjMSW/PN7S2MjUIbGgICdVLzk/V0nfziYlNSezLLVI3y7BMOPbzJOMBZe6GCs+n1vB 0sC4K6+LkYNDQsBEYuYU2S5GLg4hgUOMEm/fv2ADiTMLyEscPA8Vb2SSWNm/i6WLkZNDWEBS ovvOHWYQm01AS+Lt7XZWEFtEQEri447t7CANzAJrGSXuTj3IAjJIWMBVYu0hTpAaFgFViXWb +8Dm8AqYSZw608kGYksA7Vr8fSeYzSlgLtHz9RE7iC0EVHP+50umCYx8sxBOWsDIuIpRoCg1 J7HSTA/u2U2M4FAtjNrB2LDc6hCjAAejEg/vjKvmUUKsiWXFlblAv3EwK4nw+p0FCvGmJFZW pRblxxeV5qQWH2L0AbptIrOUaHI+MI7ySuINjS2MLU0sDAxMLM1McAgrifO6rwKaJZCeWJKa nZpakFoEM46Jg1OqgdFANGq39uvwE4e/v7t3M3XHfl+7gsvTp7ybXFXIfExQ6BCLSYtSXRTP o4yns7KW/dd4cNud6eLB6RYXoiTMVjz33j1hXbiqlt/0speMIetONLpqb5SZef/vm0XZSxcb Tq100Fn1zOPWkfvrEhvyd/vr3OJhEbOK+d4V03grw6f04brf682CHlxUYgEmC0Mt5qLiRAAa Idh4ggIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Adopt the extra-option framework for SMC. It allows us to entirely remove SMC-code out of the TCP-stack. The static key is gone, as this is now covered by the static key of the extra-option framework. We allocate state (struct tcp_smc_opt) that indicates whether SMC was successfully negotiated or not and check this state in the relevant functions. Cc: Ursula Braun Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 3 +- include/net/inet_sock.h | 3 +- include/net/tcp.h | 4 - net/ipv4/tcp.c | 5 -- net/ipv4/tcp_input.c | 36 --------- net/ipv4/tcp_minisocks.c | 18 ----- net/ipv4/tcp_output.c | 54 -------------- net/smc/af_smc.c | 190 +++++++++++++++++++++++++++++++++++++++++++++-- 8 files changed, 186 insertions(+), 127 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 4756bd2c4b54..231b352f587f 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -257,8 +257,7 @@ struct tcp_sock { syn_fastopen_ch:1, /* Active TFO re-enabling probe */ syn_data_acked:1,/* data in SYN is acked by SYN-ACK */ save_syn:1, /* Save headers of SYN packet */ - is_cwnd_limited:1,/* forward progress limited by snd_cwnd? */ - syn_smc:1; /* SYN includes SMC */ + is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */ u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */ /* RTT measurement */ diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index 39efb968b7a4..8e51b4a69088 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -90,8 +90,7 @@ struct inet_request_sock { wscale_ok : 1, ecn_ok : 1, acked : 1, - no_srccheck: 1, - smc_ok : 1; + no_srccheck: 1; u32 ir_mark; union { struct ip_options_rcu __rcu *ireq_opt; diff --git a/include/net/tcp.h b/include/net/tcp.h index ac62ceff9815..a5c4856e25c7 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2062,10 +2062,6 @@ static inline bool tcp_bpf_ca_needs_ecn(struct sock *sk) return (tcp_call_bpf(sk, BPF_SOCK_OPS_NEEDS_ECN) == 1); } -#if IS_ENABLED(CONFIG_SMC) -extern struct static_key_false tcp_have_smc; -#endif - struct tcp_extopt_store; struct tcp_extopt_ops { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 17f38afb4212..0a1cabee6d5e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -294,11 +294,6 @@ EXPORT_SYMBOL(sysctl_tcp_mem); atomic_long_t tcp_memory_allocated; /* Current allocated memory. */ EXPORT_SYMBOL(tcp_memory_allocated); -#if IS_ENABLED(CONFIG_SMC) -DEFINE_STATIC_KEY_FALSE(tcp_have_smc); -EXPORT_SYMBOL(tcp_have_smc); -#endif - /* * Current number of TCP sockets. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1950ff80fb3f..af8f4f9fd098 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3671,24 +3671,6 @@ static void tcp_parse_fastopen_option(int len, const unsigned char *cookie, foc->exp = exp_opt; } -static int smc_parse_options(const struct tcphdr *th, - struct tcp_options_received *opt_rx, - const unsigned char *ptr, - int opsize) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (th->syn && !(opsize & 1) && - opsize >= TCPOLEN_EXP_SMC_BASE && - get_unaligned_be32(ptr) == TCPOPT_SMC_MAGIC) { - opt_rx->smc_ok = 1; - return 1; - } - } -#endif - return 0; -} - /* Look for tcp options. Normally only called on SYN and SYNACK packets. * But, this can also be called on packets in the established flow when * the fast version below fails. @@ -3796,9 +3778,6 @@ void tcp_parse_options(const struct net *net, tcp_parse_fastopen_option(opsize - TCPOLEN_EXP_FASTOPEN_BASE, ptr + 2, th->syn, foc, true); - else if (smc_parse_options(th, opt_rx, ptr, - opsize)) - break; else if (opsize >= TCPOLEN_EXP_BASE) tcp_extopt_parse(get_unaligned_be32(ptr), opsize, ptr, skb, @@ -5572,16 +5551,6 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack, return false; } -static void smc_check_reset_syn(struct tcp_sock *tp) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (tp->syn_smc && !tp->rx_opt.smc_ok) - tp->syn_smc = 0; - } -#endif -} - static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, const struct tcphdr *th) { @@ -5692,8 +5661,6 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * is initialized. */ tp->copied_seq = tp->rcv_nxt; - smc_check_reset_syn(tp); - smp_mb(); tcp_finish_connect(sk, skb); @@ -6150,9 +6117,6 @@ static void tcp_openreq_init(struct request_sock *req, ireq->ir_rmt_port = tcp_hdr(skb)->source; ireq->ir_num = ntohs(tcp_hdr(skb)->dest); ireq->ir_mark = inet_request_mark(sk, skb); -#if IS_ENABLED(CONFIG_SMC) - ireq->smc_ok = rx_opt->smc_ok; -#endif } struct request_sock *inet_reqsk_alloc(const struct request_sock_ops *ops, diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 676ad7ca13ad..aa2ff9aadad0 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -435,21 +435,6 @@ void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst) } EXPORT_SYMBOL_GPL(tcp_ca_openreq_child); -static void smc_check_reset_syn_req(struct tcp_sock *oldtp, - struct request_sock *req, - struct tcp_sock *newtp) -{ -#if IS_ENABLED(CONFIG_SMC) - struct inet_request_sock *ireq; - - if (static_branch_unlikely(&tcp_have_smc)) { - ireq = inet_rsk(req); - if (oldtp->syn_smc && !ireq->smc_ok) - newtp->syn_smc = 0; - } -#endif -} - /* This is not only more efficient than what we used to do, it eliminates * a lot of code duplication between IPv4/IPv6 SYN recv processing. -DaveM * @@ -467,9 +452,6 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, struct tcp_request_sock *treq = tcp_rsk(req); struct inet_connection_sock *newicsk = inet_csk(newsk); struct tcp_sock *newtp = tcp_sk(newsk); - struct tcp_sock *oldtp = tcp_sk(sk); - - smc_check_reset_syn_req(oldtp, req, newtp); /* Now setup tcp_sock */ newtp->pred_flags = 0; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6804a9325107..baf1c913ca7f 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -398,21 +398,6 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) return tp->snd_una != tp->snd_up; } -static void smc_options_write(__be32 *ptr, u16 *options) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (unlikely(OPTION_SMC & *options)) { - *ptr++ = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_EXP << 8) | - (TCPOLEN_EXP_SMC_BASE)); - *ptr++ = htonl(TCPOPT_SMC_MAGIC); - } - } -#endif -} - /* Write previously computed TCP options to the packet. * * Beware: Something in the Internet is very sensitive to the ordering of @@ -527,45 +512,10 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, ptr += (len + 3) >> 2; } - smc_options_write(ptr, &options); - if (unlikely(!hlist_empty(extopt_list))) tcp_extopt_write(ptr, skb, opts, sk); } -static void smc_set_option(const struct tcp_sock *tp, - struct tcp_out_options *opts, - unsigned int *remaining) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (tp->syn_smc) { - if (*remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) { - opts->options |= OPTION_SMC; - *remaining -= TCPOLEN_EXP_SMC_BASE_ALIGNED; - } - } - } -#endif -} - -static void smc_set_option_cond(const struct tcp_sock *tp, - const struct inet_request_sock *ireq, - struct tcp_out_options *opts, - unsigned int *remaining) -{ -#if IS_ENABLED(CONFIG_SMC) - if (static_branch_unlikely(&tcp_have_smc)) { - if (tp->syn_smc && ireq->smc_ok) { - if (*remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) { - opts->options |= OPTION_SMC; - *remaining -= TCPOLEN_EXP_SMC_BASE_ALIGNED; - } - } - } -#endif -} - /* Compute TCP options for SYN packets. This is not the final * network wire format yet. */ @@ -631,8 +581,6 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, } } - smc_set_option(tp, opts, &remaining); - if (unlikely(!hlist_empty(&tp->tcp_option_list))) remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN, remaining, opts, tcp_to_sk(tp)); @@ -698,8 +646,6 @@ static unsigned int tcp_synack_options(const struct sock *sk, } } - smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); - if (unlikely(!hlist_empty(&tcp_rsk(req)->tcp_option_list))) remaining -= tcp_extopt_prepare(skb, TCPHDR_SYN | TCPHDR_ACK, remaining, opts, diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index daf8075f5a4c..14bb84f81a50 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -44,6 +44,149 @@ #include "smc_rx.h" #include "smc_close.h" +static unsigned int tcp_smc_opt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); +static __be32 *tcp_smc_opt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store); +static void tcp_smc_opt_parse(int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct sock *sk, + struct tcp_extopt_store *store); +static void tcp_smc_opt_post_process(struct sock *sk, + struct tcp_options_received *opt, + struct tcp_extopt_store *store); +static struct tcp_extopt_store *tcp_smc_opt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store); +static void tcp_smc_opt_destroy(struct tcp_extopt_store *store); + +struct tcp_smc_opt { + struct tcp_extopt_store store; + int smc_ok:1; /* SMC supported on this connection */ + struct rcu_head rcu; +}; + +static const struct tcp_extopt_ops tcp_smc_extra_ops = { + .option_kind = TCPOPT_SMC_MAGIC, + .parse = tcp_smc_opt_parse, + .post_process = tcp_smc_opt_post_process, + .prepare = tcp_smc_opt_prepare, + .write = tcp_smc_opt_write, + .copy = tcp_smc_opt_copy, + .destroy = tcp_smc_opt_destroy, + .owner = THIS_MODULE, +}; + +static struct tcp_smc_opt *tcp_extopt_to_smc(struct tcp_extopt_store *store) +{ + return container_of(store, struct tcp_smc_opt, store); +} + +static struct tcp_smc_opt *tcp_smc_opt_find(struct sock *sk) +{ + struct tcp_extopt_store *ext_opt; + + ext_opt = tcp_extopt_find_kind(TCPOPT_SMC_MAGIC, sk); + + return tcp_extopt_to_smc(ext_opt); +} + +static unsigned int tcp_smc_opt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ + if (!(flags & TCPHDR_SYN)) + return 0; + + if (remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) { + opts->options |= OPTION_SMC; + return TCPOLEN_EXP_SMC_BASE_ALIGNED; + } + + return 0; +} + +static __be32 *tcp_smc_opt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store) +{ + if (unlikely(OPTION_SMC & opts->options)) { + *ptr++ = htonl((TCPOPT_NOP << 24) | + (TCPOPT_NOP << 16) | + (TCPOPT_EXP << 8) | + (TCPOLEN_EXP_SMC_BASE)); + *ptr++ = htonl(TCPOPT_SMC_MAGIC); + } + + return ptr; +} + +static void tcp_smc_opt_parse(int opsize, const unsigned char *opptr, + const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct sock *sk, + struct tcp_extopt_store *store) +{ + struct tcphdr *th = tcp_hdr(skb); + + if (th->syn && !(opsize & 1) && opsize >= TCPOLEN_EXP_SMC_BASE) + opt_rx->smc_ok = 1; +} + +static void tcp_smc_opt_post_process(struct sock *sk, + struct tcp_options_received *opt, + struct tcp_extopt_store *store) +{ + struct tcp_smc_opt *smc_opt = tcp_extopt_to_smc(store); + + if (sk->sk_state != TCP_SYN_SENT) + return; + + if (opt->smc_ok) + smc_opt->smc_ok = 1; + else + smc_opt->smc_ok = 0; +} + +static struct tcp_extopt_store *tcp_smc_opt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store) +{ + struct tcp_smc_opt *smc_opt; + + /* First, check if the peer sent us the smc-opt */ + if (!opt->smc_ok) + return NULL; + + smc_opt = kzalloc(sizeof(*smc_opt), GFP_ATOMIC); + if (!smc_opt) + return NULL; + + smc_opt->store.ops = &tcp_smc_extra_ops; + + smc_opt->smc_ok = 1; + + return (struct tcp_extopt_store *)smc_opt; +} + +static void tcp_smc_opt_destroy(struct tcp_extopt_store *store) +{ + struct tcp_smc_opt *smc_opt = tcp_extopt_to_smc(store); + + kfree_rcu(smc_opt, rcu); +} + static DEFINE_MUTEX(smc_create_lgr_pending); /* serialize link group * creation */ @@ -384,13 +527,15 @@ static int smc_connect_rdma(struct smc_sock *smc) struct smc_clc_msg_accept_confirm aclc; int local_contact = SMC_FIRST_CONTACT; struct smc_ib_device *smcibdev; + struct tcp_smc_opt *smc_opt; struct smc_link *link; u8 srv_first_contact; int reason_code = 0; int rc = 0; u8 ibport; - if (!tcp_sk(smc->clcsock->sk)->syn_smc) { + smc_opt = tcp_smc_opt_find(smc->clcsock->sk); + if (!smc_opt || !smc_opt->smc_ok) { /* peer has not signalled SMC-capability */ smc->use_fallback = true; goto out_connected; @@ -535,6 +680,7 @@ static int smc_connect_rdma(struct smc_sock *smc) static int smc_connect(struct socket *sock, struct sockaddr *addr, int alen, int flags) { + struct tcp_smc_opt *smc_opt; struct sock *sk = sock->sk; struct smc_sock *smc; int rc = -EINVAL; @@ -548,9 +694,17 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr, goto out_err; smc->addr = addr; /* needed for nonblocking connect */ + smc_opt = kzalloc(sizeof(*smc_opt), GFP_KERNEL); + if (!smc_opt) { + rc = -ENOMEM; + goto out_err; + } + smc_opt->store.ops = &tcp_smc_extra_ops; + lock_sock(sk); switch (sk->sk_state) { default: + rc = -EINVAL; goto out; case SMC_ACTIVE: rc = -EISCONN; @@ -560,8 +714,15 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr, break; } + /* We are the only owner of smc->clcsock->sk, so we can be lockless */ + rc = tcp_register_extopt(&smc_opt->store, smc->clcsock->sk); + if (rc) { + release_sock(smc->clcsock->sk); + kfree(smc_opt); + goto out_err; + } + smc_copy_sock_settings_to_clc(smc); - tcp_sk(smc->clcsock->sk)->syn_smc = 1; rc = kernel_connect(smc->clcsock, addr, alen, flags); if (rc) goto out; @@ -760,6 +921,7 @@ static void smc_listen_work(struct work_struct *work) struct smc_clc_msg_proposal *pclc; struct smc_ib_device *smcibdev; struct sockaddr_in peeraddr; + struct tcp_smc_opt *smc_opt; u8 buf[SMC_CLC_MAX_LEN]; struct smc_link *link; int reason_code = 0; @@ -769,7 +931,8 @@ static void smc_listen_work(struct work_struct *work) u8 ibport; /* check if peer is smc capable */ - if (!tcp_sk(newclcsock->sk)->syn_smc) { + smc_opt = tcp_smc_opt_find(newclcsock->sk); + if (!smc_opt || !smc_opt->smc_ok) { new_smc->use_fallback = true; goto out_connected; } @@ -962,10 +1125,18 @@ static void smc_tcp_listen_work(struct work_struct *work) static int smc_listen(struct socket *sock, int backlog) { + struct tcp_smc_opt *smc_opt; struct sock *sk = sock->sk; struct smc_sock *smc; int rc; + smc_opt = kzalloc(sizeof(*smc_opt), GFP_KERNEL); + if (!smc_opt) { + rc = -ENOMEM; + goto out_err; + } + smc_opt->store.ops = &tcp_smc_extra_ops; + smc = smc_sk(sk); lock_sock(sk); @@ -978,11 +1149,19 @@ static int smc_listen(struct socket *sock, int backlog) sk->sk_max_ack_backlog = backlog; goto out; } + + /* We are the only owner of smc->clcsock->sk, so we can be lockless */ + rc = tcp_register_extopt(&smc_opt->store, smc->clcsock->sk); + if (rc) { + release_sock(smc->clcsock->sk); + kfree(smc_opt); + goto out_err; + } + /* some socket options are handled in core, so we could not apply * them to the clc socket -- copy smc socket options to clc socket */ smc_copy_sock_settings_to_clc(smc); - tcp_sk(smc->clcsock->sk)->syn_smc = 1; rc = kernel_listen(smc->clcsock, backlog); if (rc) @@ -995,6 +1174,7 @@ static int smc_listen(struct socket *sock, int backlog) out: release_sock(sk); +out_err: return rc; } @@ -1425,7 +1605,6 @@ static int __init smc_init(void) goto out_sock; } - static_branch_enable(&tcp_have_smc); return 0; out_sock: @@ -1450,7 +1629,6 @@ static void __exit smc_exit(void) list_del_init(&lgr->list); smc_lgr_free(lgr); /* free link group */ } - static_branch_disable(&tcp_have_smc); smc_ib_unregister_client(); sock_unregister(PF_SMC); proto_unregister(&smc_proto); From patchwork Mon Dec 18 21:51:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850422 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="lyMj4Gds"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrH09qDz9s7v for ; Tue, 19 Dec 2017 08:51:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937544AbdLRVvk (ORCPT ); Mon, 18 Dec 2017 16:51:40 -0500 Received: from mail-out2.apple.com ([17.151.62.25]:53894 "EHLO mail-in2.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935637AbdLRVvf (ORCPT ); Mon, 18 Dec 2017 16:51:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633895; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=YlqlFzMELmpKfItrd7vndKjdOLokdH5nbOSjQTavg9E=; b=lyMj4GdsRlmL1dlkWGeifni75QLWny7xezzo23J49fLt5n44oH/27lfj82nCtMyl BX39Od6YOlKuq8H/tlCJwAd0RIhO5oQ5x5Sa8SVkfBo45/hOnpE2RtM6i6O6GFVY iTt8TcOTSKpQsXBd2QmCgkaWqkCX+tAe7xQjyRgdrgynTm62FRs8BGL+VmnI4u2X XSJByI2jxDZMyBAp/oVTJf+/X30LCPcgHdotfO8LQ6wmAQXCy4KgAz3G4d6eXqfP CL+I/ANNw3zLLH6j6Oa5YYbifueesEGx+N4M3XY6uLTNOfGANZNQ2VRAu88hHi1o zQw7Vcsxn6EMVq6yn4UgpQ==; Received: from relay8.apple.com (relay8.apple.com [17.128.113.102]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in2.apple.com (Apple Secure Mail Relay) with SMTP id 99.62.22347.768383A5; Mon, 18 Dec 2017 13:51:35 -0800 (PST) X-AuditID: 11973e11-163b19c00000574b-e0-5a38386794cd Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay8.apple.com (Apple SCV relay) with SMTP id A7.0F.22651.768383A5; Mon, 18 Dec 2017 13:51:35 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LC9FDZYT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:35 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 07/14] tcp_md5: Don't pass along md5-key Date: Mon, 18 Dec 2017 13:51:02 -0800 Message-id: <20171218215109.38700-8-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FCYpptuYRFlsOqLusWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujHmfLrIV/Dev +P8mo4Fxqm4XIyeHhICJxPUHr9m6GLk4hATWMEl0/D7HDpP4vvMBVOIQo8SEz1OAHA4OZgF5 iYPnZSHijUwS+yccZQVpEBaQlOi+c4cZxGYT0JJ4e7sdLC4iICXxccd2sKHMAs2MEjvniEDU m0m8nP8DrJ5FQFVi6acWFhCbFyje2LMV6gh5icXfd7KB2JwC5hI9Xx+BxYWAas7/fMkEcoSE wAE2iUM/7jFPYBSchXDfAkbGVYxCuYmZObqZeUZ6iQUFOal6yfm5mxhBgTndTnAH4/FVVocY BTgYlXh4Z1w1jxJiTSwrrsw9xCjNwaIkznux2iRKSCA9sSQ1OzW1ILUovqg0J7X4ECMTB6dU A6MHl4OicKMN0/GrG6q/VrDX/85aUtbsOveeRfmHb0qvPuyfd2ypz/G9qZMObWj1FFpZ5/Az m7+CyeDuDqdj9rP3/lDp2/njyAuPnfn1jS3fGcrTL7988iM801XoLcOGQ7asIh0Hj2jcDklV tvm/rsiIy0155k+mVSvVplxYGJiSV/3c/hsD9w0lluKMREMt5qLiRABDXXqPLQIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8QzfdwiLK4MIKVYsvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGfM+XWQr+G9e8f9NRgPjVN0uRk4OCQET ie87H7B1MXJxCAkcYpSY8HkKkMPBwSwgL3HwvCxEvJFJYv+Eo6wgDcICkhLdd+4wg9hsAloS b2+3g8VFBKQkPu7Yzg5iMws0M0rsnCMCUW8m8XL+D7B6FgFViaWfWlhAbF6geGPPVnaII+Ql Fn/fyQZicwqYS/R8fQQWFwKqOf/zJdMERr5ZCCctYGRcxShQlJqTWGmhB/frJkZwWBam7WBs Wm51iFGAg1GJh3fGVfMoIdbEsuLKXKDfOJiVRHj9zgKFeFMSK6tSi/Lji0pzUosPMfoA3TaR WUo0OR8YM3kl8YbGFsaWJhYGBiaWZiY4hJXEeT1WAc0SSE8sSc1OTS1ILYIZx8TBKdXAGJge VPNzOaugYAfrr9RP/G27TnhwXpF7JXr46O09HlNfPtNPeGLkrNO6+Har2GRnhkTxoA8e6ytY 7T3ljfXy9pz4U7k15bVQ/ZRXXBNz2YSv67ZEOX0I4zqdref7XTnysv6cVbMC7l89pCzrcPNN sMIb8ctZ2h1qHy98OvvSKivNLnr1pCvXlFiAqcJQi7moOBEAeei0zHgCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org It is much cleaner to store the key-pointer in tcp_out_options. It allows to remove some MD5-specific code out of the function-arguments and paves the way to adopting the TCP-option framework with TCP-MD5. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 1 + net/ipv4/tcp_output.c | 46 +++++++++++++++++++--------------------------- 2 files changed, 20 insertions(+), 27 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 231b352f587f..b0b38f7100a4 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -131,6 +131,7 @@ struct tcp_out_options { __u8 *hash_location; /* temporary pointer, overloaded */ __u32 tsval, tsecr; /* need to include OPTION_TS */ struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ + struct tcp_md5sig_key *md5; /* TCP_MD5 signature key */ }; /* This is the max number of SACKS that we'll generate and process. It's safe diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index baf1c913ca7f..43849ed73b03 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -520,21 +520,18 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, * network wire format yet. */ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, - struct tcp_out_options *opts, - struct tcp_md5sig_key **md5) + struct tcp_out_options *opts) { struct tcp_sock *tp = tcp_sk(sk); unsigned int remaining = MAX_TCP_OPTION_SPACE; struct tcp_fastopen_request *fastopen = tp->fastopen_req; #ifdef CONFIG_TCP_MD5SIG - *md5 = tp->af_specific->md5_lookup(sk, sk); - if (*md5) { + opts->md5 = tp->af_specific->md5_lookup(sk, sk); + if (opts->md5) { opts->options |= OPTION_MD5; remaining -= TCPOLEN_MD5SIG_ALIGNED; } -#else - *md5 = NULL; #endif /* We always get an MSS option. The option bytes which will be seen in @@ -549,7 +546,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, opts->mss = tcp_advertise_mss(sk); remaining -= TCPOLEN_MSS_ALIGNED; - if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !*md5)) { + if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !opts->md5)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; @@ -593,14 +590,13 @@ static unsigned int tcp_synack_options(const struct sock *sk, struct request_sock *req, unsigned int mss, struct sk_buff *skb, struct tcp_out_options *opts, - const struct tcp_md5sig_key *md5, struct tcp_fastopen_cookie *foc) { struct inet_request_sock *ireq = inet_rsk(req); unsigned int remaining = MAX_TCP_OPTION_SPACE; #ifdef CONFIG_TCP_MD5SIG - if (md5) { + if (opts->md5) { opts->options |= OPTION_MD5; remaining -= TCPOLEN_MD5SIG_ALIGNED; @@ -658,8 +654,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, * final wire format yet. */ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb, - struct tcp_out_options *opts, - struct tcp_md5sig_key **md5) + struct tcp_out_options *opts) { struct tcp_sock *tp = tcp_sk(sk); unsigned int size = 0; @@ -668,13 +663,13 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->options = 0; #ifdef CONFIG_TCP_MD5SIG - *md5 = tp->af_specific->md5_lookup(sk, sk); - if (unlikely(*md5)) { + opts->md5 = tp->af_specific->md5_lookup(sk, sk); + if (unlikely(opts->md5)) { opts->options |= OPTION_MD5; size += TCPOLEN_MD5SIG_ALIGNED; } #else - *md5 = NULL; + opts->md5 = NULL; #endif if (likely(tp->rx_opt.tstamp_ok)) { @@ -992,7 +987,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, struct tcp_out_options opts; unsigned int tcp_options_size, tcp_header_size; struct sk_buff *oskb = NULL; - struct tcp_md5sig_key *md5; struct tcphdr *th; int err; @@ -1021,10 +1015,9 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, memset(&opts, 0, sizeof(opts)); if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) - tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5); + tcp_options_size = tcp_syn_options(sk, skb, &opts); else - tcp_options_size = tcp_established_options(sk, skb, &opts, - &md5); + tcp_options_size = tcp_established_options(sk, skb, &opts); tcp_header_size = tcp_options_size + sizeof(struct tcphdr); /* if no packet is in qdisc/device queue, then allow XPS to select @@ -1090,10 +1083,10 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, tcp_options_write((__be32 *)(th + 1), skb, sk, &opts); #ifdef CONFIG_TCP_MD5SIG /* Calculate the MD5 hash, as we have all we need now */ - if (md5) { + if (opts.md5) { sk_nocaps_add(sk, NETIF_F_GSO_MASK); tp->af_specific->calc_md5_hash(opts.hash_location, - md5, sk, skb); + opts.md5, sk, skb); } #endif @@ -1537,7 +1530,6 @@ unsigned int tcp_current_mss(struct sock *sk) u32 mss_now; unsigned int header_len; struct tcp_out_options opts; - struct tcp_md5sig_key *md5; mss_now = tp->mss_cache; @@ -1547,7 +1539,7 @@ unsigned int tcp_current_mss(struct sock *sk) mss_now = tcp_sync_mss(sk, mtu); } - header_len = tcp_established_options(sk, NULL, &opts, &md5) + + header_len = tcp_established_options(sk, NULL, &opts) + sizeof(struct tcphdr); /* The mss_cache is sized based on tp->tcp_header_len, which assumes * some common options. If this is an odd packet (because we have SACK @@ -3123,7 +3115,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, { struct inet_request_sock *ireq = inet_rsk(req); const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *md5 = NULL; struct tcp_out_options opts; struct sk_buff *skb; int tcp_header_size; @@ -3169,10 +3160,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #ifdef CONFIG_TCP_MD5SIG rcu_read_lock(); - md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); + opts.md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); #endif skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4); - tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5, + tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, foc) + sizeof(*th); skb_push(skb, tcp_header_size); @@ -3199,9 +3190,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #ifdef CONFIG_TCP_MD5SIG /* Okay, we have all we need - do the md5 hash if needed */ - if (md5) + if (opts.md5) tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location, - md5, req_to_sk(req), skb); + opts.md5, + req_to_sk(req), skb); rcu_read_unlock(); #endif From patchwork Mon Dec 18 21:51:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850427 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="HuUeGsfY"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrj2GLXz9s7v for ; Tue, 19 Dec 2017 08:52:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965152AbdLRVwD (ORCPT ); Mon, 18 Dec 2017 16:52:03 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:48858 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935637AbdLRVvl (ORCPT ); Mon, 18 Dec 2017 16:51:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633896; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Z2Fzn5XQS0F6e9tEJOTZv7i7JOuqxNiGw8KU7+Moj6A=; b=HuUeGsfYHhg4u75ntyPNeBPr5MjdUgNHg4LBo3alxMRPJ2oLSzoTHOnDXj0fSw2Z eOZ23S+nn7fdoxUPgWttrQ+bbNLhGRtj0TEJ+K482c/KtkpOntz1RVUjVPoqelkh 1e595wWRH2pAIXRJL8VT5q6aIVS+7guUsDyTWQK23xR7RRvPykc79flUBSjqXrlp uw+SoaLde8An1c1ax9qAGMsPMDJUk7TAK6zi8msL2nrlWeWZaeO4UoiErul9Nyxb 9P63z1aagqjv0Ej7OGh9vBNNmTTW7qbY6oDdBosQU+ra4j57gx7+oBRXinFuYv1A FUqEteoYdDhQvr+Jkcl+2g==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id 4C.23.20985.868383A5; Mon, 18 Dec 2017 13:51:36 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-43-5a383868524a Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay6.apple.com (Apple SCV relay) with SMTP id 59.13.05652.868383A5; Mon, 18 Dec 2017 13:51:36 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LCBFE0YT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:36 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 08/14] tcp_md5: Detect key inside tcp_v4_send_ack instead of passing it as an argument Date: Mon, 18 Dec 2017 13:51:03 -0800 Message-id: <20171218215109.38700-9-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNLMWRmVeSWpSXmKPExsUi2FAYpZthYRFlcPa+mMWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujI/PTrEVnBGq 2PPsP1sD4ze+LkZODgkBE4mL+3tZuhi5OIQEVjNJnF3wlwkmseb5N1aIxCFGiROz/gAlODiY BeQlDp6XhYg3Mkk03TgO1iAsICnRfecOM4jNJqAl8fZ2OyuILSIgJfFxx3Z2EJtZoJlRYucc EYj6NIl5K5+xgdgsAqoSmw88BavnFTCT2L1uMTPEEfISi7/vBKvhFDCX6Pn6CGyOEFDN+Z8v mUCOkBA4wSaxc9dplgmMgrMQ7lvAyLiKUSg3MTNHNzPPTC+xoCAnVS85P3cTIyg0p9uJ7mA8 s8rqEKMAB6MSD++Mq+ZRQqyJZcWVuYcYpTlYlMR5L1abRAkJpCeWpGanphakFsUXleakFh9i ZOLglGpgFLK0WdDF7rf5c+qX0u6C7ArWF6qOnzISb0zkFZ/26Z7EVaZw5i0l//g2vFlc+kzu u3uLUkLQTG614gV+inKMy+6ZbV5hV/pf6NTU+0n5K8uSdrWlRHecTArJatb69/ltRDSL2imF LQdZlNadNst/apB0XnfTRKlsG78PNS9uZ5yZ7Lroz94wJZbijERDLeai4kQAk4TQdC4CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsUi2FB8QzfDwiLK4OVbYYsvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGR+fnWIrOCNUsefZf7YGxm98XYycHBIC JhJrnn9j7WLk4hASOMQocWLWH6YuRg4OZgF5iYPnZSHijUwSTTeOM4E0CAtISnTfucMMYrMJ aEm8vd3OCmKLCEhJfNyxnR3EZhZoZpTYOUcEoj5NYt7KZ2wgNouAqsTmA0/B6nkFzCR2r1vM DHGEvMTi7zvBajgFzCV6vj4CmyMEVHP+50umCYx8sxBOWsDIuIpRoCg1J7HSTA/u2U2M4MAs jNrB2LDc6hCjAAejEg/vjKvmUUKsiWXFlblAv3EwK4nw+p0FCvGmJFZWpRblxxeV5qQWH2L0 AbptIrOUaHI+MGrySuINjS2MLU0sDAxMLM1McAgrifO6rwKaJZCeWJKanZpakFoEM46Jg1Oq gdF87flXBf+UMizTHPMybN/V+m+8OWuhtClfsdIc4yuKR54uunKgr1qu79zlr8K+Sa8d7YJS F+zV3LLEvOLEHIvaSy/3OCywWzzr8hzJKV/WNN807D5y7DJnyaXNDveqas7lnzb5FZGWdluO LXRFwsvz4p3cm43ezHkdG3W+6tS3m/KmF/fZBjxTYgEmC0Mt5qLiRAATjeveeQIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This will simplify to consolidate the TCP_MD5-code into a single place. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_ipv4.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index dee296097b8f..397975203e14 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -764,7 +764,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) static void tcp_v4_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, int reply_flags, u8 tos) { const struct tcphdr *th = tcp_hdr(skb); @@ -773,6 +772,9 @@ static void tcp_v4_send_ack(const struct sock *sk, __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct hlist_head *extopt_list = NULL; +#ifdef CONFIG_TCP_MD5SIG + struct tcp_md5sig_key *key; +#endif struct net *net = sock_net(sk); struct ip_reply_arg arg; int offset = 0; @@ -803,6 +805,17 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.ack = 1; rep.th.window = htons(win); +#ifdef CONFIG_TCP_MD5SIG + if (sk->sk_state == TCP_TIME_WAIT) { + key = tcp_twsk_md5_key(tcp_twsk(sk)); + } else if (sk->sk_state == TCP_NEW_SYN_RECV) { + key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr, + AF_INET); + } else { + key = NULL; /* Should not happen */ + } +#endif + if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; struct tcp_out_options opts; @@ -872,7 +885,6 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) tcp_time_stamp_raw() + tcptw->tw_ts_offset, tcptw->tw_ts_recent, tw->tw_bound_dev_if, - tcp_twsk_md5_key(tcptw), tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0, tw->tw_tos ); @@ -900,8 +912,6 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, req->ts_recent, 0, - tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr, - AF_INET), inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, ip_hdr(skb)->tos); } From patchwork Mon Dec 18 21:51:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850425 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="TH3Yu+9G"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrX25nsz9s7v for ; Tue, 19 Dec 2017 08:51:56 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965067AbdLRVvy (ORCPT ); Mon, 18 Dec 2017 16:51:54 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:65236 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S937547AbdLRVvm (ORCPT ); Mon, 18 Dec 2017 16:51:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633897; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=LkVRtaJH5Dc81pN6I0Qip3uC+OkdY+r5F+jVxQOGdpU=; b=TH3Yu+9GlszsVhC9WM4mJQpoHHBMcx4YsfFW6T7EfJhNOteA76ih40GHv9st0d0Y meb+yG1JU0vhNJ72jP28QzFSPn/2XjidkE5zsFhanW5T9v3fnGF/got8V1Emj6I+ SwqWamkM64bo364hzlv12D7nqedkSgEkctWC54tHkyUpQOWNSls4lD1mpQm2mGRB uyqmZMEgS/yT+RWFu0iQ/Iq2tMoTtF7xWei2BsgTRqsOimHRaQ/SquKzHouwQvU+ HSdAczqnYCFwr/KLAwTuEEWIcrb158odRhXixOIxqeIgQapynpzUsJPHxpmfchc3 jF0XjnTRMLSEdBULWG54Cw==; Received: from relay5.apple.com (relay5.apple.com [17.128.113.88]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id 9C.23.20985.968383A5; Mon, 18 Dec 2017 13:51:37 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-45-5a383869fedb Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay5.apple.com (Apple SCV relay) with SMTP id 17.E8.18983.868383A5; Mon, 18 Dec 2017 13:51:36 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600HV9FE0VT70@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:36 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 09/14] tcp_md5: Detect key inside tcp_v6_send_response instead of passing it as an argument Date: Mon, 18 Dec 2017 13:51:04 -0800 Message-id: <20171218215109.38700-10-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprGLMWRmVeSWpSXmKPExsUi2FAYoZtpYRFlsCHV4svP2+wWT489Yrf4 29LPYnFsgZgDi8eCTaUem1Z1snnMOxno8XmTXABLFJdNSmpOZllqkb5dAlfGunmT2As6rCs2 vJrA2MDYoN/FyMEhIWAiMfFEUhcjF4eQwGomiX1HVrJ2MXKCxR9unssCYgsJHGKU+DHTHqSe WUBe4uB5WYj6RiaJD3N2g9ULC0hKdN+5wwxiswloSby93Q4WFxGQkvi4Yzs7iM0s0MwosXOO CER9tsT2p6/ZQGwWAVWJA42/2UDm8wqYS/xeWgZxgrzE4u87wUo4gcI9Xx+xQ5xjJnH+50sm iJojbBIXH8lPYBSchXDdAkbGVYxCuYmZObqZeWZ6iQUFOal6yfm5mxhBATndTnQH45lVVocY BTgYlXh4Z1w1jxJiTSwrrsw9xCjNwaIkznux2iRKSCA9sSQ1OzW1ILUovqg0J7X4ECMTB6dU A6P+JydpTb5ysU3MJ5+rVk81WNZs9zDrxr61nJfEQ1ZITDUqlHu0N41F3anG41HPi8ebuNPY pk/6x+MapHpkpbVqwBrG62fU98w6FWCxNXfGur42a2uJZ+I6k5oKe5Z+fscbE+KxROaSjb7e ld2p/JwxFxLObpobIXvrzQ2/Cy7rP/+8kLY0/7gSS3FGoqEWc1FxIgANLo4cKQIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrCLMWRmVeSWpSXmKPExsUi2FB8QzfDwiLKoGGKtcWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAlihDm7T8ovLEohSFouSCElul4ozElPzy eEtjI1OHxIKCnFS95PxcJX07m5TUnMyy1CJ9uwTDjHXzJrEXdFhXbHg1gbGBsUG/i5GTQ0LA ROLh5rksILaQwCFGiR8z7bsYOTiYBeQlDp6X7WLkAgo3Mkl8mLObFaRGWEBSovvOHWYQm01A S+Lt7XawuIiAlMTHHdvZQWxmgWZGiZ1zRCDqsyW2P33NBmKzCKhKHGj8zQYyn1fAXOL30jKI E+QlFn/fCVbCCRTu+fqIHeIcM4nzP18yTWDkm4Vw0QJGxlWMAkWpOYmVpnpwr25iBAdlYcQO xv/LrA4xCnAwKvHwzrhqHiXEmlhWXJkL9BkHs5IIr99ZoBBvSmJlVWpRfnxRaU5q8SFGH6DT JjJLiSbnAyMmryTe0NjC2NLEwsDAxNLMBIewkjiv+yqgWQLpiSWp2ampBalFMOOYODilGhhL DgpxfFJr8vk+NSubMWr7y58/fVbfbXaLVlo6tWZpl4Ls0yUWle6paQ8nvucW2Wj2KF7j3xSb /+Vr9h1Nk63/VqbesFflCvt5fmveL4q7RRWPB7CvY1p9fYmK/w/ekh/NWx0XZa786Zgy513H 49ithdH3tWXXZK44rLtr8q/op/s5JTwOLVdXYgEmCkMt5qLiRAC+wwnCdwIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org We want to move all the TCP-MD5 code to a single place which enables us to factor the TCP-MD5 code out of the TCP-stack into the extra-option framework. Detection of whether or not to drop the segment (as done in tcp_v6_send_reset()) has now been moved to tcp_v6_send_response(). So we needed to adapt the latter so that it can handle the case where we want to exit without sending anything. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv6/tcp_ipv6.c | 119 +++++++++++++++++++++++++--------------------------- 1 file changed, 57 insertions(+), 62 deletions(-) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 5af5dcc1ac83..202a59511950 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -82,12 +82,6 @@ static const struct inet_connection_sock_af_ops ipv6_specific; #ifdef CONFIG_TCP_MD5SIG static const struct tcp_sock_af_ops tcp_sock_ipv6_specific; static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; -#else -static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, - const struct in6_addr *addr) -{ - return NULL; -} #endif static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb) @@ -780,12 +774,11 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, - int oif, struct tcp_md5sig_key *key, int rst, - u8 tclass, __be32 label) + int oif, int rst, u8 tclass, __be32 label) { const struct tcphdr *th = tcp_hdr(skb); struct tcphdr *t1; - struct sk_buff *buff; + struct sk_buff *buff = NULL; struct flowi6 fl6; struct net *net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); struct sock *ctl_sk = net->ipv6.tcp_sk; @@ -794,10 +787,54 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 __be32 *topt; struct hlist_head *extopt_list = NULL; struct tcp_out_options extraopts; +#ifdef CONFIG_TCP_MD5SIG + struct tcp_md5sig_key *key = NULL; + const __u8 *hash_location = NULL; + struct ipv6hdr *ipv6h = ipv6_hdr(skb); +#endif if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG + rcu_read_lock(); + hash_location = tcp_parse_md5sig_option(th); + if (sk && sk_fullsock(sk)) { + key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (sk && sk->sk_state == TCP_TIME_WAIT) { + struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + + key = tcp_twsk_md5_key(tcptw); + } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { + key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (hash_location) { + unsigned char newhash[16]; + struct sock *sk1 = NULL; + int genhash; + + /* active side is lost. Try to find listening socket through + * source port, and then find md5 key through listening socket. + * we are not loose security here: + * Incoming packet is checked with md5 hash with finding key, + * no RST generated if md5 hash doesn't match. + */ + sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), + &tcp_hashinfo, NULL, 0, + &ipv6h->saddr, + th->source, &ipv6h->daddr, + ntohs(th->source), tcp_v6_iif(skb), + tcp_v6_sdif(skb)); + if (!sk1) + goto out; + + key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); + if (!key) + goto out; + + genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); + if (genhash || memcmp(hash_location, newhash, 16) != 0) + goto out; + } + if (key) tot_len += TCPOLEN_MD5SIG_ALIGNED; #endif @@ -824,7 +861,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 buff = alloc_skb(MAX_HEADER + sizeof(struct ipv6hdr) + tot_len, GFP_ATOMIC); if (!buff) - return; + goto out; skb_reserve(buff, MAX_HEADER + sizeof(struct ipv6hdr) + tot_len); @@ -901,24 +938,21 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 TCP_INC_STATS(net, TCP_MIB_OUTSEGS); if (rst) TCP_INC_STATS(net, TCP_MIB_OUTRSTS); - return; + buff = NULL; } +out: kfree_skb(buff); + +#ifdef CONFIG_TCP_MD5SIG + rcu_read_unlock(); +#endif } static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) { const struct tcphdr *th = tcp_hdr(skb); u32 seq = 0, ack_seq = 0; - struct tcp_md5sig_key *key = NULL; -#ifdef CONFIG_TCP_MD5SIG - const __u8 *hash_location = NULL; - struct ipv6hdr *ipv6h = ipv6_hdr(skb); - unsigned char newhash[16]; - int genhash; - struct sock *sk1 = NULL; -#endif int oif = 0; if (th->rst) @@ -930,38 +964,6 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) if (!sk && !ipv6_unicast_destination(skb)) return; -#ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (hash_location) { - /* - * active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), tcp_v6_iif(skb), - tcp_v6_sdif(skb)); - if (!sk1) - goto out; - - key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); - if (!key) - goto out; - - genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out; - } -#endif - if (th->ack) seq = ntohl(th->ack_seq); else @@ -973,20 +975,14 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) trace_tcp_send_reset(sk, skb); } - tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, key, 1, 0, 0); - -#ifdef CONFIG_TCP_MD5SIG -out: - rcu_read_unlock(); -#endif + tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, 1, 0, 0); } static void tcp_v6_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, u8 tclass, - __be32 label) + u8 tclass, __be32 label) { - tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, key, 0, + tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, 0, tclass, label); } @@ -998,7 +994,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) tcp_v6_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt, tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale, tcp_time_stamp_raw() + tcptw->tw_ts_offset, - tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), + tcptw->tw_ts_recent, tw->tw_bound_dev_if, tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel)); inet_twsk_put(tw); @@ -1021,7 +1017,6 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, req->ts_recent, sk->sk_bound_dev_if, - tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr), 0, 0); } From patchwork Mon Dec 18 21:51:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850430 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="kqlhmVsv"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrx5Dy4z9s81 for ; Tue, 19 Dec 2017 08:52:17 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965177AbdLRVwO (ORCPT ); Mon, 18 Dec 2017 16:52:14 -0500 Received: from mail-out5.apple.com ([17.151.62.27]:47737 "EHLO mail-in5.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S964848AbdLRVvi (ORCPT ); Mon, 18 Dec 2017 16:51:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633897; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=PXVbSh2HFL+m/QbUD7WSQ3UtELDxQ+0+g/IpnjMC6ow=; b=kqlhmVsvO/iNaALVPN9NbXtqKznIzHcl/TVJcccJ7DiXqVq9HfT83NpwPldhER5E OQMH/PxSAlGo3BwStYPfmRIl9FXfmqgfFZEg3KhnYohXxw2JmRi+nCz6kGvPFXa1 wqRUojNzK6zJPebJY1lD3tKWsm7oIPStqb8p8DqiIU02xw4JOXLIsKgN3eZGkh2E Z/+YYEPEILBuI4UNS6pNvDmNsV7FmbzLwfLnswRfxSX2ioxF6bsHBKs20bt4hOkN PhFu7ih8g5z2iQWnJO1aj4xQnMHa/TlIEfnbaaYBcctkJC8n2lrOBAhzbTzb+wXY e6Xt9MSowtvrSEiSroJaUQ==; Received: from relay3.apple.com (relay3.apple.com [17.128.113.83]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in5.apple.com (Apple Secure Mail Relay) with SMTP id 11.AA.23913.968383A5; Mon, 18 Dec 2017 13:51:37 -0800 (PST) X-AuditID: 11973e13-c5d259c000005d69-ef-5a3838691d05 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay3.apple.com (Apple SCV relay) with SMTP id 5E.D7.12852.968383A5; Mon, 18 Dec 2017 13:51:37 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600HVCFE1VT70@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:37 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 10/14] tcp_md5: Check for TCP_MD5 after TCP Timestamps in tcp_established_options Date: Mon, 18 Dec 2017 13:51:05 -0800 Message-id: <20171218215109.38700-11-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNLMWRmVeSWpSXmKPExsUi2FAYrJtpYRFl8H6ttcWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujHmTljIWLOWp uLt+PmsD42yuLkZODgkBE4mmtfuZuhi5OIQEVjNJbPs/kQUm8eLMQqjEIUaJ20eb2boYOTiY BeQlDp6XhYg3MknsmLOUGaRBWEBSovvOHTCbTUBL4u3tdlYQW0RASuLjju3sIDazQDOjxM45 IhD1iRK3uj6C1bMIqEo8ObWdEcTmFTCXOHNpDSPEEfISi7/vZAOxOYHiPV8fgc0REjCTOP/z JdhxEgJH2CRav8xin8AoOAvhvgWMjKsYhXITM3N0M/NM9RILCnJS9ZLzczcxgkJzup3wDsbT q6wOMQpwMCrx8M64ah4lxJpYVlyZe4hRmoNFSZz3YrVJlJBAemJJanZqakFqUXxRaU5q8SFG Jg5OqQbGiKgHrlJnH+3+lmLCslzhMdtm3Vzxmq8iRwIkBLlEBE8rrb1csPjNC4aeaSk6+/7K zby2W58h7l/N9BNZFgZ9Oct+c8g4sCSru6mePPFL/eWaHo7q83N+uDB/vCt5ZPl094hGj8Xd N3bvdFf8vmCVawQL22XhnYkMJ29F8DuEz/xyl8fKv/eAEktxRqKhFnNRcSIAiT+SfS4CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsUi2FB8QzfTwiLKoGuPhcWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAlihDm7T8ovLEohSFouSCElul4ozElPzy eEtjI1OHxIKCnFS95PxcJX07m5TUnMyy1CJ9uwTDjHmTljIWLOWpuLt+PmsD42yuLkZODgkB E4kXZxYydTFycQgJHGKUuH20ma2LkYODWUBe4uB5WYh4I5PEjjlLmUEahAUkJbrv3AGz2QS0 JN7ebmcFsUUEpCQ+7tjODmIzCzQzSuycIwJRnyhxq+sjWD2LgKrEk1PbGUFsXgFziTOX1jBC HCEvsfj7TjYQmxMo3vP1EdgcIQEzifM/XzJNYOSbhXDSAkbGVYwCRak5iZXGenDPbmIEB2Zh 8A7GP8usDjEKcDAq8fDOuGoeJcSaWFZcmQv0Gwezkgiv31mgEG9KYmVValF+fFFpTmrxIUYf oNsmMkuJJucDoyavJN7Q2MLY0sTCwMDE0swEh7CSOK/HKqBZAumJJanZqakFqUUw45g4OKUa GJeG/jG1dvX3endxx+wl/WFtW97e5jumoHIn8E3bQxX+xr6o55ru5pterPn+aqv+3/udH9zz e58r5se45jXO28U1O/8P+5orArZf3tlLM2285sH/d8GKtT0N3872WLiWXp7O/veI9/H71eKz jMvf7cxOMJx13DNhttxJy+VbfXr+We1jCCz60K/EAkwWhlrMRcWJAEIA5it5AgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org It really does not matter, because we never use TCP timestamps when TCP_MD5 is enabled (see tcp_syn_options). Moving TCP_MD5 a bit lower allows for easier adoption of the tcp_extra_option framework. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- net/ipv4/tcp_output.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 43849ed73b03..7ea65f70e5ec 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -662,6 +662,13 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->options = 0; + if (likely(tp->rx_opt.tstamp_ok)) { + opts->options |= OPTION_TS; + opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0; + opts->tsecr = tp->rx_opt.ts_recent; + size += TCPOLEN_TSTAMP_ALIGNED; + } + #ifdef CONFIG_TCP_MD5SIG opts->md5 = tp->af_specific->md5_lookup(sk, sk); if (unlikely(opts->md5)) { @@ -672,13 +679,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->md5 = NULL; #endif - if (likely(tp->rx_opt.tstamp_ok)) { - opts->options |= OPTION_TS; - opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0; - opts->tsecr = tp->rx_opt.ts_recent; - size += TCPOLEN_TSTAMP_ALIGNED; - } - if (unlikely(!hlist_empty(&tp->tcp_option_list))) size += tcp_extopt_prepare(skb, 0, MAX_TCP_OPTION_SPACE - size, opts, tcp_to_sk(tp)); From patchwork Mon Dec 18 21:51:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850429 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="Yg4VhveF"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrp6TjVz9s81 for ; Tue, 19 Dec 2017 08:52:10 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937640AbdLRVwJ (ORCPT ); Mon, 18 Dec 2017 16:52:09 -0500 Received: from mail-out6.apple.com ([17.151.62.28]:48858 "EHLO mail-in6.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S934193AbdLRVvr (ORCPT ); Mon, 18 Dec 2017 16:51:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633898; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=jreYCw9OVM7mqQylPcYbm9rPMtAJs1wfI5IFKeMkTeU=; b=Yg4VhveFO1RsW0Y6sDHKwNIkU34cgpZE0f0Ws50bNPfE2oglwylcmVouAy0Ma+la xHj4uUbCeN7BBFzyfA8Lb7xnyVRA9nmLjuiCGznFM4dMAXLltaVz4FZ7jsYOlAM9 od+DHgXv7JB+pcH2p5yoBsZmWCQIZ7euXQVaBR017iW1ArvxN2IHcA4P6748ZH08 Wp+kG6skZdhcAuPQFEqFBJYEv6FsPGraJmw/9alIkEif13V1KvtTWPCYXLx+euYy YPxLiOw5fE5WRljORTeMNGcdMDwlp/2acl3XiaVavYauZVwqaLiSydxHU6SzEko/ RXCfsHwLlgVrS+JEUwIucQ==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id CD.23.20985.A68383A5; Mon, 18 Dec 2017 13:51:38 -0800 (PST) X-AuditID: 11973e15-7125b9c0000051f9-4d-5a38386a6762 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay6.apple.com (Apple SCV relay) with SMTP id 27.23.05652.A68383A5; Mon, 18 Dec 2017 13:51:38 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LCGFE2YT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:38 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 11/14] tcp_md5: Move TCP-MD5 code out of TCP itself Date: Mon, 18 Dec 2017 13:51:06 -0800 Message-id: <20171218215109.38700-12-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FAYpZtlYRFlsKRF3+LLz9vsFk+PPWK3 +NvSz2JxbIGYA4vHgk2lHptWdbJ5zDsZ6PF5k1wASxSXTUpqTmZZapG+XQJXxu0735gK3jWx Vkxt2MrYwHi4iaWLkYNDQsBEYtE/IJOTQ0hgNZPEhotBIDZIeEHTZ6YuRi6g+CFGiRtT97OB 1DMLyEscPC8LUd/IJDFhmTuILSwgKdF95w4ziM0moCXx9nY7K4gtIiAl8XHHdnYQm1mgmVFi 5xwRiHpHicP3V7CB2CwCqhJL5mwGq+cVMJf49f8+M8QN8hKLv+8Eq+EEivd8fcQOsddM4vzP l2C3SQicYJNY3fOMfQKj4CyE8xYwMq5iFMpNzMzRzcwz00ssKMhJ1UvOz93ECArM6XaiOxjP rLI6xCjAwajEwzvjqnmUEGtiWXFl7iFGaQ4WJXHei9UmUUIC6YklqdmpqQWpRfFFpTmpxYcY mTg4pRoYD7RemP3EvfTuzM+xG7u4Z++a/c1N6nbCpQB3P3sbtXUs17PfPhFpnnX7n8mm6Jn7 P97mNqrW7ch6EtuVqTevwnJu/1Wnqy+6ZjgtLb7w4ZzQKbvS+jWh+3mnvd7sJhalcGS/yeHP vl8PbxDvXPvgQYr7w9dauh8OzExUrT2/snSZ0V/BOZcWfFRiKc5INNRiLipOBADR/arhLQIA AA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOLMWRmVeSWpSXmKPExsUi2FB8QzfLwiLK4EuHpsWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAlihDm7T8ovLEohSFouSCElul4ozElPzy eEtjI1OHxIKCnFS95PxcJX07m5TUnMyy1CJ9uwTDjNt3vjEVvGtirZjasJWxgfFwE0sXIyeH hICJxIKmz0xdjFwcQgKHGCVuTN3P1sXIwcEsIC9x8LwsSI2QQCOTxIRl7iC2sICkRPedO8wg NpuAlsTb2+2sILaIgJTExx3b2UFsZoFmRomdc0Qg6h0lDt9fwQZiswioSiyZsxmsnlfAXOLX //vMEDfISyz+vhOshhMo3vP1ETvEXjOJ8z9fMk1g5JuFcNECRsZVjAJFqTmJlWZ6cN9uYgSH ZmHUDsaG5VaHGAU4GJV4eGdcNY8SYk0sK67MBXqNg1lJhNfvLFCINyWxsiq1KD++qDQntfgQ ow/QbROZpUST84Fxk1cSb2hsYWxpYmFgYGJpZoJDWEmc130V0CyB9MSS1OzU1ILUIphxTByc Ug2Mxsy8HEzFh7SVGNl+zRdzZ+38whLxY1UHO8+0qT9Wz7RalWNUFRvd/EWk64VamvPCa2sT p0y1Y78bYvxBoGCPQe6u+vz0aPV2gd3fr13sVrtpzf3+rc7dC1v2znqyfEJuRe3+df9flO5a /MUxJWHJcWPn+Icyda/Dsljtg3VfLsxo7pFPtp2jxAJMF4ZazEXFiQB2iBfQegIAAA== Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is all just copy-pasting the TCP_MD5-code into functions that are placed in net/ipv4/tcp_md5.c. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/inet_diag.h | 1 + include/linux/tcp_md5.h | 138 ++++++ include/net/tcp.h | 77 ---- net/ipv4/Makefile | 1 + net/ipv4/tcp.c | 133 +----- net/ipv4/tcp_diag.c | 81 +--- net/ipv4/tcp_input.c | 38 -- net/ipv4/tcp_ipv4.c | 520 ++------------------- net/ipv4/tcp_md5.c | 1102 +++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_minisocks.c | 27 +- net/ipv4/tcp_output.c | 4 +- net/ipv6/tcp_ipv6.c | 318 +------------ 12 files changed, 1305 insertions(+), 1135 deletions(-) create mode 100644 include/linux/tcp_md5.h create mode 100644 net/ipv4/tcp_md5.c diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h index 39faaaf843e1..1ef6727e41c9 100644 --- a/include/linux/inet_diag.h +++ b/include/linux/inet_diag.h @@ -2,6 +2,7 @@ #ifndef _INET_DIAG_H_ #define _INET_DIAG_H_ 1 +#include #include struct net; diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h new file mode 100644 index 000000000000..f6a681cdded4 --- /dev/null +++ b/include/linux/tcp_md5.h @@ -0,0 +1,138 @@ +#ifndef _LINUX_TCP_MD5_H +#define _LINUX_TCP_MD5_H + +#include + +#ifdef CONFIG_TCP_MD5SIG +#include + +#include + +union tcp_md5_addr { + struct in_addr a4; +#if IS_ENABLED(CONFIG_IPV6) + struct in6_addr a6; +#endif +}; + +/* - key database */ +struct tcp_md5sig_key { + struct hlist_node node; + u8 keylen; + u8 family; /* AF_INET or AF_INET6 */ + union tcp_md5_addr addr; + u8 prefixlen; + u8 key[TCP_MD5SIG_MAXKEYLEN]; + struct rcu_head rcu; +}; + +/* - sock block */ +struct tcp_md5sig_info { + struct hlist_head head; + struct rcu_head rcu; +}; + +union tcp_md5sum_block { + struct tcp4_pseudohdr ip4; +#if IS_ENABLED(CONFIG_IPV6) + struct tcp6_pseudohdr ip6; +#endif +}; + +/* - pool: digest algorithm, hash description and scratch buffer */ +struct tcp_md5sig_pool { + struct ahash_request *md5_req; + void *scratch; +}; + +extern const struct tcp_sock_af_ops tcp_sock_ipv4_specific; +extern const struct tcp_sock_af_ops tcp_sock_ipv6_specific; +extern const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; + +/* - functions */ +int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, + const struct sock *sk, const struct sk_buff *skb); + +struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, + const struct sock *addr_sk); + +void tcp_v4_md5_destroy_sock(struct sock *sk); + +int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk); + +int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk); + +void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk); + +bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb); + +void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); + +void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); + +void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw); + +struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, + const struct sock *addr_sk); + +int tcp_v6_md5_hash_skb(char *md5_hash, + const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb); + +bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb); + +static inline void tcp_md5_twsk_destructor(struct sock *sk) +{ + struct tcp_timewait_sock *twsk = tcp_twsk(sk); + + if (twsk->tw_md5_key) + kfree_rcu(twsk->tw_md5_key, rcu); +} + +static inline void tcp_md5_add_header_len(const struct sock *listener, + struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (tp->af_specific->md5_lookup(listener, sk)) + tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; +} + +int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb); + +int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin); + +#else + +static inline bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + return false; +} + +static inline bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + return false; +} + +#endif + +#endif /* _LINUX_TCP_MD5_H */ diff --git a/include/net/tcp.h b/include/net/tcp.h index a5c4856e25c7..e955c5f0997f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -406,7 +406,6 @@ void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, int estab, struct tcp_fastopen_cookie *foc, struct sock *sk); -const u8 *tcp_parse_md5sig_option(const struct tcphdr *th); /* * TCP v4 functions exported for the inet6 API @@ -1415,30 +1414,6 @@ static inline void tcp_clear_all_retrans_hints(struct tcp_sock *tp) tp->retransmit_skb_hint = NULL; } -union tcp_md5_addr { - struct in_addr a4; -#if IS_ENABLED(CONFIG_IPV6) - struct in6_addr a6; -#endif -}; - -/* - key database */ -struct tcp_md5sig_key { - struct hlist_node node; - u8 keylen; - u8 family; /* AF_INET or AF_INET6 */ - union tcp_md5_addr addr; - u8 prefixlen; - u8 key[TCP_MD5SIG_MAXKEYLEN]; - struct rcu_head rcu; -}; - -/* - sock block */ -struct tcp_md5sig_info { - struct hlist_head head; - struct rcu_head rcu; -}; - /* - pseudo header */ struct tcp4_pseudohdr { __be32 saddr; @@ -1455,58 +1430,6 @@ struct tcp6_pseudohdr { __be32 protocol; /* including padding */ }; -union tcp_md5sum_block { - struct tcp4_pseudohdr ip4; -#if IS_ENABLED(CONFIG_IPV6) - struct tcp6_pseudohdr ip6; -#endif -}; - -/* - pool: digest algorithm, hash description and scratch buffer */ -struct tcp_md5sig_pool { - struct ahash_request *md5_req; - void *scratch; -}; - -/* - functions */ -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, const struct sk_buff *skb); -int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, - int family, u8 prefixlen, const u8 *newkey, u8 newkeylen, - gfp_t gfp); -int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, - int family, u8 prefixlen); -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk); - -#ifdef CONFIG_TCP_MD5SIG -struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, - const union tcp_md5_addr *addr, - int family); -#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) -#else -static inline struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, - const union tcp_md5_addr *addr, - int family) -{ - return NULL; -} -#define tcp_twsk_md5_key(twsk) NULL -#endif - -bool tcp_alloc_md5sig_pool(void); - -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void); -static inline void tcp_put_md5sig_pool(void) -{ - local_bh_enable(); -} - -int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *, const struct sk_buff *, - unsigned int header_len); -int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, - const struct tcp_md5sig_key *key); - /* From tcp_fastopen.c */ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss, struct tcp_fastopen_cookie *cookie); diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index c6c8ad1d4b6d..9262d9a01035 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o +obj-$(CONFIG_TCP_MD5SIG) += tcp_md5.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o xfrm4_protocol.o diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0a1cabee6d5e..29f3ce8a0b54 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -271,6 +271,7 @@ #include #include #include +#include #include #include @@ -3337,138 +3338,6 @@ int compat_tcp_getsockopt(struct sock *sk, int level, int optname, EXPORT_SYMBOL(compat_tcp_getsockopt); #endif -#ifdef CONFIG_TCP_MD5SIG -static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); -static DEFINE_MUTEX(tcp_md5sig_mutex); -static bool tcp_md5sig_pool_populated = false; - -static void __tcp_alloc_md5sig_pool(void) -{ - struct crypto_ahash *hash; - int cpu; - - hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC); - if (IS_ERR(hash)) - return; - - for_each_possible_cpu(cpu) { - void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch; - struct ahash_request *req; - - if (!scratch) { - scratch = kmalloc_node(sizeof(union tcp_md5sum_block) + - sizeof(struct tcphdr), - GFP_KERNEL, - cpu_to_node(cpu)); - if (!scratch) - return; - per_cpu(tcp_md5sig_pool, cpu).scratch = scratch; - } - if (per_cpu(tcp_md5sig_pool, cpu).md5_req) - continue; - - req = ahash_request_alloc(hash, GFP_KERNEL); - if (!req) - return; - - ahash_request_set_callback(req, 0, NULL, NULL); - - per_cpu(tcp_md5sig_pool, cpu).md5_req = req; - } - /* before setting tcp_md5sig_pool_populated, we must commit all writes - * to memory. See smp_rmb() in tcp_get_md5sig_pool() - */ - smp_wmb(); - tcp_md5sig_pool_populated = true; -} - -bool tcp_alloc_md5sig_pool(void) -{ - if (unlikely(!tcp_md5sig_pool_populated)) { - mutex_lock(&tcp_md5sig_mutex); - - if (!tcp_md5sig_pool_populated) - __tcp_alloc_md5sig_pool(); - - mutex_unlock(&tcp_md5sig_mutex); - } - return tcp_md5sig_pool_populated; -} -EXPORT_SYMBOL(tcp_alloc_md5sig_pool); - - -/** - * tcp_get_md5sig_pool - get md5sig_pool for this user - * - * We use percpu structure, so if we succeed, we exit with preemption - * and BH disabled, to make sure another thread or softirq handling - * wont try to get same context. - */ -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) -{ - local_bh_disable(); - - if (tcp_md5sig_pool_populated) { - /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */ - smp_rmb(); - return this_cpu_ptr(&tcp_md5sig_pool); - } - local_bh_enable(); - return NULL; -} -EXPORT_SYMBOL(tcp_get_md5sig_pool); - -int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, - const struct sk_buff *skb, unsigned int header_len) -{ - struct scatterlist sg; - const struct tcphdr *tp = tcp_hdr(skb); - struct ahash_request *req = hp->md5_req; - unsigned int i; - const unsigned int head_data_len = skb_headlen(skb) > header_len ? - skb_headlen(skb) - header_len : 0; - const struct skb_shared_info *shi = skb_shinfo(skb); - struct sk_buff *frag_iter; - - sg_init_table(&sg, 1); - - sg_set_buf(&sg, ((u8 *) tp) + header_len, head_data_len); - ahash_request_set_crypt(req, &sg, NULL, head_data_len); - if (crypto_ahash_update(req)) - return 1; - - for (i = 0; i < shi->nr_frags; ++i) { - const struct skb_frag_struct *f = &shi->frags[i]; - unsigned int offset = f->page_offset; - struct page *page = skb_frag_page(f) + (offset >> PAGE_SHIFT); - - sg_set_page(&sg, page, skb_frag_size(f), - offset_in_page(offset)); - ahash_request_set_crypt(req, &sg, NULL, skb_frag_size(f)); - if (crypto_ahash_update(req)) - return 1; - } - - skb_walk_frags(skb, frag_iter) - if (tcp_md5_hash_skb_data(hp, frag_iter, 0)) - return 1; - - return 0; -} -EXPORT_SYMBOL(tcp_md5_hash_skb_data); - -int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key) -{ - struct scatterlist sg; - - sg_init_one(&sg, key->key, key->keylen); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, key->keylen); - return crypto_ahash_update(hp->md5_req); -} -EXPORT_SYMBOL(tcp_md5_hash_key); - -#endif - struct hlist_head *tcp_extopt_get_list(const struct sock *sk) { if (sk_fullsock(sk)) diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index abbf0edcf6c2..5cfe5dc8f8dd 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -15,6 +15,7 @@ #include #include +#include #include #include @@ -37,70 +38,14 @@ static void tcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r, tcp_get_info(sk, info); } -#ifdef CONFIG_TCP_MD5SIG -static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, - const struct tcp_md5sig_key *key) -{ - info->tcpm_family = key->family; - info->tcpm_prefixlen = key->prefixlen; - info->tcpm_keylen = key->keylen; - memcpy(info->tcpm_key, key->key, key->keylen); - - if (key->family == AF_INET) - info->tcpm_addr[0] = key->addr.a4.s_addr; - #if IS_ENABLED(CONFIG_IPV6) - else if (key->family == AF_INET6) - memcpy(&info->tcpm_addr, &key->addr.a6, - sizeof(info->tcpm_addr)); - #endif -} - -static int tcp_diag_put_md5sig(struct sk_buff *skb, - const struct tcp_md5sig_info *md5sig) -{ - const struct tcp_md5sig_key *key; - struct tcp_diag_md5sig *info; - struct nlattr *attr; - int md5sig_count = 0; - - hlist_for_each_entry_rcu(key, &md5sig->head, node) - md5sig_count++; - if (md5sig_count == 0) - return 0; - - attr = nla_reserve(skb, INET_DIAG_MD5SIG, - md5sig_count * sizeof(struct tcp_diag_md5sig)); - if (!attr) - return -EMSGSIZE; - - info = nla_data(attr); - memset(info, 0, md5sig_count * sizeof(struct tcp_diag_md5sig)); - hlist_for_each_entry_rcu(key, &md5sig->head, node) { - tcp_diag_md5sig_fill(info++, key); - if (--md5sig_count == 0) - break; - } - - return 0; -} -#endif - static int tcp_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb) { #ifdef CONFIG_TCP_MD5SIG - if (net_admin) { - struct tcp_md5sig_info *md5sig; - int err = 0; - - rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) - err = tcp_diag_put_md5sig(skb, md5sig); - rcu_read_unlock(); - if (err < 0) - return err; - } + int err = tcp_md5_diag_get_aux(sk, net_admin, skb); + + if (err < 0) + return err; #endif return 0; @@ -111,21 +56,7 @@ static size_t tcp_diag_get_aux_size(struct sock *sk, bool net_admin) size_t size = 0; #ifdef CONFIG_TCP_MD5SIG - if (net_admin && sk_fullsock(sk)) { - const struct tcp_md5sig_info *md5sig; - const struct tcp_md5sig_key *key; - size_t md5sig_count = 0; - - rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) { - hlist_for_each_entry_rcu(key, &md5sig->head, node) - md5sig_count++; - } - rcu_read_unlock(); - size += nla_total_size(md5sig_count * - sizeof(struct tcp_diag_md5sig)); - } + size += tcp_md5_diag_get_aux_size(sk, net_admin); #endif return size; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index af8f4f9fd098..db54bdbdee51 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3847,44 +3847,6 @@ static bool tcp_fast_parse_options(const struct net *net, return false; } -#ifdef CONFIG_TCP_MD5SIG -/* - * Parse MD5 Signature option - */ -const u8 *tcp_parse_md5sig_option(const struct tcphdr *th) -{ - int length = (th->doff << 2) - sizeof(*th); - const u8 *ptr = (const u8 *)(th + 1); - - /* If the TCP option is too short, we can short cut */ - if (length < TCPOLEN_MD5SIG) - return NULL; - - while (length > 0) { - int opcode = *ptr++; - int opsize; - - switch (opcode) { - case TCPOPT_EOL: - return NULL; - case TCPOPT_NOP: - length--; - continue; - default: - opsize = *ptr++; - if (opsize < 2 || opsize > length) - return NULL; - if (opcode == TCPOPT_MD5SIG) - return opsize == TCPOLEN_MD5SIG ? ptr : NULL; - } - ptr += opsize - 2; - length -= opsize; - } - return NULL; -} -EXPORT_SYMBOL(tcp_parse_md5sig_option); -#endif - /* Sorry, PAWS as specified is broken wrt. pure-ACKs -DaveM * * It is not fatal. If this ACK does _not_ change critical state (seqs, window) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 397975203e14..143e1f66a24a 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -62,6 +62,7 @@ #include #include #include +#include #include #include @@ -87,11 +88,6 @@ #include -#ifdef CONFIG_TCP_MD5SIG -static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, - __be32 daddr, __be32 saddr, const struct tcphdr *th); -#endif - struct inet_hashinfo tcp_hashinfo; EXPORT_SYMBOL(tcp_hashinfo); @@ -603,16 +599,13 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct hlist_head *extopt_list = NULL; + struct tcp_out_options opts; struct ip_reply_arg arg; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key = NULL; - const __u8 *hash_location = NULL; - unsigned char newhash[16]; - int genhash; - struct sock *sk1 = NULL; -#endif struct net *net; int offset = 0; +#ifdef CONFIG_TCP_MD5SIG + int ret; +#endif /* Never send a reset in response to a reset. */ if (th->rst) @@ -627,6 +620,8 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) if (sk) extopt_list = tcp_extopt_get_list(sk); + memset(&opts, 0, sizeof(opts)); + /* Swap the send and the receive. */ memset(&rep, 0, sizeof(rep)); rep.th.dest = th->source; @@ -647,55 +642,28 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) arg.iov[0].iov_len = sizeof(rep.th); net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); -#ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *) - &ip_hdr(skb)->saddr, AF_INET); - } else if (hash_location) { - /* - * active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = __inet_lookup_listener(net, &tcp_hashinfo, NULL, 0, - ip_hdr(skb)->saddr, - th->source, ip_hdr(skb)->daddr, - ntohs(th->source), inet_iif(skb), - tcp_v4_sdif(skb)); - /* don't send rst if it can't find key */ - if (!sk1) - goto out; - - key = tcp_md5_do_lookup(sk1, (union tcp_md5_addr *) - &ip_hdr(skb)->saddr, AF_INET); - if (!key) - goto out; +#ifdef CONFIG_TCP_MD5SIG + ret = tcp_v4_md5_send_response_prepare(skb, 0, + MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, + &opts, sk); - genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out; + if (ret == -1) + return; - } + arg.iov[0].iov_len += ret; #endif if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; - struct tcp_out_options opts; int used; remaining = sizeof(rep.opt); #ifdef CONFIG_TCP_MD5SIG - if (key) + if (opts.md5) remaining -= TCPOLEN_MD5SIG_ALIGNED; #endif - memset(&opts, 0, sizeof(opts)); - used = tcp_extopt_response_prepare(skb, TCPHDR_RST, remaining, &opts, sk); @@ -707,19 +675,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) } #ifdef CONFIG_TCP_MD5SIG - if (key) { - rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | - TCPOLEN_MD5SIG); - /* Update length and the length the header thinks exists */ - arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; - rep.th.doff = arg.iov[0].iov_len / 4; - - tcp_v4_md5_hash_hdr((__u8 *)&rep.opt[offset], - key, ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, &rep.th); - } + tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); #endif arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ @@ -750,11 +706,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) __TCP_INC_STATS(net, TCP_MIB_OUTSEGS); __TCP_INC_STATS(net, TCP_MIB_OUTRSTS); local_bh_enable(); - -#ifdef CONFIG_TCP_MD5SIG -out: - rcu_read_unlock(); -#endif } /* The code following below sending ACKs in SYN-RECV and TIME-WAIT states @@ -772,17 +723,19 @@ static void tcp_v4_send_ack(const struct sock *sk, __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct hlist_head *extopt_list = NULL; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif + struct tcp_out_options opts; struct net *net = sock_net(sk); struct ip_reply_arg arg; int offset = 0; +#ifdef CONFIG_TCP_MD5SIG + int ret; +#endif extopt_list = tcp_extopt_get_list(sk); memset(&rep.th, 0, sizeof(struct tcphdr)); memset(&arg, 0, sizeof(arg)); + memset(&opts, 0, sizeof(opts)); arg.iov[0].iov_base = (unsigned char *)&rep; arg.iov[0].iov_len = sizeof(rep.th); @@ -806,25 +759,24 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.window = htons(win); #ifdef CONFIG_TCP_MD5SIG - if (sk->sk_state == TCP_TIME_WAIT) { - key = tcp_twsk_md5_key(tcp_twsk(sk)); - } else if (sk->sk_state == TCP_NEW_SYN_RECV) { - key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr, - AF_INET); - } else { - key = NULL; /* Should not happen */ - } + ret = tcp_v4_md5_send_response_prepare(skb, 0, + MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, + &opts, sk); + + if (ret == -1) + return; + + arg.iov[0].iov_len += ret; #endif if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; - struct tcp_out_options opts; int used; remaining = sizeof(rep.th) + sizeof(rep.opt) - arg.iov[0].iov_len; #ifdef CONFIG_TCP_MD5SIG - if (key) + if (opts.md5) remaining -= TCPOLEN_MD5SIG_ALIGNED; #endif @@ -841,18 +793,11 @@ static void tcp_v4_send_ack(const struct sock *sk, } #ifdef CONFIG_TCP_MD5SIG - if (key) { - rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | - TCPOLEN_MD5SIG); + if (opts.md5) { arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; rep.th.doff = arg.iov[0].iov_len / 4; - - tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[offset], - key, ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, &rep.th); } + tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); #endif arg.flags = reply_flags; @@ -961,374 +906,6 @@ static void tcp_v4_reqsk_destructor(struct request_sock *req) kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1)); } -#ifdef CONFIG_TCP_MD5SIG -/* - * RFC2385 MD5 checksumming requires a mapping of - * IP address->MD5 Key. - * We need to maintain these in the sk structure. - */ - -/* Find the Key structure for an address. */ -struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, - const union tcp_md5_addr *addr, - int family) -{ - const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - const struct tcp_md5sig_info *md5sig; - __be32 mask; - struct tcp_md5sig_key *best_match = NULL; - bool match; - - /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); - if (!md5sig) - return NULL; - - hlist_for_each_entry_rcu(key, &md5sig->head, node) { - if (key->family != family) - continue; - - if (family == AF_INET) { - mask = inet_make_mask(key->prefixlen); - match = (key->addr.a4.s_addr & mask) == - (addr->a4.s_addr & mask); -#if IS_ENABLED(CONFIG_IPV6) - } else if (family == AF_INET6) { - match = ipv6_prefix_equal(&key->addr.a6, &addr->a6, - key->prefixlen); -#endif - } else { - match = false; - } - - if (match && (!best_match || - key->prefixlen > best_match->prefixlen)) - best_match = key; - } - return best_match; -} -EXPORT_SYMBOL(tcp_md5_do_lookup); - -static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk, - const union tcp_md5_addr *addr, - int family, u8 prefixlen) -{ - const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - unsigned int size = sizeof(struct in_addr); - const struct tcp_md5sig_info *md5sig; - - /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); - if (!md5sig) - return NULL; -#if IS_ENABLED(CONFIG_IPV6) - if (family == AF_INET6) - size = sizeof(struct in6_addr); -#endif - hlist_for_each_entry_rcu(key, &md5sig->head, node) { - if (key->family != family) - continue; - if (!memcmp(&key->addr, addr, size) && - key->prefixlen == prefixlen) - return key; - } - return NULL; -} - -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) -{ - const union tcp_md5_addr *addr; - - addr = (const union tcp_md5_addr *)&addr_sk->sk_daddr; - return tcp_md5_do_lookup(sk, addr, AF_INET); -} -EXPORT_SYMBOL(tcp_v4_md5_lookup); - -/* This can be called on a newly created socket, from other files */ -int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, - int family, u8 prefixlen, const u8 *newkey, u8 newkeylen, - gfp_t gfp) -{ - /* Add Key to the list */ - struct tcp_md5sig_key *key; - struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_info *md5sig; - - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); - if (key) { - /* Pre-existing entry - just update that one. */ - memcpy(key->key, newkey, newkeylen); - key->keylen = newkeylen; - return 0; - } - - md5sig = rcu_dereference_protected(tp->md5sig_info, - lockdep_sock_is_held(sk)); - if (!md5sig) { - md5sig = kmalloc(sizeof(*md5sig), gfp); - if (!md5sig) - return -ENOMEM; - - sk_nocaps_add(sk, NETIF_F_GSO_MASK); - INIT_HLIST_HEAD(&md5sig->head); - rcu_assign_pointer(tp->md5sig_info, md5sig); - } - - key = sock_kmalloc(sk, sizeof(*key), gfp); - if (!key) - return -ENOMEM; - if (!tcp_alloc_md5sig_pool()) { - sock_kfree_s(sk, key, sizeof(*key)); - return -ENOMEM; - } - - memcpy(key->key, newkey, newkeylen); - key->keylen = newkeylen; - key->family = family; - key->prefixlen = prefixlen; - memcpy(&key->addr, addr, - (family == AF_INET6) ? sizeof(struct in6_addr) : - sizeof(struct in_addr)); - hlist_add_head_rcu(&key->node, &md5sig->head); - return 0; -} -EXPORT_SYMBOL(tcp_md5_do_add); - -int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family, - u8 prefixlen) -{ - struct tcp_md5sig_key *key; - - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); - if (!key) - return -ENOENT; - hlist_del_rcu(&key->node); - atomic_sub(sizeof(*key), &sk->sk_omem_alloc); - kfree_rcu(key, rcu); - return 0; -} -EXPORT_SYMBOL(tcp_md5_do_del); - -static void tcp_clear_md5_list(struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - struct hlist_node *n; - struct tcp_md5sig_info *md5sig; - - md5sig = rcu_dereference_protected(tp->md5sig_info, 1); - - hlist_for_each_entry_safe(key, n, &md5sig->head, node) { - hlist_del_rcu(&key->node); - atomic_sub(sizeof(*key), &sk->sk_omem_alloc); - kfree_rcu(key, rcu); - } -} - -static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) -{ - struct tcp_md5sig cmd; - struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; - u8 prefixlen = 32; - - if (optlen < sizeof(cmd)) - return -EINVAL; - - if (copy_from_user(&cmd, optval, sizeof(cmd))) - return -EFAULT; - - if (sin->sin_family != AF_INET) - return -EINVAL; - - if (optname == TCP_MD5SIG_EXT && - cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { - prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 32) - return -EINVAL; - } - - if (!cmd.tcpm_keylen) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen); - - if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) - return -EINVAL; - - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, - GFP_KERNEL); -} - -static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, - __be32 daddr, __be32 saddr, - const struct tcphdr *th, int nbytes) -{ - struct tcp4_pseudohdr *bp; - struct scatterlist sg; - struct tcphdr *_th; - - bp = hp->scratch; - bp->saddr = saddr; - bp->daddr = daddr; - bp->pad = 0; - bp->protocol = IPPROTO_TCP; - bp->len = cpu_to_be16(nbytes); - - _th = (struct tcphdr *)(bp + 1); - memcpy(_th, th, sizeof(*th)); - _th->check = 0; - - sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, - sizeof(*bp) + sizeof(*th)); - return crypto_ahash_update(hp->md5_req); -} - -static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, - __be32 daddr, __be32 saddr, const struct tcphdr *th) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} - -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - __be32 saddr, daddr; - - if (sk) { /* valid for establish/request sockets */ - saddr = sk->sk_rcv_saddr; - daddr = sk->sk_daddr; - } else { - const struct iphdr *iph = ip_hdr(skb); - saddr = iph->saddr; - daddr = iph->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} -EXPORT_SYMBOL(tcp_v4_md5_hash_skb); - -#endif - -/* Called with rcu_read_lock() */ -static bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ -#ifdef CONFIG_TCP_MD5SIG - /* - * This gets called for each TCP segment that arrives - * so we want to be efficient. - * We have 3 drop cases: - * o No MD5 hash and one expected. - * o MD5 hash and we're not expecting one. - * o MD5 hash and its wrong. - */ - const __u8 *hash_location = NULL; - struct tcp_md5sig_key *hash_expected; - const struct iphdr *iph = ip_hdr(skb); - const struct tcphdr *th = tcp_hdr(skb); - int genhash; - unsigned char newhash[16]; - - hash_expected = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&iph->saddr, - AF_INET); - hash_location = tcp_parse_md5sig_option(th); - - /* We've parsed the options - do we have a hash? */ - if (!hash_expected && !hash_location) - return false; - - if (hash_expected && !hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); - return true; - } - - if (!hash_expected && hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); - return true; - } - - /* Okay, so this is hash_expected and hash_location - - * so we need to calculate the checksum. - */ - genhash = tcp_v4_md5_hash_skb(newhash, - hash_expected, - NULL, skb); - - if (genhash || memcmp(hash_location, newhash, 16) != 0) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); - net_info_ratelimited("MD5 Hash failed for (%pI4, %d)->(%pI4, %d)%s\n", - &iph->saddr, ntohs(th->source), - &iph->daddr, ntohs(th->dest), - genhash ? " tcp_v4_calc_md5_hash failed" - : ""); - return true; - } - return false; -#endif - return false; -} - static void tcp_v4_init_req(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb) @@ -1404,9 +981,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, struct inet_sock *newinet; struct tcp_sock *newtp; struct sock *newsk; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif struct ip_options_rcu *inet_opt; if (sk_acceptq_is_full(sk)) @@ -1453,20 +1027,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, tcp_initialize_rcv_mss(newsk); #ifdef CONFIG_TCP_MD5SIG - /* Copy over the MD5 key from the original socket */ - key = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&newinet->inet_daddr, - AF_INET); - if (key) { - /* - * We're using one, so create a matching key - * on the newsk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(newsk, (union tcp_md5_addr *)&newinet->inet_daddr, - AF_INET, 32, key->key, key->keylen, GFP_ATOMIC); - sk_nocaps_add(newsk, NETIF_F_GSO_MASK); - } + tcp_v4_md5_syn_recv_sock(sk, newsk); #endif if (__inet_inherit_port(sk, newsk) < 0) @@ -1930,14 +1491,6 @@ const struct inet_connection_sock_af_ops ipv4_specific = { }; EXPORT_SYMBOL(ipv4_specific); -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v4_parse_md5_keys, -}; -#endif - /* NOTE: A lot of things set to zero explicitly by call to * sk_alloc() so need not be done here. */ @@ -1980,12 +1533,7 @@ void tcp_v4_destroy_sock(struct sock *sk) if (unlikely(!hlist_empty(&tp->tcp_option_list))) tcp_extopt_destroy(sk); #ifdef CONFIG_TCP_MD5SIG - /* Clean up the MD5 key list, if any */ - if (tp->md5sig_info) { - tcp_clear_md5_list(sk); - kfree_rcu(tp->md5sig_info, rcu); - tp->md5sig_info = NULL; - } + tcp_v4_md5_destroy_sock(sk); #endif /* Clean up a referenced TCP bind bucket. */ diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c new file mode 100644 index 000000000000..a31b404e6dbf --- /dev/null +++ b/net/ipv4/tcp_md5.c @@ -0,0 +1,1102 @@ +#include +#include +#include +#include + +#include + +#include + +static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); +static DEFINE_MUTEX(tcp_md5sig_mutex); +static bool tcp_md5sig_pool_populated; + +#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) + +static void __tcp_alloc_md5sig_pool(void) +{ + struct crypto_ahash *hash; + int cpu; + + hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC); + if (IS_ERR(hash)) + return; + + for_each_possible_cpu(cpu) { + void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch; + struct ahash_request *req; + + if (!scratch) { + scratch = kmalloc_node(sizeof(union tcp_md5sum_block) + + sizeof(struct tcphdr), + GFP_KERNEL, + cpu_to_node(cpu)); + if (!scratch) + return; + per_cpu(tcp_md5sig_pool, cpu).scratch = scratch; + } + if (per_cpu(tcp_md5sig_pool, cpu).md5_req) + continue; + + req = ahash_request_alloc(hash, GFP_KERNEL); + if (!req) + return; + + ahash_request_set_callback(req, 0, NULL, NULL); + + per_cpu(tcp_md5sig_pool, cpu).md5_req = req; + } + /* before setting tcp_md5sig_pool_populated, we must commit all writes + * to memory. See smp_rmb() in tcp_get_md5sig_pool() + */ + smp_wmb(); + tcp_md5sig_pool_populated = true; +} + +static bool tcp_alloc_md5sig_pool(void) +{ + if (unlikely(!tcp_md5sig_pool_populated)) { + mutex_lock(&tcp_md5sig_mutex); + + if (!tcp_md5sig_pool_populated) + __tcp_alloc_md5sig_pool(); + + mutex_unlock(&tcp_md5sig_mutex); + } + return tcp_md5sig_pool_populated; +} + +static void tcp_put_md5sig_pool(void) +{ + local_bh_enable(); +} + +/** + * tcp_get_md5sig_pool - get md5sig_pool for this user + * + * We use percpu structure, so if we succeed, we exit with preemption + * and BH disabled, to make sure another thread or softirq handling + * wont try to get same context. + */ +static struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) +{ + local_bh_disable(); + + if (tcp_md5sig_pool_populated) { + /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */ + smp_rmb(); + return this_cpu_ptr(&tcp_md5sig_pool); + } + local_bh_enable(); + return NULL; +} + +static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk, + const union tcp_md5_addr *addr, + int family, u8 prefixlen) +{ + const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + unsigned int size = sizeof(struct in_addr); + const struct tcp_md5sig_info *md5sig; + + /* caller either holds rcu_read_lock() or socket lock */ + md5sig = rcu_dereference_check(tp->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) + return NULL; +#if IS_ENABLED(CONFIG_IPV6) + if (family == AF_INET6) + size = sizeof(struct in6_addr); +#endif + hlist_for_each_entry_rcu(key, &md5sig->head, node) { + if (key->family != family) + continue; + if (!memcmp(&key->addr, addr, size) && + key->prefixlen == prefixlen) + return key; + } + return NULL; +} + +/* This can be called on a newly created socket, from other files */ +static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, + int family, u8 prefixlen, const u8 *newkey, + u8 newkeylen, gfp_t gfp) +{ + /* Add Key to the list */ + struct tcp_md5sig_key *key; + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_info *md5sig; + + key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + if (key) { + /* Pre-existing entry - just update that one. */ + memcpy(key->key, newkey, newkeylen); + key->keylen = newkeylen; + return 0; + } + + md5sig = rcu_dereference_protected(tp->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) { + md5sig = kmalloc(sizeof(*md5sig), gfp); + if (!md5sig) + return -ENOMEM; + + sk_nocaps_add(sk, NETIF_F_GSO_MASK); + INIT_HLIST_HEAD(&md5sig->head); + rcu_assign_pointer(tp->md5sig_info, md5sig); + } + + key = sock_kmalloc(sk, sizeof(*key), gfp); + if (!key) + return -ENOMEM; + if (!tcp_alloc_md5sig_pool()) { + sock_kfree_s(sk, key, sizeof(*key)); + return -ENOMEM; + } + + memcpy(key->key, newkey, newkeylen); + key->keylen = newkeylen; + key->family = family; + key->prefixlen = prefixlen; + memcpy(&key->addr, addr, + (family == AF_INET6) ? sizeof(struct in6_addr) : + sizeof(struct in_addr)); + hlist_add_head_rcu(&key->node, &md5sig->head); + return 0; +} + +static void tcp_clear_md5_list(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + struct hlist_node *n; + struct tcp_md5sig_info *md5sig; + + md5sig = rcu_dereference_protected(tp->md5sig_info, 1); + + hlist_for_each_entry_safe(key, n, &md5sig->head, node) { + hlist_del_rcu(&key->node); + atomic_sub(sizeof(*key), &sk->sk_omem_alloc); + kfree_rcu(key, rcu); + } +} + +static int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, + int family, u8 prefixlen) +{ + struct tcp_md5sig_key *key; + + key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + if (!key) + return -ENOENT; + hlist_del_rcu(&key->node); + atomic_sub(sizeof(*key), &sk->sk_omem_alloc); + kfree_rcu(key, rcu); + return 0; +} + +static int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, + const struct tcp_md5sig_key *key) +{ + struct scatterlist sg; + + sg_init_one(&sg, key->key, key->keylen); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, key->keylen); + return crypto_ahash_update(hp->md5_req); +} + +static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, + char __user *optval, int optlen) +{ + struct tcp_md5sig cmd; + struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; + u8 prefixlen = 32; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + if (copy_from_user(&cmd, optval, sizeof(cmd))) + return -EFAULT; + + if (sin->sin_family != AF_INET) + return -EINVAL; + + if (optname == TCP_MD5SIG_EXT && + cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { + prefixlen = cmd.tcpm_prefixlen; + if (prefixlen > 32) + return -EINVAL; + } + + if (!cmd.tcpm_keylen) + return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, + AF_INET, prefixlen); + + if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) + return -EINVAL; + + return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, + AF_INET, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, + GFP_KERNEL); +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, + char __user *optval, int optlen) +{ + struct tcp_md5sig cmd; + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; + u8 prefixlen; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + if (copy_from_user(&cmd, optval, sizeof(cmd))) + return -EFAULT; + + if (sin6->sin6_family != AF_INET6) + return -EINVAL; + + if (optname == TCP_MD5SIG_EXT && + cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { + prefixlen = cmd.tcpm_prefixlen; + if (prefixlen > 128 || (ipv6_addr_v4mapped(&sin6->sin6_addr) && + prefixlen > 32)) + return -EINVAL; + } else { + prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128; + } + + if (!cmd.tcpm_keylen) { + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) + return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], + AF_INET, prefixlen); + return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr, + AF_INET6, prefixlen); + } + + if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) + return -EINVAL; + + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) + return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], + AF_INET, prefixlen, cmd.tcpm_key, + cmd.tcpm_keylen, GFP_KERNEL); + + return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, + AF_INET6, prefixlen, cmd.tcpm_key, + cmd.tcpm_keylen, GFP_KERNEL); +} +#endif + +static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, + __be32 daddr, __be32 saddr, + const struct tcphdr *th, int nbytes) +{ + struct tcp4_pseudohdr *bp; + struct scatterlist sg; + struct tcphdr *_th; + + bp = hp->scratch; + bp->saddr = saddr; + bp->daddr = daddr; + bp->pad = 0; + bp->protocol = IPPROTO_TCP; + bp->len = cpu_to_be16(nbytes); + + _th = (struct tcphdr *)(bp + 1); + memcpy(_th, th, sizeof(*th)); + _th->check = 0; + + sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, + sizeof(*bp) + sizeof(*th)); + return crypto_ahash_update(hp->md5_req); +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp, + const struct in6_addr *daddr, + const struct in6_addr *saddr, + const struct tcphdr *th, int nbytes) +{ + struct tcp6_pseudohdr *bp; + struct scatterlist sg; + struct tcphdr *_th; + + bp = hp->scratch; + /* 1. TCP pseudo-header (RFC2460) */ + bp->saddr = *saddr; + bp->daddr = *daddr; + bp->protocol = cpu_to_be32(IPPROTO_TCP); + bp->len = cpu_to_be32(nbytes); + + _th = (struct tcphdr *)(bp + 1); + memcpy(_th, th, sizeof(*th)); + _th->check = 0; + + sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, + sizeof(*bp) + sizeof(*th)); + return crypto_ahash_update(hp->md5_req); +} +#endif + +static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, + __be32 daddr, __be32 saddr, + const struct tcphdr *th) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, + const struct in6_addr *daddr, + struct in6_addr *saddr, const struct tcphdr *th) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +#endif + +/* RFC2385 MD5 checksumming requires a mapping of + * IP address->MD5 Key. + * We need to maintain these in the sk structure. + */ + +/* Find the Key structure for an address. */ +static struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, + const union tcp_md5_addr *addr, + int family) +{ + const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + const struct tcp_md5sig_info *md5sig; + __be32 mask; + struct tcp_md5sig_key *best_match = NULL; + bool match; + + /* caller either holds rcu_read_lock() or socket lock */ + md5sig = rcu_dereference_check(tp->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) + return NULL; + + hlist_for_each_entry_rcu(key, &md5sig->head, node) { + if (key->family != family) + continue; + + if (family == AF_INET) { + mask = inet_make_mask(key->prefixlen); + match = (key->addr.a4.s_addr & mask) == + (addr->a4.s_addr & mask); +#if IS_ENABLED(CONFIG_IPV6) + } else if (family == AF_INET6) { + match = ipv6_prefix_equal(&key->addr.a6, &addr->a6, + key->prefixlen); +#endif + } else { + match = false; + } + + if (match && (!best_match || + key->prefixlen > best_match->prefixlen)) + best_match = key; + } + return best_match; +} + +/* Parse MD5 Signature option */ +static const u8 *tcp_parse_md5sig_option(const struct tcphdr *th) +{ + int length = (th->doff << 2) - sizeof(*th); + const u8 *ptr = (const u8 *)(th + 1); + + /* If the TCP option is too short, we can short cut */ + if (length < TCPOLEN_MD5SIG) + return NULL; + + while (length > 0) { + int opcode = *ptr++; + int opsize; + + switch (opcode) { + case TCPOPT_EOL: + return NULL; + case TCPOPT_NOP: + length--; + continue; + default: + opsize = *ptr++; + if (opsize < 2 || opsize > length) + return NULL; + if (opcode == TCPOPT_MD5SIG) + return opsize == TCPOLEN_MD5SIG ? ptr : NULL; + } + ptr += opsize - 2; + length -= opsize; + } + return NULL; +} + +#if IS_ENABLED(CONFIG_IPV6) +static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, + const struct in6_addr *addr) +{ + return tcp_md5_do_lookup(sk, (union tcp_md5_addr *)addr, AF_INET6); +} +#endif + +static int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, + const struct sk_buff *skb, + unsigned int header_len) +{ + struct scatterlist sg; + const struct tcphdr *tp = tcp_hdr(skb); + struct ahash_request *req = hp->md5_req; + unsigned int i; + const unsigned int head_data_len = skb_headlen(skb) > header_len ? + skb_headlen(skb) - header_len : 0; + const struct skb_shared_info *shi = skb_shinfo(skb); + struct sk_buff *frag_iter; + + sg_init_table(&sg, 1); + + sg_set_buf(&sg, ((u8 *)tp) + header_len, head_data_len); + ahash_request_set_crypt(req, &sg, NULL, head_data_len); + if (crypto_ahash_update(req)) + return 1; + + for (i = 0; i < shi->nr_frags; ++i) { + const struct skb_frag_struct *f = &shi->frags[i]; + unsigned int offset = f->page_offset; + struct page *page = skb_frag_page(f) + (offset >> PAGE_SHIFT); + + sg_set_page(&sg, page, skb_frag_size(f), + offset_in_page(offset)); + ahash_request_set_crypt(req, &sg, NULL, skb_frag_size(f)); + if (crypto_ahash_update(req)) + return 1; + } + + skb_walk_frags(skb, frag_iter) + if (tcp_md5_hash_skb_data(hp, frag_iter, 0)) + return 1; + + return 0; +} + +int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + const struct tcphdr *th = tcp_hdr(skb); + const struct iphdr *iph = ip_hdr(skb); + const __u8 *hash_location = NULL; + + rcu_read_lock(); + hash_location = tcp_parse_md5sig_option(th); + if (sk && sk_fullsock(sk)) { + opts->md5 = tcp_md5_do_lookup(sk, + (union tcp_md5_addr *)&iph->saddr, + AF_INET); + } else if (sk && sk->sk_state == TCP_TIME_WAIT) { + struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + + opts->md5 = tcp_twsk_md5_key(tcptw); + } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { + opts->md5 = tcp_md5_do_lookup(sk, + (union tcp_md5_addr *)&iph->saddr, + AF_INET); + } else if (hash_location) { + unsigned char newhash[16]; + struct sock *sk1; + int genhash; + + /* active side is lost. Try to find listening socket through + * source port, and then find md5 key through listening socket. + * we are not loose security here: + * Incoming packet is checked with md5 hash with finding key, + * no RST generated if md5 hash doesn't match. + */ + sk1 = __inet_lookup_listener(dev_net(skb_dst(skb)->dev), + &tcp_hashinfo, NULL, 0, + iph->saddr, + th->source, iph->daddr, + ntohs(th->source), inet_iif(skb), + tcp_v4_sdif(skb)); + /* don't send rst if it can't find key */ + if (!sk1) + goto out_err; + + opts->md5 = tcp_md5_do_lookup(sk1, (union tcp_md5_addr *) + &iph->saddr, AF_INET); + if (!opts->md5) + goto out_err; + + genhash = tcp_v4_md5_hash_skb(newhash, opts->md5, NULL, skb); + if (genhash || memcmp(hash_location, newhash, 16) != 0) + goto out_err; + } + + if (opts->md5) + return TCPOLEN_MD5SIG_ALIGNED; + + rcu_read_unlock(); + return 0; + +out_err: + rcu_read_unlock(); + return -1; +} + +void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) +{ + if (opts->md5) { + *topt++ = htonl((TCPOPT_NOP << 24) | + (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | + TCPOLEN_MD5SIG); + + tcp_v4_md5_hash_hdr((__u8 *)topt, opts->md5, + ip_hdr(skb)->saddr, + ip_hdr(skb)->daddr, t1); + rcu_read_unlock(); + } +} + +#if IS_ENABLED(CONFIG_IPV6) +int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) +{ + const struct tcphdr *th = tcp_hdr(skb); + struct ipv6hdr *ipv6h = ipv6_hdr(skb); + const __u8 *hash_location = NULL; + + rcu_read_lock(); + hash_location = tcp_parse_md5sig_option(th); + if (sk && sk_fullsock(sk)) { + opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (sk && sk->sk_state == TCP_TIME_WAIT) { + struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + + opts->md5 = tcp_twsk_md5_key(tcptw); + } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { + opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); + } else if (hash_location) { + unsigned char newhash[16]; + struct sock *sk1; + int genhash; + + /* active side is lost. Try to find listening socket through + * source port, and then find md5 key through listening socket. + * we are not loose security here: + * Incoming packet is checked with md5 hash with finding key, + * no RST generated if md5 hash doesn't match. + */ + sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), + &tcp_hashinfo, NULL, 0, + &ipv6h->saddr, + th->source, &ipv6h->daddr, + ntohs(th->source), tcp_v6_iif(skb), + tcp_v6_sdif(skb)); + if (!sk1) + goto out_err; + + opts->md5 = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); + if (!opts->md5) + goto out_err; + + genhash = tcp_v6_md5_hash_skb(newhash, opts->md5, NULL, skb); + if (genhash || memcmp(hash_location, newhash, 16) != 0) + goto out_err; + } + + if (opts->md5) + return TCPOLEN_MD5SIG_ALIGNED; + + rcu_read_unlock(); + return 0; + +out_err: + rcu_read_unlock(); + return -1; +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_prepare); + +void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) +{ + if (opts->md5) { + *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); + tcp_v6_md5_hash_hdr((__u8 *)topt, opts->md5, + &ipv6_hdr(skb)->saddr, + &ipv6_hdr(skb)->daddr, t1); + + rcu_read_unlock(); + } +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_write); +#endif + +struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) +{ + const union tcp_md5_addr *addr; + + addr = (const union tcp_md5_addr *)&addr_sk->sk_daddr; + return tcp_md5_do_lookup(sk, addr, AF_INET); +} +EXPORT_SYMBOL(tcp_v4_md5_lookup); + +int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + __be32 saddr, daddr; + + if (sk) { /* valid for establish/request sockets */ + saddr = sk->sk_rcv_saddr; + daddr = sk->sk_daddr; + } else { + const struct iphdr *iph = ip_hdr(skb); + + saddr = iph->saddr; + daddr = iph->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +EXPORT_SYMBOL(tcp_v4_md5_hash_skb); + +#if IS_ENABLED(CONFIG_IPV6) +int tcp_v6_md5_hash_skb(char *md5_hash, + const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb) +{ + const struct in6_addr *saddr, *daddr; + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + + if (sk) { /* valid for establish/request sockets */ + saddr = &sk->sk_v6_rcv_saddr; + daddr = &sk->sk_v6_daddr; + } else { + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + + saddr = &ip6h->saddr; + daddr = &ip6h->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_hash_skb); +#endif + +/* Called with rcu_read_lock() */ +bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + /* This gets called for each TCP segment that arrives + * so we want to be efficient. + * We have 3 drop cases: + * o No MD5 hash and one expected. + * o MD5 hash and we're not expecting one. + * o MD5 hash and its wrong. + */ + const __u8 *hash_location = NULL; + struct tcp_md5sig_key *hash_expected; + const struct iphdr *iph = ip_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); + int genhash; + unsigned char newhash[16]; + + hash_expected = tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&iph->saddr, + AF_INET); + hash_location = tcp_parse_md5sig_option(th); + + /* We've parsed the options - do we have a hash? */ + if (!hash_expected && !hash_location) + return false; + + if (hash_expected && !hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); + return true; + } + + if (!hash_expected && hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); + return true; + } + + /* Okay, so this is hash_expected and hash_location - + * so we need to calculate the checksum. + */ + genhash = tcp_v4_md5_hash_skb(newhash, + hash_expected, + NULL, skb); + + if (genhash || memcmp(hash_location, newhash, 16) != 0) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); + net_info_ratelimited("MD5 Hash failed for (%pI4, %d)->(%pI4, %d)%s\n", + &iph->saddr, ntohs(th->source), + &iph->daddr, ntohs(th->dest), + genhash ? " tcp_v4_calc_md5_hash failed" + : ""); + return true; + } + return false; +} + +#if IS_ENABLED(CONFIG_IPV6) +bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) +{ + const __u8 *hash_location = NULL; + struct tcp_md5sig_key *hash_expected; + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); + int genhash; + u8 newhash[16]; + + hash_expected = tcp_v6_md5_do_lookup(sk, &ip6h->saddr); + hash_location = tcp_parse_md5sig_option(th); + + /* We've parsed the options - do we have a hash? */ + if (!hash_expected && !hash_location) + return false; + + if (hash_expected && !hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); + return true; + } + + if (!hash_expected && hash_location) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); + return true; + } + + /* check the signature */ + genhash = tcp_v6_md5_hash_skb(newhash, + hash_expected, + NULL, skb); + + if (genhash || memcmp(hash_location, newhash, 16) != 0) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); + net_info_ratelimited("MD5 Hash %s for [%pI6c]:%u->[%pI6c]:%u\n", + genhash ? "failed" : "mismatch", + &ip6h->saddr, ntohs(th->source), + &ip6h->daddr, ntohs(th->dest)); + return true; + } + + return false; +} +EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); +#endif + +void tcp_v4_md5_destroy_sock(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + + /* Clean up the MD5 key list, if any */ + if (tp->md5sig_info) { + tcp_clear_md5_list(sk); + kfree_rcu(tp->md5sig_info, rcu); + tp->md5sig_info = NULL; + } +} + +void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) +{ + struct inet_sock *inet = inet_sk(sk); + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_md5_do_lookup(listener, (union tcp_md5_addr *)&inet->inet_daddr, + AF_INET); + if (key) { + /* We're using one, so create a matching key + * on the sk structure. If we fail to get + * memory, then we end up not copying the key + * across. Shucks. + */ + tcp_md5_do_add(sk, (union tcp_md5_addr *)&inet->inet_daddr, + AF_INET, 32, key->key, key->keylen, GFP_ATOMIC); + sk_nocaps_add(sk, NETIF_F_GSO_MASK); + } +} + +#if IS_ENABLED(CONFIG_IPV6) +void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) +{ + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_v6_md5_do_lookup(listener, &sk->sk_v6_daddr); + if (key) { + /* We're using one, so create a matching key + * on the newsk structure. If we fail to get + * memory, then we end up not copying the key + * across. Shucks. + */ + tcp_md5_do_add(sk, (union tcp_md5_addr *)&sk->sk_v6_daddr, + AF_INET6, 128, key->key, key->keylen, + sk_gfp_mask(sk, GFP_ATOMIC)); + } +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_syn_recv_sock); + +struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) +{ + return tcp_v6_md5_do_lookup(sk, &addr_sk->sk_v6_daddr); +} +EXPORT_SYMBOL_GPL(tcp_v6_md5_lookup); +#endif + +void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw) +{ + struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_key *key; + + /* The timewait bucket does not have the key DB from the + * sock structure. We just make a quick copy of the + * md5 key being used (if indeed we are using one) + * so the timewait ack generating code has the key. + */ + tcptw->tw_md5_key = NULL; + key = tp->af_specific->md5_lookup(sk, sk); + if (key) { + tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); + BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool()); + } +} + +static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, + const struct tcp_md5sig_key *key) +{ + info->tcpm_family = key->family; + info->tcpm_prefixlen = key->prefixlen; + info->tcpm_keylen = key->keylen; + memcpy(info->tcpm_key, key->key, key->keylen); + + if (key->family == AF_INET) + info->tcpm_addr[0] = key->addr.a4.s_addr; + #if IS_ENABLED(CONFIG_IPV6) + else if (key->family == AF_INET6) + memcpy(&info->tcpm_addr, &key->addr.a6, + sizeof(info->tcpm_addr)); + #endif +} + +static int tcp_diag_put_md5sig(struct sk_buff *skb, + const struct tcp_md5sig_info *md5sig) +{ + const struct tcp_md5sig_key *key; + struct tcp_diag_md5sig *info; + struct nlattr *attr; + int md5sig_count = 0; + + hlist_for_each_entry_rcu(key, &md5sig->head, node) + md5sig_count++; + if (md5sig_count == 0) + return 0; + + attr = nla_reserve(skb, INET_DIAG_MD5SIG, + md5sig_count * sizeof(struct tcp_diag_md5sig)); + if (!attr) + return -EMSGSIZE; + + info = nla_data(attr); + memset(info, 0, md5sig_count * sizeof(struct tcp_diag_md5sig)); + hlist_for_each_entry_rcu(key, &md5sig->head, node) { + tcp_diag_md5sig_fill(info++, key); + if (--md5sig_count == 0) + break; + } + + return 0; +} + +int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb) +{ + if (net_admin) { + struct tcp_md5sig_info *md5sig; + int err = 0; + + rcu_read_lock(); + md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); + if (md5sig) + err = tcp_diag_put_md5sig(skb, md5sig); + rcu_read_unlock(); + if (err < 0) + return err; + } + + return 0; +} +EXPORT_SYMBOL_GPL(tcp_md5_diag_get_aux); + +int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin) +{ + int size = 0; + + if (net_admin && sk_fullsock(sk)) { + const struct tcp_md5sig_info *md5sig; + const struct tcp_md5sig_key *key; + size_t md5sig_count = 0; + + rcu_read_lock(); + md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); + if (md5sig) { + hlist_for_each_entry_rcu(key, &md5sig->head, node) + md5sig_count++; + } + rcu_read_unlock(); + size += nla_total_size(md5sig_count * + sizeof(struct tcp_diag_md5sig)); + } + + return size; +} +EXPORT_SYMBOL_GPL(tcp_md5_diag_get_aux_size); + +const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { + .md5_lookup = tcp_v4_md5_lookup, + .calc_md5_hash = tcp_v4_md5_hash_skb, + .md5_parse = tcp_v4_parse_md5_keys, +}; + +#if IS_ENABLED(CONFIG_IPV6) +const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { + .md5_lookup = tcp_v6_md5_lookup, + .calc_md5_hash = tcp_v6_md5_hash_skb, + .md5_parse = tcp_v6_parse_md5_keys, +}; +EXPORT_SYMBOL_GPL(tcp_sock_ipv6_specific); + +const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { + .md5_lookup = tcp_v4_md5_lookup, + .calc_md5_hash = tcp_v4_md5_hash_skb, + .md5_parse = tcp_v6_parse_md5_keys, +}; +EXPORT_SYMBOL_GPL(tcp_sock_ipv6_mapped_specific); +#endif diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index aa2ff9aadad0..f33214b29167 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -295,21 +296,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) INIT_HLIST_HEAD(&tp->tcp_option_list); } #ifdef CONFIG_TCP_MD5SIG - /* - * The timewait bucket does not have the key DB from the - * sock structure. We just make a quick copy of the - * md5 key being used (if indeed we are using one) - * so the timewait ack generating code has the key. - */ - do { - struct tcp_md5sig_key *key; - tcptw->tw_md5_key = NULL; - key = tp->af_specific->md5_lookup(sk, sk); - if (key) { - tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool()); - } - } while (0); + tcp_md5_time_wait(sk, tw); #endif /* Get the TIME_WAIT timeout firing. */ @@ -346,13 +333,10 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) void tcp_twsk_destructor(struct sock *sk) { #ifdef CONFIG_TCP_MD5SIG - struct tcp_timewait_sock *twsk = tcp_twsk(sk); - - if (twsk->tw_md5_key) - kfree_rcu(twsk->tw_md5_key, rcu); + tcp_md5_twsk_destructor(sk); #endif - if (unlikely(!hlist_empty(&twsk->tcp_option_list))) + if (unlikely(!hlist_empty(&tcp_twsk(sk)->tcp_option_list))) tcp_extopt_destroy(sk); } EXPORT_SYMBOL_GPL(tcp_twsk_destructor); @@ -538,8 +522,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->tsoffset = treq->ts_off; #ifdef CONFIG_TCP_MD5SIG newtp->md5sig_info = NULL; /*XXX*/ - if (newtp->af_specific->md5_lookup(sk, newsk)) - newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; + tcp_md5_add_header_len(sk, newsk); #endif if (unlikely(!hlist_empty(&treq->tcp_option_list))) newtp->tcp_header_len += tcp_extopt_add_header(req_to_sk(req), newsk); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 7ea65f70e5ec..137645753abb 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -42,6 +42,7 @@ #include #include #include +#include #include @@ -3238,8 +3239,7 @@ static void tcp_connect_init(struct sock *sk) tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG - if (tp->af_specific->md5_lookup(sk, sk)) - tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; + tcp_md5_add_header_len(sk, sk); #endif if (unlikely(!hlist_empty(&tp->tcp_option_list))) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 202a59511950..e9b72d794140 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -79,10 +80,6 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb); static const struct inet_connection_sock_af_ops ipv6_mapped; static const struct inet_connection_sock_af_ops ipv6_specific; -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv6_specific; -static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; -#endif static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb) { @@ -501,218 +498,6 @@ static void tcp_v6_reqsk_destructor(struct request_sock *req) kfree_skb(inet_rsk(req)->pktopts); } -#ifdef CONFIG_TCP_MD5SIG -static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, - const struct in6_addr *addr) -{ - return tcp_md5_do_lookup(sk, (union tcp_md5_addr *)addr, AF_INET6); -} - -static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) -{ - return tcp_v6_md5_do_lookup(sk, &addr_sk->sk_v6_daddr); -} - -static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) -{ - struct tcp_md5sig cmd; - struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; - u8 prefixlen; - - if (optlen < sizeof(cmd)) - return -EINVAL; - - if (copy_from_user(&cmd, optval, sizeof(cmd))) - return -EFAULT; - - if (sin6->sin6_family != AF_INET6) - return -EINVAL; - - if (optname == TCP_MD5SIG_EXT && - cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { - prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 128 || (ipv6_addr_v4mapped(&sin6->sin6_addr) && - prefixlen > 32)) - return -EINVAL; - } else { - prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128; - } - - if (!cmd.tcpm_keylen) { - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen); - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen); - } - - if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) - return -EINVAL; - - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen, cmd.tcpm_key, - cmd.tcpm_keylen, GFP_KERNEL); - - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen, cmd.tcpm_key, - cmd.tcpm_keylen, GFP_KERNEL); -} - -static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp, - const struct in6_addr *daddr, - const struct in6_addr *saddr, - const struct tcphdr *th, int nbytes) -{ - struct tcp6_pseudohdr *bp; - struct scatterlist sg; - struct tcphdr *_th; - - bp = hp->scratch; - /* 1. TCP pseudo-header (RFC2460) */ - bp->saddr = *saddr; - bp->daddr = *daddr; - bp->protocol = cpu_to_be32(IPPROTO_TCP); - bp->len = cpu_to_be32(nbytes); - - _th = (struct tcphdr *)(bp + 1); - memcpy(_th, th, sizeof(*th)); - _th->check = 0; - - sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, - sizeof(*bp) + sizeof(*th)); - return crypto_ahash_update(hp->md5_req); -} - -static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, - const struct in6_addr *daddr, struct in6_addr *saddr, - const struct tcphdr *th) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} - -static int tcp_v6_md5_hash_skb(char *md5_hash, - const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - const struct in6_addr *saddr, *daddr; - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - - if (sk) { /* valid for establish/request sockets */ - saddr = &sk->sk_v6_rcv_saddr; - daddr = &sk->sk_v6_daddr; - } else { - const struct ipv6hdr *ip6h = ipv6_hdr(skb); - saddr = &ip6h->saddr; - daddr = &ip6h->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} - -#endif - -static bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ -#ifdef CONFIG_TCP_MD5SIG - const __u8 *hash_location = NULL; - struct tcp_md5sig_key *hash_expected; - const struct ipv6hdr *ip6h = ipv6_hdr(skb); - const struct tcphdr *th = tcp_hdr(skb); - int genhash; - u8 newhash[16]; - - hash_expected = tcp_v6_md5_do_lookup(sk, &ip6h->saddr); - hash_location = tcp_parse_md5sig_option(th); - - /* We've parsed the options - do we have a hash? */ - if (!hash_expected && !hash_location) - return false; - - if (hash_expected && !hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); - return true; - } - - if (!hash_expected && hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); - return true; - } - - /* check the signature */ - genhash = tcp_v6_md5_hash_skb(newhash, - hash_expected, - NULL, skb); - - if (genhash || memcmp(hash_location, newhash, 16) != 0) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); - net_info_ratelimited("MD5 Hash %s for [%pI6c]:%u->[%pI6c]:%u\n", - genhash ? "failed" : "mismatch", - &ip6h->saddr, ntohs(th->source), - &ip6h->daddr, ntohs(th->dest)); - return true; - } -#endif - return false; -} - static void tcp_v6_init_req(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb) @@ -787,56 +572,24 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 __be32 *topt; struct hlist_head *extopt_list = NULL; struct tcp_out_options extraopts; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key = NULL; - const __u8 *hash_location = NULL; - struct ipv6hdr *ipv6h = ipv6_hdr(skb); -#endif + + memset(&extraopts, 0, sizeof(extraopts)); if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (sk && sk->sk_state == TCP_TIME_WAIT) { - struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - - key = tcp_twsk_md5_key(tcptw); - } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (hash_location) { - unsigned char newhash[16]; - struct sock *sk1 = NULL; - int genhash; - - /* active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), tcp_v6_iif(skb), - tcp_v6_sdif(skb)); - if (!sk1) - goto out; +{ + int ret; - key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); - if (!key) - goto out; + ret = tcp_v6_md5_send_response_prepare(skb, 0, + MAX_TCP_OPTION_SPACE - tot_len, + &extraopts, sk); - genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out; - } + if (ret == -1) + goto out; - if (key) - tot_len += TCPOLEN_MD5SIG_ALIGNED; + tot_len += ret; +} #endif if (sk) @@ -850,8 +603,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 if (!rst || !th->ack) extraflags |= TCPHDR_ACK; - memset(&extraopts, 0, sizeof(extraopts)); - used = tcp_extopt_response_prepare(skb, extraflags, remaining, &extraopts, sk); @@ -889,13 +640,8 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 } #ifdef CONFIG_TCP_MD5SIG - if (key) { - *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); - tcp_v6_md5_hash_hdr((__u8 *)topt, key, - &ipv6_hdr(skb)->saddr, - &ipv6_hdr(skb)->daddr, t1); - } + if (extraopts.md5) + tcp_v6_md5_send_response_write(topt, skb, t1, &extraopts, sk); #endif if (unlikely(extopt_list && !hlist_empty(extopt_list))) @@ -943,10 +689,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 out: kfree_skb(buff); - -#ifdef CONFIG_TCP_MD5SIG - rcu_read_unlock(); -#endif } static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) @@ -1072,9 +814,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * struct inet_sock *newinet; struct tcp_sock *newtp; struct sock *newsk; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif struct flowi6 fl6; if (skb->protocol == htons(ETH_P_IP)) { @@ -1219,18 +958,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * newinet->inet_rcv_saddr = LOOPBACK4_IPV6; #ifdef CONFIG_TCP_MD5SIG - /* Copy over the MD5 key from the original socket */ - key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr); - if (key) { - /* We're using one, so create a matching key - * on the newsk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr, - AF_INET6, 128, key->key, key->keylen, - sk_gfp_mask(sk, GFP_ATOMIC)); - } + tcp_v6_md5_syn_recv_sock(sk, newsk); #endif if (__inet_inherit_port(sk, newsk) < 0) { @@ -1692,14 +1420,6 @@ static const struct inet_connection_sock_af_ops ipv6_specific = { .mtu_reduced = tcp_v6_mtu_reduced, }; -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { - .md5_lookup = tcp_v6_md5_lookup, - .calc_md5_hash = tcp_v6_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -#endif - /* * TCP over IPv4 via INET6 API */ @@ -1722,14 +1442,6 @@ static const struct inet_connection_sock_af_ops ipv6_mapped = { .mtu_reduced = tcp_v4_mtu_reduced, }; -#ifdef CONFIG_TCP_MD5SIG -static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -#endif - /* NOTE: A lot of things set to zero explicitly by call to * sk_alloc() so need not be done here. */ From patchwork Mon Dec 18 21:51:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850446 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="3EsXj6BR"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0w9p3MmWz9sNr for ; Tue, 19 Dec 2017 09:06:54 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936195AbdLRWGw (ORCPT ); Mon, 18 Dec 2017 17:06:52 -0500 Received: from mail-out22.apple.com ([17.171.2.32]:47721 "EHLO mail-in22.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933786AbdLRWGv (ORCPT ); Mon, 18 Dec 2017 17:06:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633900; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=WszxgB7YHkBFAZz0unj+9k46Ug3Ae/X/YGR3FNRHacA=; b=3EsXj6BRYJVytR2EC6JN46r2AiIYkXA6tEyImiZtdXIyjYcOfG5JMJNXpgrqjYsD xKJGZdxCAXihD6BqRY6mbVJ/xvCdQ59e3DU1rbfGSzY5h9w7ogXqWQjY7Zyzcdub GP9zlSL4mS1Xog+r1ECh89ujTIASImudODWZwj04D/lFjF80D7nQVLipcoWUbPjJ 6/2NCouzRIcdxST5DUoJ5Xhc8ke8GZSx4efuxdZ+rp4V+Drp7/roYhWHGo+nlbkm KyRVr/j73UjyYlZVjyZf9Noy/GrGj+GQgDe5UbO0ZkJcGtRuQgn3b6s2RfjaIIvR 8SRXusPu6smKgmqUbhSfqg==; Received: from relay2.apple.com (relay2.apple.com [17.128.113.67]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in22.apple.com (Apple Secure Mail Relay) with SMTP id 98.4E.08992.B68383A5; Mon, 18 Dec 2017 13:51:40 -0800 (PST) X-AuditID: 11ab0216-43acf9c000002320-15-5a38386b5408 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay2.apple.com (Apple SCV relay) with SMTP id 67.5C.07440.B68383A5; Mon, 18 Dec 2017 13:51:39 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600HVFFE3VT70@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:39 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 12/14] tcp_md5: Use tcp_extra_options in output path Date: Mon, 18 Dec 2017 13:51:07 -0800 Message-id: <20171218215109.38700-13-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNLMWRmVeSWpSXmKPExsUi2FDorJtjYRFl0NrAYfHl5212i6fHHrFb /G3pZ7E4tkDMgcVjwaZSj02rOtk85p0M9Pi8SS6AJYrLJiU1J7MstUjfLoEro/shb8HZu0wV n/a8Ym5gbJ/O1MXIySEhYCJxdPEyMFtIYA2TxIn1sjDxCzsOsHUxcgHFDzFKLJ+/jaWLkYOD WUBe4uB5WYh4I5PEucMLWEEahAUkJbrv3GEGsdkEtCTe3m4Hi4sISEl83LGdHcRmFmhmlNg5 RwSi3kni6sXHYHEWAVWJcyvnsIHYvALmEvueLYA6Tl5i8fedYHFOoHjP10fsEIeaSZz/+RKq 5gibRPuGjAmMgrMQzlvAyLiKUTg3MTNHNzPPyEgvsaAgJ1UvOT93EyMoNFczie1gvPfa8BCj AAejEg/vjKvmUUKsiWXFlbmHGKU5WJTEeR9Um0QJCaQnlqRmp6YWpBbFF5XmpBYfYmTi4JRq YNSceOiExHtt3yyRzCcB7h5XZz6UuCFzR/W85ot3778lddfxLNn61OrQr3SlQ/33W+dzGzfa Pfr0M+Xm2WNnFPVl7B9u37hCc5fnQ93DPT2fp56zuuYtsqT8w9OTfou2v6r21bOc3jLVc/r1 4zeaum/vtOze7yk9szf8/ZSrPW9eyBf8KN1nUjRViaU4I9FQi7moOBEAZ9OR7S4CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOLMWRmVeSWpSXmKPExsUi2FB8QzfbwiLKYMJXGYsvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGd0PeQvO3mWq+LTnFXMDY/t0pi5GTg4J AROJCzsOsHUxcnEICRxilFg+fxtLFyMHB7OAvMTB87IQ8UYmiXOHF7CCNAgLSEp037nDDGKz CWhJvL3dDhYXEZCS+LhjOzuIzSzQzCixc44IRL2TxNWLj8HiLAKqEudWzmEDsXkFzCX2PVsA dYS8xOLvO8HinEDxnq+PwOqFBMwkzv98yTSBkW8WwkkLGBlXMQoUpeYkVhrpwX27iREcmoXO OxiPLbM6xCjAwajEwzvjqnmUEGtiWXFlLtBvHMxKIrx+Z4FCvCmJlVWpRfnxRaU5qcWHGH2A bpvILCWanA+Mm7ySeENjC2NLEwsDAxNLMxMcwkrivO6rgGYJpCeWpGanphakFsGMY+LglGpg nLp8TdPE9W3XVpXwX9q8WyjN4fVlDo+npeIT/qzhiwqeXvlG+V3AYcXd85zvRG33XPT955Ku Zt6Kw9s3/3yYxPBLscV5J8v7fzPebkzXdNTYpbxhw8e6qTml584labQu7A/79STV+JzPd8Yr F6evkAqteHz+yvZTrJcjCp0CDjQK2z2UZ23ak6DEAkwXhlrMRcWJAEIZPYR6AgAA Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch starts making use of the extra_option framework for TCP_MD5. One tricky part is that extra_options are called at the end of the tcp_syn_options(), while TCP_MD5 is called at the beginning. TCP_MD5 is called at the beginning because it wants to disable TCP-timestamps (for option-space reasons). So, in the _prepare-function of the extra options we need to undo the work that was done when enabling TCP timestamps. Another thing to note is that in tcp_v4_send_reset (and its IPv6 counterpart), we were looking previously for the listening-socket (if sk == NULL) in case there was an MD5 signature in the TCP-option space of the incoming packet. With the extra-option framework we can't do this anymore, because extra-options are part of the TCP-socket's tcp_option_list. If there is no socket, it means we can't parse the option. This shouldn't have an impact, because when we receive a segment and there is not established socket, we will match on the listening socket (if it's still there). Then, when we decide to respond with a RST in tcp_rcv_state_process, we will give to tcp_v4_send_reset() the listening-socket and thus will parse the TCP_MD5 option. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 10 +- include/linux/tcp_md5.h | 64 ----- net/ipv4/tcp_ipv4.c | 55 ---- net/ipv4/tcp_md5.c | 696 +++++++++++++++++++++++++++++++++-------------- net/ipv4/tcp_minisocks.c | 12 - net/ipv4/tcp_output.c | 68 +---- net/ipv6/tcp_ipv6.c | 23 -- 7 files changed, 488 insertions(+), 440 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index b0b38f7100a4..034fbd9e0a38 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -127,11 +127,11 @@ struct tcp_out_options { u16 mss; /* 0 to disable */ u8 ws; /* window scale, 0 to disable */ u8 num_sack_blocks; /* number of SACK blocks to include */ - u8 hash_size; /* bytes in hash_location */ - __u8 *hash_location; /* temporary pointer, overloaded */ __u32 tsval, tsecr; /* need to include OPTION_TS */ struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ +#ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *md5; /* TCP_MD5 signature key */ +#endif }; /* This is the max number of SACKS that we'll generate and process. It's safe @@ -380,9 +380,6 @@ struct tcp_sock { #ifdef CONFIG_TCP_MD5SIG /* TCP AF-Specific parts; only used by MD5 Signature support so far */ const struct tcp_sock_af_ops *af_specific; - -/* TCP MD5 Signature Option information */ - struct tcp_md5sig_info __rcu *md5sig_info; #endif /* TCP fastopen related information */ @@ -440,9 +437,6 @@ struct tcp_timewait_sock { long tw_ts_recent_stamp; struct hlist_head tcp_option_list; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *tw_md5_key; -#endif }; static inline struct tcp_timewait_sock *tcp_twsk(const struct sock *sk) diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h index f6a681cdded4..8dee4fc3dc7f 100644 --- a/include/linux/tcp_md5.h +++ b/include/linux/tcp_md5.h @@ -26,25 +26,6 @@ struct tcp_md5sig_key { struct rcu_head rcu; }; -/* - sock block */ -struct tcp_md5sig_info { - struct hlist_head head; - struct rcu_head rcu; -}; - -union tcp_md5sum_block { - struct tcp4_pseudohdr ip4; -#if IS_ENABLED(CONFIG_IPV6) - struct tcp6_pseudohdr ip6; -#endif -}; - -/* - pool: digest algorithm, hash description and scratch buffer */ -struct tcp_md5sig_pool { - struct ahash_request *md5_req; - void *scratch; -}; - extern const struct tcp_sock_af_ops tcp_sock_ipv4_specific; extern const struct tcp_sock_af_ops tcp_sock_ipv6_specific; extern const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; @@ -56,37 +37,9 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, const struct sock *addr_sk); -void tcp_v4_md5_destroy_sock(struct sock *sk); - -int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk); - -void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk); - -int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk); - -void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk); - bool tcp_v4_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); -void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); - -void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk); - -void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw); - struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, const struct sock *addr_sk); @@ -98,23 +51,6 @@ int tcp_v6_md5_hash_skb(char *md5_hash, bool tcp_v6_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); -static inline void tcp_md5_twsk_destructor(struct sock *sk) -{ - struct tcp_timewait_sock *twsk = tcp_twsk(sk); - - if (twsk->tw_md5_key) - kfree_rcu(twsk->tw_md5_key, rcu); -} - -static inline void tcp_md5_add_header_len(const struct sock *listener, - struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - - if (tp->af_specific->md5_lookup(listener, sk)) - tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED; -} - int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb); int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 143e1f66a24a..356bf41ec73a 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -603,9 +603,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) struct ip_reply_arg arg; struct net *net; int offset = 0; -#ifdef CONFIG_TCP_MD5SIG - int ret; -#endif /* Never send a reset in response to a reset. */ if (th->rst) @@ -643,26 +640,11 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); -#ifdef CONFIG_TCP_MD5SIG - ret = tcp_v4_md5_send_response_prepare(skb, 0, - MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, - &opts, sk); - - if (ret == -1) - return; - - arg.iov[0].iov_len += ret; -#endif - if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; int used; remaining = sizeof(rep.opt); -#ifdef CONFIG_TCP_MD5SIG - if (opts.md5) - remaining -= TCPOLEN_MD5SIG_ALIGNED; -#endif used = tcp_extopt_response_prepare(skb, TCPHDR_RST, remaining, &opts, sk); @@ -674,9 +656,6 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) offset += used / 4; } -#ifdef CONFIG_TCP_MD5SIG - tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); -#endif arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ arg.iov[0].iov_len, IPPROTO_TCP, 0); @@ -727,9 +706,6 @@ static void tcp_v4_send_ack(const struct sock *sk, struct net *net = sock_net(sk); struct ip_reply_arg arg; int offset = 0; -#ifdef CONFIG_TCP_MD5SIG - int ret; -#endif extopt_list = tcp_extopt_get_list(sk); @@ -758,28 +734,12 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.ack = 1; rep.th.window = htons(win); -#ifdef CONFIG_TCP_MD5SIG - ret = tcp_v4_md5_send_response_prepare(skb, 0, - MAX_TCP_OPTION_SPACE - arg.iov[0].iov_len, - &opts, sk); - - if (ret == -1) - return; - - arg.iov[0].iov_len += ret; -#endif - if (unlikely(extopt_list && !hlist_empty(extopt_list))) { unsigned int remaining; int used; remaining = sizeof(rep.th) + sizeof(rep.opt) - arg.iov[0].iov_len; -#ifdef CONFIG_TCP_MD5SIG - if (opts.md5) - remaining -= TCPOLEN_MD5SIG_ALIGNED; -#endif - memset(&opts, 0, sizeof(opts)); used = tcp_extopt_response_prepare(skb, TCPHDR_ACK, remaining, &opts, sk); @@ -792,14 +752,6 @@ static void tcp_v4_send_ack(const struct sock *sk, offset += used / 4; } -#ifdef CONFIG_TCP_MD5SIG - if (opts.md5) { - arg.iov[0].iov_len += TCPOLEN_MD5SIG_ALIGNED; - rep.th.doff = arg.iov[0].iov_len / 4; - } - tcp_v4_md5_send_response_write(&rep.opt[offset], skb, &rep.th, &opts, sk); -#endif - arg.flags = reply_flags; arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, ip_hdr(skb)->saddr, /* XXX */ @@ -1026,10 +978,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, tcp_initialize_rcv_mss(newsk); -#ifdef CONFIG_TCP_MD5SIG - tcp_v4_md5_syn_recv_sock(sk, newsk); -#endif - if (__inet_inherit_port(sk, newsk) < 0) goto put_and_exit; *own_req = inet_ehash_nolisten(newsk, req_to_sk(req_unhash)); @@ -1532,9 +1480,6 @@ void tcp_v4_destroy_sock(struct sock *sk) if (unlikely(!hlist_empty(&tp->tcp_option_list))) tcp_extopt_destroy(sk); -#ifdef CONFIG_TCP_MD5SIG - tcp_v4_md5_destroy_sock(sk); -#endif /* Clean up a referenced TCP bind bucket. */ if (inet_csk(sk)->icsk_bind_hash) diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c index a31b404e6dbf..64e5b4420ce9 100644 --- a/net/ipv4/tcp_md5.c +++ b/net/ipv4/tcp_md5.c @@ -7,11 +7,119 @@ #include +struct tcp_md5sig_info { + struct hlist_head head; + struct rcu_head rcu; +}; + +union tcp_md5sum_block { + struct tcp4_pseudohdr ip4; +#if IS_ENABLED(CONFIG_IPV6) + struct tcp6_pseudohdr ip6; +#endif +}; + +/* - pool: digest algorithm, hash description and scratch buffer */ +struct tcp_md5sig_pool { + struct ahash_request *md5_req; + void *scratch; +}; + static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); static DEFINE_MUTEX(tcp_md5sig_mutex); static bool tcp_md5sig_pool_populated; -#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) +static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + +static __be32 *tcp_md5_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store); + +static int tcp_md5_send_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + +static __be32 *tcp_md5_send_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store); + +static int tcp_md5_extopt_add_header_len(const struct sock *orig, + const struct sock *sk, + struct tcp_extopt_store *store); + +static struct tcp_extopt_store *tcp_md5_extopt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store); + +static struct tcp_extopt_store *tcp_md5_extopt_move(struct sock *from, + struct sock *to, + struct tcp_extopt_store *store); + +static void tcp_md5_extopt_destroy(struct tcp_extopt_store *store); + +struct tcp_md5_extopt { + struct tcp_extopt_store store; + struct tcp_md5sig_info __rcu *md5sig_info; + struct sock *sk; + struct rcu_head rcu; +}; + +static const struct tcp_extopt_ops tcp_md5_extra_ops = { + .option_kind = TCPOPT_MD5SIG, + .prepare = tcp_md5_extopt_prepare, + .write = tcp_md5_extopt_write, + .response_prepare = tcp_md5_send_response_prepare, + .response_write = tcp_md5_send_response_write, + .add_header_len = tcp_md5_extopt_add_header_len, + .copy = tcp_md5_extopt_copy, + .move = tcp_md5_extopt_move, + .destroy = tcp_md5_extopt_destroy, + .owner = THIS_MODULE, +}; + +static struct tcp_md5_extopt *tcp_extopt_to_md5(struct tcp_extopt_store *store) +{ + return container_of(store, struct tcp_md5_extopt, store); +} + +static struct tcp_md5_extopt *tcp_md5_opt_find(const struct sock *sk) +{ + struct tcp_extopt_store *ext_opt; + + ext_opt = tcp_extopt_find_kind(TCPOPT_MD5SIG, sk); + + return tcp_extopt_to_md5(ext_opt); +} + +static int tcp_md5_register(struct sock *sk, + struct tcp_md5_extopt *md5_opt) +{ + return tcp_register_extopt(&md5_opt->store, sk); +} + +static struct tcp_md5_extopt *tcp_md5_alloc_store(struct sock *sk) +{ + struct tcp_md5_extopt *md5_opt; + + md5_opt = kzalloc(sizeof(*md5_opt), GFP_ATOMIC); + if (!md5_opt) + return NULL; + + md5_opt->store.ops = &tcp_md5_extra_ops; + md5_opt->sk = sk; + + return md5_opt; +} static void __tcp_alloc_md5sig_pool(void) { @@ -91,18 +199,18 @@ static struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) return NULL; } -static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk, +static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct tcp_md5_extopt *md5_opt, const union tcp_md5_addr *addr, int family, u8 prefixlen) { - const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; unsigned int size = sizeof(struct in_addr); const struct tcp_md5sig_info *md5sig; + const struct sock *sk = md5_opt->sk; /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); + md5sig = rcu_dereference_check(md5_opt->md5sig_info, + sk_fullsock(sk) && lockdep_sock_is_held(sk)); if (!md5sig) return NULL; #if IS_ENABLED(CONFIG_IPV6) @@ -125,11 +233,26 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, u8 newkeylen, gfp_t gfp) { /* Add Key to the list */ - struct tcp_md5sig_key *key; - struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_info *md5sig; + struct tcp_md5_extopt *md5_opt; + struct tcp_md5sig_key *key; - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + md5_opt = tcp_md5_opt_find(sk); + if (!md5_opt) { + int ret; + + md5_opt = tcp_md5_alloc_store(sk); + if (!md5_opt) + return -ENOMEM; + + ret = tcp_md5_register(sk, md5_opt); + if (ret) { + kfree(md5_opt); + return ret; + } + } + + key = tcp_md5_do_lookup_exact(md5_opt, addr, family, prefixlen); if (key) { /* Pre-existing entry - just update that one. */ memcpy(key->key, newkey, newkeylen); @@ -137,8 +260,8 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, return 0; } - md5sig = rcu_dereference_protected(tp->md5sig_info, - lockdep_sock_is_held(sk)); + md5sig = rcu_dereference_protected(md5_opt->md5sig_info, + sk_fullsock(sk) && lockdep_sock_is_held(sk)); if (!md5sig) { md5sig = kmalloc(sizeof(*md5sig), gfp); if (!md5sig) @@ -146,7 +269,7 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, sk_nocaps_add(sk, NETIF_F_GSO_MASK); INIT_HLIST_HEAD(&md5sig->head); - rcu_assign_pointer(tp->md5sig_info, md5sig); + rcu_assign_pointer(md5_opt->md5sig_info, md5sig); } key = sock_kmalloc(sk, sizeof(*key), gfp); @@ -168,18 +291,18 @@ static int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, return 0; } -static void tcp_clear_md5_list(struct sock *sk) +static void tcp_clear_md5_list(struct tcp_md5_extopt *md5_opt) { - struct tcp_sock *tp = tcp_sk(sk); + struct tcp_md5sig_info *md5sig; struct tcp_md5sig_key *key; struct hlist_node *n; - struct tcp_md5sig_info *md5sig; - md5sig = rcu_dereference_protected(tp->md5sig_info, 1); + md5sig = rcu_dereference_protected(md5_opt->md5sig_info, 1); hlist_for_each_entry_safe(key, n, &md5sig->head, node) { hlist_del_rcu(&key->node); - atomic_sub(sizeof(*key), &sk->sk_omem_alloc); + if (md5_opt->sk && sk_fullsock(md5_opt->sk)) + atomic_sub(sizeof(*key), &md5_opt->sk->sk_omem_alloc); kfree_rcu(key, rcu); } } @@ -187,9 +310,14 @@ static void tcp_clear_md5_list(struct sock *sk) static int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family, u8 prefixlen) { + struct tcp_md5_extopt *md5_opt; struct tcp_md5sig_key *key; - key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen); + md5_opt = tcp_md5_opt_find(sk); + if (!md5_opt) + return -ENOENT; + + key = tcp_md5_do_lookup_exact(md5_opt, addr, family, prefixlen); if (!key) return -ENOENT; hlist_del_rcu(&key->node); @@ -421,16 +549,20 @@ static struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, const union tcp_md5_addr *addr, int family) { - const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; + struct tcp_md5sig_key *best_match = NULL; const struct tcp_md5sig_info *md5sig; + struct tcp_md5_extopt *md5_opt; + struct tcp_md5sig_key *key; __be32 mask; - struct tcp_md5sig_key *best_match = NULL; bool match; + md5_opt = tcp_md5_opt_find(sk); + if (!md5_opt) + return NULL; + /* caller either holds rcu_read_lock() or socket lock */ - md5sig = rcu_dereference_check(tp->md5sig_info, - lockdep_sock_is_held(sk)); + md5sig = rcu_dereference_check(md5_opt->md5sig_info, + sk_fullsock(sk) && lockdep_sock_is_held(sk)); if (!md5sig) return NULL; @@ -538,75 +670,30 @@ static int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, return 0; } -int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk) +static int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) { - const struct tcphdr *th = tcp_hdr(skb); const struct iphdr *iph = ip_hdr(skb); - const __u8 *hash_location = NULL; rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - opts->md5 = tcp_md5_do_lookup(sk, - (union tcp_md5_addr *)&iph->saddr, - AF_INET); - } else if (sk && sk->sk_state == TCP_TIME_WAIT) { - struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - - opts->md5 = tcp_twsk_md5_key(tcptw); - } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { - opts->md5 = tcp_md5_do_lookup(sk, - (union tcp_md5_addr *)&iph->saddr, - AF_INET); - } else if (hash_location) { - unsigned char newhash[16]; - struct sock *sk1; - int genhash; - - /* active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = __inet_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - iph->saddr, - th->source, iph->daddr, - ntohs(th->source), inet_iif(skb), - tcp_v4_sdif(skb)); - /* don't send rst if it can't find key */ - if (!sk1) - goto out_err; - - opts->md5 = tcp_md5_do_lookup(sk1, (union tcp_md5_addr *) - &iph->saddr, AF_INET); - if (!opts->md5) - goto out_err; - - genhash = tcp_v4_md5_hash_skb(newhash, opts->md5, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out_err; - } + opts->md5 = tcp_md5_do_lookup(sk, + (union tcp_md5_addr *)&iph->saddr, + AF_INET); if (opts->md5) + /* rcu_read_unlock() is in _response_write */ return TCPOLEN_MD5SIG_ALIGNED; rcu_read_unlock(); return 0; - -out_err: - rcu_read_unlock(); - return -1; } -void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk) +static __be32 *tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) { if (opts->md5) { *topt++ = htonl((TCPOPT_NOP << 24) | @@ -617,75 +704,39 @@ void tcp_v4_md5_send_response_write(__be32 *topt, struct sk_buff *skb, tcp_v4_md5_hash_hdr((__u8 *)topt, opts->md5, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, t1); + + topt += 4; + + /* Unlocking from _response_prepare */ rcu_read_unlock(); } + + return topt; } #if IS_ENABLED(CONFIG_IPV6) -int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, - unsigned int remaining, - struct tcp_out_options *opts, - const struct sock *sk) +static int tcp_v6_md5_send_response_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk) { - const struct tcphdr *th = tcp_hdr(skb); struct ipv6hdr *ipv6h = ipv6_hdr(skb); - const __u8 *hash_location = NULL; rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); - if (sk && sk_fullsock(sk)) { - opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (sk && sk->sk_state == TCP_TIME_WAIT) { - struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - - opts->md5 = tcp_twsk_md5_key(tcptw); - } else if (sk && sk->sk_state == TCP_NEW_SYN_RECV) { - opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); - } else if (hash_location) { - unsigned char newhash[16]; - struct sock *sk1; - int genhash; - - /* active side is lost. Try to find listening socket through - * source port, and then find md5 key through listening socket. - * we are not loose security here: - * Incoming packet is checked with md5 hash with finding key, - * no RST generated if md5 hash doesn't match. - */ - sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), tcp_v6_iif(skb), - tcp_v6_sdif(skb)); - if (!sk1) - goto out_err; - - opts->md5 = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr); - if (!opts->md5) - goto out_err; - - genhash = tcp_v6_md5_hash_skb(newhash, opts->md5, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) - goto out_err; - } + opts->md5 = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr); if (opts->md5) + /* rcu_read_unlock() is in _response_write */ return TCPOLEN_MD5SIG_ALIGNED; rcu_read_unlock(); return 0; - -out_err: - rcu_read_unlock(); - return -1; } -EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_prepare); -void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, - struct tcphdr *t1, - struct tcp_out_options *opts, - const struct sock *sk) +static __be32 *tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, + struct tcphdr *t1, + struct tcp_out_options *opts, + const struct sock *sk) { if (opts->md5) { *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | @@ -694,12 +745,45 @@ void tcp_v6_md5_send_response_write(__be32 *topt, struct sk_buff *skb, &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr, t1); + topt += 4; + + /* Unlocking from _response_prepare */ rcu_read_unlock(); } + + return topt; } -EXPORT_SYMBOL_GPL(tcp_v6_md5_send_response_write); #endif +static int tcp_md5_send_response_prepare(struct sk_buff *orig, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (orig->protocol != htons(ETH_P_IP)) + return tcp_v6_md5_send_response_prepare(orig, flags, remaining, + opts, sk); + else +#endif + return tcp_v4_md5_send_response_prepare(orig, flags, remaining, + opts, sk); +} + +static __be32 *tcp_md5_send_response_write(__be32 *ptr, struct sk_buff *orig, + struct tcphdr *th, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (orig->protocol != htons(ETH_P_IP)) + return tcp_v6_md5_send_response_write(ptr, orig, th, opts, sk); +#endif + return tcp_v4_md5_send_response_write(ptr, orig, th, opts, sk); +} + struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, const struct sock *addr_sk) { @@ -909,59 +993,6 @@ bool tcp_v6_inbound_md5_hash(const struct sock *sk, return false; } EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); -#endif - -void tcp_v4_md5_destroy_sock(struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - - /* Clean up the MD5 key list, if any */ - if (tp->md5sig_info) { - tcp_clear_md5_list(sk); - kfree_rcu(tp->md5sig_info, rcu); - tp->md5sig_info = NULL; - } -} - -void tcp_v4_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) -{ - struct inet_sock *inet = inet_sk(sk); - struct tcp_md5sig_key *key; - - /* Copy over the MD5 key from the original socket */ - key = tcp_md5_do_lookup(listener, (union tcp_md5_addr *)&inet->inet_daddr, - AF_INET); - if (key) { - /* We're using one, so create a matching key - * on the sk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(sk, (union tcp_md5_addr *)&inet->inet_daddr, - AF_INET, 32, key->key, key->keylen, GFP_ATOMIC); - sk_nocaps_add(sk, NETIF_F_GSO_MASK); - } -} - -#if IS_ENABLED(CONFIG_IPV6) -void tcp_v6_md5_syn_recv_sock(const struct sock *listener, struct sock *sk) -{ - struct tcp_md5sig_key *key; - - /* Copy over the MD5 key from the original socket */ - key = tcp_v6_md5_do_lookup(listener, &sk->sk_v6_daddr); - if (key) { - /* We're using one, so create a matching key - * on the newsk structure. If we fail to get - * memory, then we end up not copying the key - * across. Shucks. - */ - tcp_md5_do_add(sk, (union tcp_md5_addr *)&sk->sk_v6_daddr, - AF_INET6, 128, key->key, key->keylen, - sk_gfp_mask(sk, GFP_ATOMIC)); - } -} -EXPORT_SYMBOL_GPL(tcp_v6_md5_syn_recv_sock); struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, const struct sock *addr_sk) @@ -971,25 +1002,6 @@ struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, EXPORT_SYMBOL_GPL(tcp_v6_md5_lookup); #endif -void tcp_md5_time_wait(struct sock *sk, struct inet_timewait_sock *tw) -{ - struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); - struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *key; - - /* The timewait bucket does not have the key DB from the - * sock structure. We just make a quick copy of the - * md5 key being used (if indeed we are using one) - * so the timewait ack generating code has the key. - */ - tcptw->tw_md5_key = NULL; - key = tp->af_specific->md5_lookup(sk, sk); - if (key) { - tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool()); - } -} - static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, const struct tcp_md5sig_key *key) { @@ -1039,13 +1051,17 @@ static int tcp_diag_put_md5sig(struct sk_buff *skb, int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb) { if (net_admin) { + struct tcp_md5_extopt *md5_opt; struct tcp_md5sig_info *md5sig; int err = 0; rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) - err = tcp_diag_put_md5sig(skb, md5sig); + md5_opt = tcp_md5_opt_find(sk); + if (md5_opt) { + md5sig = rcu_dereference(md5_opt->md5sig_info); + if (md5sig) + err = tcp_diag_put_md5sig(skb, md5sig); + } rcu_read_unlock(); if (err < 0) return err; @@ -1060,15 +1076,19 @@ int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin) int size = 0; if (net_admin && sk_fullsock(sk)) { + struct tcp_md5_extopt *md5_opt; const struct tcp_md5sig_info *md5sig; const struct tcp_md5sig_key *key; size_t md5sig_count = 0; rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) { - hlist_for_each_entry_rcu(key, &md5sig->head, node) - md5sig_count++; + md5_opt = tcp_md5_opt_find(sk); + if (md5_opt) { + md5sig = rcu_dereference(md5_opt->md5sig_info); + if (md5sig) { + hlist_for_each_entry_rcu(key, &md5sig->head, node) + md5sig_count++; + } } rcu_read_unlock(); size += nla_total_size(md5sig_count * @@ -1079,6 +1099,260 @@ int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin) } EXPORT_SYMBOL_GPL(tcp_md5_diag_get_aux_size); +static int tcp_md5_extopt_add_header_len(const struct sock *orig, + const struct sock *sk, + struct tcp_extopt_store *store) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (tp->af_specific->md5_lookup(orig, sk)) + return TCPOLEN_MD5SIG_ALIGNED; + + return 0; +} + +static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, + unsigned int remaining, + struct tcp_out_options *opts, + const struct sock *sk, + struct tcp_extopt_store *store) +{ + int ret = 0; + + if (sk_fullsock(sk)) { + struct tcp_sock *tp = tcp_sk(sk); + + opts->md5 = tp->af_specific->md5_lookup(sk, sk); + } else { + struct request_sock *req = inet_reqsk(sk); + struct sock *listener = req->rsk_listener; + + /* Coming from tcp_make_synack, unlock is in + * tcp_md5_extopt_write + */ + rcu_read_lock(); + + opts->md5 = tcp_rsk(req)->af_specific->req_md5_lookup(listener, sk); + + if (!opts->md5) + rcu_read_unlock(); + } + + if (unlikely(opts->md5)) { + ret = TCPOLEN_MD5SIG_ALIGNED; + opts->options |= OPTION_MD5; + + /* Don't use TCP timestamps with TCP_MD5 */ + if ((opts->options & OPTION_TS)) { + ret -= TCPOLEN_TSTAMP_ALIGNED; + + /* When TS are enabled, Linux puts the SACK_OK + * next to the timestamp option, thus not accounting + * for its space. Here, we disable timestamps, thus + * we need to account for the space. + */ + if (opts->options & OPTION_SACK_ADVERTISE) + ret += TCPOLEN_SACKPERM_ALIGNED; + } + + opts->options &= ~OPTION_TS; + opts->tsval = 0; + opts->tsecr = 0; + + if (!sk_fullsock(sk)) { + struct request_sock *req = inet_reqsk(sk); + + inet_rsk(req)->tstamp_ok = 0; + } + } + + return ret; +} + +static __be32 *tcp_md5_extopt_write(__be32 *ptr, struct sk_buff *skb, + struct tcp_out_options *opts, + struct sock *sk, + struct tcp_extopt_store *store) +{ + if (unlikely(OPTION_MD5 & opts->options)) { +#if IS_ENABLED(CONFIG_IPV6) + const struct in6_addr *addr6; + + if (sk_fullsock(sk)) { + addr6 = &sk->sk_v6_daddr; + } else { + BUG_ON(sk->sk_state != TCP_NEW_SYN_RECV); + addr6 = &inet_rsk(inet_reqsk(sk))->ir_v6_rmt_addr; + } +#endif + + *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | + (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); + + if (sk_fullsock(sk)) + sk_nocaps_add(sk, NETIF_F_GSO_MASK); + + /* Calculate the MD5 hash, as we have all we need now */ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6 && !ipv6_addr_v4mapped(addr6)) + tcp_v6_md5_hash_skb((__u8 *)ptr, opts->md5, sk, skb); + else +#endif + tcp_v4_md5_hash_skb((__u8 *)ptr, opts->md5, sk, skb); + + ptr += 4; + + /* Coming from tcp_make_synack */ + if (!sk_fullsock(sk)) + rcu_read_unlock(); + } + + return ptr; +} + +static struct tcp_md5_extopt *__tcp_md5_extopt_copy(struct request_sock *req, + const struct tcp_md5sig_key *key, + const union tcp_md5_addr *addr, + int family) +{ + struct tcp_md5_extopt *md5_opt = NULL; + struct tcp_md5sig_info *md5sig; + struct tcp_md5sig_key *newkey; + + md5_opt = tcp_md5_alloc_store(req_to_sk(req)); + if (!md5_opt) + goto err; + + md5sig = kmalloc(sizeof(*md5sig), GFP_ATOMIC); + if (!md5sig) + goto err_md5sig; + + INIT_HLIST_HEAD(&md5sig->head); + rcu_assign_pointer(md5_opt->md5sig_info, md5sig); + + newkey = kmalloc(sizeof(*newkey), GFP_ATOMIC); + if (!newkey) + goto err_newkey; + + memcpy(newkey->key, key->key, key->keylen); + newkey->keylen = key->keylen; + newkey->family = family; + newkey->prefixlen = 32; + memcpy(&newkey->addr, addr, + (family == AF_INET6) ? sizeof(struct in6_addr) : + sizeof(struct in_addr)); + hlist_add_head_rcu(&newkey->node, &md5sig->head); + + return md5_opt; + +err_newkey: + kfree(md5sig); +err_md5sig: + kfree_rcu(md5_opt, rcu); +err: + return NULL; +} + +static struct tcp_extopt_store *tcp_md5_v4_extopt_copy(const struct sock *listener, + struct request_sock *req) +{ + struct inet_request_sock *ireq = inet_rsk(req); + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_md5_do_lookup(listener, + (union tcp_md5_addr *)&ireq->ir_rmt_addr, + AF_INET); + if (!key) + return NULL; + + return (struct tcp_extopt_store *)__tcp_md5_extopt_copy(req, key, + (union tcp_md5_addr *)&ireq->ir_rmt_addr, + AF_INET); +} + +#if IS_ENABLED(CONFIG_IPV6) +static struct tcp_extopt_store *tcp_md5_v6_extopt_copy(const struct sock *listener, + struct request_sock *req) +{ + struct inet_request_sock *ireq = inet_rsk(req); + struct tcp_md5sig_key *key; + + /* Copy over the MD5 key from the original socket */ + key = tcp_v6_md5_do_lookup(listener, &ireq->ir_v6_rmt_addr); + if (!key) + return NULL; + + return (struct tcp_extopt_store *)__tcp_md5_extopt_copy(req, key, + (union tcp_md5_addr *)&ireq->ir_v6_rmt_addr, + AF_INET6); +} +#endif + +/* We are creating a new request-socket, based on the listener's key that + * matches the IP-address. Thus, we need to create a new tcp_extopt_store, and + * store the matching key in there for the request-sock. + */ +static struct tcp_extopt_store *tcp_md5_extopt_copy(struct sock *listener, + struct request_sock *req, + struct tcp_options_received *opt, + struct tcp_extopt_store *store) +{ +#if IS_ENABLED(CONFIG_IPV6) + struct inet_request_sock *ireq = inet_rsk(req); + + if (ireq->ireq_family == AF_INET6) + return tcp_md5_v6_extopt_copy(listener, req); +#endif + return tcp_md5_v4_extopt_copy(listener, req); +} + +/* Moving from a request-sock to a full socket means we need to account for + * the memory and set GSO-flags. When moving from a full socket to ta time-wait + * socket we also need to adjust the memory accounting. + */ +static struct tcp_extopt_store *tcp_md5_extopt_move(struct sock *from, + struct sock *to, + struct tcp_extopt_store *store) +{ + struct tcp_md5_extopt *md5_opt = tcp_extopt_to_md5(store); + unsigned int size = sizeof(struct tcp_md5sig_key); + + if (sk_fullsock(to)) { + /* From request-sock to full socket */ + + if (size > sysctl_optmem_max || + atomic_read(&to->sk_omem_alloc) + size >= sysctl_optmem_max) { + tcp_md5_extopt_destroy(store); + return NULL; + } + + sk_nocaps_add(to, NETIF_F_GSO_MASK); + atomic_add(size, &to->sk_omem_alloc); + } else if (sk_fullsock(from)) { + /* From full socket to time-wait-socket */ + atomic_sub(size, &from->sk_omem_alloc); + } + + md5_opt->sk = to; + + return store; +} + +static void tcp_md5_extopt_destroy(struct tcp_extopt_store *store) +{ + struct tcp_md5_extopt *md5_opt = tcp_extopt_to_md5(store); + + /* Clean up the MD5 key list, if any */ + if (md5_opt) { + tcp_clear_md5_list(md5_opt); + kfree_rcu(md5_opt->md5sig_info, rcu); + md5_opt->md5sig_info = NULL; + + kfree_rcu(md5_opt, rcu); + } +} + const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { .md5_lookup = tcp_v4_md5_lookup, .calc_md5_hash = tcp_v4_md5_hash_skb, diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index f33214b29167..3da1c823240b 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -22,7 +22,6 @@ #include #include #include -#include #include #include #include @@ -295,9 +294,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) tcp_extopt_move(sk, (struct sock *)tw); INIT_HLIST_HEAD(&tp->tcp_option_list); } -#ifdef CONFIG_TCP_MD5SIG - tcp_md5_time_wait(sk, tw); -#endif /* Get the TIME_WAIT timeout firing. */ if (timeo < rto) @@ -332,10 +328,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) void tcp_twsk_destructor(struct sock *sk) { -#ifdef CONFIG_TCP_MD5SIG - tcp_md5_twsk_destructor(sk); -#endif - if (unlikely(!hlist_empty(&tcp_twsk(sk)->tcp_option_list))) tcp_extopt_destroy(sk); } @@ -520,10 +512,6 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->tcp_header_len = sizeof(struct tcphdr); } newtp->tsoffset = treq->ts_off; -#ifdef CONFIG_TCP_MD5SIG - newtp->md5sig_info = NULL; /*XXX*/ - tcp_md5_add_header_len(sk, newsk); -#endif if (unlikely(!hlist_empty(&treq->tcp_option_list))) newtp->tcp_header_len += tcp_extopt_add_header(req_to_sk(req), newsk); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 137645753abb..41bd8a791b0d 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -42,7 +42,6 @@ #include #include #include -#include #include @@ -424,14 +423,6 @@ static void tcp_options_write(__be32 *ptr, struct sk_buff *skb, struct sock *sk, extopt_list = tcp_extopt_get_list(sk); - if (unlikely(OPTION_MD5 & options)) { - *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | - (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); - /* overload cookie hash location */ - opts->hash_location = (__u8 *)ptr; - ptr += 4; - } - if (unlikely(opts->mss)) { *ptr++ = htonl((TCPOPT_MSS << 24) | (TCPOLEN_MSS << 16) | @@ -527,14 +518,6 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, unsigned int remaining = MAX_TCP_OPTION_SPACE; struct tcp_fastopen_request *fastopen = tp->fastopen_req; -#ifdef CONFIG_TCP_MD5SIG - opts->md5 = tp->af_specific->md5_lookup(sk, sk); - if (opts->md5) { - opts->options |= OPTION_MD5; - remaining -= TCPOLEN_MD5SIG_ALIGNED; - } -#endif - /* We always get an MSS option. The option bytes which will be seen in * normal data packets should timestamps be used, must be in the MSS * advertised. But we subtract them from tp->mss_cache so that @@ -547,7 +530,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, opts->mss = tcp_advertise_mss(sk); remaining -= TCPOLEN_MSS_ALIGNED; - if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !opts->md5)) { + if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; @@ -596,20 +579,6 @@ static unsigned int tcp_synack_options(const struct sock *sk, struct inet_request_sock *ireq = inet_rsk(req); unsigned int remaining = MAX_TCP_OPTION_SPACE; -#ifdef CONFIG_TCP_MD5SIG - if (opts->md5) { - opts->options |= OPTION_MD5; - remaining -= TCPOLEN_MD5SIG_ALIGNED; - - /* We can't fit any SACK blocks in a packet with MD5 + TS - * options. There was discussion about disabling SACK - * rather than TS in order to fit in better with old, - * buggy kernels, but that was deemed to be unnecessary. - */ - ireq->tstamp_ok &= !ireq->sack_ok; - } -#endif - /* We always send an MSS option. */ opts->mss = mss; remaining -= TCPOLEN_MSS_ALIGNED; @@ -670,16 +639,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb size += TCPOLEN_TSTAMP_ALIGNED; } -#ifdef CONFIG_TCP_MD5SIG - opts->md5 = tp->af_specific->md5_lookup(sk, sk); - if (unlikely(opts->md5)) { - opts->options |= OPTION_MD5; - size += TCPOLEN_MD5SIG_ALIGNED; - } -#else - opts->md5 = NULL; -#endif - if (unlikely(!hlist_empty(&tp->tcp_option_list))) size += tcp_extopt_prepare(skb, 0, MAX_TCP_OPTION_SPACE - size, opts, tcp_to_sk(tp)); @@ -1082,14 +1041,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, th->window = htons(min(tp->rcv_wnd, 65535U)); } tcp_options_write((__be32 *)(th + 1), skb, sk, &opts); -#ifdef CONFIG_TCP_MD5SIG - /* Calculate the MD5 hash, as we have all we need now */ - if (opts.md5) { - sk_nocaps_add(sk, NETIF_F_GSO_MASK); - tp->af_specific->calc_md5_hash(opts.hash_location, - opts.md5, sk, skb); - } -#endif icsk->icsk_af_ops->send_check(sk, skb); @@ -3159,10 +3110,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #endif skb->skb_mstamp = tcp_clock_us(); -#ifdef CONFIG_TCP_MD5SIG - rcu_read_lock(); - opts.md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); -#endif skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4); tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, foc) + sizeof(*th); @@ -3189,15 +3136,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, tcp_options_write((__be32 *)(th + 1), skb, req_to_sk(req), &opts); __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); -#ifdef CONFIG_TCP_MD5SIG - /* Okay, we have all we need - do the md5 hash if needed */ - if (opts.md5) - tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location, - opts.md5, - req_to_sk(req), skb); - rcu_read_unlock(); -#endif - /* Do not fool tcpdump (if any), clean our debris */ skb->tstamp = 0; return skb; @@ -3238,10 +3176,6 @@ static void tcp_connect_init(struct sock *sk) if (sock_net(sk)->ipv4.sysctl_tcp_timestamps) tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED; -#ifdef CONFIG_TCP_MD5SIG - tcp_md5_add_header_len(sk, sk); -#endif - if (unlikely(!hlist_empty(&tp->tcp_option_list))) tp->tcp_header_len += tcp_extopt_add_header(sk, sk); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index e9b72d794140..16cbd6ec2063 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -577,20 +577,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; -#ifdef CONFIG_TCP_MD5SIG -{ - int ret; - - ret = tcp_v6_md5_send_response_prepare(skb, 0, - MAX_TCP_OPTION_SPACE - tot_len, - &extraopts, sk); - - if (ret == -1) - goto out; - - tot_len += ret; -} -#endif if (sk) extopt_list = tcp_extopt_get_list(sk); @@ -639,11 +625,6 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 *topt++ = htonl(tsecr); } -#ifdef CONFIG_TCP_MD5SIG - if (extraopts.md5) - tcp_v6_md5_send_response_write(topt, skb, t1, &extraopts, sk); -#endif - if (unlikely(extopt_list && !hlist_empty(extopt_list))) tcp_extopt_response_write(topt, skb, t1, &extraopts, sk); @@ -957,10 +938,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6; newinet->inet_rcv_saddr = LOOPBACK4_IPV6; -#ifdef CONFIG_TCP_MD5SIG - tcp_v6_md5_syn_recv_sock(sk, newsk); -#endif - if (__inet_inherit_port(sk, newsk) < 0) { inet_csk_prepare_forced_close(newsk); tcp_done(newsk); From patchwork Mon Dec 18 21:51:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850428 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="iykCQqYD"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrn2nbTz9s7v for ; Tue, 19 Dec 2017 08:52:09 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937629AbdLRVwH (ORCPT ); Mon, 18 Dec 2017 16:52:07 -0500 Received: from mail-out2.apple.com ([17.151.62.25]:53894 "EHLO mail-in2.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S937444AbdLRVvk (ORCPT ); Mon, 18 Dec 2017 16:51:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633900; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=VJAmCZILm74zxNzoe5BYXeNe+Txz9I2uWrg7Il5ePjQ=; b=iykCQqYDlrsW/5Ql5TRQzIXeu0YbNuK+F7FixFjU4MZPJ6TZZguuZ/Kg5M9k209z VBAPOateIXN1/q8zzTC3M2aF87OE6BcsZGcGfhskJ8Bw5eSWiQ0ohfqIne3GIqDL u/W1D9v8wSUZpZcUvmTCqXEHrPza+x6KdRdiAE9Fo1z4kYcbAE7SK7IJYUk9ZftF L1K8A/YxoS5CcgKDrzCol6veINqD2OwiHVt8AIs98B1pghBosAfeDjnnHTG+iKvc DbG61ZwLiUkl1RZUPcsRCQvji+ddOX8UXRgQJftWiMzxu+oI6lKf2bNyHEPetZzA uEv89DGxZBZfjuzsrOF4Tw==; Received: from relay2.apple.com (relay2.apple.com [17.128.113.67]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in2.apple.com (Apple Secure Mail Relay) with SMTP id 8C.62.22347.C68383A5; Mon, 18 Dec 2017 13:51:40 -0800 (PST) X-AuditID: 11973e11-163b19c00000574b-f3-5a38386c8464 Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay2.apple.com (Apple SCV relay) with SMTP id 2A.5C.07440.B68383A5; Mon, 18 Dec 2017 13:51:40 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600HVHFE3VT70@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:39 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 13/14] tcp_md5: Cleanup TCP-code Date: Mon, 18 Dec 2017 13:51:08 -0800 Message-id: <20171218215109.38700-14-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FDorJtjYRFlsOwkt8WXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujG2TF7EU/JjJ WHHt/U6WBsZ3NV2MnBwSAiYSy78cZO5i5OIQEljNJDHzw0+WLkYOsMTimzIQ8UOMEvOf/WQG iTMLyEscPC8L0isk0Mgk8eVlPIgtLCAp0X3nDjOIzSagJfH2djsriC0iICXxccd2dhCbWaCZ UWLnHBGIej2JppbvYDUsAqoS6+cuB+vlFTCXOHtiDyPEbfISi7/vZAOxOYHiPV8fsUPsNZM4 //MlE8htEgJH2CT+f/vKPIFRcBbCeQsYGVcxCuUmZuboZuYZ6SUWFOSk6iXn525iBAXmdDvB HYzHV1kdYhTgYFTi4Z1x1TxKiDWxrLgy9xCjNAeLkjjvxWqTKCGB9MSS1OzU1ILUovii0pzU 4kOMTBycUg2MeYf6w83flV6X5L60m02kJXR+vUvKPmbv2cW71GSL1qzUfrHq8carb2QnnHng vmNl8NScqjWuh+vbauI29ebt02t+rr1ye+SCO/s8Yt5OXmrxy8XAeOn5+yYNOSelJFOuS+2r Vb9sLCp8ke2fqKj9P7/gSUwTSgSZX4ivP/9xrdzu2SVb30ZnKbEUZyQaajEXFScCAG610agt AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8QzfHwiLK4HoXm8WXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAlihDm7T8ovLEohSFouSCElul4ozElPzy eEtjI1OHxIKCnFS95PxcJX07m5TUnMyy1CJ9uwTDjG2TF7EU/JjJWHHt/U6WBsZ3NV2MHBwS AiYSi2/KdDFycQgJHGKUmP/sJzNInFlAXuLgedkuRk6geCOTxJeX8SC2sICkRPedO8wgNpuA lsTb2+2sILaIgJTExx3b2UFsZoFmRomdc0Qg6vUkmlq+g9WwCKhKrJ+7HKyXV8Bc4uyJPYwg tgTQqsXfd7KB2JxA8Z6vj9gh9ppJnP/5kmkCI98shIsWMDKuYhQoSs1JrDTSg/t1EyM4LAud dzAeW2Z1iFGAg1GJh3fGVfMoIdbEsuLKXKDXOJiVRHj9zgKFeFMSK6tSi/Lji0pzUosPMfoA 3TaRWUo0OR8YM3kl8YbGFsaWJhYGBiaWZiY4hJXEed1XAc0SSE8sSc1OTS1ILYIZx8TBKdXA WC0z+e7voOdrKlmPn3HIclfXl/upzDN747vnvxYlT7q13/BZf94Ph26b/j0nuHtcy882VjPq 1v/53l4y+31W6u5wnp9bFryzVJs1/cas3P9pqqXS36cH1qbrSbRMmy6Uu7Od7eEPQWsHl0MS N29YXL4z+y+L33wr61bNbHfvf+9PBcn9qWU6qsQCTBWGWsxFxYkApGmwSHgCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Now that we have consolidated the TCP_MD5 output path, we can cleanup TCP and its callbacks to MD5. These callbacks are solely there to handle the different address-familiese (v4, v6 and v4mapped). Now that we have isolated the TCP_MD5-code it is acceptable to add a bit more complexity inside tcp_md5.c to handle these address-families at the benefit of getting rid of these callbacks in tcp_sock, together with its assignments in tcp_v4/6_connect,... Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp.h | 5 - include/linux/tcp_md5.h | 18 +-- include/net/tcp.h | 24 ---- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_ipv4.c | 8 -- net/ipv4/tcp_md5.c | 340 ++++++++++++++++++++++-------------------------- net/ipv6/tcp_ipv6.c | 17 --- 7 files changed, 155 insertions(+), 259 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 034fbd9e0a38..5278387fabe7 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -377,11 +377,6 @@ struct tcp_sock { * while socket was owned by user. */ -#ifdef CONFIG_TCP_MD5SIG -/* TCP AF-Specific parts; only used by MD5 Signature support so far */ - const struct tcp_sock_af_ops *af_specific; -#endif - /* TCP fastopen related information */ struct tcp_fastopen_request *fastopen_req; /* fastopen_rsk points to request_sock that resulted in this big diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h index 8dee4fc3dc7f..509fc36335e7 100644 --- a/include/linux/tcp_md5.h +++ b/include/linux/tcp_md5.h @@ -26,28 +26,14 @@ struct tcp_md5sig_key { struct rcu_head rcu; }; -extern const struct tcp_sock_af_ops tcp_sock_ipv4_specific; -extern const struct tcp_sock_af_ops tcp_sock_ipv6_specific; -extern const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; - /* - functions */ -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, const struct sk_buff *skb); -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk); +int tcp_md5_parse_keys(struct sock *sk, int optname, char __user *optval, + int optlen); bool tcp_v4_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); -struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, - const struct sock *addr_sk); - -int tcp_v6_md5_hash_skb(char *md5_hash, - const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb); - bool tcp_v6_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb); diff --git a/include/net/tcp.h b/include/net/tcp.h index e955c5f0997f..baf0a6989a79 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1729,32 +1729,8 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, const struct tcp_request_sock_ops *af_ops, struct sock *sk, struct sk_buff *skb); -/* TCP af-specific functions */ -struct tcp_sock_af_ops { -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *(*md5_lookup) (const struct sock *sk, - const struct sock *addr_sk); - int (*calc_md5_hash)(char *location, - const struct tcp_md5sig_key *md5, - const struct sock *sk, - const struct sk_buff *skb); - int (*md5_parse)(struct sock *sk, - int optname, - char __user *optval, - int optlen); -#endif -}; - struct tcp_request_sock_ops { u16 mss_clamp; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *(*req_md5_lookup)(const struct sock *sk, - const struct sock *addr_sk); - int (*calc_md5_hash) (char *location, - const struct tcp_md5sig_key *md5, - const struct sock *sk, - const struct sk_buff *skb); -#endif void (*init_req)(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 29f3ce8a0b54..8b6f5efe9509 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2795,7 +2795,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, case TCP_MD5SIG: case TCP_MD5SIG_EXT: /* Read the IP->Key mappings from userspace */ - err = tp->af_specific->md5_parse(sk, optname, optval, optlen); + err = tcp_md5_parse_keys(sk, optname, optval, optlen); break; #endif case TCP_USER_TIMEOUT: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 356bf41ec73a..670d7751f814 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -889,10 +889,6 @@ struct request_sock_ops tcp_request_sock_ops __read_mostly = { static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = { .mss_clamp = TCP_MSS_DEFAULT, -#ifdef CONFIG_TCP_MD5SIG - .req_md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, -#endif .init_req = tcp_v4_init_req, #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v4_init_sequence, @@ -1450,10 +1446,6 @@ static int tcp_v4_init_sock(struct sock *sk) icsk->icsk_af_ops = &ipv4_specific; -#ifdef CONFIG_TCP_MD5SIG - tcp_sk(sk)->af_specific = &tcp_sock_ipv4_specific; -#endif - return 0; } diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c index 64e5b4420ce9..052f5a587783 100644 --- a/net/ipv4/tcp_md5.c +++ b/net/ipv4/tcp_md5.c @@ -336,12 +336,13 @@ static int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, return crypto_ahash_update(hp->md5_req); } -static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) +int tcp_md5_parse_keys(struct sock *sk, int optname, char __user *optval, + int optlen) { + u8 prefixlen = 32, maxprefixlen; + union tcp_md5_addr *tcpmd5addr; struct tcp_md5sig cmd; - struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; - u8 prefixlen = 32; + unsigned short family; if (optlen < sizeof(cmd)) return -EINVAL; @@ -349,76 +350,48 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, if (copy_from_user(&cmd, optval, sizeof(cmd))) return -EFAULT; - if (sin->sin_family != AF_INET) - return -EINVAL; - - if (optname == TCP_MD5SIG_EXT && - cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { - prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 32) - return -EINVAL; - } + family = cmd.tcpm_addr.ss_family; - if (!cmd.tcpm_keylen) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen); - - if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) + if (family != AF_INET && family != AF_INET6) return -EINVAL; - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin->sin_addr.s_addr, - AF_INET, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, - GFP_KERNEL); -} - -#if IS_ENABLED(CONFIG_IPV6) -static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, - char __user *optval, int optlen) -{ - struct tcp_md5sig cmd; - struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; - u8 prefixlen; - - if (optlen < sizeof(cmd)) + if (sk->sk_family != family) return -EINVAL; - if (copy_from_user(&cmd, optval, sizeof(cmd))) - return -EFAULT; + if (family == AF_INET6) { + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; - if (sin6->sin6_family != AF_INET6) - return -EINVAL; + if (!ipv6_addr_v4mapped(&sin6->sin6_addr)) { + tcpmd5addr = (union tcp_md5_addr *)&sin6->sin6_addr; + maxprefixlen = 128; + } else { + tcpmd5addr = (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3]; + family = AF_INET; + maxprefixlen = 32; + } + } else { + struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.tcpm_addr; + + tcpmd5addr = (union tcp_md5_addr *)&sin->sin_addr; + maxprefixlen = 32; + } if (optname == TCP_MD5SIG_EXT && cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { prefixlen = cmd.tcpm_prefixlen; - if (prefixlen > 128 || (ipv6_addr_v4mapped(&sin6->sin6_addr) && - prefixlen > 32)) + if (prefixlen > maxprefixlen) return -EINVAL; - } else { - prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128; } - if (!cmd.tcpm_keylen) { - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen); - return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen); - } + if (!cmd.tcpm_keylen) + return tcp_md5_do_del(sk, tcpmd5addr, family, prefixlen); if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) return -EINVAL; - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], - AF_INET, prefixlen, cmd.tcpm_key, - cmd.tcpm_keylen, GFP_KERNEL); - - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen, cmd.tcpm_key, + return tcp_md5_do_add(sk, tcpmd5addr, family, prefixlen, cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL); } -#endif static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, __be32 daddr, __be32 saddr, @@ -670,6 +643,102 @@ static int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, return 0; } +static int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, + const struct sock *sk, const struct sk_buff *skb) +{ + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + __be32 saddr, daddr; + + if (sk) { /* valid for establish/request sockets */ + saddr = sk->sk_rcv_saddr; + daddr = sk->sk_daddr; + } else { + const struct iphdr *iph = ip_hdr(skb); + + saddr = iph->saddr; + daddr = iph->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_v6_md5_hash_skb(char *md5_hash, + const struct tcp_md5sig_key *key, + const struct sock *sk, + const struct sk_buff *skb) +{ + const struct in6_addr *saddr, *daddr; + struct tcp_md5sig_pool *hp; + struct ahash_request *req; + const struct tcphdr *th = tcp_hdr(skb); + + if (sk) { /* valid for establish/request sockets */ + saddr = &sk->sk_v6_rcv_saddr; + daddr = &sk->sk_v6_daddr; + } else { + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + + saddr = &ip6h->saddr; + daddr = &ip6h->daddr; + } + + hp = tcp_get_md5sig_pool(); + if (!hp) + goto clear_hash_noput; + req = hp->md5_req; + + if (crypto_ahash_init(req)) + goto clear_hash; + + if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + goto clear_hash; + if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + goto clear_hash; + if (tcp_md5_hash_key(hp, key)) + goto clear_hash; + ahash_request_set_crypt(req, NULL, md5_hash, 0); + if (crypto_ahash_final(req)) + goto clear_hash; + + tcp_put_md5sig_pool(); + return 0; + +clear_hash: + tcp_put_md5sig_pool(); +clear_hash_noput: + memset(md5_hash, 0, 16); + return 1; +} +#endif + static int tcp_v4_md5_send_response_prepare(struct sk_buff *skb, u8 flags, unsigned int remaining, struct tcp_out_options *opts, @@ -784,114 +853,14 @@ static __be32 *tcp_md5_send_response_write(__be32 *ptr, struct sk_buff *orig, return tcp_v4_md5_send_response_write(ptr, orig, th, opts, sk); } -struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) +static struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) { const union tcp_md5_addr *addr; addr = (const union tcp_md5_addr *)&addr_sk->sk_daddr; return tcp_md5_do_lookup(sk, addr, AF_INET); } -EXPORT_SYMBOL(tcp_v4_md5_lookup); - -int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - __be32 saddr, daddr; - - if (sk) { /* valid for establish/request sockets */ - saddr = sk->sk_rcv_saddr; - daddr = sk->sk_daddr; - } else { - const struct iphdr *iph = ip_hdr(skb); - - saddr = iph->saddr; - daddr = iph->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} -EXPORT_SYMBOL(tcp_v4_md5_hash_skb); - -#if IS_ENABLED(CONFIG_IPV6) -int tcp_v6_md5_hash_skb(char *md5_hash, - const struct tcp_md5sig_key *key, - const struct sock *sk, - const struct sk_buff *skb) -{ - const struct in6_addr *saddr, *daddr; - struct tcp_md5sig_pool *hp; - struct ahash_request *req; - const struct tcphdr *th = tcp_hdr(skb); - - if (sk) { /* valid for establish/request sockets */ - saddr = &sk->sk_v6_rcv_saddr; - daddr = &sk->sk_v6_daddr; - } else { - const struct ipv6hdr *ip6h = ipv6_hdr(skb); - - saddr = &ip6h->saddr; - daddr = &ip6h->daddr; - } - - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; - - if (crypto_ahash_init(req)) - goto clear_hash; - - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) - goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) - goto clear_hash; - if (tcp_md5_hash_key(hp, key)) - goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) - goto clear_hash; - - tcp_put_md5sig_pool(); - return 0; - -clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: - memset(md5_hash, 0, 16); - return 1; -} -EXPORT_SYMBOL_GPL(tcp_v6_md5_hash_skb); -#endif /* Called with rcu_read_lock() */ bool tcp_v4_inbound_md5_hash(const struct sock *sk, @@ -994,8 +963,8 @@ bool tcp_v6_inbound_md5_hash(const struct sock *sk, } EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); -struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, - const struct sock *addr_sk) +static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, + const struct sock *addr_sk) { return tcp_v6_md5_do_lookup(sk, &addr_sk->sk_v6_daddr); } @@ -1103,10 +1072,17 @@ static int tcp_md5_extopt_add_header_len(const struct sock *orig, const struct sock *sk, struct tcp_extopt_store *store) { - struct tcp_sock *tp = tcp_sk(sk); - - if (tp->af_specific->md5_lookup(orig, sk)) +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6 && + !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) { + if (tcp_v6_md5_lookup(orig, sk)) + return TCPOLEN_MD5SIG_ALIGNED; + } else +#endif +{ + if (tcp_v4_md5_lookup(orig, sk)) return TCPOLEN_MD5SIG_ALIGNED; +} return 0; } @@ -1120,19 +1096,29 @@ static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, int ret = 0; if (sk_fullsock(sk)) { - struct tcp_sock *tp = tcp_sk(sk); - - opts->md5 = tp->af_specific->md5_lookup(sk, sk); +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6 && !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) + opts->md5 = tcp_v6_md5_lookup(sk, sk); + else +#endif + opts->md5 = tcp_v4_md5_lookup(sk, sk); } else { struct request_sock *req = inet_reqsk(sk); struct sock *listener = req->rsk_listener; + struct inet_request_sock *ireq = inet_rsk(req); /* Coming from tcp_make_synack, unlock is in * tcp_md5_extopt_write */ rcu_read_lock(); - opts->md5 = tcp_rsk(req)->af_specific->req_md5_lookup(listener, sk); +#if IS_ENABLED(CONFIG_IPV6) + if (ireq->ireq_family == AF_INET6 && + !ipv6_addr_v4mapped(&ireq->ir_v6_rmt_addr)) + opts->md5 = tcp_v6_md5_lookup(listener, sk); + else +#endif + opts->md5 = tcp_v4_md5_lookup(listener, sk); if (!opts->md5) rcu_read_unlock(); @@ -1352,25 +1338,3 @@ static void tcp_md5_extopt_destroy(struct tcp_extopt_store *store) kfree_rcu(md5_opt, rcu); } } - -const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v4_parse_md5_keys, -}; - -#if IS_ENABLED(CONFIG_IPV6) -const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { - .md5_lookup = tcp_v6_md5_lookup, - .calc_md5_hash = tcp_v6_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -EXPORT_SYMBOL_GPL(tcp_sock_ipv6_specific); - -const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { - .md5_lookup = tcp_v4_md5_lookup, - .calc_md5_hash = tcp_v4_md5_hash_skb, - .md5_parse = tcp_v6_parse_md5_keys, -}; -EXPORT_SYMBOL_GPL(tcp_sock_ipv6_mapped_specific); -#endif diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 16cbd6ec2063..890616fc5591 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -208,9 +208,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, icsk->icsk_af_ops = &ipv6_mapped; sk->sk_backlog_rcv = tcp_v4_do_rcv; -#ifdef CONFIG_TCP_MD5SIG - tp->af_specific = &tcp_sock_ipv6_mapped_specific; -#endif err = tcp_v4_connect(sk, (struct sockaddr *)&sin, sizeof(sin)); @@ -218,9 +215,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, icsk->icsk_ext_hdr_len = exthdrlen; icsk->icsk_af_ops = &ipv6_specific; sk->sk_backlog_rcv = tcp_v6_do_rcv; -#ifdef CONFIG_TCP_MD5SIG - tp->af_specific = &tcp_sock_ipv6_specific; -#endif goto failure; } np->saddr = sk->sk_v6_rcv_saddr; @@ -543,10 +537,6 @@ struct request_sock_ops tcp6_request_sock_ops __read_mostly = { static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { .mss_clamp = IPV6_MIN_MTU - sizeof(struct tcphdr) - sizeof(struct ipv6hdr), -#ifdef CONFIG_TCP_MD5SIG - .req_md5_lookup = tcp_v6_md5_lookup, - .calc_md5_hash = tcp_v6_md5_hash_skb, -#endif .init_req = tcp_v6_init_req, #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v6_init_sequence, @@ -821,9 +811,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * inet_csk(newsk)->icsk_af_ops = &ipv6_mapped; newsk->sk_backlog_rcv = tcp_v4_do_rcv; -#ifdef CONFIG_TCP_MD5SIG - newtp->af_specific = &tcp_sock_ipv6_mapped_specific; -#endif newnp->ipv6_mc_list = NULL; newnp->ipv6_ac_list = NULL; @@ -1430,10 +1417,6 @@ static int tcp_v6_init_sock(struct sock *sk) icsk->icsk_af_ops = &ipv6_specific; -#ifdef CONFIG_TCP_MD5SIG - tcp_sk(sk)->af_specific = &tcp_sock_ipv6_specific; -#endif - return 0; } From patchwork Mon Dec 18 21:51:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Paasch X-Patchwork-Id: 850426 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=apple.com header.i=@apple.com header.b="Y8fF4YBj"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3z0vrc2YyTz9s7v for ; Tue, 19 Dec 2017 08:52:00 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937614AbdLRVv6 (ORCPT ); Mon, 18 Dec 2017 16:51:58 -0500 Received: from mail-out2.apple.com ([17.151.62.25]:53894 "EHLO mail-in2.apple.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S937543AbdLRVvq (ORCPT ); Mon, 18 Dec 2017 16:51:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1513633900; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-Version:Content-Type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=4cEGMDQl7LZWl7qnJIixz9oLce9Do75k7xQ9DxaFtfU=; b=Y8fF4YBj/OP0NG+i3DnI6OlUc7kbhhPLImRDb+hkF38qBDYaq4u29OLGBglwD+uX OkchrLUu7jffhri4u0nWXOWC0Ou7dp9yD4mMyte7VkJr9Fvh+8iesxRxudmGbP0w G3ZWy4hAcR5ptz8Tp8g9VYNdj5s1MdGtZPGldQsondscOG6MAMOY2ZBgnZH+PtGl PJrBXcMm/TUjLvWGrsJMn5wYUvZb9zUd8/IyqVd2v2TZhA+YbHNPH/R7v1pR+QBT m8O1THFpgFPkHB5hjAPaXgrPU/ATUkG0YJB5xupNgHZNwrPz1Er6v3qUonxObIWO axO3qViYKkbI3xdxdDd3MA==; Received: from relay2.apple.com (relay2.apple.com [17.128.113.67]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in2.apple.com (Apple Secure Mail Relay) with SMTP id DD.62.22347.C68383A5; Mon, 18 Dec 2017 13:51:40 -0800 (PST) X-AuditID: 11973e11-163b19c00000574b-fa-5a38386cffea Received: from nwk-mmpp-sz13.apple.com (nwk-mmpp-sz13.apple.com [17.128.115.216]) by relay2.apple.com (Apple SCV relay) with SMTP id 1C.5C.07440.C68383A5; Mon, 18 Dec 2017 13:51:40 -0800 (PST) Content-transfer-encoding: 7BIT Received: from localhost ([17.226.23.135]) by nwk-mmpp-sz13.apple.com (Oracle Communications Messaging Server 8.0.2.1.20171204 64bit (built Dec 4 2017)) with ESMTPSA id <0P1600LCLFE4YT00@nwk-mmpp-sz13.apple.com>; Mon, 18 Dec 2017 13:51:40 -0800 (PST) From: Christoph Paasch To: netdev@vger.kernel.org Cc: Eric Dumazet , Mat Martineau , Alexei Starovoitov Subject: [RFC 14/14] tcp_md5: Use TCP extra-options on the input path Date: Mon, 18 Dec 2017 13:51:09 -0800 Message-id: <20171218215109.38700-15-cpaasch@apple.com> X-Mailer: git-send-email 2.15.0 In-reply-to: <20171218215109.38700-1-cpaasch@apple.com> References: <20171218215109.38700-1-cpaasch@apple.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFLMWRmVeSWpSXmKPExsUi2FDorJtjYRFlsHyascWXn7fZLZ4ee8Ru 8beln8Xi2AIxBxaPBZtKPTat6mTzmHcy0OPzJrkAligum5TUnMyy1CJ9uwSujJZDmxkL3ptU XOrnbGD8qNXFyMkhIWAisWbpMaYuRi4OIYHVTBI3jj1hgkn8beqHShxilHh56iRbFyMHB7OA vMTB87IQ8UYmiUOLF7KCNAgLSEp037nDDGKzCWhJvL3dDhYXEZCS+LhjOzuIzSzQzCixc44I RL2rxMX/TWD1LAKqElc/rGUBsXkFzCW2fJnODHGEvMTi7zvZQGxOoHjP10dgc4QEzCTO/3wJ dpyEwBE2ie/T+xgnMArOQrhvASPjKkah3MTMHN3MPCO9xIKCnFS95PzcTYygwJxuJ7iD8fgq q0OMAhyMSjy8M66aRwmxJpYVV+YeYpTmYFES571YbRIlJJCeWJKanZpakFoUX1Sak1p8iJGJ g1OqgfG+VoLCToX73IZcn6dsENs4bdrXaRNmd2hN4jZvqOUt6ZdO1nO04tKoZtbZ5X8jvt3V yOLEpIw/Ej6sdcvC7641+DqBUdtI0cG+9N+RjLmzZt5et2dvjeydgCVye16f3smSt+eTplnm ESXe+PDErrYbRr+nfWv5+GVOYezjDTkHOjeW3G448VSJpTgj0VCLuag4EQDwdAVmLQIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsUi2FB8QzfHwiLK4PhBfYsvP2+zWzw99ojd 4m9LP4vFsQViDiweCzaVemxa1cnmMe9koMfnTXIBLFGGNmn5ReWJRSkKRckFJbZKxRmJKfnl 8ZbGRqYOiQUFOal6yfm5Svp2NimpOZllqUX6dgmGGS2HNjMWvDepuNTP2cD4UauLkZNDQsBE 4m9TP1MXIxeHkMAhRomXp06ydTFycDALyEscPC8LEW9kkji0eCErSIOwgKRE9507zCA2m4CW xNvb7WBxEQEpiY87trOD2MwCzYwSO+eIQNS7Slz83wRWzyKgKnH1w1oWEJtXwFxiy5fpzBBH yEss/r6TDcTmBIr3fH0ENkdIwEzi/M+XTBMY+WYhnLSAkXEVo0BRak5ipZEe3K+bGMFhWei8 g/HYMqtDjAIcjEo8vDOumkcJsSaWFVfmAv3Gwawkwut3FijEm5JYWZValB9fVJqTWnyI0Qfo tonMUqLJ+cCYySuJNzS2MLY0sTAwMLE0M8EhrCTO674KaJZAemJJanZqakFqEcw4Jg5OqQbG pPM8ptGrcuRnHc2TfGXy6e89hhN3Z3smHTn04V/aln6Lz/e+c31Vvmt57abwxEvuBuEprL+O e6rEJv5TPenckXcwbWZe25c7EjP2rDvm2r/Z4P2m9o2ir1g+9OikbI73yvNSSdSXjimYN19j auac+I3uK/wf8M6a3Wbxyerh3L8dyxaY6u9LU2IBpgpDLeai4kQAjXWmnXgCAAA= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The checks are now being done through the extra-option framework. For TCP MD5 this means that the check happens a bit later than usual. Signed-off-by: Christoph Paasch Reviewed-by: Mat Martineau --- include/linux/tcp_md5.h | 23 +---------------------- net/ipv4/tcp_input.c | 8 -------- net/ipv4/tcp_ipv4.c | 9 --------- net/ipv4/tcp_md5.c | 29 ++++++++++++++++++++++++----- net/ipv6/tcp_ipv6.c | 9 --------- 5 files changed, 25 insertions(+), 53 deletions(-) diff --git a/include/linux/tcp_md5.h b/include/linux/tcp_md5.h index 509fc36335e7..bef277f55b36 100644 --- a/include/linux/tcp_md5.h +++ b/include/linux/tcp_md5.h @@ -31,30 +31,9 @@ struct tcp_md5sig_key { int tcp_md5_parse_keys(struct sock *sk, int optname, char __user *optval, int optlen); -bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb); - -bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb); - int tcp_md5_diag_get_aux(struct sock *sk, bool net_admin, struct sk_buff *skb); int tcp_md5_diag_get_aux_size(struct sock *sk, bool net_admin); -#else - -static inline bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ - return false; -} - -static inline bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) -{ - return false; -} - -#endif - +#endif /* CONFIG_TCP_MD5SIG */ #endif /* _LINUX_TCP_MD5_H */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index db54bdbdee51..e4de06e28a85 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3754,14 +3754,6 @@ void tcp_parse_options(const struct net *net, TCP_SKB_CB(skb)->sacked = (ptr - 2) - (unsigned char *)th; } break; -#ifdef CONFIG_TCP_MD5SIG - case TCPOPT_MD5SIG: - /* - * The MD5 Hash has already been - * checked (see tcp_v{4,6}_do_rcv()). - */ - break; -#endif case TCPOPT_FASTOPEN: tcp_parse_fastopen_option( opsize - TCPOLEN_FASTOPEN_BASE, diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 670d7751f814..707ad1a343ba 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -62,7 +62,6 @@ #include #include #include -#include #include #include @@ -1249,11 +1248,6 @@ int tcp_v4_rcv(struct sk_buff *skb) struct sock *nsk; sk = req->rsk_listener; - if (unlikely(tcp_v4_inbound_md5_hash(sk, skb))) { - sk_drops_add(sk, skb); - reqsk_put(req); - goto discard_it; - } if (unlikely(sk->sk_state != TCP_LISTEN)) { inet_csk_reqsk_queue_drop_and_put(sk, req); goto lookup; @@ -1293,9 +1287,6 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; - if (tcp_v4_inbound_md5_hash(sk, skb)) - goto discard_and_relse; - nf_reset(skb); if (tcp_filter(sk, skb)) diff --git a/net/ipv4/tcp_md5.c b/net/ipv4/tcp_md5.c index 052f5a587783..723320d0741a 100644 --- a/net/ipv4/tcp_md5.c +++ b/net/ipv4/tcp_md5.c @@ -29,6 +29,10 @@ static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); static DEFINE_MUTEX(tcp_md5sig_mutex); static bool tcp_md5sig_pool_populated; +static bool tcp_inbound_md5_hash(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store); + static unsigned int tcp_md5_extopt_prepare(struct sk_buff *skb, u8 flags, unsigned int remaining, struct tcp_out_options *opts, @@ -76,6 +80,7 @@ struct tcp_md5_extopt { static const struct tcp_extopt_ops tcp_md5_extra_ops = { .option_kind = TCPOPT_MD5SIG, + .check = tcp_inbound_md5_hash, .prepare = tcp_md5_extopt_prepare, .write = tcp_md5_extopt_write, .response_prepare = tcp_md5_send_response_prepare, @@ -863,8 +868,8 @@ static struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, } /* Called with rcu_read_lock() */ -bool tcp_v4_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) +static bool tcp_v4_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) { /* This gets called for each TCP segment that arrives * so we want to be efficient. @@ -918,8 +923,8 @@ bool tcp_v4_inbound_md5_hash(const struct sock *sk, } #if IS_ENABLED(CONFIG_IPV6) -bool tcp_v6_inbound_md5_hash(const struct sock *sk, - const struct sk_buff *skb) +static bool tcp_v6_inbound_md5_hash(const struct sock *sk, + const struct sk_buff *skb) { const __u8 *hash_location = NULL; struct tcp_md5sig_key *hash_expected; @@ -961,7 +966,6 @@ bool tcp_v6_inbound_md5_hash(const struct sock *sk, return false; } -EXPORT_SYMBOL_GPL(tcp_v6_inbound_md5_hash); static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, const struct sock *addr_sk) @@ -971,6 +975,21 @@ static struct tcp_md5sig_key *tcp_v6_md5_lookup(const struct sock *sk, EXPORT_SYMBOL_GPL(tcp_v6_md5_lookup); #endif +static bool tcp_inbound_md5_hash(struct sock *sk, const struct sk_buff *skb, + struct tcp_options_received *opt_rx, + struct tcp_extopt_store *store) +{ + if (skb->protocol == htons(ETH_P_IP)) { + return tcp_v4_inbound_md5_hash(sk, skb); +#if IS_ENABLED(CONFIG_IPV6) + } else { + return tcp_v6_inbound_md5_hash(sk, skb); +#endif + } + + return false; +} + static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, const struct tcp_md5sig_key *key) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 890616fc5591..f5dc730d3abc 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -43,7 +43,6 @@ #include #include #include -#include #include #include @@ -1173,11 +1172,6 @@ static int tcp_v6_rcv(struct sk_buff *skb) struct sock *nsk; sk = req->rsk_listener; - if (tcp_v6_inbound_md5_hash(sk, skb)) { - sk_drops_add(sk, skb); - reqsk_put(req); - goto discard_it; - } if (unlikely(sk->sk_state != TCP_LISTEN)) { inet_csk_reqsk_queue_drop_and_put(sk, req); goto lookup; @@ -1214,9 +1208,6 @@ static int tcp_v6_rcv(struct sk_buff *skb) if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; - if (tcp_v6_inbound_md5_hash(sk, skb)) - goto discard_and_relse; - if (tcp_filter(sk, skb)) goto discard_and_relse; th = (const struct tcphdr *)skb->data;