From patchwork Thu Feb 4 15:35:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Craig Gallek X-Patchwork-Id: 579005 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id B02DC14056B for ; Fri, 5 Feb 2016 02:35:47 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966123AbcBDPfn (ORCPT ); Thu, 4 Feb 2016 10:35:43 -0500 Received: from mail-qg0-f51.google.com ([209.85.192.51]:35627 "EHLO mail-qg0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756701AbcBDPf2 (ORCPT ); Thu, 4 Feb 2016 10:35:28 -0500 Received: by mail-qg0-f51.google.com with SMTP id o11so44076907qge.2 for ; Thu, 04 Feb 2016 07:35:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=IYE808NjpwiHm7RzW3u9fzHXfbBSrRhfnVUWntg8dAQ=; b=YQtB4O+BcSHOVsdBKHk01VcRMhVCq8YbwDdJcE1pyJ6/yV9qds5Z1fP99YfZD0QHLc mzzK369G5LMaB3n57bTWEWl1U2HIzjF53zmuYMGoaZ+twGWWaei4KCWCZTPZVTpc+uxA v09P/cSq6zJUfwGvUKI7glxOKi+SoM0UB1BDCVH4i4bMXdqukkaAanIbjUfPEqR9j3aL Cy1XQQRGnXkfcxKxXueYIM7nhlxKgLIHBpAHzAK1hq96PfnhL8WHku7gNnGXiH4lm79o lfoXejsYoFMNZcOOUZaxzLnmlMP7QiZqD4ybcnI4Fh0D36jKYCMDoMjnKE79UN7lM6CN fJ+A== X-Gm-Message-State: AG10YOSRsWyI4tMFmf591pINZGKsXPOm5b/VCRRDld0KAum1IuLqyuX6WNKMsCodNIl7RlVf X-Received: by 10.140.131.17 with SMTP id 17mr10419681qhd.12.1454600127946; Thu, 04 Feb 2016 07:35:27 -0800 (PST) Received: from cgallek-warp18.nyc.corp.google.com ([172.29.18.56]) by smtp.gmail.com with ESMTPSA id f3sm5354879qge.44.2016.02.04.07.35.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 04 Feb 2016 07:35:27 -0800 (PST) From: Craig Gallek To: netdev@vger.kernel.org, David Miller Subject: [PATCH net-next 5/7] soreuseport: Prep for fast reuseport TCP socket selection Date: Thu, 4 Feb 2016 10:35:16 -0500 Message-Id: <1454600118-30152-6-git-send-email-kraigatgoog@gmail.com> X-Mailer: git-send-email 2.7.0.rc3.207.g0ac5344 In-Reply-To: <1454600118-30152-1-git-send-email-kraigatgoog@gmail.com> References: <1454600118-30152-1-git-send-email-kraigatgoog@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Craig Gallek Both of the lines in this patch probably should have been included in the initial implementation of this code for generic socket support, but weren't technically necessary since only UDP sockets were supported. First, the sk_reuseport_cb points to a structure which assumes each socket in the group has this pointer assigned at the same time it's added to the array in the structure. The sk_clone_lock function breaks this assumption. Since a child socket shouldn't implicitly be in a reuseport group, the simple fix is to clear the field in the clone. Second, the SO_ATTACH_REUSEPORT_xBPF socket options require that SO_REUSEPORT also be set first. For UDP sockets, this is easily enforced at bind-time since that process both puts the socket in the appropriate receive hlist and updates the reuseport structures. Since these operations can happen at two different times for TCP sockets (bind and listen) it must be explicitly checked to enforce the use of SO_REUSEPORT with SO_ATTACH_REUSEPORT_xBPF in the setsockopt call. Signed-off-by: Craig Gallek Acked-by: Eric Dumazet --- net/core/filter.c | 2 +- net/core/sock.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/net/core/filter.c b/net/core/filter.c index 94d26201080d..2a6e9562f1ab 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -1181,7 +1181,7 @@ static int __reuseport_attach_prog(struct bpf_prog *prog, struct sock *sk) if (bpf_prog_size(prog->len) > sysctl_optmem_max) return -ENOMEM; - if (sk_unhashed(sk)) { + if (sk_unhashed(sk) && sk->sk_reuseport) { err = reuseport_alloc(sk); if (err) return err; diff --git a/net/core/sock.c b/net/core/sock.c index 6c1c8bc93412..46dc8ad7d050 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1531,6 +1531,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) newsk = NULL; goto out; } + RCU_INIT_POINTER(newsk->sk_reuseport_cb, NULL); newsk->sk_err = 0; newsk->sk_priority = 0;