From patchwork Mon Sep 21 16:52:04 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Mack X-Patchwork-Id: 520370 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 0CEC8140187 for ; Tue, 22 Sep 2015 02:52:16 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932867AbbIUQwL (ORCPT ); Mon, 21 Sep 2015 12:52:11 -0400 Received: from svenfoo.org ([82.94.215.22]:53115 "EHLO mail.zonque.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932395AbbIUQwJ (ORCPT ); Mon, 21 Sep 2015 12:52:09 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.zonque.de (Postfix) with ESMTP id 96E7EC02BF; Mon, 21 Sep 2015 18:52:05 +0200 (CEST) Received: from mail.zonque.de ([127.0.0.1]) by localhost (rambrand.bugwerft.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zo-vyLAAfZel; Mon, 21 Sep 2015 18:52:05 +0200 (CEST) Received: from [192.168.178.71] (p5DDC7790.dip0.t-ipconnect.de [93.220.119.144]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.zonque.de (Postfix) with ESMTPSA id F2103C0197; Mon, 21 Sep 2015 18:52:04 +0200 (CEST) From: Daniel Mack Subject: Re: [PATCH RFC 0/3] Allow postponed netfilter handling for socket matches To: Florian Westphal References: <1442418180-14322-1-git-send-email-daniel@zonque.org> <20150916212100.GT24810@breakpoint.cc> <55FA902F.50006@zonque.org> <20150917160008.GA32700@breakpoint.cc> Cc: pablo@netfilter.org, daniel@iogearbox.net, netfilter-devel@vger.kernel.org, netdev@vger.kernel.org, balazs.scheidler@balabit.com X-Enigmail-Draft-Status: N1110 Message-ID: <560035B4.9010504@zonque.org> Date: Mon, 21 Sep 2015 18:52:04 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20150917160008.GA32700@breakpoint.cc> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi, Thanks for your feedback, Florian! On 09/17/2015 06:00 PM, Florian Westphal wrote: > Daniel Mack wrote: >> That would be a new netfilter hook then, something that is called after >> LOCAL_IN, for ingress only? In a sense, it would be called from the >> protocol handlers, just as my patches do right now, but instead of >> conditionally re-iterating the same rules again, we would walk a >> different chain? > > Yes, something like that. Obviously, you'll need to dru^W brib^W > convince a LOT of people before that could ever fly. > > I think we should not do this and that this 'match on ingress sk > properties' is just bad[tm]. > > f.e. you'd also have to move all of the stuff you want into > sock_common ... 8-( Hmm, I'm not sure whether I understand which problems you see, or which corner cases I am missing in my assessment. I did a quick test with the attached 4 patches that 1) Allow hook callbacks to look at the socket passed to nf_hook(), so skb->sk does not have to be set 2) Make nft_meta look at pkt->sk rather that skb->sk (only for cgroups as proof of concept) 3) Introduce a new POST_DEMUX netfilter chain (the name is not perfect, admittedly) 4) Iterate POST_DEMUX chains for v4 TCP and UDP unicast+multicast sockets. With some really trivial modifications to libnftnl/nftables (which just map strings to the new enum value), this works fine in my tests. Multicast receivers that match a netclass ID in the ruleset won't see any packets, while others do. Some more considerations: if we cannot determine a socket for a packet and hence don't deliver it, it's IMO perfectly fine not to run the netfilter rules for them. All we need to achieve with this chain is that for packets that _are_ delivered to a socket, all the necessary rules have been processed, at a time when we know who the final receiver of the skb is. I'm happy to discuss the side effects of such an approach. Thanks, Daniel From 1898df7d6a35967972bae412994623a8d7c262cd Mon Sep 17 00:00:00 2001 From: Daniel Mack Date: Wed, 16 Sep 2015 14:58:08 +0200 Subject: [PATCH RFC 4/4] net: tcp_ipv4, udp_ipv4: hook up post demux netfilter chains Run the POST_DEMUX netfilter chain rules after the destination socket has been looked up. Signed-off-by: Daniel Mack --- net/ipv4/tcp_ipv4.c | 8 ++++++++ net/ipv4/udp.c | 15 +++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 93898e0..33f968e 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -78,6 +78,7 @@ #include #include +#include #include #include #include @@ -1594,6 +1595,13 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!sk) goto no_tcp_socket; + ret = nf_hook(NFPROTO_IPV4, NF_INET_POST_DEMUX, sk, + skb, skb->dev, NULL, NULL); + if (ret != 1) { + sock_put(sk); + return 0; + } + process: if (sk->sk_state == TCP_TIME_WAIT) goto do_time_wait; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c0a15e7..0056c20 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -97,6 +97,7 @@ #include #include #include +#include #include #include #include @@ -1632,7 +1633,14 @@ static void flush_stack(struct sock **stack, unsigned int count, struct sock *sk; for (i = 0; i < count; i++) { + int ret; sk = stack[i]; + + ret = nf_hook(NFPROTO_IPV4, NF_INET_POST_DEMUX, sk, + skb, skb->dev, NULL, NULL); + if (ret != 1) + continue; + if (likely(!skb1)) skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC); @@ -1819,6 +1827,13 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, if (sk) { int ret; + ret = nf_hook(NFPROTO_IPV4, NF_INET_POST_DEMUX, sk, + skb, skb->dev, NULL, NULL); + if (ret != 1) { + sock_put(sk); + return 0; + } + if (inet_get_convert_csum(sk) && uh->check && !IS_UDPLITE(sk)) skb_checksum_try_convert(skb, IPPROTO_UDP, uh->check, inet_compute_pseudo); -- 2.5.0