Message ID | 1246379267.3749.42.camel@blaa |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Jun 30, 2009 at 05:27:47PM +0100, Mark McLoughlin wrote: > > However, because nf_conntrack introduces an skb_orphan(), it is now > recommended that bridge-nf-call-iptables be disabled completely so as > to ensure features like TUNSETSNDBUF work as expected. Patrick, does conntrack ever make sense for bridging? Perhaps we should get rid of that completely? Cheers,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Wed, 1 Jul 2009 01:00:27 +0800 Adding appropriate lists and persons to CC: > On Tue, Jun 30, 2009 at 05:27:47PM +0100, Mark McLoughlin wrote: >> >> However, because nf_conntrack introduces an skb_orphan(), it is now >> recommended that bridge-nf-call-iptables be disabled completely so as >> to ensure features like TUNSETSNDBUF work as expected. > > Patrick, does conntrack ever make sense for bridging? Perhaps > we should get rid of that completely? > > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 2009-06-30 21:06, David Miller wrote: >Adding appropriate lists and persons to CC: > >> On Tue, Jun 30, 2009 at 05:27:47PM +0100, Mark McLoughlin wrote: >>> >>> However, because nf_conntrack introduces an skb_orphan(), it is now >>> recommended that bridge-nf-call-iptables be disabled completely so as >>> to ensure features like TUNSETSNDBUF work as expected. >> >> Patrick, does conntrack ever make sense for bridging? Perhaps >> we should get rid of that completely? It makes sense absolutely. Consider: * packet enters bridge * NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, ...) is called by nr_netfilter.c * (connection tracking entry is set up) * let bridging decision be "local delivery" -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Tue, 30 Jun 2009 22:16:35 +0200 (CEST) Jan Engelhardt <jengelh@medozas.de> wrote: > > On Tuesday 2009-06-30 21:06, David Miller wrote: > >Adding appropriate lists and persons to CC: > > > >> On Tue, Jun 30, 2009 at 05:27:47PM +0100, Mark McLoughlin wrote: > >>> > >>> However, because nf_conntrack introduces an skb_orphan(), it is now > >>> recommended that bridge-nf-call-iptables be disabled completely so as > >>> to ensure features like TUNSETSNDBUF work as expected. > >> > >> Patrick, does conntrack ever make sense for bridging? Perhaps > >> we should get rid of that completely? > > It makes sense absolutely. Consider: > > * packet enters bridge > * NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, ...) is called by nr_netfilter.c > * (connection tracking entry is set up) > * let bridging decision be "local delivery" I really like this feature, although it is only because I've spent time thinking about it, and it's usefulness, after having been burnt quite a lot by it, due to it's quite strange side effects on traffic. e.g. it'll defragment bridged IP packets, and then, if the outbound bridge interface MTU is big enough, will send off large single ethernet frames. If you're not expecting that, or don't work out what is going on, you'll believe you're seeing the input traffic in the form it arrived, and you'll believe it for quite a long time :-( I'm not sure if it supposed to work on IP traffic carried within bridge PPPoE/PPP, but it does - and that was very, very confusing to work out what had happened too. PPPoE comes up, as does PPP and IPCP, but forwarded IP packets are dropped unless there is a FORWARD iptables rule. I do agree though, either it should default to off, or the behaviour be made far more prominent somehow. Regards, Mark. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 30, 2009 at 10:16:35PM +0200, Jan Engelhardt wrote: > > It makes sense absolutely. Consider: > > * packet enters bridge > * NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, ...) is called by nr_netfilter.c > * (connection tracking entry is set up) > * let bridging decision be "local delivery" No, my question is does it ever make sense to use conntrack as part of bridge netfilter. That is, do you ever want to test it in your rules that are run as part of bridge netfilter. conntrack is inherently a security hole when used as part of bridging, because it ignores the Ethernet header so two unrelated connections can be tracked as one. It used to be an even bigger security hole when we ignored VLAN and PPPOE headers but at least that's now off by default. Cheers,
On Wed, Jul 01, 2009 at 06:27:37AM +0930, Mark Smith wrote: > > I'm not sure if it supposed to work on IP traffic carried within > bridge PPPoE/PPP, but it does - and that was very, very confusing to > work out what had happened too. PPPoE comes up, as does PPP and IPCP, > but forwarded IP packets are dropped unless there is a FORWARD > iptables rule. At least this bit should be off by default now, along as VLAN decoding which was a gaping hole. Cheers,
From: Mark McLoughlin <markmc@redhat.com> Date: Tue, 30 Jun 2009 17:27:47 +0100 > For these reasons, it makes sense to allow distributions to disable > netfilter on the bridge by default and require those specialized users > to enable it explicitly via sysctl. I heard that distributions ship some file, what's it called... something like /etc/sysctl.conf :-) Really, if someone thinkgs the default stinks and dists don't like it for their users, they can use sysctl.conf to set it how they please. Notwithstanding that changing this default can break working setups and scripts. Yes they can "change", but they were just (rightly) using the kernel as it came to them. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 2009-07-01 03:15, Herbert Xu wrote: >On Tue, Jun 30, 2009 at 10:16:35PM +0200, Jan Engelhardt wrote: >> >> It makes sense absolutely. Consider: >> >> * packet enters bridge >> * NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, ...) is called by nr_netfilter.c >> * (connection tracking entry is set up) >> * let bridging decision be "local delivery" > >No, my question is does it ever make sense to use conntrack as >part of bridge netfilter. That is, do you ever want to test it >in your rules that are run as part of bridge netfilter. There is the possibility that some users have -m conntrack in their mangle table in the PREROUTING chain. However, I am pretty sure that if there are such users, they do it because of the layer-3/4/5/7 part and not care about bridge so much. >conntrack is inherently a security hole when used as part of >bridging, because it ignores the Ethernet header so two unrelated >connections can be tracked as one. On secondary thought, one could also argue that because conntrack ignores the interface, two unrelated connections happening to be routed through the same machine(*) are tracked as one, too. (*) ip rule iif eth0 table 7 ip rule iif eth1 table 7 ip rule iif eth2 table 8 ip rule iif eth3 table 8 ip route add 2003::1/128 dev eth0 table 7 ip route add 2003::2/128 dev eth1 table 7 ip route add 2003::1/128 dev eth2 table 8 ip route add 2003::2/128 dev eth3 table 8 (four single nets, four single hosts) In Unicode: A 0 ┌─║───┐ B1══╝ │ │ ╔══2C └───║─┘ 3 D -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 01, 2009 at 05:50:18AM +0200, Jan Engelhardt wrote: > > On secondary thought, one could also argue that because conntrack > ignores the interface, two unrelated connections happening to be routed > through the same machine(*) are tracked as one, too. Good point. We really should make these risks much more explicit. However, I still think the risk with bridging is higher, especially in the presence of virtualisation. Consider the scenario where you have to VMs on the one host, each with a dedicated bridge with the intention that neither should know anything about the other's traffic. With conntrack running as part of bridging, the traffic can now cross over which is a serious security hole. Cheers,
Jan Engelhardt wrote: > On Wednesday 2009-07-01 03:15, Herbert Xu wrote: > >> On Tue, Jun 30, 2009 at 10:16:35PM +0200, Jan Engelhardt wrote: >> >>> It makes sense absolutely. Consider: >>> >>> * packet enters bridge >>> * NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, ...) is called by nr_netfilter.c >>> * (connection tracking entry is set up) >>> * let bridging decision be "local delivery" >>> >> No, my question is does it ever make sense to use conntrack as >> part of bridge netfilter. That is, do you ever want to test it >> in your rules that are run as part of bridge netfilter. >> > > There is the possibility that some users have -m conntrack in their > mangle table in the PREROUTING chain. However, I am pretty sure that > if there are such users, they do it because of the layer-3/4/5/7 part > and not care about bridge so much. > > >> conntrack is inherently a security hole when used as part of >> bridging, because it ignores the Ethernet header so two unrelated >> connections can be tracked as one. >> > > On secondary thought, one could also argue that because conntrack > ignores the interface, two unrelated connections happening to be routed > through the same machine(*) are tracked as one, too. > I had a similar problem while trying to implement virtual routers using different routing tables and something like 'veth' to connect them. My solution was to add a 'mark' field to the netdevice and allow user-space to set the mark on the device. This mark was included as part of the connection identifier. The mark is set before the pkt hits the bridge code on ingress, so a pkt entering from eth1 can get a different connection hash from a pkt entering eth2, even if all other data in the packet is the same. I'll dig it out of my monster patch if something like this is deemed useful upstream. Thanks, Ben
Mark McLoughlin wrote: > With BRIDGE_NETFILTER enabled, bridge traffic is passed through > netfilter as it is forwarded across the bridge. This is a useful > feature in specialized cases where the admin wishes to filter bridge > traffic based on higher-level protocol headers. > > However, in a lot of cases, it causes a large amount of confusion > since it is so counter-intuitive - nobody expects their IP firewall > rules to also apply to traffic on their bridges. > > This is especially true for virtualization, where users create a > bridge and find that some types of traffic work and others don't, and > it can take quite some time to identify iptables as the culprit. Users > are often recommended to configure their iptables rules to ACCEPT > "physdev-is-bridged" in order to avoid this confusion. > > However, because nf_conntrack introduces an skb_orphan(), it is now > recommended that bridge-nf-call-iptables be disabled completely so as > to ensure features like TUNSETSNDBUF work as expected. > > For these reasons, it makes sense to allow distributions to disable > netfilter on the bridge by default and require those specialized users > to enable it explicitly via sysctl. I agree that this makes sense, at least temporarily. Mid-term we should really fix the defaults, so it would be good to have a feature-removal-schedule and maybe a runtime warning stating that these defaults will change. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Herbert Xu wrote: > On Tue, Jun 30, 2009 at 05:27:47PM +0100, Mark McLoughlin wrote: >> However, because nf_conntrack introduces an skb_orphan(), it is now >> recommended that bridge-nf-call-iptables be disabled completely so as >> to ensure features like TUNSETSNDBUF work as expected. > > Patrick, does conntrack ever make sense for bridging? Perhaps > we should get rid of that completely? People are apparently using this for stateful tracking and even NAT on bridges, so I'm afraid we can't get rid of it completely. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Herbert Xu wrote: > On Tue, Jun 30, 2009 at 10:16:35PM +0200, Jan Engelhardt wrote: >> It makes sense absolutely. Consider: >> >> * packet enters bridge >> * NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, ...) is called by nr_netfilter.c >> * (connection tracking entry is set up) >> * let bridging decision be "local delivery" > > No, my question is does it ever make sense to use conntrack as > part of bridge netfilter. That is, do you ever want to test it > in your rules that are run as part of bridge netfilter. Probably not, but thats not how its used currently. The packets are passed to IP netfilter, which performs connection tracking. I'm not sure how we could avoid the negative effects while still allowing this. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 01, 2009 at 10:57:02AM +0200, Patrick McHardy wrote: > > People are apparently using this for stateful tracking and even > NAT on bridges, so I'm afraid we can't get rid of it completely. I think we should print out a nasty message warning the user if we detect that conntrack is used by bridge netfilter for the first time. If the user persists then at least they can't sue us for not warning them :) Cheers,
Herbert Xu wrote: > On Wed, Jul 01, 2009 at 10:57:02AM +0200, Patrick McHardy wrote: >> People are apparently using this for stateful tracking and even >> NAT on bridges, so I'm afraid we can't get rid of it completely. > > I think we should print out a nasty message warning the user > if we detect that conntrack is used by bridge netfilter for the > first time. If the user persists then at least they can't sue > us for not warning them :) Agreed, at least as long as this is still the default behaviour. Mark, could you add this to your patch? br_nf_pre_routing_finish() looks like a good place to print a warning when skb->nfct != NULL. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2009-06-30 at 20:16 -0700, David Miller wrote: > From: Mark McLoughlin <markmc@redhat.com> > Date: Tue, 30 Jun 2009 17:27:47 +0100 > > > For these reasons, it makes sense to allow distributions to disable > > netfilter on the bridge by default and require those specialized users > > to enable it explicitly via sysctl. > > I heard that distributions ship some file, what's it called... > something like /etc/sysctl.conf :-) > > Really, if someone thinkgs the default stinks and dists don't like it > for their users, they can use sysctl.conf to set it how they please. If upstream agrees the default stinks, then upstream could start making moves towards rectifying it :-) I like Patrick's idea of adding it to feature-removal-schedule > Notwithstanding that changing this default can break working > setups and scripts. Yes they can "change", but they were just > (rightly) using the kernel as it came to them. Yep, that's a valid concern. Weighing up the impact on the small number of people who use it versus the ongoing impact on everyone else, I think it's best to make the change. Cheers, Mark. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mark McLoughlin wrote: > On Tue, 2009-06-30 at 20:16 -0700, David Miller wrote: >> From: Mark McLoughlin <markmc@redhat.com> >> Date: Tue, 30 Jun 2009 17:27:47 +0100 >> >>> For these reasons, it makes sense to allow distributions to disable >>> netfilter on the bridge by default and require those specialized users >>> to enable it explicitly via sysctl. >> I heard that distributions ship some file, what's it called... >> something like /etc/sysctl.conf :-) >> >> Really, if someone thinkgs the default stinks and dists don't like it >> for their users, they can use sysctl.conf to set it how they please. > > If upstream agrees the default stinks, then upstream could start making > moves towards rectifying it :-) > > I like Patrick's idea of adding it to feature-removal-schedule > >> Notwithstanding that changing this default can break working >> setups and scripts. Yes they can "change", but they were just >> (rightly) using the kernel as it came to them. > > Yep, that's a valid concern. > > Weighing up the impact on the small number of people who use it versus > the ongoing impact on everyone else, I think it's best to make the > change. I agree, this has already caused an endless amount of problems for users due to the unexpected behaviour and, in some cases, bugs. Dave's point is certainly valid as well, until we change the defaults, distributions can use sysctl.conf. But I think we should move towards changing the defaults. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Mark McLoughlin <markmc@redhat.com> Date: Wed, 01 Jul 2009 11:45:58 +0100 > Weighing up the impact on the small number of people who use it versus > the ongoing impact on everyone else, I think it's best to make the > change. Consider the posibility that the people using it aren't saying anything because things just-work for them right now. If you change this default, I guarentee we will hear from them. "everyone else" is making noise only because the default isn't what they want. There is no other reason. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Patrick McHardy <kaber@trash.net> Date: Wed, 01 Jul 2009 12:51:06 +0200 > I agree, this has already caused an endless amount of problems for > users due to the unexpected behaviour and, in some cases, bugs. > > Dave's point is certainly valid as well, until we change the defaults, > distributions can use sysctl.conf. But I think we should move towards > changing the defaults. You must do this via something like your suggestion, the feature-removal-schedule.txt thing. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller wrote: > From: Patrick McHardy <kaber@trash.net> > Date: Wed, 01 Jul 2009 12:51:06 +0200 > >> I agree, this has already caused an endless amount of problems for >> users due to the unexpected behaviour and, in some cases, bugs. >> >> Dave's point is certainly valid as well, until we change the defaults, >> distributions can use sysctl.conf. But I think we should move towards >> changing the defaults. > > You must do this via something like your suggestion, the > feature-removal-schedule.txt thing. Yes, of course. But I prefer the runtime warning (at least on top), my feeling is not many people actually read feature-removal-schedule. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Patrick McHardy <kaber@trash.net> Date: Wed, 01 Jul 2009 18:05:40 +0200 > David Miller wrote: >> From: Patrick McHardy <kaber@trash.net> >> Date: Wed, 01 Jul 2009 12:51:06 +0200 >> >>> I agree, this has already caused an endless amount of problems for >>> users due to the unexpected behaviour and, in some cases, bugs. >>> >>> Dave's point is certainly valid as well, until we change the defaults, >>> distributions can use sysctl.conf. But I think we should move towards >>> changing the defaults. >> You must do this via something like your suggestion, the >> feature-removal-schedule.txt thing. > > Yes, of course. But I prefer the runtime warning (at least on top), > my feeling is not many people actually read feature-removal-schedule. Ok. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 01, 2009 at 09:02:02AM -0700, David Miller wrote: > > Consider the posibility that the people using it aren't saying > anything because things just-work for them right now. > > If you change this default, I guarentee we will hear from them. > > "everyone else" is making noise only because the default isn't > what they want. There is no other reason. FWIW I don't really care what we have as the default for bridge netfilter. I just want to make sure that people who do have bridge netfilter (and in particular, conntrack + bridge) active on their machines are aware of the security implications. Otherwise we'd be negligent. As you said distros can change the default regardless of what the kernel does. Cheers,
On Wed, Jul 01, 2009 at 11:21:44AM +0200, Patrick McHardy wrote: > > Agreed, at least as long as this is still the default behaviour. > Mark, could you add this to your patch? br_nf_pre_routing_finish() > looks like a good place to print a warning when skb->nfct != NULL. Here's a suggestion: Can we add another field to the conntrack tuple? This would be used to ensure that every bridge's conntrack is distinct from each other, as well as that of the system IP stack. Cheers,
Herbert Xu wrote: > On Wed, Jul 01, 2009 at 11:21:44AM +0200, Patrick McHardy wrote: >> Agreed, at least as long as this is still the default behaviour. >> Mark, could you add this to your patch? br_nf_pre_routing_finish() >> looks like a good place to print a warning when skb->nfct != NULL. > > Here's a suggestion: > > Can we add another field to the conntrack tuple? This would be used > to ensure that every bridge's conntrack is distinct from each other, > as well as that of the system IP stack. We would need a key that can be uniquely determined at all points and that can be inverted (taking into account ebtables NAT, NAT to a different bridge etc) - I can't think of a suitable one right now. But besides the conntrack size increase, I don't think this is the correct solution for this problem. Defragmentation (before conntrack) would still allow fragments to cross boundaries, unless we key the defragmentation queues using the same key. And generally defragmenting bridged packets by default, possibly passing them through NAT, IP routing etc. is simply wrong and only (somewhat) works in certain scenarios. Helpers might get confused when the same packet is flooded to multiple output ports, IPsec policies might magically get applied, etc etc. The best way to make people aware of all these implications and avoid unsuspecting people running into this again and again would be to change the defaults and have people think before they use this. Long term I think this needs to be completely redesigned. And for the record, I don't believe that this is used a lot and we're just not aware because it simply works. The fact is it always had major problems that we fixed as good as possible over the years, but I'm pretty certain I've heard from just about every user of this at least once :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 01 Jul 2009 18:05:40 +0200 Patrick McHardy <kaber@trash.net> wrote: > David Miller wrote: > > From: Patrick McHardy <kaber@trash.net> > > Date: Wed, 01 Jul 2009 12:51:06 +0200 > > > >> I agree, this has already caused an endless amount of problems for > >> users due to the unexpected behaviour and, in some cases, bugs. > >> > >> Dave's point is certainly valid as well, until we change the defaults, > >> distributions can use sysctl.conf. But I think we should move towards > >> changing the defaults. > > > > You must do this via something like your suggestion, the > > feature-removal-schedule.txt thing. > > Yes, of course. But I prefer the runtime warning (at least on top), > my feeling is not many people actually read feature-removal-schedule. > Reviewing my earlier email, I realised I didn't say why I liked it. Yes, I've been burned by it quite a bit, but that was because I wasn't aware of it. I do see a lot of value in it for "layer 3 transparent" firewalling. Adding a firewall to a network can be a bit of an effort as it may involve changing the networks routing configuration, and consequently all the things that involves e.g. renumbering hosts, spitting up subnets or adding new ones. Being able to insert a layer 3 transparent firewalling device between the upstream router and the downstream hosts would be far, far easier. With it being able to firewall bridged PPPoE/PPP traffic, potentially made it even more useful, although in less common cases. For example, I have a number of devices at home that are themselves running PPPoE/PPP, rather than having a single upstream router running it. If I wasn't confident of the firewalling capabilities of each of those devices, I could insert a layer 3 transparent iptables firewall, and add another level of firewalling to the PPPoE/PPP encapsulated traffic. So, I'd certainly like the feature to stay. It just needs to either not be on by default, or the default made more obvious and a method added to make it easy to switch off. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/Kconfig b/net/Kconfig index 7051b97..b5f4379 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -153,6 +153,18 @@ config BRIDGE_NETFILTER If unsure, say N. +config BRIDGE_NETFILTER_DEFAULT_ON + def_bool y + prompt "Enable netfilter on the bridge by default" + depends on BRIDGE_NETFILTER && SYSCTL + ---help--- + Selecting this option will enable netfilter iptables + etc. rules on bridges by default. This means that netfilter + iptables rules will apply to frames forwarded across the + bridge. If this option is not selected, it can be enabled at + runtime using the net.bridge.bridge-nf-call-*tables sysctl + settings. + source "net/netfilter/Kconfig" source "net/ipv4/netfilter/Kconfig" source "net/ipv6/netfilter/Kconfig" diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index d22f611..ed53e21 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -55,9 +55,15 @@ #ifdef CONFIG_SYSCTL static struct ctl_table_header *brnf_sysctl_header; +#ifdef CONFIG_BRIDGE_NETFILTER_DEFAULT_ON static int brnf_call_iptables __read_mostly = 1; static int brnf_call_ip6tables __read_mostly = 1; static int brnf_call_arptables __read_mostly = 1; +#else +static int brnf_call_iptables __read_mostly = 0; +static int brnf_call_ip6tables __read_mostly = 0; +static int brnf_call_arptables __read_mostly = 0; +#endif static int brnf_filter_vlan_tagged __read_mostly = 0; static int brnf_filter_pppoe_tagged __read_mostly = 0; #else
With BRIDGE_NETFILTER enabled, bridge traffic is passed through netfilter as it is forwarded across the bridge. This is a useful feature in specialized cases where the admin wishes to filter bridge traffic based on higher-level protocol headers. However, in a lot of cases, it causes a large amount of confusion since it is so counter-intuitive - nobody expects their IP firewall rules to also apply to traffic on their bridges. This is especially true for virtualization, where users create a bridge and find that some types of traffic work and others don't, and it can take quite some time to identify iptables as the culprit. Users are often recommended to configure their iptables rules to ACCEPT "physdev-is-bridged" in order to avoid this confusion. However, because nf_conntrack introduces an skb_orphan(), it is now recommended that bridge-nf-call-iptables be disabled completely so as to ensure features like TUNSETSNDBUF work as expected. For these reasons, it makes sense to allow distributions to disable netfilter on the bridge by default and require those specialized users to enable it explicitly via sysctl. Signed-off-by: Mark McLoughlin <markmc@redhat.com> --- net/Kconfig | 12 ++++++++++++ net/bridge/br_netfilter.c | 6 ++++++ 2 files changed, 18 insertions(+), 0 deletions(-)