Message ID | 20240726015837.14572-2-fw@strlen.de |
---|---|
State | Changes Requested |
Headers | show |
Series | list hooks refactoring | expand |
On Fri, Jul 26, 2024 at 03:58:28AM +0200, Florian Westphal wrote: > Add a brief segment about 'nft list hooks' and a summary > of the output format. > > As nft.txt is quite large, split the additonal commands > into their own file. > > The existing listing section is removed; list subcommand is > already mentioned in the relevant statement sections. > > Reported-by: Phil Sutter <phil@nwl.cc> > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > Makefile.am | 1 + > doc/additional-commands.txt | 115 ++++++++++++++++++++++++++++++++++++ > doc/nft.txt | 63 +------------------- > 3 files changed, 117 insertions(+), 62 deletions(-) > create mode 100644 doc/additional-commands.txt > > diff --git a/Makefile.am b/Makefile.am > index 9088170bfc68..ef198dafcbc8 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -322,6 +322,7 @@ A2X_OPTS_MANPAGE = \ > ASCIIDOC_MAIN = doc/nft.txt > > ASCIIDOC_INCLUDES = \ > + doc/additional-commands.txt \ > doc/data-types.txt \ > doc/payload-expression.txt \ > doc/primary-expression.txt \ > diff --git a/doc/additional-commands.txt b/doc/additional-commands.txt > new file mode 100644 > index 000000000000..dd1b3d2d87d4 > --- /dev/null > +++ b/doc/additional-commands.txt > @@ -0,0 +1,115 @@ > +LIST HOOKS > +~~~~~~~~~~ > + > +This shows the low-level netfilter processing pipeline, including > +functions registered by kernel modules such as nf_conntrack. + > + > +[verse] > +____ > +*list hooks* ['family'] > +*list hooks netdev device* 'DEVICE_NAME' > +____ > + > +*list hooks* is enough to display everything that is active > +on the system, however, it does currently omit hooks that are > +tied to a specific network device (netdev family). To obtain > +those, the network device needs to be queried by name. IIRC, the idea is to display the ingress path pipeline according to the device (if specified) list hooks netdev eth0 as for egress, as it is not possible to know where the packet is going, it is probably good to allow the user to specify the output device, so it gets the entire pipeline for ingress and egress paths, ie. list hooks netdev eth0 eth1 Note that this is not implemented. This has limitations, discovering eth{0,1} belongs to bridge device would need more work (not asking to do this now, but it could be a nice usability feature to discover the pipeline?).
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > +*list hooks* is enough to display everything that is active > > +on the system, however, it does currently omit hooks that are > > +tied to a specific network device (netdev family). To obtain > > +those, the network device needs to be queried by name. > > IIRC, the idea is to display the ingress path pipeline according to > the device (if specified) > > list hooks netdev eth0 > > as for egress, as it is not possible to know where the packet is > going, it is probably good to allow the user to specify the output > device, so it gets the entire pipeline for ingress and egress > paths, ie. > > list hooks netdev eth0 eth1 Not really, why would eth0 and eth1 be related here? What would make more sense to me is to allow list hooks netdev and then have nft fetch list of all network devices and then query them all. If a packet coming in on devX will be forwarded to devY depends on the type of packet and the configuration, e.g. arp/ip vs. bridge/routing or even encapsulation... > Note that this is not implemented. This has limitations, discovering > eth{0,1} belongs to bridge device would need more work (not asking to > do this now, but it could be a nice usability feature to discover the > pipeline?). Bridge? I don't think we have bridge family support for netdev hooks? AFAIU its only netdev and inet. This thing should only list the nf hooks registered for the device, and not start to guess. So for "list hooks br0", return ingress and egress hooks for the virtual device, not the bridge ports.
Hi Florian, On Fri, Jul 26, 2024 at 02:31:52PM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > +*list hooks* is enough to display everything that is active > > > +on the system, however, it does currently omit hooks that are > > > +tied to a specific network device (netdev family). To obtain > > > +those, the network device needs to be queried by name. > > > > IIRC, the idea is to display the ingress path pipeline according to > > the device (if specified) > > > > list hooks netdev eth0 > > > > as for egress, as it is not possible to know where the packet is > > going, it is probably good to allow the user to specify the output > > device, so it gets the entire pipeline for ingress and egress > > paths, ie. > > > > list hooks netdev eth0 eth1 > > Not really, why would eth0 and eth1 be related here? Note that you can specify: list hooks ip device enp0s25 this shows the hooks that will be exercised for a given packet family, ie. IPv4 packets will exercise the following hooks. family ip { hook ingress { 0000000000 chain netdev x y [nf_tables] } hook prerouting { -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] -0000000200 ipv4_conntrack_in [nf_conntrack] } hook input { 0000000000 chain ip filter in [nf_tables] +2147483647 nf_confirm [nf_conntrack] } hook forward { -0000000225 selinux_ip_forward } hook output { -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] -0000000225 selinux_ip_output -0000000200 ipv4_conntrack_local [nf_conntrack] } hook postrouting { +0000000225 selinux_ip_postroute +2147483647 nf_confirm [nf_conntrack] } } This is _not_ showing the list of hooks for a given family. What I meant is that user could filter out by ingress and egress device to fetch the hooks that are traversed in such case, ie. list hooks ip iifname eth0 oifname eth1 to get the traversal of hooks for IPv4 packets, assuming eth0 as ingress device and eth1 as egress device. > What would make more sense to me is to allow > > list hooks netdev > > and then have nft fetch list of all network devices and then query them > all. Makes sense, it currently fails with EINVAL because no device has been specified. > If a packet coming in on devX will be forwarded to devY depends on the > type of packet and the configuration, e.g. arp/ip vs. bridge/routing > or even encapsulation... > > > Note that this is not implemented. This has limitations, discovering > > eth{0,1} belongs to bridge device would need more work (not asking to > > do this now, but it could be a nice usability feature to discover the > > pipeline?). > > Bridge? I don't think we have bridge family support for netdev hooks? > AFAIU its only netdev and inet. > > This thing should only list the nf hooks registered for the device, > and not start to guess. So for "list hooks br0", return ingress and > egress hooks for the virtual device, not the bridge ports. OK
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > Not really, why would eth0 and eth1 be related here? > > Note that you can specify: > > list hooks ip device enp0s25 > > this shows the hooks that will be exercised for a given packet family, > ie. IPv4 packets will exercise the following hooks. > > family ip { > hook ingress { > 0000000000 chain netdev x y [nf_tables] > } > hook prerouting { > -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] > -0000000200 ipv4_conntrack_in [nf_conntrack] > } > hook input { > 0000000000 chain ip filter in [nf_tables] > +2147483647 nf_confirm [nf_conntrack] > } > hook forward { > -0000000225 selinux_ip_forward > } > hook output { > -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] > -0000000225 selinux_ip_output > -0000000200 ipv4_conntrack_local [nf_conntrack] > } > hook postrouting { > +0000000225 selinux_ip_postroute > +2147483647 nf_confirm [nf_conntrack] > } > } > > This is _not_ showing the list of hooks for a given family. I now realize that whats in the tree today is not what I wrote originally. So this is neither showing the hooks that will be execrised (packet can't be input and forward...). But ok. I don't know what to do now. > What I meant is that user could filter out by ingress and egress > device to fetch the hooks that are traversed in such case, ie. > > list hooks ip iifname eth0 oifname eth1 > > to get the traversal of hooks for IPv4 packets, assuming eth0 as > ingress device and eth1 as egress device. No idea how to make this, or I fail to understand. > > What would make more sense to me is to allow > > > > list hooks netdev > > > > and then have nft fetch list of all network devices and then query them > > all. > > Makes sense, it currently fails with EINVAL because no device has been > specified. I'll try to add it, but I don't know if I should toss these patches first or not :/
On Mon, Jul 29, 2024 at 01:37:36AM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > Not really, why would eth0 and eth1 be related here? > > > > Note that you can specify: > > > > list hooks ip device enp0s25 > > > > this shows the hooks that will be exercised for a given packet family, > > ie. IPv4 packets will exercise the following hooks. > > > > family ip { > > hook ingress { > > 0000000000 chain netdev x y [nf_tables] > > } > > hook prerouting { > > -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] > > -0000000200 ipv4_conntrack_in [nf_conntrack] > > } > > hook input { > > 0000000000 chain ip filter in [nf_tables] > > +2147483647 nf_confirm [nf_conntrack] > > } > > hook forward { > > -0000000225 selinux_ip_forward > > } > > hook output { > > -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] > > -0000000225 selinux_ip_output > > -0000000200 ipv4_conntrack_local [nf_conntrack] > > } > > hook postrouting { > > +0000000225 selinux_ip_postroute > > +2147483647 nf_confirm [nf_conntrack] > > } > > } > > > > This is _not_ showing the list of hooks for a given family. > > I now realize that whats in the tree today is not what I wrote originally. We agreed to change it. > So this is neither showing the hooks that will be execrised (packet > can't be input and forward...). But ok. I don't know what to do now. As it is not possible to know where the packet is going (input / forward), this listing represents what hooks can be potentially exercised for such packet family, hence, netdev hooks are included in the ip family view. If you don't like this behaviour and prefer a strict view per hook family it should be possible to revisit. User will have to get all the pieces together to understand what the hook pipeline is instead. This command has not been documented so far. > > What I meant is that user could filter out by ingress and egress > > device to fetch the hooks that are traversed in such case, ie. > > > > list hooks ip iifname eth0 oifname eth1 > > > > to get the traversal of hooks for IPv4 packets, assuming eth0 as > > ingress device and eth1 as egress device. > > No idea how to make this, or I fail to understand. Not important for this series, this can be extended later. > > > What would make more sense to me is to allow > > > > > > list hooks netdev > > > > > > and then have nft fetch list of all network devices and then query them > > > all. > > > > Makes sense, it currently fails with EINVAL because no device has been > > specified. > > I'll try to add it, but I don't know if I should toss these patches > first or not :/ I think patchset looks fine, but documentation update needs a revamp if you agree in the existing behaviour.
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Mon, Jul 29, 2024 at 01:37:36AM +0200, Florian Westphal wrote: > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > Not really, why would eth0 and eth1 be related here? > > > > > > Note that you can specify: > > > > > > list hooks ip device enp0s25 > > > > > > this shows the hooks that will be exercised for a given packet family, > > > ie. IPv4 packets will exercise the following hooks. > > > > > > family ip { > > > hook ingress { > > > 0000000000 chain netdev x y [nf_tables] > > > } > > > hook prerouting { > > > -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] > > > -0000000200 ipv4_conntrack_in [nf_conntrack] > > > } > > > hook input { > > > 0000000000 chain ip filter in [nf_tables] > > > +2147483647 nf_confirm [nf_conntrack] > > > } > > > hook forward { > > > -0000000225 selinux_ip_forward > > > } > > > hook output { > > > -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] > > > -0000000225 selinux_ip_output > > > -0000000200 ipv4_conntrack_local [nf_conntrack] > > > } > > > hook postrouting { > > > +0000000225 selinux_ip_postroute > > > +2147483647 nf_confirm [nf_conntrack] > > > } > > > } > > > > > > This is _not_ showing the list of hooks for a given family. > > > > I now realize that whats in the tree today is not what I wrote originally. > > We agreed to change it. Sorry. I'm old and do not remember. TL;DR: I think we should revert to strict behaviour, i.e. nft list hooks foo queries hooks registered as NFPROTO_FOO. NFPROTO_INET expands to shorthand for 'list hooks ip and ip6'. AFAICS the problem is that ip, ip6 and arp are both NFPROTO_ values (protocol families), but also l3 protocols, whereas netdev and bridge are only families. Hence, what to do on bridge becomes murky, there is just a large number of possible l3 protocols that can pass through these hooks. Netdev is simple because its scoped only to one single interface. Sorry for the wall of text below, but maybe it helps to understand why i think the above (no-guesses, treat strictly as nfproto) makes sense. > > So this is neither showing the hooks that will be execrised (packet > > can't be input and forward...). But ok. I don't know what to do now. > > As it is not possible to know where the packet is going (input / > forward), this listing represents what hooks can be potentially > exercised for such packet family, hence, netdev hooks are included in > the ip family view. I was confused because it lists netdev hooks when asking for bridge, and when I made this facility my intent was for this to return hooks that were registered with NFPROTO_$QUERY. So i did not expect to see netdev hooks. list hooks arp device eth0 my expectation: 1. no output OR 2. NFPROTO_ARP in and output hooks OR 3. error, e.g. "device not supported for arp family". But I get this instead: family arp { hook ingress { 0000000000 chain netdev ingress in_public [nf_tables] } } Which I considered a bug but if we say that the query should return hooks that this protocol family is exposed to, then that might be ok. Should it list bridge hooks too? Same rationale would apply as to why it shows the netdev family ingress hook: ARP packets will be picked up by bridge hooks if they arrive on interfaces that serve as a bridge port. I think that original code was simpler, less guesswork on userspace side, what-you-ask-for-is-what-you-get, i.e. for "list hooks arp" shows only hooks registered with NFPROTO_ARP on kernel side, and nothing else. For "list hooks arp device eth0", we could either return an error, or pass this to kernel in case we gain arp+device at some point, even though i don't see why anyone would add that. inet ingress is already awkward in my opinion, I'm not sure why it got added. > If you don't like this behaviour and prefer a strict view per hook > family it should be possible to revisit. User will have to get all the > pieces together to understand what the hook pipeline is instead. This > command has not been documented so far. Yes. In theory it should be possible to do this, so to go with arp example: nft list hooks arp device foo would list: 1. netdev ingress and egress hook for the queried device 2. arp in and output hooks (there is no forward for arp) 3. bridge pre,in,forward,output and postrouting but does that make sense? I don't think so, because it gets more complicated for bridge query: nft list hooks bridge 1. can't show netdev, we lack device to query -> query all interfaces? But why would one clutter output with netdev in/egress when bridge was asked for? 2. show pre/in/fwd/out/postrouting NFPROTO_BRIDGE hooks 3. should it show ip/ip6 hooks? They could be relevant in case of broute expression in nftables. ip/ip6 Output+postrouting are travesed in case of local packets. and it would have to show arp hooks, no? for arp packet, we might pass them up to local stack. Local arp query pass through arp:output. So I'm just worried that this gets complicated/murky as to what is shown (and what is omitted). For bridge, we would have to end up dumping everything and treat it like 'list hooks'.... I do admit that it would be nice to have something like: nft list pipeline meta input eth0 ip saddr 1.2.3.4 ip daddr 5.6.7.8 and then have nft: 1. list NFPROTO_NETDEV for eth0 ingress only (no egress) 2. list NFPROTO_INET hooks for ingress eth0 3. list NFPROTO_IPV4 hooks for prerouting 4. query FIB to get output device for 1.2.3.4->5.6.7.8 5. list NFPROTO_IPV4 for EITHER input or forward+postrouting 6. for forward case, list NFPROTO_NETDEV egress hooks for the fib-derviced output device But thats hard, because there might be rules in place that alter ip daddr or ip saddr, or packet mark etc etc so the pipeline shown can be a wrong. And bridge needs extra work, we need to figure out if eth0 is a bridge port and then dump NFPROTO_BRIDGE. And query l2 address of the ip address to query bridge fib to figure out of this enters local delivery or forwarding case, or both... Or, require use of 'ether daddr' for bridge. But still, its a lot of work to make this. > I think patchset looks fine, but documentation update needs a revamp > if you agree in the existing behaviour. I agree that we first need to figure out what 'nft list hooks xxxx' should do. I would prefer 'no guesses' approach, i.e. 1. nft list hooks dump everything EXCEPT netdev families/devices 2. nft list hooks netdev device foo dump ingress/egress netdev hooks, INCLUDING inet ingress (its scoped to the device). 3. nft list hooks arp dump arp input and output, if any 4. nft list hooks bridge dump bridge pre/input/forward/out/postrouting 5. nft list hooks ip dump ip pre/input/forward/out/postrouting 6. nft list hooks ip6 -> see above 7. nft list hooks inet -> alias for dumping both ip+ip6 hooks. This also means that i'd make 'inet device foo' return a warning that the device will be ignored because "inet ingress" is syntactic frontend sugar similar to inet pseudo-family. We could try the 'detect pipeline' later. But, as per example above, I don't think its easy, unless one omits all packet rewrites (stateless nat, dnat, redirect) and everything else that causes re-routing, e.g. policy routing, tproxy, tc(x), etc. And then there is l3 and l2 multicasting... But, admittingly, it might be nice to have something and it might be good enough if pipeline alterations by ruleset and other entities are omitted.
On Mon, Jul 29, 2024 at 05:32:11PM +0200, Florian Westphal wrote: [...] > TL;DR: I think we should revert to strict behaviour, i.e. > nft list hooks foo Agreed. > queries hooks registered as NFPROTO_FOO. > > NFPROTO_INET expands to shorthand for 'list hooks ip and ip6'. > > AFAICS the problem is that ip, ip6 and arp are both NFPROTO_ values > (protocol families), but also l3 protocols, whereas netdev and bridge > are only families. > > Hence, what to do on bridge becomes murky, there is just a large number > of possible l3 protocols that can pass through these hooks. > > Netdev is simple because its scoped only to one single interface. > > Sorry for the wall of text below, but maybe it helps to understand > why i think the above (no-guesses, treat strictly as nfproto) makes sense. Makes sense, thanks for explaining. [...] > > If you don't like this behaviour and prefer a strict view per hook > > family it should be possible to revisit. User will have to get all the > > pieces together to understand what the hook pipeline is instead. This > > command has not been documented so far. > > Yes. In theory it should be possible to do this, so to go with arp example: > > nft list hooks arp device foo > > would list: > > 1. netdev ingress and egress hook for the queried device > 2. arp in and output hooks (there is no forward for arp) > 3. bridge pre,in,forward,output and postrouting > > but does that make sense? I don't think so, because it gets more > complicated for bridge query: > > nft list hooks bridge > > 1. can't show netdev, we lack device to query -> query all interfaces? > But why would one clutter output with netdev in/egress when bridge > was asked for? > 2. show pre/in/fwd/out/postrouting NFPROTO_BRIDGE hooks > 3. should it show ip/ip6 hooks? They could be relevant in case > of broute expression in nftables. > ip/ip6 Output+postrouting are travesed in case of local packets. > > and it would have to show arp hooks, no? for arp packet, we might pass > them up to local stack. Local arp query pass through arp:output. > > So I'm just worried that this gets complicated/murky as to what is shown > (and what is omitted). For bridge, we would have to end up dumping > everything and treat it like 'list hooks'.... > > I do admit that it would be nice to have something like: > > nft list pipeline meta input eth0 ip saddr 1.2.3.4 ip daddr 5.6.7.8 > > and then have nft: > 1. list NFPROTO_NETDEV for eth0 ingress only (no egress) > 2. list NFPROTO_INET hooks for ingress eth0 > 3. list NFPROTO_IPV4 hooks for prerouting > 4. query FIB to get output device for 1.2.3.4->5.6.7.8 > 5. list NFPROTO_IPV4 for EITHER input or forward+postrouting > 6. for forward case, list NFPROTO_NETDEV egress hooks for the > fib-derviced output device > > But thats hard, because there might be rules in place that > alter ip daddr or ip saddr, or packet mark etc etc so the > pipeline shown can be a wrong. Yes, providing a pipeline description is a much wider task than just listing hooks. Please, move on with restoring list hooks. Thanks Florian.
Hi, On Mon, Jul 29, 2024 at 05:32:11PM +0200, Florian Westphal wrote: [...] > inet ingress is already awkward in my opinion, I'm not sure why it > got added. I suppose the ability to reuse other inet family ruleset parts for ingress was a sought-after feature. In hindsight, making tables family-agnostic and thus pure "containers" for grouping ruleset parts might be a superior design choice. (Or maybe I just miss the rub in doing that. :) [...] > I agree that we first need to figure out what 'nft list hooks xxxx' > should do. I would prefer 'no guesses' approach, i.e. Let me suggest a deviation which seems more user-friendly: > 1. nft list hooks > dump everything EXCEPT netdev families/devices Include netdev here, make it really list *all* hooks. Iterating over the list of currently existing NICs in this netns is no big deal, is it? > 2. nft list hooks netdev device foo > dump ingress/egress netdev hooks, > INCLUDING inet ingress (its scoped to the device). Drop 'netdev' from the syntax here. The output really is "all hooks specific to that NIC", not necessarily only netdev ones. (And "device" is a distinct identifier for network interfaces in nftables syntax.) > 3. nft list hooks arp > dump arp input and output, if any > 4. nft list hooks bridge > dump bridge pre/input/forward/out/postrouting > 5. nft list hooks ip > dump ip pre/input/forward/out/postrouting > 6. nft list hooks ip6 -> see above > 7. nft list hooks inet -> alias for dumping both ip+ip6 hooks. I wonder if this is more confusing than not - on the netfilter hooks layer, it doesn't quite exist. The only thing which persists a tad longer is inet ingress, indicated by nf_register_net_hook() passing it down to __nf_register_net_hook for nf_hook_entry_head() to return the same value as for netdev ingress. I guess they could be spliced even further up. > This also means that i'd make 'inet device foo' return a warning > that the device will be ignored because "inet ingress" is syntactic > frontend sugar similar to inet pseudo-family. Iff my claim holds true, there is no such thing as an inet hook and thus also no inet device one. :) > We could try the 'detect pipeline' later. But, as per example > above, I don't think its easy, unless one omits all packet rewrites > (stateless nat, dnat, redirect) and everything else that causes > re-routing, e.g. policy routing, tproxy, tc(x), etc. > > And then there is l3 and l2 multicasting... > > But, admittingly, it might be nice to have something and it might be > good enough if pipeline alterations by ruleset and other entities > are omitted. I seem to recall us discussing something like that on NFWS, but to simulate traversal of a packet with given properties through the ruleset. Which would also identify the hooks involved, I guess? Cheers, Phil
Hi Phil, Florian, @Florian, could you push out what you have to flush your queue in this front? On Tue, Aug 13, 2024 at 01:06:49PM +0200, Phil Sutter wrote: > On Mon, Jul 29, 2024 at 05:32:11PM +0200, Florian Westphal wrote: > [...] > Let me suggest a deviation which seems more user-friendly: > > > 1. nft list hooks > > dump everything EXCEPT netdev families/devices > > Include netdev here, make it really list *all* hooks. Iterating over > the list of currently existing NICs in this netns is no big deal, is > it? I like this suggestion. I think it should be possible to incrementally extend with these suggestions, it should be possible to submit patches for your review. > > 2. nft list hooks netdev device foo > > dump ingress/egress netdev hooks, > > INCLUDING inet ingress (its scoped to the device). > > Drop 'netdev' from the syntax here. The output really is "all hooks > specific to that NIC", not necessarily only netdev ones. (And "device" > is a distinct identifier for network interfaces in nftables syntax.) I think allowing 'device foo' without family would be good. > > 3. nft list hooks arp > > dump arp input and output, if any > > 4. nft list hooks bridge > > dump bridge pre/input/forward/out/postrouting > > 5. nft list hooks ip > > dump ip pre/input/forward/out/postrouting > > 6. nft list hooks ip6 -> see above > > 7. nft list hooks inet -> alias for dumping both ip+ip6 hooks. > > I wonder if this is more confusing than not - on the netfilter hooks > layer, it doesn't quite exist. The only thing which persists a tad > longer is inet ingress, indicated by nf_register_net_hook() passing it > down to __nf_register_net_hook for nf_hook_entry_head() to return the > same value as for netdev ingress. I guess they could be spliced even > further up. inet is an "alias". I think dumpg both ip+ip6 hooks should be fine. > > This also means that i'd make 'inet device foo' return a warning > > that the device will be ignored because "inet ingress" is syntactic > > frontend sugar similar to inet pseudo-family. > > Iff my claim holds true, there is no such thing as an inet hook and > thus also no inet device one. :) > > > We could try the 'detect pipeline' later. But, as per example > > above, I don't think its easy, unless one omits all packet rewrites > > (stateless nat, dnat, redirect) and everything else that causes > > re-routing, e.g. policy routing, tproxy, tc(x), etc. > > > > And then there is l3 and l2 multicasting... > > > > But, admittingly, it might be nice to have something and it might be > > good enough if pipeline alterations by ruleset and other entities > > are omitted. > > I seem to recall us discussing something like that on NFWS, but to > simulate traversal of a packet with given properties through the > ruleset. Which would also identify the hooks involved, I guess? Yes, this all is more complicated and it needs a whole lot of work to add this feature in a complete way, which makes sense to schedule this for the _future_.
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Hi Phil, Florian, > > @Florian, could you push out what you have to flush your queue in this front? OK, I pushed the patches to nftables.git. > > > 1. nft list hooks > > > dump everything EXCEPT netdev families/devices > > > > Include netdev here, make it really list *all* hooks. Iterating over > > the list of currently existing NICs in this netns is no big deal, is > > it? > > I like this suggestion. Fail enough, I will send a patch for this later this week. > > > 2. nft list hooks netdev device foo > > > dump ingress/egress netdev hooks, > > > INCLUDING inet ingress (its scoped to the device). > > > > Drop 'netdev' from the syntax here. The output really is "all hooks > > specific to that NIC", not necessarily only netdev ones. (And "device" > > is a distinct identifier for network interfaces in nftables syntax.) > > I think allowing 'device foo' without family would be good. OK, I'm still unclear however because internally only netdev families exist at the device level, so I'm not sure how to represent this. But dumping the existing network devices and querying them all is not and issue so I will make a patch for this.
diff --git a/Makefile.am b/Makefile.am index 9088170bfc68..ef198dafcbc8 100644 --- a/Makefile.am +++ b/Makefile.am @@ -322,6 +322,7 @@ A2X_OPTS_MANPAGE = \ ASCIIDOC_MAIN = doc/nft.txt ASCIIDOC_INCLUDES = \ + doc/additional-commands.txt \ doc/data-types.txt \ doc/payload-expression.txt \ doc/primary-expression.txt \ diff --git a/doc/additional-commands.txt b/doc/additional-commands.txt new file mode 100644 index 000000000000..dd1b3d2d87d4 --- /dev/null +++ b/doc/additional-commands.txt @@ -0,0 +1,115 @@ +LIST HOOKS +~~~~~~~~~~ + +This shows the low-level netfilter processing pipeline, including +functions registered by kernel modules such as nf_conntrack. + + +[verse] +____ +*list hooks* ['family'] +*list hooks netdev device* 'DEVICE_NAME' +____ + +*list hooks* is enough to display everything that is active +on the system, however, it does currently omit hooks that are +tied to a specific network device (netdev family). To obtain +those, the network device needs to be queried by name. +Example Usage: + +.List all active netfilter hooks in either the ip or ip6 stack +-------------------------------------------------------------- +% nft list hooks inet +family ip { + hook prerouting { + -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] + -0000000200 ipv4_conntrack_in [nf_conntrack] + -0000000100 nf_nat_ipv4_pre_routing [nf_nat] + } + hook input { + 0000000000 chain inet filter input [nf_tables] + +0000000100 nf_nat_ipv4_local_in [nf_nat] +[..] +-------------------------------------------------------------- + +The above shows a host that has nat, conntrack and ipv4 packet +defragmentation enabled. +For each hook location for the queried family a list of active hooks +using the format + + +*priority* *identifier* [*module_name*] + +will be shown. + +The *priority* value dictates the order in which the hooks are called. +The list is sorted, the lowest number is run first. + +The priority value of hooks registered by the kernel cannot be changed. +For basechains registered by nftables, this value corresponds to the +*priority* value specified in the base chain definition. + +After the numerical value, information about the hook is shown. +For basechains defined in nftables this includes the table family, +the table name and the basechains name. +For hooks coming from kernel modules, the function name is used +instead. + +If a *module name* is given, the hook was registered by the kernel +module with this name. You can use 'modinfo *module name*' to +obtain more information about the module. + +This functionality requires a kernel built with the option + +CONFIG_NETFILTER_NETLINK_HOOK +enabled, either as a module or builtin. The module is named +*nfnetlink_hook*. + +MONITOR +~~~~~~~ +The monitor command allows you to listen to Netlink events produced by the +nf_tables subsystem. These are either related to creation and deletion of +objects or to packets for which *meta nftrace* was enabled. When they +occur, nft will print to stdout the monitored events in either JSON or +native nft format. + + +[verse] +____ +*monitor* [*new* | *destroy*] 'MONITOR_OBJECT' +*monitor* *trace* + +'MONITOR_OBJECT' := *tables* | *chains* | *sets* | *rules* | *elements* | *ruleset* +____ + +To filter events related to a concrete object, use one of the keywords in +'MONITOR_OBJECT'. + +To filter events related to a concrete action, use keyword *new* or *destroy*. + +The second form of invocation takes no further options and exclusively prints +events generated for packets with *nftrace* enabled. + +Hit ^C to finish the monitor operation. + +.Listen to all events, report in native nft format +-------------------------------------------------- +% nft monitor +-------------------------------------------------- + +.Listen to deleted rules, report in JSON format +----------------------------------------------- +% nft -j monitor destroy rules +----------------------------------------------- + +.Listen to both new and destroyed chains, in native nft format +----------------------------------------------------------------- +% nft monitor chains +------------------------------- + +.Listen to ruleset events such as table, chain, rule, set, counters and quotas, in native nft format +---------------------------------------------------------------------------------------------------- +% nft monitor ruleset +--------------------- + +.Trace incoming packets from host 10.0.0.1 +------------------------------------------ +% nft add rule filter input ip saddr 10.0.0.1 meta nftrace set 1 +% nft monitor trace +------------------------------------------ diff --git a/doc/nft.txt b/doc/nft.txt index 3f4593a29831..7e8c8695522d 100644 --- a/doc/nft.txt +++ b/doc/nft.txt @@ -766,17 +766,6 @@ and subtraction can be used to set relative priority, e.g. filter + 5 equals to *destroy*:: Delete the specified flowtable, it does not fail if it does not exist. *list*:: List all flowtables. -LISTING -------- -[verse] -*list { secmarks | synproxys | flow tables | meters | hooks }* ['family'] -*list { secmarks | synproxys | flow tables | meters | hooks } table* ['family'] 'table' -*list ct { timeout | expectation | helper | helpers } table* ['family'] 'table' - -Inspect configured objects. -*list hooks* shows the full hook pipeline, including those registered by -kernel modules, such as nf_conntrack. - STATEFUL OBJECTS ---------------- [verse] @@ -908,57 +897,7 @@ ADDITIONAL COMMANDS ------------------- These are some additional commands included in nft. -MONITOR -~~~~~~~~ -The monitor command allows you to listen to Netlink events produced by the -nf_tables subsystem. These are either related to creation and deletion of -objects or to packets for which *meta nftrace* was enabled. When they -occur, nft will print to stdout the monitored events in either JSON or -native nft format. + - -[verse] -____ -*monitor* [*new* | *destroy*] 'MONITOR_OBJECT' -*monitor* *trace* - -'MONITOR_OBJECT' := *tables* | *chains* | *sets* | *rules* | *elements* | *ruleset* -____ - -To filter events related to a concrete object, use one of the keywords in -'MONITOR_OBJECT'. - -To filter events related to a concrete action, use keyword *new* or *destroy*. - -The second form of invocation takes no further options and exclusively prints -events generated for packets with *nftrace* enabled. - -Hit ^C to finish the monitor operation. - -.Listen to all events, report in native nft format --------------------------------------------------- -% nft monitor --------------------------------------------------- - -.Listen to deleted rules, report in JSON format ------------------------------------------------ -% nft -j monitor destroy rules ------------------------------------------------ - -.Listen to both new and destroyed chains, in native nft format ------------------------------------------------------------------ -% nft monitor chains -------------------------------- - -.Listen to ruleset events such as table, chain, rule, set, counters and quotas, in native nft format ----------------------------------------------------------------------------------------------------- -% nft monitor ruleset ---------------------- - -.Trace incoming packets from host 10.0.0.1 ------------------------------------------- -% nft add rule filter input ip saddr 10.0.0.1 meta nftrace set 1 -% nft monitor trace ------------------------------------------- +include::additional-commands.txt[] ERROR REPORTING ---------------
Add a brief segment about 'nft list hooks' and a summary of the output format. As nft.txt is quite large, split the additonal commands into their own file. The existing listing section is removed; list subcommand is already mentioned in the relevant statement sections. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de> --- Makefile.am | 1 + doc/additional-commands.txt | 115 ++++++++++++++++++++++++++++++++++++ doc/nft.txt | 63 +------------------- 3 files changed, 117 insertions(+), 62 deletions(-) create mode 100644 doc/additional-commands.txt