mbox series

[nf-next,v4,0/5] netfilter: nf_tables: reduce set element transaction size

Message ID 20241107174415.4690-1-fw@strlen.de
Headers show
Series netfilter: nf_tables: reduce set element transaction size | expand

Message

Florian Westphal Nov. 7, 2024, 5:44 p.m. UTC
v4: fix a typo in patch 3 commit message.
    rebase on nf-next:main.

No other changes.

Changes in v3:
I failed to realize that nft_audit leaks one implementation detail
to userspace: the length of the transaction log.
This gets fixed by patch 3 which adds needed helper to increment
the count variable by the number of elements carried by the compacted
set update.

Also fix up notifications, for update case, notifications were
skipped but currently newsetelem notifications are done even if
existing set element is updated.

Most patches are unchanged.
"prefer nft_trans_elem_alloc helper" is already upstreamed so
its dropped from this batch.

v2: only change is in patch 3, and by extension, the last one:
During transaction abort, we need to handle an aggregate container to
contain both new set elements and updates.  The latter must be
skipped, else we remove element that already existed at start of the
transaction.

original cover letter:

When doing a flush on a set or mass adding/removing elements from a
set, each element needs to allocate 96 bytes to hold the transactional
state.

In such cases, virtually all the information in struct nft_trans_elem
is the same.

Change nft_trans_elem to a flex-array, i.e. a single nft_trans_elem
can hold multiple set element pointers.

The number of elements that can be stored in one nft_trans_elem is limited
by the slab allocator, this series limits the compaction to at most 62
elements as it caps the reallocation to 2048 bytes of memory.

Florian Westphal (5):
  netfilter: nf_tables: add nft_trans_commit_list_add_elem helper
  netfilter: nf_tables: prepare for multiple elements in nft_trans_elem
    structure
  netfilter: nf_tables: preemptive fix for audit selftest failure
  netfilter: nf_tables: switch trans_elem to real flex array
  netfilter: nf_tables: allocate element update information dynamically

 include/net/netfilter/nf_tables.h |  25 +-
 net/netfilter/nf_tables_api.c     | 368 +++++++++++++++++++++++-------
 2 files changed, 304 insertions(+), 89 deletions(-)

Comments

Pablo Neira Ayuso Nov. 12, 2024, 6:42 p.m. UTC | #1
Hi Florian,

This series looks good to me.

Regarding 3/5, I don't see any fix or anything silly in this.

>nftables audit log format unfortunately leaks an implementation detail, the
>transaction log size, to userspace:
>
>    table=t1 family=2 entries=4 op=nft_register_set
>                      ~~~~~~~~~
>
>This 'entries' key is the number of transactions that will be applied.

To my understanding, entries= is the number of entries that are either
added or updated in this transaction.

Before this patch, there was a 1:1 mapping between transaction and
elements, now this is not the case anymore.

If entries= exposes only the number of transactions, then this becomes
useless to userspace?

In iptables, it shows the number of entries in the table after the
update.
Florian Westphal Nov. 12, 2024, 8:44 p.m. UTC | #2
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >nftables audit log format unfortunately leaks an implementation detail, the
> >transaction log size, to userspace:
> >
> >    table=t1 family=2 entries=4 op=nft_register_set
> >                      ~~~~~~~~~
> >
> >This 'entries' key is the number of transactions that will be applied.
> 
> To my understanding, entries= is the number of entries that are either
> added or updated in this transaction.
> 
> Before this patch, there was a 1:1 mapping between transaction and
> elements, now this is not the case anymore.
> 
> If entries= exposes only the number of transactions, then this becomes
> useless to userspace?

Hmm, I would need to know what this is supposed to be.
Its not going to be the same in either case,
iptables-legacy -A ... vs iptables-nft -A won't result in same
entries due to the whole-table-replace paradigm and introduction
of "update" mechanism also changes entries count.

I think its fine now, but please feel free to rewrite the commit
message if you think its needed.
Pablo Neira Ayuso Nov. 13, 2024, 10:19 a.m. UTC | #3
On Tue, Nov 12, 2024 at 09:44:36PM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > >nftables audit log format unfortunately leaks an implementation detail, the
> > >transaction log size, to userspace:
> > >
> > >    table=t1 family=2 entries=4 op=nft_register_set
> > >                      ~~~~~~~~~
> > >
> > >This 'entries' key is the number of transactions that will be applied.
> > 
> > To my understanding, entries= is the number of entries that are either
> > added or updated in this transaction.
> > 
> > Before this patch, there was a 1:1 mapping between transaction and
> > elements, now this is not the case anymore.
> > 
> > If entries= exposes only the number of transactions, then this becomes
> > useless to userspace?
> 
> Hmm, I would need to know what this is supposed to be.
> Its not going to be the same in either case,
> iptables-legacy -A ... vs iptables-nft -A won't result in same
> entries due to the whole-table-replace paradigm and introduction
> of "update" mechanism also changes entries count.

Right, there is a change between -legacy and -nft regarding audit.

> I think its fine now, but please feel free to rewrite the commit
> message if you think its needed.

Thanks, I will make an edit.