Message ID | 20241026105030.75254-1-fw@strlen.de |
---|---|
State | Superseded, archived |
Headers | show |
Series | [nf-next] netfilter: conntrack: collect start time as early as possible | expand |
Hi Florian, On Sat, Oct 26, 2024 at 12:50:13PM +0200, Florian Westphal wrote: > Sample start time at allocation time, not when the conntrack entry > is inserted into the hashtable. Back at the time, long time ago, I remember to have measured a performance impact on this. > In most cases this makes very little difference, but there are > cases where there is significant delay beteen allocation and > confirmation, e.g. when packets get queued to userspace. I delayed this to insertion time because packet could dropped before, rendering this conntrack timestamp useless? There is no event reporting for conntrack that never get confirmed.
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Sat, Oct 26, 2024 at 12:50:13PM +0200, Florian Westphal wrote: > > Sample start time at allocation time, not when the conntrack entry > > is inserted into the hashtable. > > Back at the time, long time ago, I remember to have measured a > performance impact on this. You mean when enabling timestamp + conntracks get dropped before confirm, correct? > > In most cases this makes very little difference, but there are > > cases where there is significant delay beteen allocation and > > confirmation, e.g. when packets get queued to userspace. > > I delayed this to insertion time because packet could dropped before, > rendering this conntrack timestamp useless? There is no event > reporting for conntrack that never get confirmed. Sure, but the "issue" is that the reported start time doesn't account for a possible delay. I did not measure huge delta before/after this patch but if you have e.g. nfqueue in between alloc+confirm then the start timestamp will account for that delay after this patch.
On Tue, Oct 29, 2024 at 08:16:24AM +0100, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > On Sat, Oct 26, 2024 at 12:50:13PM +0200, Florian Westphal wrote: > > > Sample start time at allocation time, not when the conntrack entry > > > is inserted into the hashtable. > > > > Back at the time, long time ago, I remember to have measured a > > performance impact on this. > > You mean when enabling timestamp + conntracks get dropped before > confirm, correct? Yes. > > > In most cases this makes very little difference, but there are > > > cases where there is significant delay beteen allocation and > > > confirmation, e.g. when packets get queued to userspace. > > > > I delayed this to insertion time because packet could dropped before, > > rendering this conntrack timestamp useless? There is no event > > reporting for conntrack that never get confirmed. > > Sure, but the "issue" is that the reported start time doesn't account > for a possible delay. I did not measure huge delta before/after this > patch but if you have e.g. nfqueue in between alloc+confirm then the > start timestamp will account for that delay after this patch. I see. I think the question is what this start timestamp is. For me, it is the start time since the conntrack is _confirmed_ which is what we expose to userspace via ctnetlink and /proc interface. Is this user trying to trying to profile nfqueue? Why does this user assume conntrack allocation time is the right spot to push the timestamp on the ct? On top of this, at that time I made this, I measured ~20-25% performance drop to get this accurate timestamp, probably this is cheaper now in modern equipment?
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > I delayed this to insertion time because packet could dropped before, > > > rendering this conntrack timestamp useless? There is no event > > > reporting for conntrack that never get confirmed. > > > > Sure, but the "issue" is that the reported start time doesn't account > > for a possible delay. I did not measure huge delta before/after this > > patch but if you have e.g. nfqueue in between alloc+confirm then the > > start timestamp will account for that delay after this patch. > > I see. I think the question is what this start timestamp is. For me, > it is the start time since the conntrack is _confirmed_ which is what > we expose to userspace via ctnetlink and /proc interface. Sure, its a question on definition as to what "flow start time" means. See below for yet another proposal. > Is this user trying to trying to profile nfqueue? Why does this user > assume conntrack allocation time is the right spot to push the > timestamp on the ct? If I understood correctly its about using conntrack events + timestamp extension to get (passive) RTT measurements. > On top of this, at that time I made this, I measured ~20-25% > performance drop to get this accurate timestamp, probably this is > cheaper now in modern equipment? I think cost depends on the clock source, hopefully most systems do not have a too expensive one nowadays. What about using skb_tstamp() first? If skb already had rx tstamp enabled, we'd get an even more accurate start time (packet arrival), even before ct got allocated.
diff --git a/include/net/netfilter/nf_conntrack_timestamp.h b/include/net/netfilter/nf_conntrack_timestamp.h index 57138d974a9f..76515353829d 100644 --- a/include/net/netfilter/nf_conntrack_timestamp.h +++ b/include/net/netfilter/nf_conntrack_timestamp.h @@ -23,18 +23,18 @@ struct nf_conn_tstamp *nf_conn_tstamp_find(const struct nf_conn *ct) #endif } -static inline -struct nf_conn_tstamp *nf_ct_tstamp_ext_add(struct nf_conn *ct, gfp_t gfp) +static inline void nf_ct_tstamp_ext_add(struct nf_conn *ct, gfp_t gfp) { #ifdef CONFIG_NF_CONNTRACK_TIMESTAMP struct net *net = nf_ct_net(ct); + struct nf_conn_tstamp *tstamp; if (!net->ct.sysctl_tstamp) - return NULL; + return; - return nf_ct_ext_add(ct, NF_CT_EXT_TSTAMP, gfp); -#else - return NULL; + tstamp = nf_ct_ext_add(ct, NF_CT_EXT_TSTAMP, gfp); + if (tstamp) + tstamp->start = ktime_get_real_ns(); #endif }; diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 9db3e2b0b1c3..ed1870096519 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -976,18 +976,6 @@ static void nf_ct_acct_merge(struct nf_conn *ct, enum ip_conntrack_info ctinfo, } } -static void __nf_conntrack_insert_prepare(struct nf_conn *ct) -{ - struct nf_conn_tstamp *tstamp; - - refcount_inc(&ct->ct_general.use); - - /* set conntrack timestamp, if enabled. */ - tstamp = nf_conn_tstamp_find(ct); - if (tstamp) - tstamp->start = ktime_get_real_ns(); -} - /** * nf_ct_match_reverse - check if ct1 and ct2 refer to identical flow * @ct1: conntrack in hash table to check against @@ -1111,7 +1099,7 @@ static int nf_ct_resolve_clash_harder(struct sk_buff *skb, u32 repl_idx) */ loser_ct->status |= IPS_FIXED_TIMEOUT | IPS_NAT_CLASH; - __nf_conntrack_insert_prepare(loser_ct); + refcount_inc(&loser_ct->ct_general.use); /* fake add for ORIGINAL dir: we want lookups to only find the entry * already in the table. This also hides the clashing entry from @@ -1295,7 +1283,7 @@ __nf_conntrack_confirm(struct sk_buff *skb) weird delay cases. */ ct->timeout += nfct_time_stamp; - __nf_conntrack_insert_prepare(ct); + refcount_inc(&ct->ct_general.use); /* Since the lookup is lockless, hash insertion must be done after * starting the timer and setting the CONFIRMED bit. The RCU barriers
Sample start time at allocation time, not when the conntrack entry is inserted into the hashtable. In most cases this makes very little difference, but there are cases where there is significant delay beteen allocation and confirmation, e.g. when packets get queued to userspace. Sampling as early as possible exposes this extra delay to userspace via ctnetlink. Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com> Fixes: a992ca2a0498 ("netfilter: nf_conntrack_tstamp: add flow-based timestamp extension") Signed-off-by: Florian Westphal <fw@strlen.de> --- include/net/netfilter/nf_conntrack_timestamp.h | 12 ++++++------ net/netfilter/nf_conntrack_core.c | 16 ++-------------- 2 files changed, 8 insertions(+), 20 deletions(-)