[ovs-dev] conntrack: Do clean instead of forece expire.

Message ID	20240305134655.641196-1-lic121@chinatelecom.cn
State	Changes Requested
Delegated to:	aaron conole
Headers	show Return-Path: <ovs-dev-bounces@openvswitch.org> DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org 3432041602 Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=182.42.158.78; helo=chinatelecom.cn; envelope-from=lic121@chinatelecom.cn; receiver=<UNKNOWN> From: Cheng Li <lic121@chinatelecom.cn> To: dev@openvswitch.org, grive@u256.net, pvalerio@redhat.com Date: Tue, 5 Mar 2024 13:46:55 +0000 Message-Id: <20240305134655.641196-1-lic121@chinatelecom.cn> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH] conntrack: Do clean instead of forece expire. Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" <ovs-dev-bounces@openvswitch.org>
Series	[ovs-dev] conntrack: Do clean instead of forece expire. \| expand [ovs-dev] conntrack: Do clean instead of forece expire.

Message ID

20240305134655.641196-1-lic121@chinatelecom.cn

State

Changes Requested

Delegated to:

aaron conole

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org 3432041602
Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=182.42.158.78;
 helo=chinatelecom.cn; envelope-from=lic121@chinatelecom.cn;
 receiver=<UNKNOWN>
HMM_SOURCE_IP: 192.168.137.232:6294.1410184662
HMM_ATTACHE_NUM: 0000
HMM_SOURCE_TYPE: SMTP
Received: from clientip-118.114.58.94 (unknown [192.168.137.232])
 by chinatelecom.cn (HERMES) with SMTP id B90A09BF2E;
 Tue,  5 Mar 2024 21:47:05 +0800 (CST)
X-189-SAVE-TO-SEND: +lic121@chinatelecom.cn
Received: from  ([118.114.58.94])
 by gateway-ssl-dep-77bc75f6c8-dplxv with ESMTP id
 7c89ba6ac914456ba8954a3a8b209a79 for dev@openvswitch.org;
 Tue, 05 Mar 2024 21:47:10 CST
X-Transaction-ID: 7c89ba6ac914456ba8954a3a8b209a79
X-Real-From: lic121@chinatelecom.cn
X-Receive-IP: 118.114.58.94
X-MEDUSA-Status: 0
From: Cheng Li <lic121@chinatelecom.cn>
To: dev@openvswitch.org,
	grive@u256.net,
	pvalerio@redhat.com
Date: Tue,  5 Mar 2024 13:46:55 +0000
Message-Id: <20240305134655.641196-1-lic121@chinatelecom.cn>
X-Mailer: git-send-email 2.39.3
MIME-Version: 1.0
Subject: [ovs-dev] [PATCH] conntrack: Do clean instead of forece expire.
X-BeenThere: ovs-dev@openvswitch.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <ovs-dev.openvswitch.org>
List-Unsubscribe: <https://mail.openvswitch.org/mailman/options/ovs-dev>,
 <mailto:ovs-dev-request@openvswitch.org?subject=unsubscribe>
List-Archive: <http://mail.openvswitch.org/pipermail/ovs-dev/>
List-Post: <mailto:ovs-dev@openvswitch.org>
List-Help: <mailto:ovs-dev-request@openvswitch.org?subject=help>
List-Subscribe: <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>,
 <mailto:ovs-dev-request@openvswitch.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ovs-dev-bounces@openvswitch.org
Sender: "dev" <ovs-dev-bounces@openvswitch.org>

Series

[ovs-dev] conntrack: Do clean instead of forece expire. | expand

Checks

Context	Check	Description
ovsrobot/apply-robot	success	apply and check: success
ovsrobot/github-robot-_Build_and_Test	success	github build: passed
ovsrobot/intel-ovs-compilation	success	test: success

Context

Check

Description

ovsrobot/apply-robot

success

apply and check: success

ovsrobot/github-robot-_Build_and_Test

success

github build: passed

ovsrobot/intel-ovs-compilation

success

test: success

Commit Message

Cheng Li March 5, 2024, 1:46 p.m. UTC

Force expire a connection and then create new connection of the same
tuple(cmap hash). This makes ct->conns cmap operation expensive(
within ct->ct_lock).

This patch cover the scenario by doing the clean immediately instead
of setting expire. Also this patch fix ct_clean debug log.

Signed-off-by: Cheng Li <lic121@chinatelecom.cn>
---
 lib/conntrack.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

Comments

Aaron Conole March 27, 2024, 5:37 p.m. UTC | #1

Cheng Li <lic121@chinatelecom.cn> writes:

> Force expire a connection and then create new connection of the same
> tuple(cmap hash). This makes ct->conns cmap operation expensive(
> within ct->ct_lock).
>
> This patch cover the scenario by doing the clean immediately instead
> of setting expire. Also this patch fix ct_clean debug log.
>
> Signed-off-by: Cheng Li <lic121@chinatelecom.cn>
> ---

I'm having trouble with this commit message.  You don't include any
details about any performance characteristics that you measured
anywhere, while (rightly) pointing out that this has a larger
performance impact.  Please include these details (how you measured the
impact, what changed, etc).

>  lib/conntrack.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
>
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 8a7056bac..6e137daa6 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -461,12 +461,6 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>      atomic_count_dec(&ct->n_conn);
>  }
> 
> -static void
> -conn_force_expire(struct conn *conn)
> -{
> -    atomic_store_relaxed(&conn->expiration, 0);
> -}
> -
>  /* Destroys the connection tracker 'ct' and frees all the allocated memory.
>   * The caller of this function must already have shut down packet input
>   * and PMD threads (which would have been quiesced).  */
> @@ -1030,7 +1024,10 @@ conn_update_state(struct conntrack *ct, struct dp_packet *pkt,
>          case CT_UPDATE_NEW:
>              if (conn_lookup(ct, &conn->key_node[CT_DIR_FWD].key,
>                              now, NULL, NULL)) {
> -                conn_force_expire(conn);
> +                /* Instead of setting conn->expiration and let ct_clean thread
> +                 * clean the conn, doing the clean now to avoid duplicate hash
> +                 * in ct->conns for performance reason. */
> +                conn_clean(ct, conn);

I guess the idea is that the clean can happen from the unlocked context,
but that creates a lock in the datapath (which is why the expire list
update is here).  The conn_key_lookup() has to iterate over the list
more, but then it doesn't need to take an expensive lock.

In this case, we will have to acquire the big CT lock, and if there are
two lookups happening simultaneously we will pend processing one packet
for the other.  Ideally, the adaptive nature of the pthread will prevent
us from having to call into kernel, but that isn't a guarantee - just a
best effort.

Have you done any kind of measurement on this?  I'm quite worried about
the worst-case that this introduces into the packet processing path.
The existing design avoids this worst-case lock performance precisely
because it marks the connection as needing expiration so that future
lookups will skip.

>              }
>              create_new_conn = true;
>              break;
> @@ -1242,7 +1239,7 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
>      if (OVS_UNLIKELY(force && ctx->reply && conn)) {
>          if (conn_lookup(ct, &conn->key_node[CT_DIR_FWD].key,
>                          now, NULL, NULL)) {
> -            conn_force_expire(conn);
> +            conn_clean(ct, conn);
>          }
>          conn = NULL;
>      }
> @@ -1429,7 +1426,8 @@ conntrack_get_sweep_interval(struct conntrack *ct)
>  }
> 
>  static size_t
> -ct_sweep(struct conntrack *ct, struct rculist *list, long long now)
> +ct_sweep(struct conntrack *ct, struct rculist *list, long long now,
> +         size_t *cleaned_count)
>      OVS_NO_THREAD_SAFETY_ANALYSIS
>  {
>      struct conn *conn;
> @@ -1438,6 +1436,7 @@ ct_sweep(struct conntrack *ct, struct rculist *list, long long now)
>      RCULIST_FOR_EACH (conn, node, list) {
>          if (conn_expired(conn, now)) {
>              conn_clean(ct, conn);
> +            (*cleaned_count) ++;

A fix somewhat like this may be needed anyway.  The log seems to have become
inaccurate with:

3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with rculists.")

Please submit it as a separate patch so we can backport.

>          }
> 
>          count++;
> @@ -1454,7 +1453,7 @@ conntrack_clean(struct conntrack *ct, long long now)
>  {
>      long long next_wakeup = now + conntrack_get_sweep_interval(ct);
>      unsigned int n_conn_limit, i;
> -    size_t clean_end, count = 0;
> +    size_t clean_end, count = 0, cleaned_count = 0;
> 
>      atomic_read_relaxed(&ct->n_conn_limit, &n_conn_limit);
>      clean_end = n_conn_limit / 64;
> @@ -1465,13 +1464,13 @@ conntrack_clean(struct conntrack *ct, long long now)
>              break;
>          }
> 
> -        count += ct_sweep(ct, &ct->exp_lists[i], now);
> +        count += ct_sweep(ct, &ct->exp_lists[i], now, &cleaned_count);
>      }
> 
>      ct->next_sweep = (i < N_EXP_LISTS) ? i : 0;
>  
> -    VLOG_DBG("conntrack cleanup %"PRIuSIZE" entries in %lld msec", count,
> -             time_msec() - now);
> +    VLOG_DBG("conntrack cleanup %"PRIuSIZE"/%"PRIuSIZE" entries in %lld msec",
> +             cleaned_count, count, time_msec() - now);
>  
>      return next_wakeup;
>  }

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 8a7056bac..6e137daa6 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -461,12 +461,6 @@  conn_clean(struct conntrack *ct, struct conn *conn)
     atomic_count_dec(&ct->n_conn);
 }
 
-static void
-conn_force_expire(struct conn *conn)
-{
-    atomic_store_relaxed(&conn->expiration, 0);
-}
-
 /* Destroys the connection tracker 'ct' and frees all the allocated memory.
  * The caller of this function must already have shut down packet input
  * and PMD threads (which would have been quiesced).  */
@@ -1030,7 +1024,10 @@  conn_update_state(struct conntrack *ct, struct dp_packet *pkt,
         case CT_UPDATE_NEW:
             if (conn_lookup(ct, &conn->key_node[CT_DIR_FWD].key,
                             now, NULL, NULL)) {
-                conn_force_expire(conn);
+                /* Instead of setting conn->expiration and let ct_clean thread
+                 * clean the conn, doing the clean now to avoid duplicate hash
+                 * in ct->conns for performance reason. */
+                conn_clean(ct, conn);
             }
             create_new_conn = true;
             break;
@@ -1242,7 +1239,7 @@  process_one(struct conntrack *ct, struct dp_packet *pkt,
     if (OVS_UNLIKELY(force && ctx->reply && conn)) {
         if (conn_lookup(ct, &conn->key_node[CT_DIR_FWD].key,
                         now, NULL, NULL)) {
-            conn_force_expire(conn);
+            conn_clean(ct, conn);
         }
         conn = NULL;
     }
@@ -1429,7 +1426,8 @@  conntrack_get_sweep_interval(struct conntrack *ct)
 }
 
 static size_t
-ct_sweep(struct conntrack *ct, struct rculist *list, long long now)
+ct_sweep(struct conntrack *ct, struct rculist *list, long long now,
+         size_t *cleaned_count)
     OVS_NO_THREAD_SAFETY_ANALYSIS
 {
     struct conn *conn;
@@ -1438,6 +1436,7 @@  ct_sweep(struct conntrack *ct, struct rculist *list, long long now)
     RCULIST_FOR_EACH (conn, node, list) {
         if (conn_expired(conn, now)) {
             conn_clean(ct, conn);
+            (*cleaned_count) ++;
         }
 
         count++;
@@ -1454,7 +1453,7 @@  conntrack_clean(struct conntrack *ct, long long now)
 {
     long long next_wakeup = now + conntrack_get_sweep_interval(ct);
     unsigned int n_conn_limit, i;
-    size_t clean_end, count = 0;
+    size_t clean_end, count = 0, cleaned_count = 0;
 
     atomic_read_relaxed(&ct->n_conn_limit, &n_conn_limit);
     clean_end = n_conn_limit / 64;
@@ -1465,13 +1464,13 @@  conntrack_clean(struct conntrack *ct, long long now)
             break;
         }
 
-        count += ct_sweep(ct, &ct->exp_lists[i], now);
+        count += ct_sweep(ct, &ct->exp_lists[i], now, &cleaned_count);
     }
 
     ct->next_sweep = (i < N_EXP_LISTS) ? i : 0;
 
-    VLOG_DBG("conntrack cleanup %"PRIuSIZE" entries in %lld msec", count,
-             time_msec() - now);
+    VLOG_DBG("conntrack cleanup %"PRIuSIZE"/%"PRIuSIZE" entries in %lld msec",
+             cleaned_count, count, time_msec() - now);
 
     return next_wakeup;
 }

[ovs-dev] conntrack: Do clean instead of forece expire.

Checks

Commit Message

Comments

Patch