From patchwork Sat Jun 17 17:42:27 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 777350 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3wql3L4GVpz9s7M for ; Sun, 18 Jun 2017 03:43:58 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="S8eBoHBf"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752811AbdFQRnw (ORCPT ); Sat, 17 Jun 2017 13:43:52 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:34937 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752755AbdFQRnq (ORCPT ); Sat, 17 Jun 2017 13:43:46 -0400 Received: by mail-pf0-f195.google.com with SMTP id s66so11093245pfs.2 for ; Sat, 17 Jun 2017 10:43:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cZ4pCWl4lVEEtnIrqKUsCh5X/s3RkCwNue2cQgaLQKs=; b=S8eBoHBfbpx+gsrCSGLon33uIkVJmmdzcvUNKlLI6b4f8I3HAcZ7Ytx8Fyk2MjIq1H y8Doy73LXYe/R2L0zQBfLhNH/a4Z+ZF8ahn9FNvaefkxTAV4+ODWmlcY4eRqnEj4A19m SzfCk94sUOZHF9VIqzUrF6526YM+b//Thz+BiZQMIqLK7sqxEs7krQH8rIjeYArF9QEE 45YfXmYCWO687fRZSwKPK7Kf3w6pV0zldby/9yUUmAiWKVQUALhGCq7LPZARI/svw8d9 /DPidzVRmtbC+OS++qEqF5M8GkqdhwENkWiV8z9Es1btLVyMzFtxovK8NDOEBCtoVrrl +qxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cZ4pCWl4lVEEtnIrqKUsCh5X/s3RkCwNue2cQgaLQKs=; b=CED2NkKHOK3oUcmYBgewa8HImAYjnzJP9vCXnOBe+6YpH5QGrYs55sAPRBQhmoso+r 1RHsozzN3FUu1Qst4iqU2PrDRD7RcU/CTnPIhPf14QGYDo036U5jcpZveEWKD8rtq7kI lh22yi5xxdWgz8obJzXK5Cyj+1JXFCY0E96kskghe4OzVqLp7j1udTxvUs/VOah8I2fF /+YAPBNOpot75wUCpwgTqP8USKM+QulHszA7632VpdG2AICNnC2lnEz1bh2AO/u751Ht YCJq0fx6R5S1/PFbDJVEe8rsMHh6Q6O/KKzsODPVUYihuHE9vA9+PLWt2mKyssyYJ9oz BaqA== X-Gm-Message-State: AKS2vOxNg+/uaS2Wrv7xd6oPN+WObNcbFZPqfm0NheO3W1Xir0vc3BUF xgzhumHvl0jxww== X-Received: by 10.84.194.3 with SMTP id g3mr20134447pld.117.1497721415595; Sat, 17 Jun 2017 10:43:35 -0700 (PDT) Received: from weiwan0.mtv.corp.google.com ([100.123.230.66]) by smtp.gmail.com with ESMTPSA id h7sm11352777pfc.97.2017.06.17.10.43.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 17 Jun 2017 10:43:34 -0700 (PDT) From: Wei Wang To: David Miller , netdev@vger.kernel.org Cc: Eric Dumazet , Martin KaFai Lau , Wei Wang Subject: [PATCH v2 net-next 04/21] net: introduce DST_NOGC in dst_release() to destroy dst based on refcnt Date: Sat, 17 Jun 2017 10:42:27 -0700 Message-Id: <20170617174244.132862-5-tracywwnj@gmail.com> X-Mailer: git-send-email 2.13.1.518.g3df882009-goog In-Reply-To: <20170617174244.132862-1-tracywwnj@gmail.com> References: <20170617174244.132862-1-tracywwnj@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Wei Wang The current mechanism of freeing dst is a bit complicated. dst has its ref count and when user grabs the reference to the dst, the ref count is properly taken in most cases except in IPv4/IPv6/decnet/xfrm routing code due to some historic reasons. If the reference to dst is always taken properly, we should be able to simplify the logic in dst_release() to destroy dst when dst->__refcnt drops from 1 to 0. And this should be the only condition to determine if we can call dst_destroy(). And as dst is always ref counted, there is no need for a dst garbage list to hold the dst entries that already get removed by the routing code but are still held by other users. And the task to periodically check the list to free dst if ref count become 0 is also not needed anymore. This patch introduces a temporary flag DST_NOGC(no garbage collector). If it is set in the dst, dst_release() will call dst_destroy() when dst->__refcnt drops to 0. dst_hold_safe() will also check for this flag and do atomic_inc_not_zero() similar as DST_NOCACHE to avoid double free issue. This temporary flag is mainly used so that we can make the transition component by component without breaking other parts. This flag will be removed after all components are properly transitioned. This patch also introduces a new function dst_release_immediate() which destroys dst without waiting on the rcu when refcnt drops to 0. It will be used in later patches. Follow-up patches will correct all the places to properly take ref count on dst and mark DST_NOGC. dst_release() or dst_release_immediate() will be used to release the dst instead of dst_free() and its related functions. And final clean-up patch will remove the DST_NOGC flag. Signed-off-by: Wei Wang Acked-by: Martin KaFai Lau --- include/net/dst.h | 5 ++++- net/core/dst.c | 20 ++++++++++++++++++-- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/include/net/dst.h b/include/net/dst.h index 1969008783d8..2735d5a1e774 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -58,6 +58,7 @@ struct dst_entry { #define DST_XFRM_TUNNEL 0x0080 #define DST_XFRM_QUEUE 0x0100 #define DST_METADATA 0x0200 +#define DST_NOGC 0x0400 short error; @@ -278,6 +279,8 @@ static inline struct dst_entry *dst_clone(struct dst_entry *dst) void dst_release(struct dst_entry *dst); +void dst_release_immediate(struct dst_entry *dst); + static inline void refdst_drop(unsigned long refdst) { if (!(refdst & SKB_DST_NOREF)) @@ -334,7 +337,7 @@ static inline void skb_dst_force(struct sk_buff *skb) */ static inline bool dst_hold_safe(struct dst_entry *dst) { - if (dst->flags & DST_NOCACHE) + if (dst->flags & (DST_NOCACHE | DST_NOGC)) return atomic_inc_not_zero(&dst->__refcnt); dst_hold(dst); return true; diff --git a/net/core/dst.c b/net/core/dst.c index 13ba4a090c41..551834c3363f 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -300,18 +300,34 @@ void dst_release(struct dst_entry *dst) { if (dst) { int newrefcnt; - unsigned short nocache = dst->flags & DST_NOCACHE; + unsigned short destroy_after_rcu = dst->flags & + (DST_NOCACHE | DST_NOGC); newrefcnt = atomic_dec_return(&dst->__refcnt); if (unlikely(newrefcnt < 0)) net_warn_ratelimited("%s: dst:%p refcnt:%d\n", __func__, dst, newrefcnt); - if (!newrefcnt && unlikely(nocache)) + if (!newrefcnt && unlikely(destroy_after_rcu)) call_rcu(&dst->rcu_head, dst_destroy_rcu); } } EXPORT_SYMBOL(dst_release); +void dst_release_immediate(struct dst_entry *dst) +{ + if (dst) { + int newrefcnt; + + newrefcnt = atomic_dec_return(&dst->__refcnt); + if (unlikely(newrefcnt < 0)) + net_warn_ratelimited("%s: dst:%p refcnt:%d\n", + __func__, dst, newrefcnt); + if (!newrefcnt) + dst_destroy(dst); + } +} +EXPORT_SYMBOL(dst_release_immediate); + u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old) { struct dst_metrics *p = kmalloc(sizeof(*p), GFP_ATOMIC);