From patchwork Sat Jun 17 17:42:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 777345 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3wql3G29Qjz9s76 for ; Sun, 18 Jun 2017 03:43:54 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="nmt/MPuZ"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752704AbdFQRnk (ORCPT ); Sat, 17 Jun 2017 13:43:40 -0400 Received: from mail-pf0-f196.google.com ([209.85.192.196]:34411 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752569AbdFQRni (ORCPT ); Sat, 17 Jun 2017 13:43:38 -0400 Received: by mail-pf0-f196.google.com with SMTP id d5so11094773pfe.1 for ; Sat, 17 Jun 2017 10:43:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=tSGJrX+aMW4iRUd3Ydnu2NHHDxPdfkygNpOW1JNHFxs=; b=nmt/MPuZy4MJEswEj7DdpR66Q1YGPXOwvoqtGV5FvAV33V2mbPXMVzcJ7R1tbOVdNV 98DceBDLq++9dK3251DLxiFRlqg0zYMfCamqXMiz4uFlqj7VYDRjrwB3dOouobmAfLpQ Lylq+nEJO8IHbNKuSIzkshnxDY5siuzDtdGso1vZt95HhmifEydbx4QzVnmuqQu17qec kLcbbe5wAuF9/XXe32LY3hEWB1mpieqoyqOWJcLLK0+DM3oRWPQmF7FsQkGX2hnGl2S0 pK5JHwU4yph2ohezFZmmVIaVUeXbJJGymEi2882vmHIMLMMnS5WBhtkpEAF9qxB7ROor gv3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=tSGJrX+aMW4iRUd3Ydnu2NHHDxPdfkygNpOW1JNHFxs=; b=VldRbOZ1AtE8WQqXeyTfYPLuWt1Me3a4cAWrxty+bh4jNqvFUmpl4J7GwMJrdYf+ji MEHV0YjaRdp5b2Tz+F2CcBr29ifrRNllhzhZx5l+xd/pUu2DA/HwwW+MMPzP+mvoz/1P wRqoBNX7TSSjQy6r1ZfWkNAKJ0hTuxMVdrohLIMHLU0GZveYheLX7K45xD2L+r76ICtt NcNUt15KpOWYwMMFDPTnWcI+D+B2VbUiYtj9z47fqJL/regmdm7btis61e8bAOLjY2+y QQYGcgpKW+SKc0VH2MLJmuchrFa973JbhWA66s5qkBVUSLGi3+Bu2fmKy0AGCfEShnDG mjng== X-Gm-Message-State: AKS2vOz43KeBS++kBRyvdRkGTzcl2EnvkQ2pdzNtYoXhk9brUcOcEITU Xn/kq5UebxOYCg== X-Received: by 10.84.136.135 with SMTP id 7mr19837548pll.98.1497721417308; Sat, 17 Jun 2017 10:43:37 -0700 (PDT) Received: from weiwan0.mtv.corp.google.com ([100.123.230.66]) by smtp.gmail.com with ESMTPSA id h7sm11352777pfc.97.2017.06.17.10.43.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 17 Jun 2017 10:43:36 -0700 (PDT) From: Wei Wang To: David Miller , netdev@vger.kernel.org Cc: Eric Dumazet , Martin KaFai Lau , Wei Wang Subject: [PATCH v2 net-next 06/21] ipv4: take dst->__refcnt when caching dst in fib Date: Sat, 17 Jun 2017 10:42:29 -0700 Message-Id: <20170617174244.132862-7-tracywwnj@gmail.com> X-Mailer: git-send-email 2.13.1.518.g3df882009-goog In-Reply-To: <20170617174244.132862-1-tracywwnj@gmail.com> References: <20170617174244.132862-1-tracywwnj@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Wei Wang In IPv4 routing code, fib_nh and fib_nh_exception can hold pointers to struct rtable but they never increment dst->__refcnt. This leads to the need of the dst garbage collector because when user is done with this dst and calls dst_release(), it can only decrement dst->__refcnt and can not free the dst even it sees dst->__refcnt drops from 1 to 0 (unless DST_NOCACHE flag is set) because the routing code might still hold reference to it. And when the routing code tries to delete a route, it has to put the dst to the gc_list if dst->__refcnt is not yet 0 and have a gc thread running periodically to check on dst->__refcnt and finally to free dst when refcnt becomes 0. This patch increments dst->__refcnt when fib_nh/fib_nh_exception holds reference to this dst and properly release the dst when fib_nh/fib_nh_exception has been updated with a new dst. This patch is a preparation in order to fully get rid of dst gc later. Signed-off-by: Wei Wang Acked-by: Martin KaFai Lau --- net/ipv4/fib_semantics.c | 5 ++++- net/ipv4/route.c | 19 ++++++++++++++++--- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 2157dc08c407..53b3e9c2da4c 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -152,6 +152,7 @@ static void rt_fibinfo_free(struct rtable __rcu **rtp) * free_fib_info_rcu() */ + dst_release(&rt->dst); dst_free(&rt->dst); } @@ -194,8 +195,10 @@ static void rt_fibinfo_free_cpus(struct rtable __rcu * __percpu *rtp) struct rtable *rt; rt = rcu_dereference_protected(*per_cpu_ptr(rtp, cpu), 1); - if (rt) + if (rt) { + dst_release(&rt->dst); dst_free(&rt->dst); + } } free_percpu(rtp); } diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 0a843ef2b709..3dee0043117e 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -603,11 +603,13 @@ static void fnhe_flush_routes(struct fib_nh_exception *fnhe) rt = rcu_dereference(fnhe->fnhe_rth_input); if (rt) { RCU_INIT_POINTER(fnhe->fnhe_rth_input, NULL); + dst_release(&rt->dst); rt_free(rt); } rt = rcu_dereference(fnhe->fnhe_rth_output); if (rt) { RCU_INIT_POINTER(fnhe->fnhe_rth_output, NULL); + dst_release(&rt->dst); rt_free(rt); } } @@ -1332,9 +1334,12 @@ static bool rt_bind_exception(struct rtable *rt, struct fib_nh_exception *fnhe, rt->rt_gateway = daddr; if (!(rt->dst.flags & DST_NOCACHE)) { + dst_hold(&rt->dst); rcu_assign_pointer(*porig, rt); - if (orig) + if (orig) { + dst_release(&orig->dst); rt_free(orig); + } ret = true; } @@ -1357,12 +1362,20 @@ static bool rt_cache_route(struct fib_nh *nh, struct rtable *rt) } orig = *p; + /* hold dst before doing cmpxchg() to avoid race condition + * on this dst + */ + dst_hold(&rt->dst); prev = cmpxchg(p, orig, rt); if (prev == orig) { - if (orig) + if (orig) { + dst_release(&orig->dst); rt_free(orig); - } else + } + } else { + dst_release(&rt->dst); ret = false; + } return ret; }