From patchwork Tue Apr 19 15:25:03 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans Schillstrom X-Patchwork-Id: 91995 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4D3DEB6FEA for ; Wed, 20 Apr 2011 01:25:48 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752737Ab1DSPZR (ORCPT ); Tue, 19 Apr 2011 11:25:17 -0400 Received: from smtp-gw21.han.skanova.net ([81.236.55.21]:50976 "EHLO smtp-gw21.han.skanova.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752702Ab1DSPZP (ORCPT ); Tue, 19 Apr 2011 11:25:15 -0400 Received: from dmz.mlab.se (213.65.94.224) by smtp-gw21.han.skanova.net (8.5.133) id 4D651A8801A3CF63; Tue, 19 Apr 2011 17:25:13 +0200 Received: from quad.mlab.se (quad.mlab.se [172.24.1.70]) by dmz.mlab.se (8.13.8/8.13.8) with ESMTP id p3JFPAr2024021; Tue, 19 Apr 2011 17:25:10 +0200 Received: from quad.mlab.se (localhost.localdomain [127.0.0.1]) by quad.mlab.se (8.14.4/8.14.4) with ESMTP id p3JFP9oG029214; Tue, 19 Apr 2011 17:25:09 +0200 Received: (from hans@localhost) by quad.mlab.se (8.14.4/8.14.4/Submit) id p3JFP8OV029212; Tue, 19 Apr 2011 17:25:08 +0200 From: Hans Schillstrom To: horms@verge.net.au, ja@ssi.bg, ebiederm@xmission.com, lvs-devel@vger.kernel.org, netdev@vger.kernel.org, netfilter-devel@vger.kernel.org Cc: hans.schillstrom@ericsson.com, Hans Schillstrom Subject: [PATCH 1/3] IPVS: Change of socket usage to enable name space exit. Date: Tue, 19 Apr 2011 17:25:03 +0200 Message-Id: <1303226705-29178-1-git-send-email-hans@schillstrom.com> X-Mailer: git-send-email 1.7.2.3 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is the first patch in a series of three. The cleanup doesn't work when not exit in a clean way by using ipvsadm. Killing of a namespace causes a hanging ipvs, this series will cure that. If the sync daemons run in a namespace while it crashes or get killed, there is no way to stop them except for a reboot. Kernel threads should not increment the use count of a socket. By calling sk_change_net() after creating a socket this is avoided. sock_release cant be used, instead sk_release_kernel() should be used. Thanks to Eric W Biederman. This patch is based on net-next-2.6 ver 2.6.39-rc2 Signed-off-by: Hans Schillstrom --- net/netfilter/ipvs/ip_vs_sync.c | 28 +++++++++++++++++++--------- 1 files changed, 19 insertions(+), 9 deletions(-) -- 1.7.2.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c index 3e7961e..3f87555 100644 --- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -1309,7 +1309,12 @@ static struct socket *make_send_sock(struct net *net) pr_err("Error during creation of socket; terminating\n"); return ERR_PTR(result); } - + /* + * Kernel sockets that are a part of a namespace, should not + * hold a reference to a namespace in order to allow to stop it. + * After sk_change_net should be released using sk_release_kernel. + */ + sk_change_net(sock->sk, net); result = set_mcast_if(sock->sk, ipvs->master_mcast_ifn); if (result < 0) { pr_err("Error setting outbound mcast interface\n"); @@ -1334,8 +1339,8 @@ static struct socket *make_send_sock(struct net *net) return sock; - error: - sock_release(sock); +error: + sk_release_kernel(sock->sk); return ERR_PTR(result); } @@ -1355,7 +1360,12 @@ static struct socket *make_receive_sock(struct net *net) pr_err("Error during creation of socket; terminating\n"); return ERR_PTR(result); } - + /* + * Kernel sockets that are a part of a namespace, should not + * hold a reference to a namespace in order to allow to stop it. + * After sk_change_net should be released using sk_release_kernel. + */ + sk_change_net(sock->sk, net); /* it is equivalent to the REUSEADDR option in user-space */ sock->sk->sk_reuse = 1; @@ -1377,8 +1387,8 @@ static struct socket *make_receive_sock(struct net *net) return sock; - error: - sock_release(sock); +error: + sk_release_kernel(sock->sk); return ERR_PTR(result); } @@ -1473,7 +1483,7 @@ static int sync_thread_master(void *data) ip_vs_sync_buff_release(sb); /* release the sending multicast socket */ - sock_release(tinfo->sock); + sk_release_kernel(tinfo->sock->sk); kfree(tinfo); return 0; @@ -1513,7 +1523,7 @@ static int sync_thread_backup(void *data) } /* release the sending multicast socket */ - sock_release(tinfo->sock); + sk_release_kernel(tinfo->sock->sk); kfree(tinfo->buf); kfree(tinfo); @@ -1601,7 +1611,7 @@ outtinfo: outbuf: kfree(buf); outsocket: - sock_release(sock); + sk_release_kernel(sock->sk); out: return result; }