From patchwork Mon Jul 13 08:54:47 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Konstantin Khlebnikov X-Patchwork-Id: 494275 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 9D32C140778 for ; Mon, 13 Jul 2015 18:56:55 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=yandex-team.ru header.i=@yandex-team.ru header.b=Rp0PC3jU; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751511AbbGMI4v (ORCPT ); Mon, 13 Jul 2015 04:56:51 -0400 Received: from forward-corp1m.cmail.yandex.net ([5.255.216.100]:51258 "EHLO forward-corp1m.cmail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751004AbbGMI4u (ORCPT ); Mon, 13 Jul 2015 04:56:50 -0400 Received: from smtpcorp1m.mail.yandex.net (smtpcorp1m.mail.yandex.net [77.88.61.150]) by forward-corp1m.cmail.yandex.net (Yandex) with ESMTP id 87BBF60B36; Mon, 13 Jul 2015 11:54:48 +0300 (MSK) Received: from smtpcorp1m.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp1m.mail.yandex.net (Yandex) with ESMTP id 1DD672CA02EE; Mon, 13 Jul 2015 11:54:48 +0300 (MSK) Received: from unknown (unknown [2a02:6b8:0:408:55be:d924:24e7:8b42]) by smtpcorp1m.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id p20AhH0fMP-smTSJPWA; Mon, 13 Jul 2015 11:54:48 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1436777688; bh=Vacs9upr1Ynwf/OpTA/FYwvy8vJ5mfXLN9VY69YvlAk=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=Rp0PC3jUGl9Pt4OeilHXbkRDbDSvRYQXdjJtgO0snNtigh57VALTyf7H8RcWoYwDa KyhG1g6JU7LIBJYu+PKUsDaM6bLZSxZPp6waL5HskT2hGqHSas40bARFdezo0HcSyH t+bRVj3W+vK32Vq9cXy8Zx5OkKxs5IlCYxxSRY6M= Authentication-Results: smtpcorp1m.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <55A37CD7.9050104@yandex-team.ru> Date: Mon, 13 Jul 2015 11:54:47 +0300 From: Konstantin Khlebnikov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Herbert Xu CC: netdev@vger.kernel.org, "David S. Miller" , Eric Dumazet Subject: Re: [PATCH] netlink: enable skb header refcounting before sending first broadcast References: <20150710115141.12980.88829.stgit@buzz> <20150713072352.GA8485@gondor.apana.org.au> In-Reply-To: <20150713072352.GA8485@gondor.apana.org.au> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 13.07.2015 10:23, Herbert Xu wrote: > On Fri, Jul 10, 2015 at 02:51:41PM +0300, Konstantin Khlebnikov wrote: >> This fixes race between non-atomic updates of adjacent bit-fields: >> skb->cloned could be lost because netlink broadcast clones skb after >> sending it to the first listener who sets skb->peeked at the same skb. >> As a result atomic refcounting of skb header stays disabled and >> skb_release_data() frees it twice. Race leads to double-free in kmalloc-xxx. >> >> Signed-off-by: Konstantin Khlebnikov >> Fixes: b19372273164 ("net: reorganize sk_buff for faster __copy_skb_header()") >> --- >> net/netlink/af_netlink.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c >> index dea925388a5b..921e0d8dfe3a 100644 >> --- a/net/netlink/af_netlink.c >> +++ b/net/netlink/af_netlink.c >> @@ -2028,6 +2028,12 @@ int netlink_broadcast_filtered(struct sock *ssk, struct sk_buff *skb, u32 portid >> info.tx_filter = filter; >> info.tx_data = filter_data; >> >> + /* Enable atomic refcounting in skb_release_data() before first send: >> + * non-atomic set of that bit-field in __skb_clone() could race with >> + * __skb_recv_datagram() which touches the same set of bit-fields. >> + */ >> + skb->cloned = 1; >> + >> /* While we sleep in clone, do not allow to change socket list */ >> >> netlink_lock_table(); > > Your effort in finding this bug is wonderful. However I think > the fix is a bit dirty. > > The real issue here is that the recv path no longer handles shared > skbs. So either we need to fix the recv path to not touch skbs > without cloning them, or we need to get rid of the use of shared > skbs in netlink. I don't think that recv path should care about shared skb -- skb can be delivered into only one socket anyway. Less dirty fix for that: do not send original skb. That adds one extra clone but makes code much cleaner. > > In fact it looks I introduced the bug way back in > > commit a59322be07c964e916d15be3df473fb7ba20c41e > Author: Herbert Xu > Date: Wed Dec 5 01:53:40 2007 -0800 > > [UDP]: Only increment counter on first peek/recv > > I will try to mend this error :) > > Cheers, > --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1957,17 +1957,16 @@ static void do_one_broadcast(struct sock *sk, } sock_hold(sk); - if (p->skb2 == NULL) { - if (skb_shared(p->skb)) { - p->skb2 = skb_clone(p->skb, p->allocation); - } else { - p->skb2 = skb_get(p->skb); - /* - * skb ownership may have been set when - * delivered to a previous socket. - */ - skb_orphan(p->skb2); - } + if (p->skb2 == NULL || skb_shared(p->skb2)) { + kfree_skb(p->skb2); + p->skb2 = skb_clone(p->skb, p->allocation); + } else { + skb_get(p->skb2); + /* + * skb ownership may have been set when + * delivered to a previous socket. + */ + skb_orphan(p->skb2); } if (p->skb2 == NULL) { netlink_overrun(sk); @@ -1997,7 +1996,6 @@ static void do_one_broadcast(struct sock *sk, } else { p->congested |= val; p->delivered = 1; - p->skb2 = NULL; } out: sock_put(sk);