From patchwork Tue Jan 26 07:45:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Zhu Yanjun X-Patchwork-Id: 573112 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 5042414009B for ; Tue, 26 Jan 2016 18:45:32 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=fz4Prths; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932378AbcAZHp3 (ORCPT ); Tue, 26 Jan 2016 02:45:29 -0500 Received: from mail-pf0-f179.google.com ([209.85.192.179]:33603 "EHLO mail-pf0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932223AbcAZHp1 (ORCPT ); Tue, 26 Jan 2016 02:45:27 -0500 Received: by mail-pf0-f179.google.com with SMTP id e65so95868178pfe.0 for ; Mon, 25 Jan 2016 23:45:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=rJ/VJIdwjfzjNeqQx7BE5buELiqBV5eTP4OHb2PMvN4=; b=fz4PrthsC3nGBWB7axLDU973rS9ACrd872LkPplA/9iIyOpoJtQRk88/wZqskWnKnI GS099Nni+AN+sttLlOIqPQBc7vAIl91nV8SdzDtugKWrcLr7se5VVw6rIkwvU9gGkewC S6gOpk4tgoNLZLhIo56YE4bqRA4v7wELfOoaJbVWBnQuxfT161MExWzcvkghpieqrGdW YJVXj8plGs+1G2+41QeE0cYQ89E3hPxHQJm6FDB6dAClU9BuBF5IdQJkmGGza10PuQdR gxI+7yJSvS3ybiTWKgZZgHoViwTAkrIHxVwekM0/AAeW0F36MbKhnBO2wPoVkXcDwNfS ztNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=rJ/VJIdwjfzjNeqQx7BE5buELiqBV5eTP4OHb2PMvN4=; b=das0ZYiS0AwyjH0w7Jq9qgpynq4K0A380Y9bI0dH8/O9RlG8cxHQHl53RvEmynQwne JI0lJYBk0FdUwgTqN/I552xT+LDcco3fC7FuPOIXgUkLXx5Ckikhk16n2xFYQCXBYgwA nOskdNt2z6a+y7Tbj0PuSfXy8Dtnmp+TExVBtuszGNRfe6rh0jnDSwOp2iopw/Ytf7Uu 2WDLP7m6KBt2UoJRB+28Z6h00F3DnSPboUKeTRq5mm9K1RQGotJZmALw2fzu8PI4FGg6 9aBNc6BeI+PP5hsK8ItoPzlMibcXA4JpR8SeE1ARXA2rVXaEKj5UP5+T7MGAmALFNRs9 hqAg== X-Gm-Message-State: AG10YOR+SWcIXwteFReH9wICaWNwcavOFqqRypkz7PkEAaDZ6NgqNSGa9bxNKq5M8KFzAw== X-Received: by 10.98.74.135 with SMTP id c7mr31730468pfj.129.1453794326560; Mon, 25 Jan 2016 23:45:26 -0800 (PST) Received: from [0.0.0.0] (unknown-105-122.windriver.com. [147.11.105.122]) by smtp.gmail.com with ESMTPSA id l9sm182951pfb.29.2016.01.25.23.45.21 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 25 Jan 2016 23:45:25 -0800 (PST) Subject: Re: [PATCH] net: take care of bonding in build_skb_flow_key (v4) To: Jiri Pirko , Wengang Wang References: <1453354378-3018-1-git-send-email-wen.gang.wang@oracle.com> <20160121083506.GA2251@nanopsycho.orion> <56A1AE48.4000908@oracle.com> <20160122065207.GA2211@nanopsycho.orion> Cc: netdev@vger.kernel.org, sd@queasysnail.net, jay.vosburgh@canonical.com, zhuyj From: zhuyj Message-ID: <56A72430.4030107@gmail.com> Date: Tue, 26 Jan 2016 15:45:52 +0800 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <20160122065207.GA2211@nanopsycho.orion> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 01/22/2016 02:52 PM, Jiri Pirko wrote: > Fri, Jan 22, 2016 at 05:21:28AM CET, wen.gang.wang@oracle.com wrote: >> >> 在 2016年01月21日 16:35, Jiri Pirko 写道: >>> Thu, Jan 21, 2016 at 06:32:58AM CET, wen.gang.wang@oracle.com wrote: >>>> In a bonding setting, we determines fragment size according to MTU and >>>> PMTU associated to the bonding master. If the slave finds the fragment >>>> size is too big, it drops the fragment and calls ip_rt_update_pmtu(), >>>> passing _skb_ and _pmtu_, trying to update the path MTU. >>>> Problem is that the target device that function ip_rt_update_pmtu actually >>>> tries to update is the slave (skb->dev), not the master. Thus since no >>>> PMTU change happens on master, the fragment size for later packets doesn't >>>> change so all later fragments/packets are dropped too. >>>> >>>> The fix is letting build_skb_flow_key() take care of the transition of >>>> device index from bonding slave to the master. That makes the master become >>>> the target device that ip_rt_update_pmtu tries to update PMTU to. >>>> >>>> Signed-off-by: Wengang Wang >>>> --- >>>> net/ipv4/route.c | 9 +++++++++ >>>> 1 file changed, 9 insertions(+) >>>> >>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c >>>> index 85f184e..7e766b5 100644 >>>> --- a/net/ipv4/route.c >>>> +++ b/net/ipv4/route.c >>>> @@ -524,10 +524,19 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb, >>>> { >>>> const struct iphdr *iph = ip_hdr(skb); >>>> int oif = skb->dev->ifindex; >>>> + struct net_device *master; >>>> u8 tos = RT_TOS(iph->tos); >>>> u8 prot = iph->protocol; >>>> u32 mark = skb->mark; >>>> >>>> + if (netif_is_bond_slave(skb->dev)) { >>>> + rcu_read_lock(); >>>> + master = netdev_master_upper_dev_get_rcu(skb->dev); >>>> + if (master) >>>> + oif = master->ifindex; >>>> + rcu_read_unlock(); >>>> + } >>> This is certainly not correct as it should not be bond-specific but >>> rather generic. >> Then what you would suggest to fix it? >>> Note that you may have bond over bond or bridge over >>> bond or other scenarios, which this patch ignores. >> I don't think bond over bond is a good configuration. Do you have a real use >> case for that configuration? > Stacking of multiple master devices is absolutelly common. > > You have to go in the upper tree all the way up, for all master device > types. I am not sure that the following can work or not. Just a test patch. } Thanks a lot. Zhu Yanjun > > >> thanks, >> wengang >> diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 85f184e..12b4982 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -523,10 +523,19 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb, const struct sock *sk) { const struct iphdr *iph = ip_hdr(skb); - int oif = skb->dev->ifindex; + struct net_device *master = NULL; u8 tos = RT_TOS(iph->tos); u8 prot = iph->protocol; u32 mark = skb->mark; + int oif = skb->dev->ifindex; + + if (skb->dev->flags & IFF_SLAVE) { + rcu_read_lock(); + master = skb_dst(skb)->dev; + if (master) + oif = master->ifindex; + rcu_read_unlock(); + } __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);