From patchwork Thu Dec 22 01:30:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh Bandewar X-Patchwork-Id: 708015 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3tkYqr1pySz9syB for ; Thu, 22 Dec 2016 12:30:32 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bandewar-net.20150623.gappssmtp.com header.i=@bandewar-net.20150623.gappssmtp.com header.b="ViXqtd0K"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761714AbcLVBa2 (ORCPT ); Wed, 21 Dec 2016 20:30:28 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:33319 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760412AbcLVBaY (ORCPT ); Wed, 21 Dec 2016 20:30:24 -0500 Received: by mail-pf0-f195.google.com with SMTP id 127so1563149pfg.0 for ; Wed, 21 Dec 2016 17:30:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bandewar-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=G6cGTGRLnPu5RjfM7A0G5IzzMCF219BbLDgQSRZuf60=; b=ViXqtd0KDQyvJ7fDPtfC97YmZYCp5Ts/kcMLGEtqCYbqJm/dWmrL6zRrcVmYtZBPXQ qNHGD/6FGXBCrZ0epcYKsCH/q/q70DaDDKgpF2Y6EGum01VEP8Et9i+k16ge2JnrSa+Q L9ZiWFL3b2mglXR36X62YwYgxMbw6WJo/OCGhJgRh3/zRr9S6wVcxVVdYzazopDw0bxT S3aFD9ZKFDFlSu49w6P7srYxko2toxqCnf8kfIefst0o+vCxMuE3Zm0gAT76sZWqksAE kMI4u9hfIjJA3+r7s1Th00SRDm8gfWOCrVqqTjYnRZQPSXmX1FaprGSD/LZkbgTjGJXi oWVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=G6cGTGRLnPu5RjfM7A0G5IzzMCF219BbLDgQSRZuf60=; b=aIrRGkMz2wYs6lKEFJpHuNyoqEIVFroztxIxdB3BMXFEtkWZ+yA+6UgsjJsA3jHMXz mRP+TBxSoXM6yeWs+9fJcofyHKukWhaqYiHOAkQZQm7jCsOiBdD0lQAtKsHz72WZZbcR xBAF7Pz3BRRD0cnxefyr506GdhhlCv51kwtUPcD1oiU3exWKFyxIIkkruXv3KDmKCZaC WZeTAhuSd/gut+SAq/l359gTsNRSDCRi6JhD7csj+ZQg2P/ilsWJw4wwPN5OMGjxR7TW t2LX5XozljVCI8Umf3LnAqV7u/918j5/NWSanweo5HNOnSSeYkXw2fm/GjpQTOq1IFq4 NYxg== X-Gm-Message-State: AIkVDXKrgqmpRux7+no5HPLEAm3nzZWD3xni9P5HJ+UtkK7B+xv/VGDZJERC0AWTs5JyEw== X-Received: by 10.99.211.21 with SMTP id b21mr12768392pgg.120.1482370223993; Wed, 21 Dec 2016 17:30:23 -0800 (PST) Received: from localhost ([2620:0:1000:3012:b514:340e:f7d8:7892]) by smtp.gmail.com with ESMTPSA id l71sm1352713pga.21.2016.12.21.17.30.22 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 21 Dec 2016 17:30:22 -0800 (PST) From: Mahesh Bandewar To: netdev , Eric Dumazet , David Miller Cc: Mahesh Bandewar Subject: [PATCH net] ipvlan: fix multicast processing Date: Wed, 21 Dec 2016 17:30:16 -0800 Message-Id: <1482370216-14833-1-git-send-email-mahesh@bandewar.net> X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mahesh Bandewar In an IPvlan setup when master is set in loopback mode e.g. ethtool -K eth0 set loopback on where eth0 is master device for IPvlan setup. The failure is caused by the faulty logic that determines if the packet is from TX-path vs. RX-path by just looking at the mac- addresses on the packet while processing multicast packets. In the loopback-mode where this crash was happening, the packets that are sent out are reflected by the NIC and are processed on the RX path, but mac-address check tricks into thinking this packet is from TX path and falsely uses dev_forward_skb() to pass packets to the slave (virtual) devices. This patch records the path while queueing packets and eliminates logic of looking at mac-addresses for the same decision. ------------[ cut here ]------------ kernel BUG at include/linux/skbuff.h:1737! Call Trace: [] dev_forward_skb+0x92/0xd0 [] ipvlan_process_multicast+0x395/0x4c0 [ipvlan] [] ? ipvlan_process_multicast+0xd7/0x4c0 [ipvlan] [] ? process_one_work+0x147/0x660 [] process_one_work+0x1a9/0x660 [] ? process_one_work+0x147/0x660 [] worker_thread+0x11d/0x360 [] ? rescuer_thread+0x350/0x350 [] kthread+0xdb/0xe0 [] ? _raw_spin_unlock_irq+0x30/0x50 [] ? flush_kthread_worker+0xc0/0xc0 [] ret_from_fork+0x9a/0xd0 [] ? flush_kthread_worker+0xc0/0xc0 Fixes: ba35f8588f47 ("ipvlan: Defer multicast / broadcast processing to a work-queue") Signed-off-by: Mahesh Bandewar CC: Eric Dumazet --- Note that this is on top of Eric's patch sent earlier. drivers/net/ipvlan/ipvlan.h | 5 +++++ drivers/net/ipvlan/ipvlan_core.c | 26 +++++++++++++++----------- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h index 031093e1c25f..dbfbb33ac66c 100644 --- a/drivers/net/ipvlan/ipvlan.h +++ b/drivers/net/ipvlan/ipvlan.h @@ -99,6 +99,11 @@ struct ipvl_port { int count; }; +struct ipvl_skb_cb { + bool tx_pkt; +}; +#define IPVL_SKB_CB(_skb) ((struct ipvl_skb_cb *)&((_skb)->cb[0])) + static inline struct ipvl_port *ipvlan_port_get_rcu(const struct net_device *d) { return rcu_dereference(d->rx_handler_data); diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index ea6bc1e12cdf..83ce74acf82d 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -198,7 +198,7 @@ void ipvlan_process_multicast(struct work_struct *work) unsigned int mac_hash; int ret; u8 pkt_type; - bool hlocal, dlocal; + bool tx_pkt; __skb_queue_head_init(&list); @@ -211,7 +211,7 @@ void ipvlan_process_multicast(struct work_struct *work) bool consumed = false; ethh = eth_hdr(skb); - hlocal = ether_addr_equal(ethh->h_source, port->dev->dev_addr); + tx_pkt = IPVL_SKB_CB(skb)->tx_pkt; mac_hash = ipvlan_mac_hash(ethh->h_dest); if (ether_addr_equal(ethh->h_dest, port->dev->broadcast)) @@ -219,13 +219,10 @@ void ipvlan_process_multicast(struct work_struct *work) else pkt_type = PACKET_MULTICAST; - dlocal = false; rcu_read_lock(); list_for_each_entry_rcu(ipvlan, &port->ipvlans, pnode) { - if (hlocal && (ipvlan->dev == dev)) { - dlocal = true; + if (tx_pkt && (ipvlan->dev == skb->dev)) continue; - } if (!test_bit(mac_hash, ipvlan->mac_filters)) continue; if (!(ipvlan->dev->flags & IFF_UP)) @@ -238,7 +235,7 @@ void ipvlan_process_multicast(struct work_struct *work) consumed = true; nskb->pkt_type = pkt_type; nskb->dev = ipvlan->dev; - if (hlocal) + if (tx_pkt) ret = dev_forward_skb(ipvlan->dev, nskb); else ret = netif_rx(nskb); @@ -248,7 +245,7 @@ void ipvlan_process_multicast(struct work_struct *work) } rcu_read_unlock(); - if (dlocal) { + if (tx_pkt) { /* If the packet originated here, send it out. */ skb->dev = port->dev; skb->pkt_type = pkt_type; @@ -480,13 +477,20 @@ static int ipvlan_process_outbound(struct sk_buff *skb) } static void ipvlan_multicast_enqueue(struct ipvl_port *port, - struct sk_buff *skb) + struct sk_buff *skb, bool tx_pkt) { if (skb->protocol == htons(ETH_P_PAUSE)) { kfree_skb(skb); return; } + /* Record that the deferred packet is from TX or RX path. By + * looking at mac-addresses on packet will lead to erronus decisions. + * (This would be true for a loopback-mode on master device or a + * hair-pin mode of the switch.) + */ + IPVL_SKB_CB(skb)->tx_pkt = tx_pkt; + spin_lock(&port->backlog.lock); if (skb_queue_len(&port->backlog) < IPVLAN_QBACKLOG_LIMIT) { if (skb->dev) @@ -549,7 +553,7 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev) } else if (is_multicast_ether_addr(eth->h_dest)) { ipvlan_skb_crossing_ns(skb, NULL); - ipvlan_multicast_enqueue(ipvlan->port, skb); + ipvlan_multicast_enqueue(ipvlan->port, skb, true); return NET_XMIT_SUCCESS; } @@ -646,7 +650,7 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb, */ if (nskb) { ipvlan_skb_crossing_ns(nskb, NULL); - ipvlan_multicast_enqueue(port, nskb); + ipvlan_multicast_enqueue(port, nskb, false); } } } else {