From patchwork Thu Mar 14 10:15:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sabrina Dubroca X-Patchwork-Id: 1056435 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=queasysnail.net Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44Kl3p4TJgz9s4Y for ; Thu, 14 Mar 2019 21:16:22 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727218AbfCNKQU (ORCPT ); Thu, 14 Mar 2019 06:16:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53576 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726736AbfCNKQU (ORCPT ); Thu, 14 Mar 2019 06:16:20 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8D8A63084049; Thu, 14 Mar 2019 10:16:19 +0000 (UTC) Received: from localhost.localdomain (ovpn-204-201.brq.redhat.com [10.40.204.201]) by smtp.corp.redhat.com (Postfix) with ESMTP id 059CF5D73D; Thu, 14 Mar 2019 10:16:17 +0000 (UTC) From: Sabrina Dubroca To: netdev@vger.kernel.org Cc: Eric Dumazet , Sabrina Dubroca , Jianlin Shi , Stefano Brivio Subject: [PATCH net] net: enforce xmit_recursion for devices with a queue Date: Thu, 14 Mar 2019 11:15:50 +0100 Message-Id: <6d9ed6c448a5c855e05abf19c205f33a66b6ff40.1552557395.git.sd@queasysnail.net> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Thu, 14 Mar 2019 10:16:19 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Commit 745e20f1b626 ("net: add a recursion limit in xmit path") introduced a recursion limit, but it only applies to devices without a queue. Virtual devices with a queue (either because they don't have the IFF_NO_QUEUE flag, or because the administrator added one) can still cause an unbounded recursion, via __dev_queue_xmit -> __dev_xmit_skb -> qdisc_run -> __qdisc_run -> qdisc_restart -> sch_direct_xmit -> dev_hard_start_xmit . Jianlin reported this in a setup with 16 gretap devices stacked on top of one another. This patch prevents the stack overflow by incrementing xmit_recursion in code paths that can call dev_hard_start_xmit() (like commit 745e20f1b626 did). If the recursion limit is exceeded, the packet is enqueued and the qdisc is scheduled. Reported-by: Jianlin Shi Signed-off-by: Sabrina Dubroca Reviewed-by: Stefano Brivio --- No fixes tag, I think this is at least as old as what 745e20f1b626 ("net: add a recursion limit in xmit path") fixed. include/net/pkt_sched.h | 14 ++++++++++++++ net/core/dev.c | 4 +++- net/sched/sch_generic.c | 7 +++++++ 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index a16fbe9a2a67..9384c1749bf3 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -113,6 +113,20 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q, struct net_device *dev, struct netdev_queue *txq, spinlock_t *root_lock, bool validate); +static inline bool sch_direct_xmit_rec(struct sk_buff *skb, struct Qdisc *q, + struct net_device *dev, + struct netdev_queue *txq, + spinlock_t *root_lock, bool validate) +{ + bool ret; + + __this_cpu_inc(xmit_recursion); + ret = sch_direct_xmit(skb, q, dev, txq, root_lock, validate); + __this_cpu_dec(xmit_recursion); + + return ret; +} + void __qdisc_run(struct Qdisc *q); static inline void qdisc_run(struct Qdisc *q) diff --git a/net/core/dev.c b/net/core/dev.c index 2b67f2aa59dd..74b77a773712 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3493,6 +3493,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, __qdisc_drop(skb, &to_free); rc = NET_XMIT_DROP; } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) && + likely(__this_cpu_read(xmit_recursion) <= + XMIT_RECURSION_LIMIT) && qdisc_run_begin(q)) { /* * This is a work-conserving queue; there are no old skbs @@ -3502,7 +3504,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, qdisc_bstats_update(q, skb); - if (sch_direct_xmit(skb, q, dev, txq, root_lock, true)) { + if (sch_direct_xmit_rec(skb, q, dev, txq, root_lock, true)) { if (unlikely(contended)) { spin_unlock(&q->busylock); contended = false; diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a117d9260558..2565ef18d4cc 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -395,6 +395,12 @@ void __qdisc_run(struct Qdisc *q) int quota = dev_tx_weight; int packets; + if (unlikely(__this_cpu_read(xmit_recursion) > XMIT_RECURSION_LIMIT)) { + __netif_schedule(q); + return; + } + + __this_cpu_inc(xmit_recursion); while (qdisc_restart(q, &packets)) { /* * Ordered by possible occurrence: Postpone processing if @@ -407,6 +413,7 @@ void __qdisc_run(struct Qdisc *q) break; } } + __this_cpu_dec(xmit_recursion); } unsigned long dev_trans_start(struct net_device *dev)