Message ID | a59f92670c72db738d91b639ecc72ef8daf69300.1585866258.git.marcelo.leitner@gmail.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: sched: reduce amount of log messages in act_mirred | expand |
From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Date: Thu, 2 Apr 2020 19:26:12 -0300 > @@ -245,8 +245,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, > } > > if (unlikely(!(dev->flags & IFF_UP))) { > - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", > - dev->name); > + pr_notice_once("tc mirred: device %s is down\n", > + dev->name); This reduction is too extreme. If someone causes this problem, reconfigures everything thinking that the problem will be fixed, they won't see this message the second time and mistakenly think it's working.
On Thu, Apr 02, 2020 at 06:04:17PM -0700, David Miller wrote: > From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > Date: Thu, 2 Apr 2020 19:26:12 -0300 > > > @@ -245,8 +245,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, > > } > > > > if (unlikely(!(dev->flags & IFF_UP))) { > > - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", > > - dev->name); > > + pr_notice_once("tc mirred: device %s is down\n", > > + dev->name); > > This reduction is too extreme. > > If someone causes this problem, reconfigures everything thinking that the > problem will be fixed, they won't see this message the second time and > mistakenly think it's working. Fair point. Then what about removing it entirely? printk's are not the best way to debug packet drops anyway and the action already registers the drops in its stats. Or perhaps a marker in the message, stating that it is logged only once per boot. I'm leaning to the one above, to just remove it.
On Thu, Apr 2, 2020 at 6:14 PM Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > > On Thu, Apr 02, 2020 at 06:04:17PM -0700, David Miller wrote: > > From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > > Date: Thu, 2 Apr 2020 19:26:12 -0300 > > > > > @@ -245,8 +245,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, > > > } > > > > > > if (unlikely(!(dev->flags & IFF_UP))) { > > > - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", > > > - dev->name); > > > + pr_notice_once("tc mirred: device %s is down\n", > > > + dev->name); > > > > This reduction is too extreme. > > > > If someone causes this problem, reconfigures everything thinking that the > > problem will be fixed, they won't see this message the second time and > > mistakenly think it's working. > > Fair point. Then what about removing it entirely? printk's are not the > best way to debug packet drops anyway and the action already registers > the drops in its stats. > > Or perhaps a marker in the message, stating that it is logged only > once per boot. I'm leaning to the one above, to just remove it. I think the reason why we print that is we do not handle NETDEV_DOWN event in mirred_device_event() or check IFF_UP in tcf_mirred_init(). I think if we can do both, we can remove this message entirely. I am not sure whether the latter would break existing expectations, as users may want to add a down device as a target and bring it up afterward. Thanks.
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 83dd82fc9f40ce800b99eae5c0b279dce5b2c1c9..bd1e2c98aaaefc689e52840b9be53ef9de4dd86d 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -245,8 +245,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, } if (unlikely(!(dev->flags & IFF_UP))) { - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", - dev->name); + pr_notice_once("tc mirred: device %s is down\n", + dev->name); goto out; }
OVS bridge is usually left down. When using OVS offload, then, it is quite common to trigger this message. Some cards, for example, can't offload broadcasts because they can't output to more than 2 ports. Due to this, act_mirred will try to output to the OVS bridge itself, which is often down, and floods the log. (yes, the ratelimit is not enough) As act_mirred is already incrementing the overlimit counter for each drop, there is no need to keep flooding the logs here. Lets log it once, warn the sysadmin, and let the counters do the rest. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> --- net/sched/act_mirred.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)