From patchwork Sun Mar 3 08:57:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eran Ben Elisha X-Patchwork-Id: 1050815 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44BxsX3Fnlz9s4Y for ; Sun, 3 Mar 2019 19:58:56 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726017AbfCCI5v (ORCPT ); Sun, 3 Mar 2019 03:57:51 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:36984 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbfCCI5u (ORCPT ); Sun, 3 Mar 2019 03:57:50 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from eranbe@mellanox.com) with ESMTPS (AES256-SHA encrypted); 3 Mar 2019 10:57:39 +0200 Received: from dev-l-vrt-198.mtl.labs.mlnx (dev-l-vrt-198.mtl.labs.mlnx [10.134.198.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x238vcjd005449; Sun, 3 Mar 2019 10:57:39 +0200 From: Eran Ben Elisha To: netdev@vger.kernel.org, "David S. Miller" Cc: Saeed Mahameed , Jiri Pirko , Eran Ben Elisha Subject: [PATCH net-next 1/3] devlink: Update reporter state to error even if recover aborted Date: Sun, 3 Mar 2019 10:57:29 +0200 Message-Id: <1551603451-22637-2-git-send-email-eranbe@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1551603451-22637-1-git-send-email-eranbe@mellanox.com> References: <1551603451-22637-1-git-send-email-eranbe@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org If devlink_health_report() aborted the recover flow due to grace period checker, it left the reporter status as DEVLINK_HEALTH_REPORTER_STATE_HEALTHY, which is a bug. Fix that by always setting the reporter state to DEVLINK_HEALTH_REPORTER_STATE_ERROR prior to running the checker mentioned above. In addition, save the previous health_state in a temporary variable, then use it in the abort check comparison instead of using reporter->health_state which might be already changed. Fixes: c8e1da0bf923 ("devlink: Add health report functionality") Signed-off-by: Eran Ben Elisha Acked-by: Jiri Pirko --- net/core/devlink.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/core/devlink.c b/net/core/devlink.c index 6515fbec0dcd..376e01a70c6d 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -4569,16 +4569,19 @@ static int devlink_health_do_dump(struct devlink_health_reporter *reporter, int devlink_health_report(struct devlink_health_reporter *reporter, const char *msg, void *priv_ctx) { + enum devlink_health_reporter_state prev_health_state; struct devlink *devlink = reporter->devlink; /* write a log message of the current error */ WARN_ON(!msg); trace_devlink_health_report(devlink, reporter->ops->name, msg); reporter->error_count++; + prev_health_state = reporter->health_state; + reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_ERROR; /* abort if the previous error wasn't recovered */ if (reporter->auto_recover && - (reporter->health_state != DEVLINK_HEALTH_REPORTER_STATE_HEALTHY || + (prev_health_state != DEVLINK_HEALTH_REPORTER_STATE_HEALTHY || jiffies - reporter->last_recovery_ts < msecs_to_jiffies(reporter->graceful_period))) { trace_devlink_health_recover_aborted(devlink, From patchwork Sun Mar 3 08:57:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eran Ben Elisha X-Patchwork-Id: 1050814 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44BxsW5Z9Dz9sBF for ; Sun, 3 Mar 2019 19:58:55 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726046AbfCCI5v (ORCPT ); Sun, 3 Mar 2019 03:57:51 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:36983 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725971AbfCCI5v (ORCPT ); Sun, 3 Mar 2019 03:57:51 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from eranbe@mellanox.com) with ESMTPS (AES256-SHA encrypted); 3 Mar 2019 10:57:39 +0200 Received: from dev-l-vrt-198.mtl.labs.mlnx (dev-l-vrt-198.mtl.labs.mlnx [10.134.198.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x238vcje005449; Sun, 3 Mar 2019 10:57:39 +0200 From: Eran Ben Elisha To: netdev@vger.kernel.org, "David S. Miller" Cc: Saeed Mahameed , Jiri Pirko , Eran Ben Elisha Subject: [PATCH net-next 2/3] devlink: Add support for direct reporter health state update Date: Sun, 3 Mar 2019 10:57:30 +0200 Message-Id: <1551603451-22637-3-git-send-email-eranbe@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1551603451-22637-1-git-send-email-eranbe@mellanox.com> References: <1551603451-22637-1-git-send-email-eranbe@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org It is possible that a reporter state will be updated due to a recover flow which is not triggered by a devlink health related operation, but as a side effect of some other operation in the system. Expose devlink health API for a direct update of a reporter status. Move devlink_health_reporter_state enum definition to devlink.h so it could be used from drivers as a parameter of devlink_health_reporter_state_update. In addition, add trace_devlink_health_reporter_state_update to provide user notification for reporter state change. Signed-off-by: Eran Ben Elisha Acked-by: Jiri Pirko --- include/net/devlink.h | 14 ++++++++++++++ include/trace/events/devlink.h | 31 +++++++++++++++++++++++++++++++ net/core/devlink.c | 22 +++++++++++++++++----- 3 files changed, 62 insertions(+), 5 deletions(-) diff --git a/include/net/devlink.h b/include/net/devlink.h index 7f5a0bdca228..63de99e09f04 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -447,6 +447,11 @@ typedef void devlink_snapshot_data_dest_t(const void *data); struct devlink_fmsg; struct devlink_health_reporter; +enum devlink_health_reporter_state { + DEVLINK_HEALTH_REPORTER_STATE_HEALTHY, + DEVLINK_HEALTH_REPORTER_STATE_ERROR, +}; + /** * struct devlink_health_reporter_ops - Reporter operations * @name: reporter name @@ -715,6 +720,9 @@ void * devlink_health_reporter_priv(struct devlink_health_reporter *reporter); int devlink_health_report(struct devlink_health_reporter *reporter, const char *msg, void *priv_ctx); +void +devlink_health_reporter_state_update(struct devlink_health_reporter *reporter, + enum devlink_health_reporter_state state); void devlink_compat_running_version(struct net_device *dev, char *buf, size_t len); @@ -1204,6 +1212,12 @@ devlink_health_report(struct devlink_health_reporter *reporter, return 0; } +static inline void +devlink_health_reporter_state_update(struct devlink_health_reporter *reporter, + enum devlink_health_reporter_state state) +{ +} + static inline void devlink_compat_running_version(struct net_device *dev, char *buf, size_t len) { diff --git a/include/trace/events/devlink.h b/include/trace/events/devlink.h index 191ddf67d769..6f60a78d9a7e 100644 --- a/include/trace/events/devlink.h +++ b/include/trace/events/devlink.h @@ -140,6 +140,37 @@ TRACE_EVENT(devlink_health_recover_aborted, __entry->time_since_last_recover) ); +/* + * Tracepoint for devlink health reporter state update: + */ +TRACE_EVENT(devlink_health_reporter_state_update, + TP_PROTO(const struct devlink *devlink, const char *reporter_name, + bool new_state), + + TP_ARGS(devlink, reporter_name, new_state), + + TP_STRUCT__entry( + __string(bus_name, devlink->dev->bus->name) + __string(dev_name, dev_name(devlink->dev)) + __string(driver_name, devlink->dev->driver->name) + __string(reporter_name, reporter_name) + __field(u8, new_state) + ), + + TP_fast_assign( + __assign_str(bus_name, devlink->dev->bus->name); + __assign_str(dev_name, dev_name(devlink->dev)); + __assign_str(driver_name, devlink->dev->driver->name); + __assign_str(reporter_name, reporter_name); + __entry->new_state = new_state; + ), + + TP_printk("bus_name=%s dev_name=%s driver_name=%s reporter_name=%s: new_state=%d", + __get_str(bus_name), __get_str(dev_name), + __get_str(driver_name), __get_str(reporter_name), + __entry->new_state) +); + #endif /* _TRACE_DEVLINK_H */ /* This part must be outside protection */ diff --git a/net/core/devlink.c b/net/core/devlink.c index 376e01a70c6d..78e22cea4cc7 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -4409,11 +4409,6 @@ struct devlink_health_reporter { u64 last_recovery_ts; }; -enum devlink_health_reporter_state { - DEVLINK_HEALTH_REPORTER_STATE_HEALTHY, - DEVLINK_HEALTH_REPORTER_STATE_ERROR, -}; - void * devlink_health_reporter_priv(struct devlink_health_reporter *reporter) { @@ -4498,6 +4493,23 @@ devlink_health_reporter_destroy(struct devlink_health_reporter *reporter) } EXPORT_SYMBOL_GPL(devlink_health_reporter_destroy); +void +devlink_health_reporter_state_update(struct devlink_health_reporter *reporter, + enum devlink_health_reporter_state state) +{ + if (WARN_ON(state != DEVLINK_HEALTH_REPORTER_STATE_HEALTHY && + state != DEVLINK_HEALTH_REPORTER_STATE_ERROR)) + return; + + if (reporter->health_state == state) + return; + + reporter->health_state = state; + trace_devlink_health_reporter_state_update(reporter->devlink, + reporter->ops->name, state); +} +EXPORT_SYMBOL_GPL(devlink_health_reporter_state_update); + static int devlink_health_reporter_recover(struct devlink_health_reporter *reporter, void *priv_ctx) From patchwork Sun Mar 3 08:57:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eran Ben Elisha X-Patchwork-Id: 1050816 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44BxsY1Hs6z9sBF for ; Sun, 3 Mar 2019 19:58:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725985AbfCCI5v (ORCPT ); Sun, 3 Mar 2019 03:57:51 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:36986 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725950AbfCCI5u (ORCPT ); Sun, 3 Mar 2019 03:57:50 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from eranbe@mellanox.com) with ESMTPS (AES256-SHA encrypted); 3 Mar 2019 10:57:39 +0200 Received: from dev-l-vrt-198.mtl.labs.mlnx (dev-l-vrt-198.mtl.labs.mlnx [10.134.198.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x238vcjf005449; Sun, 3 Mar 2019 10:57:39 +0200 From: Eran Ben Elisha To: netdev@vger.kernel.org, "David S. Miller" Cc: Saeed Mahameed , Jiri Pirko , Eran Ben Elisha Subject: [PATCH net-next 3/3] net/mlx5e: Update tx reporter status in case channels were successfully opened Date: Sun, 3 Mar 2019 10:57:31 +0200 Message-Id: <1551603451-22637-4-git-send-email-eranbe@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1551603451-22637-1-git-send-email-eranbe@mellanox.com> References: <1551603451-22637-1-git-send-email-eranbe@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Once channels were successfully opened, update tx reporter health state to healthy. This is needed for the following scenario: - SQ has an un-recovered error reported to the devlink health, resulting tx reporter state to be error. - Current channels (including this SQ) are closed - New channels are opened After that flow, the original error was "solved", and tx reporter state should be healthy. However, as it was resolved as a side effect, and not via tx reporter recover method, driver needs to inform devlink health about it. Signed-off-by: Eran Ben Elisha Acked-by: Jiri Pirko Acked-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index e5f74eb986b3..b5fdbd3190d9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -2302,6 +2302,10 @@ int mlx5e_open_channels(struct mlx5e_priv *priv, goto err_close_channels; } + if (!IS_ERR_OR_NULL(priv->tx_reporter)) + devlink_health_reporter_state_update(priv->tx_reporter, + DEVLINK_HEALTH_REPORTER_STATE_HEALTHY); + kvfree(cparam); return 0;