From patchwork Thu May 11 10:03:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frode Nordahl X-Patchwork-Id: 1779957 X-Patchwork-Delegate: horms@verge.net.au Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=CwJ7shRj; dkim-atps=neutral Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QH6tY4by6z20fn for ; Thu, 11 May 2023 20:04:05 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id B0A2D416B7; Thu, 11 May 2023 10:04:02 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org B0A2D416B7 Authentication-Results: smtp4.osuosl.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=CwJ7shRj X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u_ZbLm81MR42; Thu, 11 May 2023 10:03:59 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id EA0F940AB9; Thu, 11 May 2023 10:03:57 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org EA0F940AB9 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id B9F4BC0036; Thu, 11 May 2023 10:03:57 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 33A13C002A for ; Thu, 11 May 2023 10:03:56 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id F3B7B60B25 for ; Thu, 11 May 2023 10:03:55 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org F3B7B60B25 Authentication-Results: smtp3.osuosl.org; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=CwJ7shRj X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1eEfA7wgl0tN for ; Thu, 11 May 2023 10:03:54 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org A870560AAA Received: from smtp-relay-canonical-1.canonical.com (smtp-relay-canonical-1.canonical.com [185.125.188.121]) by smtp3.osuosl.org (Postfix) with ESMTPS id A870560AAA for ; Thu, 11 May 2023 10:03:53 +0000 (UTC) Received: from frode-threadripper.. (2.general.frode.uk.vpn [10.172.193.251]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-canonical-1.canonical.com (Postfix) with ESMTPSA id 0F0E842A92; Thu, 11 May 2023 10:03:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1683799431; bh=XjCsBz6/P9V8WG0MXEkw+P3HlJteuukfP2LH/XLzNgs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CwJ7shRjahx2jRMaEjwjW1TQQuQxQRnnVqT8o2pgQn+9MNRjJnNGIZgnOL92hRxpl iFe1lmlXZLk61hW1KRHOGFuXHobQguWSCL+MnCVOdVKG0EGy9XqBM6yuGDJ6jP1rF2 zbTINe0ahYRcr/ILV/WWqDmvqLnNvnA1aUF7ayGLjeT+0nNea9H2F6AcOaP+BciGUE 4yTDFtGIrDWXUs0S/a/ZYMpMYpqp5uq8IWAMH0qQrGDOEhZ+wB8JnphL2lVP5D7iwL uqkvuKl08ilNMTgxBPJCYMZMTKcPfUL8pMT19/+g98hKg/+brbNPYORdaRCt4nEBbs TDmd+0OSiUvEQ== From: Frode Nordahl To: dev@openvswitch.org Date: Thu, 11 May 2023 12:03:50 +0200 Message-Id: <20230511100350.3898835-1-frode.nordahl@canonical.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Cc: Simon Horman , Flavio Leitner , Paul Blakey Subject: [ovs-dev] [PATCH v2] tc: fix crash on EAGAIN return from recvmsg on netlink socket. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The tc module combines the use of the `tc_transact` helper function for communication with the in-kernel tc infrastructure with assertions on the reply data by `ofpbuf_at_assert` on the received data prior to further processing. `tc_transact` in turn calls `nl_transact`, which via `nl_transact_multiple__` ultimately calls and handles return value from `recvmsg`. On error a check for EAGAIN is performed and a consequence of this condition is effectively to provide a non-error (0) result and an empty reply buffer. Before this change, the `tc_transact` and, consumers of it, were unaware of this condition. The use of assertions on the reply buffer can as such lead to a fatal crash of OVS. To be fair, the behavior of `nl_transact` when handling an EAGAIN return is currently not documented, so this change also adds that documentation. While fixing the problem, it led me to find potential problems with several of the one-time initialization functions in the netdev-offload-tc module. Before this patch they would happily accept temporary failure as permanent one, with consequences for the remaining lifetime of the process. For the record, the symptom of the crash is this in the log: EMER|../include/openvswitch/ofpbuf.h:194: assertion offset + size <= b->size failed in ofpbuf_at_assert() And an excerpt of the backtrace looks like this: 0x0000561dac1396d1 in ofpbuf_at_assert (b=0x7fb650005d20, b=0x7fb650005d20, offset=16, size=20) at ../include/openvswitch/ofpbuf.h:194 tc_replace_flower (id=, flower=) at ../lib/tc.c:3223 0x0000561dac128155 in netdev_tc_flow_put (netdev=0x561dacf91840, match=, actions=, actions_len=, ufid=, info=, stats=) at ../lib/netdev-offload-tc.c:2096 0x0000561dac117541 in netdev_flow_put (stats=, info=0x7fb65b7ba780, ufid=, act_len=, actions=, match=0x7fb65b7ba980, netdev=0x561dacf91840) at ../lib/netdev-offload.c:257 parse_flow_put (put=0x7fb65b7bcc50, dpif=0x561dad0ad550) at ../lib/dpif-netlink.c:2297 try_send_to_netdev (op=0x7fb65b7bcc48, dpif=0x561dad0ad550) at ../lib/dpif-netlink.c:2384 Reported-At: https://launchpad.net/bugs/2018500 Fixes f98e418fbdb6 (tc: Add tc flower functions) Fixes 407556ac6c90 (netlink-socket: Avoid forcing a reply for final message in a transaction.) Signed-off-by: Frode Nordahl --- lib/dpif-netlink.c | 4 +++- lib/dpif.c | 2 +- lib/netdev-linux.c | 40 ++++++++++++++++++++++++++-------------- lib/netdev-offload-tc.c | 31 ++++++++++++++++++++++--------- lib/netlink-socket.c | 5 +++++ lib/tc.c | 18 ++++++++++++++++++ 6 files changed, 75 insertions(+), 25 deletions(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 60bd39643..75537b2f8 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -2340,7 +2340,9 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put) } netdev_set_hw_info(oor_netdev, HW_INFO_TYPE_OOR, true); } - level = (err == ENOSPC || err == EOPNOTSUPP) ? VLL_DBG : VLL_ERR; + level = (err == EAGAIN || + err == ENOSPC || + err == EOPNOTSUPP) ? VLL_DBG : VLL_ERR; VLOG_RL(&rl, level, "failed to offload flow: %s: %s", ovs_strerror(err), (oor_netdev ? oor_netdev->name : dev->name)); diff --git a/lib/dpif.c b/lib/dpif.c index 3305401fe..2bbb0aac4 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -1751,7 +1751,7 @@ flow_message_log_level(int error) * Kernels that support flow wildcarding will reject these flows as * duplicates (EEXIST), so lower the log level to debug for these * types of messages. */ - return (error && error != EEXIST) ? VLL_WARN : VLL_DBG; + return (error && error != EEXIST && error != EAGAIN) ? VLL_WARN : VLL_DBG; } static bool diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 36620199e..a6065bf69 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -5817,8 +5817,11 @@ tc_get_policer_action(uint32_t index, struct ofputil_meter_stats *stats) error = tc_transact(&request, &replyp); if (error) { - VLOG_ERR_RL(&rl, "Failed to dump police action (index: %u), err=%d", - index, error); + VLOG_RL(&rl, + (error == EAGAIN) ? VLL_DBG : VLL_ERR, + "Failed to dump police action (index: %u), err=%d", + index, + error); return error; } @@ -5849,8 +5852,11 @@ tc_del_policer_action(uint32_t index, struct ofputil_meter_stats *stats) error = tc_transact(&request, &replyp); if (error) { - VLOG_ERR_RL(&rl, "Failed to delete police action (index: %u), err=%d", - index, error); + VLOG_RL(&rl, + (error == EAGAIN) ? VLL_DBG : VLL_ERR, + "Failed to delete police action (index: %u), err=%d", + index, + error); return error; } @@ -6114,11 +6120,13 @@ tc_query_class(const struct netdev *netdev, error = tc_transact(&request, replyp); if (error) { - VLOG_WARN_RL(&rl, "query %s class %u:%u (parent %u:%u) failed (%s)", - netdev_get_name(netdev), - tc_get_major(handle), tc_get_minor(handle), - tc_get_major(parent), tc_get_minor(parent), - ovs_strerror(error)); + VLOG_RL(&rl, + (error == EAGAIN) ? VLL_DBG : VLL_WARN, + "query %s class %u:%u (parent %u:%u) failed (%s)", + netdev_get_name(netdev), + tc_get_major(handle), tc_get_minor(handle), + tc_get_major(parent), tc_get_minor(parent), + ovs_strerror(error)); } return error; } @@ -6140,10 +6148,12 @@ tc_delete_class(const struct netdev *netdev, unsigned int handle) error = tc_transact(&request, NULL); if (error) { - VLOG_WARN_RL(&rl, "delete %s class %u:%u failed (%s)", - netdev_get_name(netdev), - tc_get_major(handle), tc_get_minor(handle), - ovs_strerror(error)); + VLOG_RL(&rl, + (error == EAGAIN) ? VLL_DBG : VLL_WARN, + "delete %s class %u:%u failed (%s)", + netdev_get_name(netdev), + tc_get_major(handle), tc_get_minor(handle), + ovs_strerror(error)); } return error; } @@ -6262,7 +6272,9 @@ tc_query_qdisc(const struct netdev *netdev_) ops = &tc_ops_other; } } - } else if ((!error && !qdisc->size) || error == ENOENT) { + } else if ((!error && !qdisc->size) || + error == ENOENT || error == EAGAIN) + { /* Either it's a built-in qdisc, or (on Linux pre-2.6.35) it's a qdisc * set up by some other entity that doesn't have a handle 1:0. We will * assume that it's the system default qdisc. */ diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c index 4f26dd8cc..4c15989d6 100644 --- a/lib/netdev-offload-tc.c +++ b/lib/netdev-offload-tc.c @@ -2505,10 +2505,11 @@ netdev_tc_flow_get(struct netdev *netdev, err = tc_get_flower(&id, &flower); if (err) { - VLOG_ERR_RL(&error_rl, - "flow get failed (dev %s prio %d handle %d): %s", - netdev_get_name(netdev), id.prio, id.handle, - ovs_strerror(err)); + VLOG_RL(&error_rl, + (err == EAGAIN) ? VLL_DBG : VLL_ERR, + "flow get failed (dev %s prio %d handle %d): %s", + netdev_get_name(netdev), id.prio, id.handle, + ovs_strerror(err)); return err; } @@ -2594,7 +2595,9 @@ probe_multi_mask_per_prio(int ifindex) memset(&flower.mask.dst_mac, 0xff, sizeof flower.mask.dst_mac); id1 = tc_make_tcf_id(ifindex, block_id, prio, TC_INGRESS); - error = tc_replace_flower(&id1, &flower); + do { + error = tc_replace_flower(&id1, &flower); + } while (error == EAGAIN); if (error) { goto out; } @@ -2603,7 +2606,9 @@ probe_multi_mask_per_prio(int ifindex) memset(&flower.mask.src_mac, 0xff, sizeof flower.mask.src_mac); id2 = tc_make_tcf_id(ifindex, block_id, prio, TC_INGRESS); - error = tc_replace_flower(&id2, &flower); + do { + error = tc_replace_flower(&id2, &flower); + } while (error == EAGAIN); tc_del_flower_filter(&id1); if (error) { @@ -2625,6 +2630,7 @@ probe_insert_ct_state_rule(int ifindex, uint16_t ct_state, struct tcf_id *id) { int prio = TC_RESERVED_PRIORITY_MAX + 1; struct tc_flower flower; + int error; memset(&flower, 0, sizeof flower); flower.key.ct_state = ct_state; @@ -2634,7 +2640,10 @@ probe_insert_ct_state_rule(int ifindex, uint16_t ct_state, struct tcf_id *id) flower.mask.eth_type = OVS_BE16_MAX; *id = tc_make_tcf_id(ifindex, 0, prio, TC_INGRESS); - return tc_replace_flower(id, &flower); + do { + error = tc_replace_flower(id, &flower); + } while (error == EAGAIN); + return error; } static void @@ -2657,7 +2666,9 @@ probe_ct_state_support(int ifindex) goto out; } - error = tc_get_flower(&id, &flower); + do { + error = tc_get_flower(&id, &flower); + } while (error == EAGAIN); if (error || flower.mask.ct_state != ct_state) { goto out_del; } @@ -2732,7 +2743,9 @@ probe_tc_block_support(int ifindex) memset(&flower.mask.dst_mac, 0xff, sizeof flower.mask.dst_mac); id = tc_make_tcf_id(ifindex, block_id, prio, TC_INGRESS); - error = tc_replace_flower(&id, &flower); + do { + error = tc_replace_flower(&id, &flower); + } while (error == EAGAIN); tc_add_del_qdisc(ifindex, false, block_id, TC_INGRESS); diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c index 80da20d9f..dea060fc3 100644 --- a/lib/netlink-socket.c +++ b/lib/netlink-socket.c @@ -1798,6 +1798,11 @@ nl_pool_release(struct nl_sock *sock) * * 2. Resending the request causes it to be re-executed, so the request * needs to be idempotent. + * + * 3. In the event that the kernel is too busy to handle the request to + * receive the response (i.e. EAGAIN), this function will still return + * 0. The caller can check for this condition by checking for a zero + * size of the 'replyp' ofpbuf buffer. */ int nl_transact(int protocol, const struct ofpbuf *request, diff --git a/lib/tc.c b/lib/tc.c index 5c32c6f97..7df6a10c5 100644 --- a/lib/tc.c +++ b/lib/tc.c @@ -237,11 +237,29 @@ static void request_from_tcf_id(struct tcf_id *id, uint16_t eth_type, } } +/* The `tc_transact` function is a wrapper around `nl_transact` with the + * addition of: + * + * 1. the 'request' ofpbuf buffer is freed after `nl_transact` returns, + * regardless of success or failure. + * + * 2. When a 'replyp' pointer is provided; in the event of the kernel + * being too busy to process the request for the response, a positive + * error return will be provided with the value of EAGAIN. + * + * Please acquaint yourself with the documentation of the `nl_transact` + * function in the netlink-sock module before making use of this function. + */ int tc_transact(struct ofpbuf *request, struct ofpbuf **replyp) { int error = nl_transact(NETLINK_ROUTE, request, replyp); ofpbuf_uninit(request); + + if (!error && replyp && !(*replyp)->size) { + return EAGAIN; + } + return error; }