diff mbox series

[ovs-dev] netdev-offload-dpdk: Change log level on flow offload failure.

Message ID 20240910151136.4010739-1-ktraynor@redhat.com
State Superseded
Delegated to: Eelco Chaudron
Headers show
Series [ovs-dev] netdev-offload-dpdk: Change log level on flow offload failure. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed

Commit Message

Kevin Traynor Sept. 10, 2024, 3:11 p.m. UTC
Previously when a flow was attempted to be offloaded, if it
could not be offloaded and did not return an actions error,
a warning was logged.

The reason there was an exception for an actions error was to allow
for failure for full offload of a flow because it will fallback to
partial offload. There are some issues with this approach to logging.

Some NICs do not specify an actions error, because other config in
the NIC may be checked first. e.g. In the case of Mellanox CX-5,
there can be different types of offload configured, so an unspecified
error may be returned.

More generally, enabling hw-offload is best effort per datapath/NIC/flow
as full and partial offload support in NICs is variable and there is
always fallback to software.

So there is likely to be repeated logging about offloading of flows
failing. With this in mind, change the log level to debug.

The status of the offload can still be seen with below command:
$ ovs-appctl dpctl/dump-flows -m
... offloaded:partial ...

Also, remove some duplicated rate limiting.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
---
 lib/netdev-offload-dpdk.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

Comments

David Marchand Sept. 11, 2024, 7:37 a.m. UTC | #1
On Tue, Sep 10, 2024 at 5:11 PM Kevin Traynor <ktraynor@redhat.com> wrote:
> -        enum vlog_level level = VLL_WARN;
> -
> -        if (error->type == RTE_FLOW_ERROR_TYPE_ACTION) {
> -            level = VLL_DBG;
> -        }
> -        VLOG_RL(&rl, level, "%s: rte_flow creation failed: %d (%s).",
> -                netdev_get_name(netdev), error->type, error->message);
> -        if (!vlog_should_drop(&this_module, level, &rl)) {
> +        if (!VLOG_DROP_DBG(&rl)) {
> +            VLOG_DBG("%s: rte_flow creation failed: %d (%s).",
> +                     netdev_get_name(netdev), error->type, error->message);
>              dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
>              extra_str = ds_cstr(&s_extra);
> -            VLOG_RL(&rl, level, "%s: Failed flow: %s  flow create %d %s",
> -                    netdev_get_name(netdev), extra_str,
> -                    netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
> +            VLOG_DBG("%s: Failed flow: %s  flow create %d %s",
> +                     netdev_get_name(netdev), extra_str,
> +                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));

I wonder why we had two log messages so far.
In any case, wrt ratelimit, with this change above, I would say we
should log only one message.
Eelco Chaudron Sept. 11, 2024, 8:41 a.m. UTC | #2
On 10 Sep 2024, at 17:11, Kevin Traynor wrote:

> Previously when a flow was attempted to be offloaded, if it
> could not be offloaded and did not return an actions error,
> a warning was logged.
>
> The reason there was an exception for an actions error was to allow
> for failure for full offload of a flow because it will fallback to
> partial offload. There are some issues with this approach to logging.
>
> Some NICs do not specify an actions error, because other config in
> the NIC may be checked first. e.g. In the case of Mellanox CX-5,
> there can be different types of offload configured, so an unspecified
> error may be returned.
>
> More generally, enabling hw-offload is best effort per datapath/NIC/flow
> as full and partial offload support in NICs is variable and there is
> always fallback to software.
>
> So there is likely to be repeated logging about offloading of flows
> failing. With this in mind, change the log level to debug.
>
> The status of the offload can still be seen with below command:
> $ ovs-appctl dpctl/dump-flows -m
> ... offloaded:partial ...
>
> Also, remove some duplicated rate limiting.

Thanks for fixing this, and sending the patch. One comment below on the log messages.

> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
> ---
>  lib/netdev-offload-dpdk.c | 23 +++++++++--------------
>  1 file changed, 9 insertions(+), 14 deletions(-)
>
> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> index 623005b1c..378bee161 100644
> --- a/lib/netdev-offload-dpdk.c
> +++ b/lib/netdev-offload-dpdk.c
> @@ -920,22 +920,17 @@ netdev_offload_dpdk_flow_create(struct netdev *netdev,
>              dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
>              extra_str = ds_cstr(&s_extra);
> -            VLOG_DBG_RL(&rl, "%s: rte_flow 0x%"PRIxPTR" %s  flow create %d %s",
> -                        netdev_get_name(netdev), (intptr_t) flow, extra_str,
> -                        netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
> +            VLOG_DBG("%s: rte_flow 0x%"PRIxPTR" %s  flow create %d %s",
> +                     netdev_get_name(netdev), (intptr_t) flow, extra_str,
> +                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
>          }
>      } else {
> -        enum vlog_level level = VLL_WARN;
> -
> -        if (error->type == RTE_FLOW_ERROR_TYPE_ACTION) {
> -            level = VLL_DBG;
> -        }
> -        VLOG_RL(&rl, level, "%s: rte_flow creation failed: %d (%s).",
> -                netdev_get_name(netdev), error->type, error->message);
> -        if (!vlog_should_drop(&this_module, level, &rl)) {
> +        if (!VLOG_DROP_DBG(&rl)) {
> +            VLOG_DBG("%s: rte_flow creation failed: %d (%s).",
> +                     netdev_get_name(netdev), error->type, error->message);
>              dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
>              extra_str = ds_cstr(&s_extra);
> -            VLOG_RL(&rl, level, "%s: Failed flow: %s  flow create %d %s",
> -                    netdev_get_name(netdev), extra_str,
> -                    netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
> +            VLOG_DBG("%s: Failed flow: %s  flow create %d %s",
> +                     netdev_get_name(netdev), extra_str,
> +                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));

Would it make sense to combine the two debug log messages?
Something in the lines of:

%s: rte_flow creation failed [%d (%s)]: %s  flow create %d %s”


>          }
>      }
> -- 
> 2.46.0
Kevin Traynor Sept. 11, 2024, 8:57 a.m. UTC | #3
On 11/09/2024 08:37, David Marchand wrote:
> On Tue, Sep 10, 2024 at 5:11 PM Kevin Traynor <ktraynor@redhat.com> wrote:
>> -        enum vlog_level level = VLL_WARN;
>> -
>> -        if (error->type == RTE_FLOW_ERROR_TYPE_ACTION) {
>> -            level = VLL_DBG;
>> -        }
>> -        VLOG_RL(&rl, level, "%s: rte_flow creation failed: %d (%s).",
>> -                netdev_get_name(netdev), error->type, error->message);
>> -        if (!vlog_should_drop(&this_module, level, &rl)) {
>> +        if (!VLOG_DROP_DBG(&rl)) {
>> +            VLOG_DBG("%s: rte_flow creation failed: %d (%s).",
>> +                     netdev_get_name(netdev), error->type, error->message);
>>              dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
>>              extra_str = ds_cstr(&s_extra);
>> -            VLOG_RL(&rl, level, "%s: Failed flow: %s  flow create %d %s",
>> -                    netdev_get_name(netdev), extra_str,
>> -                    netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
>> +            VLOG_DBG("%s: Failed flow: %s  flow create %d %s",
>> +                     netdev_get_name(netdev), extra_str,
>> +                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
> 
> I wonder why we had two log messages so far.

I'd wager at some point they were intended at different log levels.

> In any case, wrt ratelimit, with this change above, I would say we
> should log only one message.
> 

Yes, make sense, no reason to keep apart anymore.

>
Kevin Traynor Sept. 11, 2024, 8:59 a.m. UTC | #4
On 11/09/2024 09:41, Eelco Chaudron wrote:
> 
> 
> On 10 Sep 2024, at 17:11, Kevin Traynor wrote:
> 
>> Previously when a flow was attempted to be offloaded, if it
>> could not be offloaded and did not return an actions error,
>> a warning was logged.
>>
>> The reason there was an exception for an actions error was to allow
>> for failure for full offload of a flow because it will fallback to
>> partial offload. There are some issues with this approach to logging.
>>
>> Some NICs do not specify an actions error, because other config in
>> the NIC may be checked first. e.g. In the case of Mellanox CX-5,
>> there can be different types of offload configured, so an unspecified
>> error may be returned.
>>
>> More generally, enabling hw-offload is best effort per datapath/NIC/flow
>> as full and partial offload support in NICs is variable and there is
>> always fallback to software.
>>
>> So there is likely to be repeated logging about offloading of flows
>> failing. With this in mind, change the log level to debug.
>>
>> The status of the offload can still be seen with below command:
>> $ ovs-appctl dpctl/dump-flows -m
>> ... offloaded:partial ...
>>
>> Also, remove some duplicated rate limiting.
> 
> Thanks for fixing this, and sending the patch. One comment below on the log messages.
> 
>> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
>> ---
>>  lib/netdev-offload-dpdk.c | 23 +++++++++--------------
>>  1 file changed, 9 insertions(+), 14 deletions(-)
>>
>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>> index 623005b1c..378bee161 100644
>> --- a/lib/netdev-offload-dpdk.c
>> +++ b/lib/netdev-offload-dpdk.c
>> @@ -920,22 +920,17 @@ netdev_offload_dpdk_flow_create(struct netdev *netdev,
>>              dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
>>              extra_str = ds_cstr(&s_extra);
>> -            VLOG_DBG_RL(&rl, "%s: rte_flow 0x%"PRIxPTR" %s  flow create %d %s",
>> -                        netdev_get_name(netdev), (intptr_t) flow, extra_str,
>> -                        netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
>> +            VLOG_DBG("%s: rte_flow 0x%"PRIxPTR" %s  flow create %d %s",
>> +                     netdev_get_name(netdev), (intptr_t) flow, extra_str,
>> +                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
>>          }
>>      } else {
>> -        enum vlog_level level = VLL_WARN;
>> -
>> -        if (error->type == RTE_FLOW_ERROR_TYPE_ACTION) {
>> -            level = VLL_DBG;
>> -        }
>> -        VLOG_RL(&rl, level, "%s: rte_flow creation failed: %d (%s).",
>> -                netdev_get_name(netdev), error->type, error->message);
>> -        if (!vlog_should_drop(&this_module, level, &rl)) {
>> +        if (!VLOG_DROP_DBG(&rl)) {
>> +            VLOG_DBG("%s: rte_flow creation failed: %d (%s).",
>> +                     netdev_get_name(netdev), error->type, error->message);
>>              dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
>>              extra_str = ds_cstr(&s_extra);
>> -            VLOG_RL(&rl, level, "%s: Failed flow: %s  flow create %d %s",
>> -                    netdev_get_name(netdev), extra_str,
>> -                    netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
>> +            VLOG_DBG("%s: Failed flow: %s  flow create %d %s",
>> +                     netdev_get_name(netdev), extra_str,
>> +                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
> 
> Would it make sense to combine the two debug log messages?
> Something in the lines of:
> 
> %s: rte_flow creation failed [%d (%s)]: %s  flow create %d %s”
> 

Ack, yes, had same comment from David. Will look to merge them.

> 
>>          }
>>      }
>> -- 
>> 2.46.0
>
diff mbox series

Patch

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 623005b1c..378bee161 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -920,22 +920,17 @@  netdev_offload_dpdk_flow_create(struct netdev *netdev,
             dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
             extra_str = ds_cstr(&s_extra);
-            VLOG_DBG_RL(&rl, "%s: rte_flow 0x%"PRIxPTR" %s  flow create %d %s",
-                        netdev_get_name(netdev), (intptr_t) flow, extra_str,
-                        netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
+            VLOG_DBG("%s: rte_flow 0x%"PRIxPTR" %s  flow create %d %s",
+                     netdev_get_name(netdev), (intptr_t) flow, extra_str,
+                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
         }
     } else {
-        enum vlog_level level = VLL_WARN;
-
-        if (error->type == RTE_FLOW_ERROR_TYPE_ACTION) {
-            level = VLL_DBG;
-        }
-        VLOG_RL(&rl, level, "%s: rte_flow creation failed: %d (%s).",
-                netdev_get_name(netdev), error->type, error->message);
-        if (!vlog_should_drop(&this_module, level, &rl)) {
+        if (!VLOG_DROP_DBG(&rl)) {
+            VLOG_DBG("%s: rte_flow creation failed: %d (%s).",
+                     netdev_get_name(netdev), error->type, error->message);
             dump_flow(&s, &s_extra, attr, flow_patterns, flow_actions);
             extra_str = ds_cstr(&s_extra);
-            VLOG_RL(&rl, level, "%s: Failed flow: %s  flow create %d %s",
-                    netdev_get_name(netdev), extra_str,
-                    netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
+            VLOG_DBG("%s: Failed flow: %s  flow create %d %s",
+                     netdev_get_name(netdev), extra_str,
+                     netdev_dpdk_get_port_id(netdev), ds_cstr(&s));
         }
     }