diff mbox series

[ovs-dev] dpctl: Netdev leak causes stale VHU & reaching limit.

Message ID PAWPR07MB9877859E07FBAD793EEFD01D90A52@PAWPR07MB9877.eurprd07.prod.outlook.com
State Superseded
Headers show
Series [ovs-dev] dpctl: Netdev leak causes stale VHU & reaching limit. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Vipul Ashri July 11, 2024, 8:17 p.m. UTC
While running a test with a continous VM creation/deletion using an
orchestration script with-in cloud environment. In parallel we have
some monitoring script calling ovs-appctl dpctl/show stats commands
every minute.

During VHU port delete, one of netdev references were not reduced to
0 as show_dpif call has not given-up the reference back or doing bad
cleanup. This pending deference preventing VHU deletion sequence, this
is found to be one of corner case inside dpctl code which results in
leaking up netdev which ultimately results in stale VHU entry. After
fixing this problematic cleanup, issue is not seen.

Fixes: fceef2095222 ("dpctl: add ovs-appctl dpctl/* commands to talk to dpif-netdev")
Signed-off-by: Vipul Ashri <vipul.ashri@ericsson.com>
---
 lib/dpctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Marchand July 12, 2024, 7:51 a.m. UTC | #1
On Thu, Jul 11, 2024 at 10:18 PM Vipul Ashri via dev
<ovs-dev@openvswitch.org> wrote:
>
>
> While running a test with a continous VM creation/deletion using an
> orchestration script with-in cloud environment. In parallel we have
> some monitoring script calling ovs-appctl dpctl/show stats commands
> every minute.
>
> During VHU port delete, one of netdev references were not reduced to
> 0 as show_dpif call has not given-up the reference back or doing bad
> cleanup. This pending deference preventing VHU deletion sequence, this
> is found to be one of corner case inside dpctl code which results in
> leaking up netdev which ultimately results in stale VHU entry. After
> fixing this problematic cleanup, issue is not seen.
>
> Fixes: fceef2095222 ("dpctl: add ovs-appctl dpctl/* commands to talk to dpif-netdev")
> Signed-off-by: Vipul Ashri <vipul.ashri@ericsson.com>

The issue seems generic as any netdev for which a call to
netdev_get_stats failed would trigger the same leak.
So I would update the commit title with something more generic like:
dpctl: Fix netdev reference leak in "show" command.

This can probably be changed when applying.
In any case, the fix lgtm.

Reviewed-by: David Marchand <david.marchand@redhat.com>
Simon Horman July 15, 2024, 2:05 p.m. UTC | #2
On Fri, Jul 12, 2024 at 09:51:18AM +0200, David Marchand wrote:
> On Thu, Jul 11, 2024 at 10:18 PM Vipul Ashri via dev
> <ovs-dev@openvswitch.org> wrote:
> >
> >
> > While running a test with a continous VM creation/deletion using an
> > orchestration script with-in cloud environment. In parallel we have
> > some monitoring script calling ovs-appctl dpctl/show stats commands
> > every minute.
> >
> > During VHU port delete, one of netdev references were not reduced to
> > 0 as show_dpif call has not given-up the reference back or doing bad
> > cleanup. This pending deference preventing VHU deletion sequence, this
> > is found to be one of corner case inside dpctl code which results in
> > leaking up netdev which ultimately results in stale VHU entry. After
> > fixing this problematic cleanup, issue is not seen.
> >
> > Fixes: fceef2095222 ("dpctl: add ovs-appctl dpctl/* commands to talk to dpif-netdev")
> > Signed-off-by: Vipul Ashri <vipul.ashri@ericsson.com>
> 
> The issue seems generic as any netdev for which a call to
> netdev_get_stats failed would trigger the same leak.
> So I would update the commit title with something more generic like:
> dpctl: Fix netdev reference leak in "show" command.
> 
> This can probably be changed when applying.
> In any case, the fix lgtm.
> 
> Reviewed-by: David Marchand <david.marchand@redhat.com>

Thanks David,

That sounds find to me.

There was another posting of this patch [1].
I'll plan to apply that one in a day or so
unless someone says otherwise.

[1] https://patchwork.ozlabs.org/project/openvswitch/patch/PAWPR07MB9877C192D7E7712C72A403EE90A52@PAWPR07MB9877.eurprd07.prod.outlook.com/
diff mbox series

Patch

diff --git a/lib/dpctl.c b/lib/dpctl.c
index a70df5342..81e9ca0e9 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -739,7 +739,6 @@  show_dpif(struct dpif *dpif, struct dpctl_params *dpctl_p)
             }
             error = netdev_get_stats(netdev, &s);
             if (!error) {
-                netdev_close(netdev);
                 print_stat(dpctl_p, "    RX packets:", s.rx_packets);
                 print_stat(dpctl_p, " errors:", s.rx_errors);
                 print_stat(dpctl_p, " dropped:", s.rx_dropped);
@@ -770,6 +769,7 @@  show_dpif(struct dpif *dpif, struct dpctl_params *dpctl_p)
                 dpctl_print(dpctl_p, ", could not retrieve stats (%s)",
                             ovs_strerror(error));
             }
+            netdev_close(netdev);
         }
         dpif_port_destroy(&dpif_port);
     }