diff mbox series

[ovs-dev] controller: Fix "use after free" issue in statctrl_run().

Message ID 20240919174251.5136-1-naveen.yerramneni@nutanix.com
State New
Headers show
Series [ovs-dev] controller: Fix "use after free" issue in statctrl_run(). | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Naveen Yerramneni Sept. 19, 2024, 5:42 p.m. UTC
Issue:
=====
OVN controller main() is caching the "sbrec_mac_binding/sbrec_fdb"
pointers in "mac_cache_mac_binding->mac_bindings/fdbs" which are
used to update timestamp in mac_cache_mb_stats_run()/
mac_cache_fdb_stats_run(). There is a condition where cached
"sbrec_mac_binding/sbrec_fdb" pointers gets freed without
removing the refs from "mac_cache_mac_binding->mac_bindings/fdbs"
which is leading to crash due to "use after free".

This condition can occur due to following reason.
  - en_runtime_data is input to en_mac_cache.
  - en_runtime_data went for full recompute but engine_run()
    is called with recompute_allowed as false due to
    ofctrl_has_backlog() returned true. This caused to avoid
    the recompute in the current engine run.
  - If there are any mac_binding/fdb deletes in the same run
    then, entries gets deleted during ovsdb_idl_loop_run() and
    rows are freed in ovsdb_idl_track_clear() but references
    to them are present in mac_cache_data->mac_bindings/fdbs.

Fix:
===
Avoid statctrl_run() when engine recompute is not allowed.

Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
---
 controller/ovn-controller.c | 57 +++++++++++++++++++------------------
 1 file changed, 30 insertions(+), 27 deletions(-)

Comments

Naveen Yerramneni Sept. 25, 2024, 4:27 a.m. UTC | #1
Hi Team,

Could you please review this patch ?

Thanks,
Naveen

From: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
Date: Thursday, 19 September 2024 at 11:13 PM
To: dev@openvswitch.org <dev@openvswitch.org>
Cc: Mary Manohar <mary.manohar@nutanix.com>, Shibir Basak <shibir.basak@nutanix.com>, Naveen Yerramneni <naveen.yerramneni@nutanix.com>
Subject: [PATCH ovn] controller: Fix "use after free" issue in statctrl_run().
Issue:
=====
OVN controller main() is caching the "sbrec_mac_binding/sbrec_fdb"
pointers in "mac_cache_mac_binding->mac_bindings/fdbs" which are
used to update timestamp in mac_cache_mb_stats_run()/
mac_cache_fdb_stats_run(). There is a condition where cached
"sbrec_mac_binding/sbrec_fdb" pointers gets freed without
removing the refs from "mac_cache_mac_binding->mac_bindings/fdbs"
which is leading to crash due to "use after free".

This condition can occur due to following reason.
  - en_runtime_data is input to en_mac_cache.
  - en_runtime_data went for full recompute but engine_run()
    is called with recompute_allowed as false due to
    ofctrl_has_backlog() returned true. This caused to avoid
    the recompute in the current engine run.
  - If there are any mac_binding/fdb deletes in the same run
    then, entries gets deleted during ovsdb_idl_loop_run() and
    rows are freed in ovsdb_idl_track_clear() but references
    to them are present in mac_cache_data->mac_bindings/fdbs.

Fix:
===
Avoid statctrl_run() when engine recompute is not allowed.

Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
---
 controller/ovn-controller.c | 57 +++++++++++++++++++------------------
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index c48667887..6bb8f07db 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -5363,6 +5363,7 @@ main(int argc, char *argv[])
     /* Main loop. */
     bool sb_monitor_all = false;
     while (!exit_args.exiting) {
+        bool recompute_allowed = true;
         memory_run();
         if (memory_should_report()) {
             struct simap usage = SIMAP_INITIALIZER(&usage);
@@ -5536,34 +5537,34 @@ main(int argc, char *argv[])

                     stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME,
                                     time_msec());
-                    if (ovnsb_idl_txn) {
-                        if (ofctrl_has_backlog()) {
-                            /* When there are in-flight messages pending to
-                             * ovs-vswitchd, we should hold on recomputing so
-                             * that the previous flow installations won't be
-                             * delayed.  However, we still want to try if
-                             * recompute is not needed and we can quickly
-                             * incrementally process the new changes, to avoid
-                             * unnecessarily forced recomputes later on.  This
-                             * is because the OVSDB change tracker cannot
-                             * preserve tracked changes across iterations.  If
-                             * change tracking is improved, we can simply skip
-                             * this round of engine_run and continue processing
-                             * acculated changes incrementally later when
-                             * ofctrl_has_backlog() returns false. */
-                            engine_run(false);
-                        } else {
-                            engine_run(true);
-                        }
-                    } else {
+                    if (!ovnsb_idl_txn || ofctrl_has_backlog()) {
+                        /* No ovnsb_idl_txn  */
                         /* Even if there's no SB DB transaction available,
-                         * try to run the engine so that we can handle any
-                         * incremental changes that don't require a recompute.
-                         * If a recompute is required, the engine will cancel,
-                         * triggerring a full run in the next iteration.
-                         */
-                        engine_run(false);
+                        * try to run the engine so that we can handle any
+                        * incremental changes that don't require a recompute.
+                        * If a recompute is required, the engine will cancel,
+                        * triggerring a full run in the next iteration.
+                        */
+
+                        /* OR */
+
+                        /* ofctrl_has_backlog */
+                        /* When there are in-flight messages pending to
+                         * ovs-vswitchd, we should hold on recomputing so
+                         * that the previous flow installations won't be
+                         * delayed.  However, we still want to try if
+                         * recompute is not needed and we can quickly
+                         * incrementally process the new changes, to avoid
+                         * unnecessarily forced recomputes later on.  This
+                         * is because the OVSDB change tracker cannot
+                         * preserve tracked changes across iterations.  If
+                         * change tracking is improved, we can simply skip
+                         * this round of engine_run and continue processing
+                         * acculated changes incrementally later when
+                         * ofctrl_has_backlog() returns false. */
+                        recompute_allowed = false;
                     }
+                    engine_run(recompute_allowed);
                     stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
                                    time_msec());
                     if (engine_has_updated()) {
@@ -5685,7 +5686,9 @@ main(int argc, char *argv[])
                         }
                     }

-                    if (mac_cache_data) {
+                    if (mac_cache_data && recompute_allowed) {
+                        /* Run only when engine recompute is allowed
+                         * since mac_binding/FDB rows are cached */
                         statctrl_run(ovnsb_idl_txn, mac_cache_data);
                     }

--
2.36.6
Ales Musil Oct. 2, 2024, 9:04 a.m. UTC | #2
On Thu, Sep 19, 2024 at 7:43 PM Naveen Yerramneni via dev <
ovs-dev@openvswitch.org> wrote:

> Issue:
> =====
> OVN controller main() is caching the "sbrec_mac_binding/sbrec_fdb"
> pointers in "mac_cache_mac_binding->mac_bindings/fdbs" which are
> used to update timestamp in mac_cache_mb_stats_run()/
> mac_cache_fdb_stats_run(). There is a condition where cached
> "sbrec_mac_binding/sbrec_fdb" pointers gets freed without
> removing the refs from "mac_cache_mac_binding->mac_bindings/fdbs"
> which is leading to crash due to "use after free".
>
> This condition can occur due to following reason.
>   - en_runtime_data is input to en_mac_cache.
>   - en_runtime_data went for full recompute but engine_run()
>     is called with recompute_allowed as false due to
>     ofctrl_has_backlog() returned true. This caused to avoid
>     the recompute in the current engine run.
>   - If there are any mac_binding/fdb deletes in the same run
>     then, entries gets deleted during ovsdb_idl_loop_run() and
>     rows are freed in ovsdb_idl_track_clear() but references
>     to them are present in mac_cache_data->mac_bindings/fdbs.
>
> Fix:
> ===
> Avoid statctrl_run() when engine recompute is not allowed.
>
> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
> ---
>


Hello Neveen,

thank you for the patch. I have one comment down below. This also
makes me wonder, don't we have potentially the same issue with pinctrl?
I'm pretty sure that there are some SB references too.

 controller/ovn-controller.c | 57 +++++++++++++++++++------------------
>  1 file changed, 30 insertions(+), 27 deletions(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index c48667887..6bb8f07db 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -5363,6 +5363,7 @@ main(int argc, char *argv[])
>      /* Main loop. */
>      bool sb_monitor_all = false;
>      while (!exit_args.exiting) {
> +        bool recompute_allowed = true;
>

There is no need for the scope of this variable to be the
whole main loop. It could actually be changed to:

bool recompute_allowed =
        !ovnsb_idl_txn || ofctrl_has_backlog();
engine_run(recompute_allowed);

Keeping the combined comment of course.


         memory_run();
>          if (memory_should_report()) {
>              struct simap usage = SIMAP_INITIALIZER(&usage);
> @@ -5536,34 +5537,34 @@ main(int argc, char *argv[])
>
>                      stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME,
>                                      time_msec());
> -                    if (ovnsb_idl_txn) {
> -                        if (ofctrl_has_backlog()) {
> -                            /* When there are in-flight messages pending
> to
> -                             * ovs-vswitchd, we should hold on
> recomputing so
> -                             * that the previous flow installations won't
> be
> -                             * delayed.  However, we still want to try if
> -                             * recompute is not needed and we can quickly
> -                             * incrementally process the new changes, to
> avoid
> -                             * unnecessarily forced recomputes later on.
> This
> -                             * is because the OVSDB change tracker cannot
> -                             * preserve tracked changes across
> iterations.  If
> -                             * change tracking is improved, we can simply
> skip
> -                             * this round of engine_run and continue
> processing
> -                             * acculated changes incrementally later when
> -                             * ofctrl_has_backlog() returns false. */
> -                            engine_run(false);
> -                        } else {
> -                            engine_run(true);
> -                        }
> -                    } else {
> +                    if (!ovnsb_idl_txn || ofctrl_has_backlog()) {
> +                        /* No ovnsb_idl_txn  */
>                          /* Even if there's no SB DB transaction available,
> -                         * try to run the engine so that we can handle any
> -                         * incremental changes that don't require a
> recompute.
> -                         * If a recompute is required, the engine will
> cancel,
> -                         * triggerring a full run in the next iteration.
> -                         */
> -                        engine_run(false);
> +                        * try to run the engine so that we can handle any
> +                        * incremental changes that don't require a
> recompute.
> +                        * If a recompute is required, the engine will
> cancel,
> +                        * triggerring a full run in the next iteration.
> +                        */
> +
> +                        /* OR */
> +
> +                        /* ofctrl_has_backlog */
> +                        /* When there are in-flight messages pending to
> +                         * ovs-vswitchd, we should hold on recomputing so
> +                         * that the previous flow installations won't be
> +                         * delayed.  However, we still want to try if
> +                         * recompute is not needed and we can quickly
> +                         * incrementally process the new changes, to avoid
> +                         * unnecessarily forced recomputes later on.  This
> +                         * is because the OVSDB change tracker cannot
> +                         * preserve tracked changes across iterations.  If
> +                         * change tracking is improved, we can simply skip
> +                         * this round of engine_run and continue
> processing
> +                         * acculated changes incrementally later when
> +                         * ofctrl_has_backlog() returns false. */
> +                        recompute_allowed = false;
>                      }
> +                    engine_run(recompute_allowed);
>                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
>                                     time_msec());
>                      if (engine_has_updated()) {
> @@ -5685,7 +5686,9 @@ main(int argc, char *argv[])
>                          }
>                      }
>
> -                    if (mac_cache_data) {
> +                    if (mac_cache_data && recompute_allowed) {
> +                        /* Run only when engine recompute is allowed
> +                         * since mac_binding/FDB rows are cached */
>                          statctrl_run(ovnsb_idl_txn, mac_cache_data);
>                      }
>
> --
> 2.36.6
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Thanks,
Ales
diff mbox series

Patch

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index c48667887..6bb8f07db 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -5363,6 +5363,7 @@  main(int argc, char *argv[])
     /* Main loop. */
     bool sb_monitor_all = false;
     while (!exit_args.exiting) {
+        bool recompute_allowed = true;
         memory_run();
         if (memory_should_report()) {
             struct simap usage = SIMAP_INITIALIZER(&usage);
@@ -5536,34 +5537,34 @@  main(int argc, char *argv[])
 
                     stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME,
                                     time_msec());
-                    if (ovnsb_idl_txn) {
-                        if (ofctrl_has_backlog()) {
-                            /* When there are in-flight messages pending to
-                             * ovs-vswitchd, we should hold on recomputing so
-                             * that the previous flow installations won't be
-                             * delayed.  However, we still want to try if
-                             * recompute is not needed and we can quickly
-                             * incrementally process the new changes, to avoid
-                             * unnecessarily forced recomputes later on.  This
-                             * is because the OVSDB change tracker cannot
-                             * preserve tracked changes across iterations.  If
-                             * change tracking is improved, we can simply skip
-                             * this round of engine_run and continue processing
-                             * acculated changes incrementally later when
-                             * ofctrl_has_backlog() returns false. */
-                            engine_run(false);
-                        } else {
-                            engine_run(true);
-                        }
-                    } else {
+                    if (!ovnsb_idl_txn || ofctrl_has_backlog()) {
+                        /* No ovnsb_idl_txn  */
                         /* Even if there's no SB DB transaction available,
-                         * try to run the engine so that we can handle any
-                         * incremental changes that don't require a recompute.
-                         * If a recompute is required, the engine will cancel,
-                         * triggerring a full run in the next iteration.
-                         */
-                        engine_run(false);
+                        * try to run the engine so that we can handle any
+                        * incremental changes that don't require a recompute.
+                        * If a recompute is required, the engine will cancel,
+                        * triggerring a full run in the next iteration.
+                        */
+
+                        /* OR */
+
+                        /* ofctrl_has_backlog */
+                        /* When there are in-flight messages pending to
+                         * ovs-vswitchd, we should hold on recomputing so
+                         * that the previous flow installations won't be
+                         * delayed.  However, we still want to try if
+                         * recompute is not needed and we can quickly
+                         * incrementally process the new changes, to avoid
+                         * unnecessarily forced recomputes later on.  This
+                         * is because the OVSDB change tracker cannot
+                         * preserve tracked changes across iterations.  If
+                         * change tracking is improved, we can simply skip
+                         * this round of engine_run and continue processing
+                         * acculated changes incrementally later when
+                         * ofctrl_has_backlog() returns false. */
+                        recompute_allowed = false;
                     }
+                    engine_run(recompute_allowed);
                     stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
                                    time_msec());
                     if (engine_has_updated()) {
@@ -5685,7 +5686,9 @@  main(int argc, char *argv[])
                         }
                     }
 
-                    if (mac_cache_data) {
+                    if (mac_cache_data && recompute_allowed) {
+                        /* Run only when engine recompute is allowed
+                         * since mac_binding/FDB rows are cached */
                         statctrl_run(ovnsb_idl_txn, mac_cache_data);
                     }