diff mbox

[ovs-dev,v2,2/4] ovn-controller: add quiet mode

Message ID 1472656966-30133-3-git-send-email-rmoats@us.ibm.com
State Not Applicable
Headers show

Commit Message

Ryan Moats Aug. 31, 2016, 3:22 p.m. UTC
As discussed in [1], what the incremental processing code
actually accomplished was that the ovn-controller would
be "quiet" and not burn CPU when things weren't changing.
This patch set recreates this state by calculating whether
changes have occured that would require a full calculation
to be performed.  It does this by persisting a copy of
the localvif_to_ofport and tunnel information in the
controller module, rather than in the physical.c module
as was the case with previous commits.

[1] http://openvswitch.org/pipermail/dev/2016-August/078272.html

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
---
 ovn/controller/ofctrl.c         |   2 +
 ovn/controller/ovn-controller.c |  73 ++++++++++++++++++++---------
 ovn/controller/ovn-controller.h |   1 +
 ovn/controller/patch.c          |  11 +++++
 ovn/controller/physical.c       | 100 ++++++++++++++++++++++++----------------
 ovn/controller/physical.h       |   5 ++
 6 files changed, 131 insertions(+), 61 deletions(-)

Comments

Ben Pfaff Oct. 4, 2016, 5:14 p.m. UTC | #1
On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> As discussed in [1], what the incremental processing code
> actually accomplished was that the ovn-controller would
> be "quiet" and not burn CPU when things weren't changing.
> This patch set recreates this state by calculating whether
> changes have occured that would require a full calculation
> to be performed.  It does this by persisting a copy of
> the localvif_to_ofport and tunnel information in the
> controller module, rather than in the physical.c module
> as was the case with previous commits.
> 
> [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html
> 
> Signed-off-by: Ryan Moats <rmoats@us.ibm.com>

Hi Ryan.

I like the idea behind this patch.  However, it no longer applies to
master, so it needs a rebase.

It also seems like this TODO should be addressed:
+        /* TODO (regXboi): this next line is needed for the 3 HVs, 3 LS,
+         * 3 lports/LS, 1 LR test case, but has the potential side effect
+         * of defeating quiet mode once a logical router leads to creating
+         * patch ports. Need to understand the failure mode better and
+         * what is needed to remove this. */
+        force_full_process();

Thanks,

Ben.
Ryan Moats Oct. 4, 2016, 10:11 p.m. UTC | #2
Ben Pfaff <blp@ovn.org> wrote on 10/04/2016 12:14:32 PM:

> From: Ben Pfaff <blp@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 10/04/2016 12:14 PM
> Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
>
> On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> > As discussed in [1], what the incremental processing code
> > actually accomplished was that the ovn-controller would
> > be "quiet" and not burn CPU when things weren't changing.
> > This patch set recreates this state by calculating whether
> > changes have occured that would require a full calculation
> > to be performed.  It does this by persisting a copy of
> > the localvif_to_ofport and tunnel information in the
> > controller module, rather than in the physical.c module
> > as was the case with previous commits.
> >
> > [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html
> >
> > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
>
> Hi Ryan.
>
> I like the idea behind this patch.  However, it no longer applies to
> master, so it needs a rebase.

So done, but before submitting a new patch....

>
> It also seems like this TODO should be addressed:
> +        /* TODO (regXboi): this next line is needed for the 3 HVs, 3 LS,
> +         * 3 lports/LS, 1 LR test case, but has the potential side
effect
> +         * of defeating quiet mode once a logical router leads to
creating
> +         * patch ports. Need to understand the failure mode better and
> +         * what is needed to remove this. */
> +        force_full_process();

I've been looking at what happens here and I'm seeing some signatures
that concern me.  The test case that fails is no longer the cited one above
but is "2 HVs, 4 lports/HV, localnet ports" ...

What I'm seeing when I peer in is that the information populating the
local_datapath structures doesn't appear to be consistent on each pass
through binding_run: Looking at the ovn-controller process under hv2
for the above test (when it passes), I'll see signatures that look
like the following:

2016-10-04T21:57:58.257Z|00020|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.257Z|00021|physical|INFO|dp_key is 1
2016-10-04T21:57:58.257Z|00022|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.257Z|00023|physical|INFO|looking for ofport of ln1
(localnet)
...
2016-10-04T21:57:58.259Z|00034|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.259Z|00035|physical|INFO|dp_key is 1
2016-10-04T21:57:58.259Z|00036|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.259Z|00037|physical|INFO|looking for tunnel to hv1
...
2016-10-04T21:57:58.263Z|00054|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.263Z|00055|physical|INFO|dp_key is 1
2016-10-04T21:57:58.263Z|00056|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.263Z|00057|physical|INFO|looking for ofport of ln1
(localnet)
...
2016-10-04T21:57:58.275Z|00115|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.275Z|00116|physical|INFO|dp_key is 1
2016-10-04T21:57:58.275Z|00117|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.275Z|00118|physical|INFO|looking for tunnel to hv1
...
2016-10-04T21:57:58.281Z|00133|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.281Z|00134|physical|INFO|dp_key is 1
2016-10-04T21:57:58.281Z|00135|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.281Z|00136|physical|INFO|looking for ofport of ln1
(localnet)
...
2016-10-04T21:57:58.296Z|00203|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.296Z|00204|physical|INFO|dp_key is 1
2016-10-04T21:57:58.296Z|00205|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.296Z|00206|physical|INFO|looking for tunnel to hv1
...
2016-10-04T21:57:58.302Z|00241|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.302Z|00242|physical|INFO|dp_key is 1
2016-10-04T21:57:58.302Z|00243|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.302Z|00244|physical|INFO|looking for ofport of ln1
(localnet)
...
2016-10-04T21:57:58.327Z|00307|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.327Z|00308|physical|INFO|dp_key is 1
2016-10-04T21:57:58.328Z|00309|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.328Z|00310|physical|INFO|looking for tunnel to hv1
...
2016-10-04T21:57:58.341Z|00341|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.341Z|00342|physical|INFO|dp_key is 1
2016-10-04T21:57:58.341Z|00343|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.341Z|00344|physical|INFO|looking for ofport of ln1
(localnet)
...
2016-10-04T21:57:58.452Z|00532|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.452Z|00533|physical|INFO|dp_key is 1
2016-10-04T21:57:58.452Z|00534|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.452Z|00535|physical|INFO|looking for tunnel to hv1
...
2016-10-04T21:57:58.465Z|00624|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.465Z|00625|physical|INFO|dp_key is 1
2016-10-04T21:57:58.465Z|00626|physical|INFO|looking for ofport of lp11
(LP)
2016-10-04T21:57:58.465Z|00627|physical|INFO|looking for ofport of ln1
(localnet)
...
2016-10-04T21:57:58.884Z|00710|physical|INFO|looking at binding record
3736404b-c69d-4878-8d45-81ad9be06f5f
2016-10-04T21:57:58.884Z|00711|physical|INFO|dp_key is 1
2016-10-04T21:57:58.884Z|00712|physical|INFO|looking for ofport of lp11
(LP)

The failure mode I'm seeing is that ovn-controller on hv2 is thinking that
ports on hv1 are reachable by a tunnel and not via the localnet port, so it
incorrectly programs tunnel rules into table 32...

My expectation (which could be incorrect) is that before the localnet port
is bound, this lookup would go with a tunnel, but once the localnet port
is found, then it would be consistently found...

Any insight on this? (I'm thinking this is bug in binding_run, but want
confirmation before I go down that rabbit hole)

Ryan
Ben Pfaff Oct. 5, 2016, 5:37 p.m. UTC | #3
On Tue, Oct 04, 2016 at 05:11:37PM -0500, Ryan Moats wrote:
> Ben Pfaff <blp@ovn.org> wrote on 10/04/2016 12:14:32 PM:
> 
> > From: Ben Pfaff <blp@ovn.org>
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 10/04/2016 12:14 PM
> > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> >
> > On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> > > As discussed in [1], what the incremental processing code
> > > actually accomplished was that the ovn-controller would
> > > be "quiet" and not burn CPU when things weren't changing.
> > > This patch set recreates this state by calculating whether
> > > changes have occured that would require a full calculation
> > > to be performed.  It does this by persisting a copy of
> > > the localvif_to_ofport and tunnel information in the
> > > controller module, rather than in the physical.c module
> > > as was the case with previous commits.
> > >
> > > [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html
> > >
> > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> >
> > Hi Ryan.
> >
> > I like the idea behind this patch.  However, it no longer applies to
> > master, so it needs a rebase.
> 
> So done, but before submitting a new patch....
> 
> >
> > It also seems like this TODO should be addressed:
> > +        /* TODO (regXboi): this next line is needed for the 3 HVs, 3 LS,
> > +         * 3 lports/LS, 1 LR test case, but has the potential side
> effect
> > +         * of defeating quiet mode once a logical router leads to
> creating
> > +         * patch ports. Need to understand the failure mode better and
> > +         * what is needed to remove this. */
> > +        force_full_process();
> 
> I've been looking at what happens here and I'm seeing some signatures
> that concern me.  The test case that fails is no longer the cited one above
> but is "2 HVs, 4 lports/HV, localnet ports" ...
> 
> What I'm seeing when I peer in is that the information populating the
> local_datapath structures doesn't appear to be consistent on each pass
> through binding_run: Looking at the ovn-controller process under hv2
> for the above test (when it passes), I'll see signatures that look
> like the following:
> 
> 2016-10-04T21:57:58.257Z|00020|physical|INFO|looking at binding record
> 3736404b-c69d-4878-8d45-81ad9be06f5f
> 2016-10-04T21:57:58.257Z|00021|physical|INFO|dp_key is 1
> 2016-10-04T21:57:58.257Z|00022|physical|INFO|looking for ofport of lp11
> (LP)
> 2016-10-04T21:57:58.257Z|00023|physical|INFO|looking for ofport of ln1
> (localnet)
> ...
> 2016-10-04T21:57:58.259Z|00034|physical|INFO|looking at binding record
> 3736404b-c69d-4878-8d45-81ad9be06f5f
> 2016-10-04T21:57:58.259Z|00035|physical|INFO|dp_key is 1
> 2016-10-04T21:57:58.259Z|00036|physical|INFO|looking for ofport of lp11
> (LP)
> 2016-10-04T21:57:58.259Z|00037|physical|INFO|looking for tunnel to hv1
> ...

This sort of oscillation seems super-weird to me.  I'd also like to
learn more.
Ryan Moats Oct. 5, 2016, 5:52 p.m. UTC | #4
Ben Pfaff <blp@ovn.org> wrote on 10/05/2016 12:37:26 PM:

> From: Ben Pfaff <blp@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 10/05/2016 12:37 PM
> Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
>
> On Tue, Oct 04, 2016 at 05:11:37PM -0500, Ryan Moats wrote:
> > Ben Pfaff <blp@ovn.org> wrote on 10/04/2016 12:14:32 PM:
> >
> > > From: Ben Pfaff <blp@ovn.org>
> > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > Cc: dev@openvswitch.org
> > > Date: 10/04/2016 12:14 PM
> > > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> > >
> > > On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> > > > As discussed in [1], what the incremental processing code
> > > > actually accomplished was that the ovn-controller would
> > > > be "quiet" and not burn CPU when things weren't changing.
> > > > This patch set recreates this state by calculating whether
> > > > changes have occured that would require a full calculation
> > > > to be performed.  It does this by persisting a copy of
> > > > the localvif_to_ofport and tunnel information in the
> > > > controller module, rather than in the physical.c module
> > > > as was the case with previous commits.
> > > >
> > > > [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html
> > > >
> > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > >
> > > Hi Ryan.
> > >
> > > I like the idea behind this patch.  However, it no longer applies to
> > > master, so it needs a rebase.
> >
> > So done, but before submitting a new patch....
> >
> > >
> > > It also seems like this TODO should be addressed:
> > > +        /* TODO (regXboi): this next line is needed for the 3 HVs, 3
LS,
> > > +         * 3 lports/LS, 1 LR test case, but has the potential side
> > effect
> > > +         * of defeating quiet mode once a logical router leads to
> > creating
> > > +         * patch ports. Need to understand the failure mode better
and
> > > +         * what is needed to remove this. */
> > > +        force_full_process();
> >
> > I've been looking at what happens here and I'm seeing some signatures
> > that concern me.  The test case that fails is no longer the cited one
above
> > but is "2 HVs, 4 lports/HV, localnet ports" ...
> >
> > What I'm seeing when I peer in is that the information populating the
> > local_datapath structures doesn't appear to be consistent on each pass
> > through binding_run: Looking at the ovn-controller process under hv2
> > for the above test (when it passes), I'll see signatures that look
> > like the following:
> >
> > 2016-10-04T21:57:58.257Z|00020|physical|INFO|looking at binding record
> > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > 2016-10-04T21:57:58.257Z|00021|physical|INFO|dp_key is 1
> > 2016-10-04T21:57:58.257Z|00022|physical|INFO|looking for ofport of lp11
> > (LP)
> > 2016-10-04T21:57:58.257Z|00023|physical|INFO|looking for ofport of ln1
> > (localnet)
> > ...
> > 2016-10-04T21:57:58.259Z|00034|physical|INFO|looking at binding record
> > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > 2016-10-04T21:57:58.259Z|00035|physical|INFO|dp_key is 1
> > 2016-10-04T21:57:58.259Z|00036|physical|INFO|looking for ofport of lp11
> > (LP)
> > 2016-10-04T21:57:58.259Z|00037|physical|INFO|looking for tunnel to hv1
> > ...
>
> This sort of oscillation seems super-weird to me.  I'd also like to
> learn more.
>

I found the problem and fixed it in the v3 patch set - some of the methods
(patch_run for example) now include a test of ovs_idl_txn before they run
and some don't (physical_run for example), so you can see guess what was
happening...

(If we had a txn to work with everything is fine, but without one,
physical_run was trying to calculate flows based on local datapaths
with no localnet ports being defined, which leads to the above
oscillation.)

I added a txn check to the logic for whether to call things or not in
ovn-controller.c and I added a note in the email about thinking that a
follow-on patch to clean up all of the ovs_idl_txn gates might make sense
(so that nobody trips on this in the future)

Ryan
Ben Pfaff Oct. 5, 2016, 6:04 p.m. UTC | #5
On Wed, Oct 05, 2016 at 12:52:57PM -0500, Ryan Moats wrote:
> Ben Pfaff <blp@ovn.org> wrote on 10/05/2016 12:37:26 PM:
> 
> > From: Ben Pfaff <blp@ovn.org>
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 10/05/2016 12:37 PM
> > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> >
> > On Tue, Oct 04, 2016 at 05:11:37PM -0500, Ryan Moats wrote:
> > > Ben Pfaff <blp@ovn.org> wrote on 10/04/2016 12:14:32 PM:
> > >
> > > > From: Ben Pfaff <blp@ovn.org>
> > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > Cc: dev@openvswitch.org
> > > > Date: 10/04/2016 12:14 PM
> > > > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> > > >
> > > > On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> > > > > As discussed in [1], what the incremental processing code
> > > > > actually accomplished was that the ovn-controller would
> > > > > be "quiet" and not burn CPU when things weren't changing.
> > > > > This patch set recreates this state by calculating whether
> > > > > changes have occured that would require a full calculation
> > > > > to be performed.  It does this by persisting a copy of
> > > > > the localvif_to_ofport and tunnel information in the
> > > > > controller module, rather than in the physical.c module
> > > > > as was the case with previous commits.
> > > > >
> > > > > [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html
> > > > >
> > > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > > >
> > > > Hi Ryan.
> > > >
> > > > I like the idea behind this patch.  However, it no longer applies to
> > > > master, so it needs a rebase.
> > >
> > > So done, but before submitting a new patch....
> > >
> > > >
> > > > It also seems like this TODO should be addressed:
> > > > +        /* TODO (regXboi): this next line is needed for the 3 HVs, 3
> LS,
> > > > +         * 3 lports/LS, 1 LR test case, but has the potential side
> > > effect
> > > > +         * of defeating quiet mode once a logical router leads to
> > > creating
> > > > +         * patch ports. Need to understand the failure mode better
> and
> > > > +         * what is needed to remove this. */
> > > > +        force_full_process();
> > >
> > > I've been looking at what happens here and I'm seeing some signatures
> > > that concern me.  The test case that fails is no longer the cited one
> above
> > > but is "2 HVs, 4 lports/HV, localnet ports" ...
> > >
> > > What I'm seeing when I peer in is that the information populating the
> > > local_datapath structures doesn't appear to be consistent on each pass
> > > through binding_run: Looking at the ovn-controller process under hv2
> > > for the above test (when it passes), I'll see signatures that look
> > > like the following:
> > >
> > > 2016-10-04T21:57:58.257Z|00020|physical|INFO|looking at binding record
> > > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > > 2016-10-04T21:57:58.257Z|00021|physical|INFO|dp_key is 1
> > > 2016-10-04T21:57:58.257Z|00022|physical|INFO|looking for ofport of lp11
> > > (LP)
> > > 2016-10-04T21:57:58.257Z|00023|physical|INFO|looking for ofport of ln1
> > > (localnet)
> > > ...
> > > 2016-10-04T21:57:58.259Z|00034|physical|INFO|looking at binding record
> > > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > > 2016-10-04T21:57:58.259Z|00035|physical|INFO|dp_key is 1
> > > 2016-10-04T21:57:58.259Z|00036|physical|INFO|looking for ofport of lp11
> > > (LP)
> > > 2016-10-04T21:57:58.259Z|00037|physical|INFO|looking for tunnel to hv1
> > > ...
> >
> > This sort of oscillation seems super-weird to me.  I'd also like to
> > learn more.
> >
> 
> I found the problem and fixed it in the v3 patch set - some of the methods
> (patch_run for example) now include a test of ovs_idl_txn before they run
> and some don't (physical_run for example), so you can see guess what was
> happening...
> 
> (If we had a txn to work with everything is fine, but without one,
> physical_run was trying to calculate flows based on local datapaths
> with no localnet ports being defined, which leads to the above
> oscillation.)
> 
> I added a txn check to the logic for whether to call things or not in
> ovn-controller.c and I added a note in the email about thinking that a
> follow-on patch to clean up all of the ovs_idl_txn gates might make sense
> (so that nobody trips on this in the future)

It sounds like you found a bug that should be fixed independent of
"quiet mode".  Should it be two patches, one of which we backport to
branch-2.6?
Ryan Moats Oct. 5, 2016, 6:19 p.m. UTC | #6
Ben Pfaff <blp@ovn.org> wrote on 10/05/2016 01:04:36 PM:

> From: Ben Pfaff <blp@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 10/05/2016 01:04 PM
> Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
>
> On Wed, Oct 05, 2016 at 12:52:57PM -0500, Ryan Moats wrote:
> > Ben Pfaff <blp@ovn.org> wrote on 10/05/2016 12:37:26 PM:
> >
> > > From: Ben Pfaff <blp@ovn.org>
> > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > Cc: dev@openvswitch.org
> > > Date: 10/05/2016 12:37 PM
> > > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> > >
> > > On Tue, Oct 04, 2016 at 05:11:37PM -0500, Ryan Moats wrote:
> > > > Ben Pfaff <blp@ovn.org> wrote on 10/04/2016 12:14:32 PM:
> > > >
> > > > > From: Ben Pfaff <blp@ovn.org>
> > > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > > Cc: dev@openvswitch.org
> > > > > Date: 10/04/2016 12:14 PM
> > > > > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> > > > >
> > > > > On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> > > > > > As discussed in [1], what the incremental processing code
> > > > > > actually accomplished was that the ovn-controller would
> > > > > > be "quiet" and not burn CPU when things weren't changing.
> > > > > > This patch set recreates this state by calculating whether
> > > > > > changes have occured that would require a full calculation
> > > > > > to be performed.  It does this by persisting a copy of
> > > > > > the localvif_to_ofport and tunnel information in the
> > > > > > controller module, rather than in the physical.c module
> > > > > > as was the case with previous commits.
> > > > > >
> > > > > > [1]
http://openvswitch.org/pipermail/dev/2016-August/078272.html
> > > > > >
> > > > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > > > >
> > > > > Hi Ryan.
> > > > >
> > > > > I like the idea behind this patch.  However, it no longer applies
to
> > > > > master, so it needs a rebase.
> > > >
> > > > So done, but before submitting a new patch....
> > > >
> > > > >
> > > > > It also seems like this TODO should be addressed:
> > > > > +        /* TODO (regXboi): this next line is needed for the 3
HVs, 3
> > LS,
> > > > > +         * 3 lports/LS, 1 LR test case, but has the potential
side
> > > > effect
> > > > > +         * of defeating quiet mode once a logical router leads
to
> > > > creating
> > > > > +         * patch ports. Need to understand the failure mode
better
> > and
> > > > > +         * what is needed to remove this. */
> > > > > +        force_full_process();
> > > >
> > > > I've been looking at what happens here and I'm seeing some
signatures
> > > > that concern me.  The test case that fails is no longer the cited
one
> > above
> > > > but is "2 HVs, 4 lports/HV, localnet ports" ...
> > > >
> > > > What I'm seeing when I peer in is that the information populating
the
> > > > local_datapath structures doesn't appear to be consistent on each
pass
> > > > through binding_run: Looking at the ovn-controller process under
hv2
> > > > for the above test (when it passes), I'll see signatures that look
> > > > like the following:
> > > >
> > > > 2016-10-04T21:57:58.257Z|00020|physical|INFO|looking at binding
record
> > > > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > > > 2016-10-04T21:57:58.257Z|00021|physical|INFO|dp_key is 1
> > > > 2016-10-04T21:57:58.257Z|00022|physical|INFO|looking for ofport of
lp11
> > > > (LP)
> > > > 2016-10-04T21:57:58.257Z|00023|physical|INFO|looking for ofport of
ln1
> > > > (localnet)
> > > > ...
> > > > 2016-10-04T21:57:58.259Z|00034|physical|INFO|looking at binding
record
> > > > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > > > 2016-10-04T21:57:58.259Z|00035|physical|INFO|dp_key is 1
> > > > 2016-10-04T21:57:58.259Z|00036|physical|INFO|looking for ofport of
lp11
> > > > (LP)
> > > > 2016-10-04T21:57:58.259Z|00037|physical|INFO|looking for tunnel to
hv1
> > > > ...
> > >
> > > This sort of oscillation seems super-weird to me.  I'd also like to
> > > learn more.
> > >
> >
> > I found the problem and fixed it in the v3 patch set - some of the
methods
> > (patch_run for example) now include a test of ovs_idl_txn before they
run
> > and some don't (physical_run for example), so you can see guess what
was
> > happening...
> >
> > (If we had a txn to work with everything is fine, but without one,
> > physical_run was trying to calculate flows based on local datapaths
> > with no localnet ports being defined, which leads to the above
> > oscillation.)
> >
> > I added a txn check to the logic for whether to call things or not in
> > ovn-controller.c and I added a note in the email about thinking that a
> > follow-on patch to clean up all of the ovs_idl_txn gates might make
sense
> > (so that nobody trips on this in the future)
>
> It sounds like you found a bug that should be fixed independent of
> "quiet mode".  Should it be two patches, one of which we backport to
> branch-2.6?

I'd say *yes* to two patches, and *yes* to backport to branch-2.6, but
I'd not split what I've done as it is just a quick hack - I'd say let's
fix the issue *right* and then the v3 patch can be rebased on the result.

Ryan
Ryan Moats Oct. 6, 2016, 1:58 a.m. UTC | #7
"dev" <dev-bounces@openvswitch.org> wrote on 10/05/2016 01:19:34 PM:

> From: Ryan Moats/Omaha/IBM@IBMUS
> To: Ben Pfaff <blp@ovn.org>
> Cc: dev@openvswitch.org
> Date: 10/05/2016 01:20 PM
> Subject: Re: [ovs-dev] [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> Sent by: "dev" <dev-bounces@openvswitch.org>
>
> Ben Pfaff <blp@ovn.org> wrote on 10/05/2016 01:04:36 PM:
>
> > From: Ben Pfaff <blp@ovn.org>
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 10/05/2016 01:04 PM
> > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> >
> > On Wed, Oct 05, 2016 at 12:52:57PM -0500, Ryan Moats wrote:
> > > Ben Pfaff <blp@ovn.org> wrote on 10/05/2016 12:37:26 PM:
> > >
> > > > From: Ben Pfaff <blp@ovn.org>
> > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > Cc: dev@openvswitch.org
> > > > Date: 10/05/2016 12:37 PM
> > > > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> > > >
> > > > On Tue, Oct 04, 2016 at 05:11:37PM -0500, Ryan Moats wrote:
> > > > > Ben Pfaff <blp@ovn.org> wrote on 10/04/2016 12:14:32 PM:
> > > > >
> > > > > > From: Ben Pfaff <blp@ovn.org>
> > > > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > > > Cc: dev@openvswitch.org
> > > > > > Date: 10/04/2016 12:14 PM
> > > > > > Subject: Re: [ovs-dev,v2,2/4] ovn-controller: add quiet mode
> > > > > >
> > > > > > On Wed, Aug 31, 2016 at 03:22:44PM +0000, Ryan Moats wrote:
> > > > > > > As discussed in [1], what the incremental processing code
> > > > > > > actually accomplished was that the ovn-controller would
> > > > > > > be "quiet" and not burn CPU when things weren't changing.
> > > > > > > This patch set recreates this state by calculating whether
> > > > > > > changes have occured that would require a full calculation
> > > > > > > to be performed.  It does this by persisting a copy of
> > > > > > > the localvif_to_ofport and tunnel information in the
> > > > > > > controller module, rather than in the physical.c module
> > > > > > > as was the case with previous commits.
> > > > > > >
> > > > > > > [1]
> http://openvswitch.org/pipermail/dev/2016-August/078272.html
> > > > > > >
> > > > > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > > > > >
> > > > > > Hi Ryan.
> > > > > >
> > > > > > I like the idea behind this patch.  However, it no longer
applies
> to
> > > > > > master, so it needs a rebase.
> > > > >
> > > > > So done, but before submitting a new patch....
> > > > >
> > > > > >
> > > > > > It also seems like this TODO should be addressed:
> > > > > > +        /* TODO (regXboi): this next line is needed for the 3
> HVs, 3
> > > LS,
> > > > > > +         * 3 lports/LS, 1 LR test case, but has the potential
> side
> > > > > effect
> > > > > > +         * of defeating quiet mode once a logical router leads
> to
> > > > > creating
> > > > > > +         * patch ports. Need to understand the failure mode
> better
> > > and
> > > > > > +         * what is needed to remove this. */
> > > > > > +        force_full_process();
> > > > >
> > > > > I've been looking at what happens here and I'm seeing some
> signatures
> > > > > that concern me.  The test case that fails is no longer the cited
> one
> > > above
> > > > > but is "2 HVs, 4 lports/HV, localnet ports" ...
> > > > >
> > > > > What I'm seeing when I peer in is that the information populating
> the
> > > > > local_datapath structures doesn't appear to be consistent on each
> pass
> > > > > through binding_run: Looking at the ovn-controller process under
> hv2
> > > > > for the above test (when it passes), I'll see signatures that
look
> > > > > like the following:
> > > > >
> > > > > 2016-10-04T21:57:58.257Z|00020|physical|INFO|looking at binding
> record
> > > > > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > > > > 2016-10-04T21:57:58.257Z|00021|physical|INFO|dp_key is 1
> > > > > 2016-10-04T21:57:58.257Z|00022|physical|INFO|looking for ofport
of
> lp11
> > > > > (LP)
> > > > > 2016-10-04T21:57:58.257Z|00023|physical|INFO|looking for ofport
of
> ln1
> > > > > (localnet)
> > > > > ...
> > > > > 2016-10-04T21:57:58.259Z|00034|physical|INFO|looking at binding
> record
> > > > > 3736404b-c69d-4878-8d45-81ad9be06f5f
> > > > > 2016-10-04T21:57:58.259Z|00035|physical|INFO|dp_key is 1
> > > > > 2016-10-04T21:57:58.259Z|00036|physical|INFO|looking for ofport
of
> lp11
> > > > > (LP)
> > > > > 2016-10-04T21:57:58.259Z|00037|physical|INFO|looking for tunnel
to
> hv1
> > > > > ...
> > > >
> > > > This sort of oscillation seems super-weird to me.  I'd also like to
> > > > learn more.
> > > >
> > >
> > > I found the problem and fixed it in the v3 patch set - some of the
> methods
> > > (patch_run for example) now include a test of ovs_idl_txn before they
> run
> > > and some don't (physical_run for example), so you can see guess what
> was
> > > happening...
> > >
> > > (If we had a txn to work with everything is fine, but without one,
> > > physical_run was trying to calculate flows based on local datapaths
> > > with no localnet ports being defined, which leads to the above
> > > oscillation.)
> > >
> > > I added a txn check to the logic for whether to call things or not in
> > > ovn-controller.c and I added a note in the email about thinking that
a
> > > follow-on patch to clean up all of the ovs_idl_txn gates might make
> sense
> > > (so that nobody trips on this in the future)
> >
> > It sounds like you found a bug that should be fixed independent of
> > "quiet mode".  Should it be two patches, one of which we backport to
> > branch-2.6?
>
> I'd say *yes* to two patches, and *yes* to backport to branch-2.6, but
> I'd not split what I've done as it is just a quick hack - I'd say let's

The patch for fixing this bug is available at

http://patchwork.ozlabs.org/patch/678692/

I'll rebase the v3 of quiet mode patch onto the above once it merges...

Ryan
diff mbox

Patch

diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index d8e111d..5b1434e 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -366,6 +366,8 @@  run_S_CLEAR_FLOWS(void)
     queue_msg(encode_group_mod(&gm));
     ofputil_uninit_group_mod(&gm);
 
+    force_full_process();
+
     /* Clear existing groups, to match the state of the switch. */
     if (groups) {
         ovn_group_table_clear(groups, true);
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index c2e667b..cb0ae40 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -304,6 +304,23 @@  get_nb_cfg(struct ovsdb_idl *idl)
     return sb ? sb->nb_cfg : 0;
 }
 
+/* Contains mapping of localvifs to OF ports for detecting if the physical
+ * configuration of the switch has changed. */
+static struct simap localvif_to_ofport =
+    SIMAP_INITIALIZER(&localvif_to_ofport);
+
+/* Contains the list of known tunnels. */
+static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+
+/* The last sequence number seen from the southbound IDL. */
+static unsigned int seqno = 0;
+
+void
+force_full_process(void)
+{
+    seqno = 0;
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -427,39 +444,51 @@  main(int argc, char *argv[])
                         &all_lports);
         }
 
+        enum mf_field_id mff_ovn_geneve;
+        bool physical_change = true;
         if (br_int && chassis_id) {
+            mff_ovn_geneve = ofctrl_run(br_int);
+            physical_change = detect_and_save_physical_changes(
+                &localvif_to_ofport, &tunnels, mff_ovn_geneve, br_int,
+                chassis_id);
+            unsigned int cur_seqno = ovsdb_idl_get_seqno(ovnsb_idl_loop.idl);
+
             patch_run(&ctx, br_int, chassis_id, &local_datapaths,
                       &patched_datapaths);
 
             static struct lport_index lports;
-            static struct mcgroup_index mcgroups;
             lport_index_init(&lports, ctx.ovnsb_idl);
-            mcgroup_index_init(&mcgroups, ctx.ovnsb_idl);
-
-            enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int);
-
-            pinctrl_run(&ctx, &lports, br_int, chassis_id, &local_datapaths);
+            pinctrl_run(&ctx, &lports, br_int, chassis_id,
+                        &local_datapaths);
             update_ct_zones(&all_lports, &patched_datapaths, &ct_zones,
                             ct_zone_bitmap);
 
-            struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
-            lflow_run(&ctx, &lports, &mcgroups, &local_datapaths,
-                      &patched_datapaths, &group_table, &ct_zones,
-                      &flow_table);
-
-            physical_run(&ctx, mff_ovn_geneve,
-                         br_int, chassis_id, &ct_zones, &flow_table,
-                         &local_datapaths, &patched_datapaths);
-
-            ofctrl_put(&flow_table, get_nb_cfg(ctx.ovnsb_idl));
-            hmap_destroy(&flow_table);
-            if (ctx.ovnsb_idl_txn) {
-                int64_t cur_cfg = ofctrl_get_cur_cfg();
-                if (cur_cfg && cur_cfg != chassis->nb_cfg) {
-                    sbrec_chassis_set_nb_cfg(chassis, cur_cfg);
+
+            if (physical_change || seqno < cur_seqno) {
+                seqno = cur_seqno;
+
+                static struct mcgroup_index mcgroups;
+                mcgroup_index_init(&mcgroups, ctx.ovnsb_idl);
+
+                struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
+                lflow_run(&ctx, &lports, &mcgroups, &local_datapaths,
+                          &patched_datapaths, &group_table, &ct_zones,
+                          &flow_table);
+
+                physical_run(&ctx, mff_ovn_geneve,
+                             br_int, chassis_id, &ct_zones, &flow_table,
+                             &local_datapaths, &patched_datapaths);
+
+                ofctrl_put(&flow_table, get_nb_cfg(ctx.ovnsb_idl));
+                hmap_destroy(&flow_table);
+                if (ctx.ovnsb_idl_txn) {
+                    int64_t cur_cfg = ofctrl_get_cur_cfg();
+                    if (cur_cfg && cur_cfg != chassis->nb_cfg) {
+                        sbrec_chassis_set_nb_cfg(chassis, cur_cfg);
+                    }
                 }
+                mcgroup_index_destroy(&mcgroups);
             }
-            mcgroup_index_destroy(&mcgroups);
             lport_index_destroy(&lports);
         }
 
diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h
index 470b1f5..14cf861 100644
--- a/ovn/controller/ovn-controller.h
+++ b/ovn/controller/ovn-controller.h
@@ -76,5 +76,6 @@  enum chassis_tunnel_type {
 
 uint32_t get_tunnel_type(const char *name);
 
+void force_full_process(void);
 
 #endif /* ovn/ovn-controller.h */
diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index c8e47b4..d2bb30f 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -96,6 +96,7 @@  create_patch_port(struct controller_ctx *ctx,
     ovsrec_bridge_set_ports(src, ports, src->n_ports + 1);
 
     free(ports);
+    force_full_process();
 }
 
 static void
@@ -124,6 +125,7 @@  remove_port(struct controller_ctx *ctx,
             ovsrec_bridge_set_ports(bridge, new_ports, bridge->n_ports - 1);
             free(new_ports);
             ovsrec_port_delete(port);
+            force_full_process();
             return;
         }
     }
@@ -243,6 +245,12 @@  add_bridge_mappings(struct controller_ctx *ctx,
                           br_int, name1, br_ln, name2, existing_ports);
         create_patch_port(ctx, patch_port_id, binding->logical_port,
                           br_ln, name2, br_int, name1, existing_ports);
+        /* TODO (regXboi): this next line is needed for the 3 HVs, 3 LS,
+         * 3 lports/LS, 1 LR test case, but has the potential side effect
+         * of defeating quiet mode once a logical router leads to creating
+         * patch ports. Need to understand the failure mode better and
+         * what is needed to remove this. */
+        force_full_process();
         free(name1);
         free(name2);
     }
@@ -270,6 +278,7 @@  add_patched_datapath(struct hmap *patched_datapaths,
     /* stale is set to false. */
     hmap_insert(patched_datapaths, &pd->hmap_node,
                 binding_rec->datapath->tunnel_key);
+    force_full_process();
 }
 
 static void
@@ -296,6 +305,7 @@  add_logical_patch_ports_postprocess(struct hmap *patched_datapaths)
             hmap_remove(patched_datapaths, &pd_cur_node->hmap_node);
             free(pd_cur_node->key);
             free(pd_cur_node);
+            force_full_process();
         }
     }
 }
@@ -368,6 +378,7 @@  add_logical_patch_ports(struct controller_ctx *ctx,
             if (local_port) {
                 if (binding->chassis != chassis_rec && ctx->ovnsb_idl_txn) {
                     sbrec_port_binding_set_chassis(binding, chassis_rec);
+                    force_full_process();
                 }
             }
         }
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 9fa6754..24cc277 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -55,10 +55,6 @@  physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
     ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_external_ids);
 }
 
-static struct simap localvif_to_ofport =
-    SIMAP_INITIALIZER(&localvif_to_ofport);
-static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
-
 /* Maps from a chassis to the OpenFlow port number of the tunnel that can be
  * used to reach that chassis. */
 struct chassis_tunnel {
@@ -69,11 +65,11 @@  struct chassis_tunnel {
 };
 
 static struct chassis_tunnel *
-chassis_tunnel_find(const char *chassis_id)
+chassis_tunnel_find(struct hmap *tunnels_p, const char *chassis_id)
 {
     struct chassis_tunnel *tun;
     HMAP_FOR_EACH_WITH_HASH (tun, hmap_node, hash_string(chassis_id, 0),
-                             &tunnels) {
+                             tunnels_p) {
         if (!strcmp(tun->chassis_id, chassis_id)) {
             return tun;
         }
@@ -161,7 +157,9 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
                       struct hmap *patched_datapaths,
                       const struct sbrec_port_binding *binding,
                       struct ofpbuf *ofpacts_p,
-                      struct hmap *flow_table)
+                      struct hmap *flow_table,
+                      struct simap *localvif_to_ofport_p,
+                      struct hmap *tunnels_p)
 {
     /* Skip the port binding if the port is on a datapath that is neither
      * local nor with any logical patch port connected, because local ports
@@ -218,13 +216,13 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
         if (!binding->tag) {
             return;
         }
-        ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
+        ofport = u16_to_ofp(simap_get(localvif_to_ofport_p,
                                       binding->parent_port));
         if (ofport) {
             tag = *binding->tag;
         }
     } else {
-        ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
+        ofport = u16_to_ofp(simap_get(localvif_to_ofport_p,
                                       binding->logical_port));
         if ((!strcmp(binding->type, "localnet")
             || !strcmp(binding->type, "l2gateway"))
@@ -243,13 +241,13 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
             return;
         }
         if (localnet_port) {
-            ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
+            ofport = u16_to_ofp(simap_get(localvif_to_ofport_p,
                                           localnet_port->logical_port));
             if (!ofport) {
                 return;
             }
         } else {
-            tun = chassis_tunnel_find(binding->chassis->name);
+            tun = chassis_tunnel_find(tunnels_p, binding->chassis->name);
             if (!tun) {
                 return;
             }
@@ -514,7 +512,9 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
                   const struct sbrec_multicast_group *mc,
                   struct ofpbuf *ofpacts_p,
                   struct ofpbuf *remote_ofpacts_p,
-                  struct hmap *flow_table)
+                  struct hmap *flow_table,
+                  struct simap *localvif_to_ofport_p,
+                  struct hmap *tunnels_p)
 {
     struct sset remote_chassis = SSET_INITIALIZER(&remote_chassis);
     struct match match;
@@ -557,7 +557,7 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
             put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32,
                      remote_ofpacts_p);
             put_resubmit(OFTABLE_CHECK_LOOPBACK, remote_ofpacts_p);
-        } else if (simap_contains(&localvif_to_ofport,
+        } else if (simap_contains(localvif_to_ofport_p,
                            (port->parent_port && *port->parent_port)
                            ? port->parent_port : port->logical_port)) {
             put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, ofpacts_p);
@@ -603,7 +603,7 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
         const struct chassis_tunnel *prev = NULL;
         SSET_FOR_EACH (chassis, &remote_chassis) {
             const struct chassis_tunnel *tun
-                = chassis_tunnel_find(chassis);
+                = chassis_tunnel_find(tunnels_p, chassis);
             if (!tun) {
                 continue;
             }
@@ -627,13 +627,16 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
     sset_destroy(&remote_chassis);
 }
 
-void
-physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
-             const struct ovsrec_bridge *br_int, const char *this_chassis_id,
-             const struct simap *ct_zones, struct hmap *flow_table,
-             struct hmap *local_datapaths, struct hmap *patched_datapaths)
+/* Scans the bridge 'br_int' and compares it to the supplied expected state.
+ * Returns true and updates the expected state if changes are detected.
+ * Returns false if no changes are found. */
+bool
+detect_and_save_physical_changes(struct simap *localvif_to_ofport_p,
+                                 struct hmap *tunnels_p,
+                                 enum mf_field_id mff_ovn_geneve,
+                                 const struct ovsrec_bridge *br_int,
+                                 const char *this_chassis_id)
 {
-
     /* This bool tracks physical mapping changes. */
     bool physical_map_changed = false;
 
@@ -641,6 +644,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
         SIMAP_INITIALIZER(&new_localvif_to_ofport);
     struct simap new_tunnel_to_ofport =
         SIMAP_INITIALIZER(&new_tunnel_to_ofport);
+
     for (int i = 0; i < br_int->n_ports; i++) {
         const struct ovsrec_port *port_rec = br_int->ports[i];
         if (!strcmp(port_rec->name, br_int->name)) {
@@ -709,7 +713,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 }
 
                 simap_put(&new_tunnel_to_ofport, chassis_id, ofport);
-                struct chassis_tunnel *tun = chassis_tunnel_find(chassis_id);
+                struct chassis_tunnel *tun = chassis_tunnel_find(tunnels_p,
+                                                                 chassis_id);
                 if (tun) {
                     /* If the tunnel's ofport has changed, update. */
                     if (tun->ofport != u16_to_ofp(ofport) ||
@@ -720,7 +725,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                     }
                 } else {
                     tun = xmalloc(sizeof *tun);
-                    hmap_insert(&tunnels, &tun->hmap_node,
+                    hmap_insert(tunnels_p, &tun->hmap_node,
                                 hash_string(chassis_id, 0));
                     tun->chassis_id = chassis_id;
                     tun->ofport = u16_to_ofp(ofport);
@@ -740,9 +745,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
     /* Remove tunnels that are no longer here. */
     struct chassis_tunnel *tun, *tun_next;
-    HMAP_FOR_EACH_SAFE (tun, tun_next, hmap_node, &tunnels) {
+    HMAP_FOR_EACH_SAFE (tun, tun_next, hmap_node, tunnels_p) {
         if (!simap_find(&new_tunnel_to_ofport, tun->chassis_id)) {
-            hmap_remove(&tunnels, &tun->hmap_node);
+            hmap_remove(tunnels_p, &tun->hmap_node);
             physical_map_changed = true;
             free(tun);
         }
@@ -750,29 +755,43 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
     /* Capture changed or removed openflow ports. */
     struct simap_node *vif_name, *vif_name_next;
-    SIMAP_FOR_EACH_SAFE (vif_name, vif_name_next, &localvif_to_ofport) {
+    SIMAP_FOR_EACH_SAFE (vif_name, vif_name_next, localvif_to_ofport_p) {
         int newport;
         if ((newport = simap_get(&new_localvif_to_ofport, vif_name->name))) {
-            if (newport != simap_get(&localvif_to_ofport, vif_name->name)) {
-                simap_put(&localvif_to_ofport, vif_name->name, newport);
+            if (newport != simap_get(localvif_to_ofport_p, vif_name->name)) {
+                simap_put(localvif_to_ofport_p, vif_name->name, newport);
                 physical_map_changed = true;
             }
         } else {
-            simap_find_and_delete(&localvif_to_ofport, vif_name->name);
+            simap_find_and_delete(localvif_to_ofport_p, vif_name->name);
             physical_map_changed = true;
         }
     }
     SIMAP_FOR_EACH (vif_name, &new_localvif_to_ofport) {
-        if (!simap_get(&localvif_to_ofport, vif_name->name)) {
-            simap_put(&localvif_to_ofport, vif_name->name,
+        if (!simap_get(localvif_to_ofport_p, vif_name->name)) {
+            simap_put(localvif_to_ofport_p, vif_name->name,
                       simap_get(&new_localvif_to_ofport, vif_name->name));
             physical_map_changed = true;
         }
     }
-    if (physical_map_changed) {
-        /* Reprocess logical flow table immediately. */
-        poll_immediate_wake();
-    }
+
+    simap_destroy(&new_localvif_to_ofport);
+    simap_destroy(&new_tunnel_to_ofport);
+
+    return physical_map_changed;
+}
+
+void
+physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
+             const struct ovsrec_bridge *br_int, const char *this_chassis_id,
+             const struct simap *ct_zones, struct hmap *flow_table,
+             struct hmap *local_datapaths, struct hmap *patched_datapaths)
+{
+    struct simap localvif_to_ofport = SIMAP_INITIALIZER(&localvif_to_ofport);
+    struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+
+    detect_and_save_physical_changes(&localvif_to_ofport, &tunnels,
+                                     mff_ovn_geneve, br_int, this_chassis_id);
 
     struct ofpbuf ofpacts;
     ofpbuf_init(&ofpacts, 0);
@@ -786,7 +805,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
          * should clear the old flows to avoid collisions. */
         consider_port_binding(mff_ovn_geneve, ct_zones, local_datapaths,
                               patched_datapaths, binding, &ofpacts,
-                              flow_table);
+                              flow_table, &localvif_to_ofport, &tunnels);
     }
 
     /* Handle output to multicast groups, in tables 32 and 33. */
@@ -799,7 +818,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
          * so that we avoid warning messages on collisions. */
         consider_mc_group(mff_ovn_geneve, ct_zones,
                           local_datapaths, mc, &ofpacts, &remote_ofpacts,
-                          flow_table);
+                          flow_table, &localvif_to_ofport, &tunnels);
     }
 
     ofpbuf_uninit(&remote_ofpacts);
@@ -815,6 +834,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
      * ports.  We set MFF_LOG_DATAPATH, MFF_LOG_INPORT, and
      * MFF_LOG_OUTPORT from the tunnel key data, then resubmit to table
      * 33 to handle packets to the local hypervisor. */
+    struct chassis_tunnel *tun;
     HMAP_FOR_EACH (tun, hmap_node, &tunnels) {
         struct match match = MATCH_CATCHALL_INITIALIZER;
         match_set_in_port(&match, tun->ofport);
@@ -912,7 +932,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 0, &match, &ofpacts);
 
     ofpbuf_uninit(&ofpacts);
-
-    simap_destroy(&new_localvif_to_ofport);
-    simap_destroy(&new_tunnel_to_ofport);
+    simap_destroy(&localvif_to_ofport);
+    HMAP_FOR_EACH_POP (tun, hmap_node, &tunnels) {
+        free(tun);
+    }
+    hmap_destroy(&tunnels);
 }
diff --git a/ovn/controller/physical.h b/ovn/controller/physical.h
index 86ce93c..d11615e 100644
--- a/ovn/controller/physical.h
+++ b/ovn/controller/physical.h
@@ -41,6 +41,11 @@  struct simap;
 #define OVN_GENEVE_LEN 4
 
 void physical_register_ovs_idl(struct ovsdb_idl *);
+bool detect_and_save_physical_changes(struct simap *localvif_to_ofport_p,
+                                      struct hmap *tunnels_p,
+                                      enum mf_field_id mff_ovn_geneve,
+                                      const struct ovsrec_bridge *br_int,
+                                      const char *this_chassis_id);
 void physical_run(struct controller_ctx *, enum mf_field_id mff_ovn_geneve,
                   const struct ovsrec_bridge *br_int, const char *chassis_id,
                   const struct simap *ct_zones, struct hmap *flow_table,