mbox series

[ovs-dev,v2,0/3] expr: Optimize OR expressions.

Message ID 20230320103809.3121679-1-i.maximets@ovn.org
Headers show
Series expr: Optimize OR expressions. | expand

Message

Ilya Maximets March 20, 2023, 10:38 a.m. UTC
This patch set covers removal of expressions which are subsets of other
wider expressions and aggregation of a few granular expressions into
wider expressions that cover all of them at once.  This allows to avoid
flow explosion in case of negative matches and reduce the total number
of flows required for address sets.  More details are in commit messages.

Version 2:
  * Became a patch set.
  * Added tests and missing bitmap.h include.
  * Code switched to work with bitwise maskable fields only (ORDINAL).
  * Added a new patch to combine smaller expressions into wider ones.
  * Added a patch to fix a crash uncovered with expression aggregation.

Ilya Maximets (3):
  expr: Remove supersets from OR expressions.
  expr: Avoid crash if all sub-expressions crushed down to 'true'.
  expr: Combine OR sub-expressions into supersets.

 controller/lflow.c      |   5 +-
 lib/expr.c              | 188 +++++++++++++------
 tests/ovn-controller.at | 399 ++++++++++++++++++++--------------------
 tests/ovn.at            | 210 +++++++++++----------
 4 files changed, 443 insertions(+), 359 deletions(-)

Comments

Han Zhou March 20, 2023, 10:31 p.m. UTC | #1
On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> This patch set covers removal of expressions which are subsets of other
> wider expressions and aggregation of a few granular expressions into
> wider expressions that cover all of them at once.  This allows to avoid
> flow explosion in case of negative matches and reduce the total number
> of flows required for address sets.  More details are in commit messages.
>
> Version 2:
>   * Became a patch set.
>   * Added tests and missing bitmap.h include.
>   * Code switched to work with bitwise maskable fields only (ORDINAL).
>   * Added a new patch to combine smaller expressions into wider ones.
>   * Added a patch to fix a crash uncovered with expression aggregation.
>
> Ilya Maximets (3):
>   expr: Remove supersets from OR expressions.
>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
>   expr: Combine OR sub-expressions into supersets.
>
>  controller/lflow.c      |   5 +-
>  lib/expr.c              | 188 +++++++++++++------
>  tests/ovn-controller.at | 399 ++++++++++++++++++++--------------------
>  tests/ovn.at            | 210 +++++++++++----------
>  4 files changed, 443 insertions(+), 359 deletions(-)
>
> --
> 2.39.2
>

Thanks Ilya for working on this. The same problem was also reported and
discussed briefly in the past: [0], which may be mentioned in reported-by
as well.

I reviewed and tested this series. It definitely works great for the flow
explosion problem caused by negations in expressions. With 5 subnets in !=
{} form, which would have generated hundreds of thousands of flows without
the patch, now ended up with almost nothing.

However, I also see scale problems introduced by this change for more
normal use cases. The loop that tries to combine expressions to supersets
is O(n^2), n = size of an address set. In large scale environments, it is
easy to have more than 10k IPs in an address set - such as when there is a
network policy allowing access from all pods of a big tenant/application. I
reused my scripts for testing my earlier address set I-P patch [1] to test
the performance with this series. As mentioned in the commit message, the
old result was:

Before: ~400ms
After: 11-12ms

Now with this series, it takes 70 seconds for the same test!
As we can see, even before address-set I-P, it took just 400ms. In a large
scale environment, since pods come and go very frequently, even 400ms for
each change would make ovn-controller too busy, and that's why we came up
with address-set I-P, which made it O(1) and much faster. Now with this
change, for each IP change it would take 70s, almost 200 times slower when
recomputing the lflow, not to mention comparing with address-set I-P.

So, I would suggest that the patch 1 should add some logic to restrict the
handling for combining expressions generated by negation (!=) only, and
keep the current logic unchanged for regular non-negative matches. I think
it is possible to add a field in the expr structure to indicate that
information while parsing the != operator.

For the same reason, I'd rather drop patch 3, because the penalty of
disabling I-P is far beyond the gains of reduced number of flows, unless
there are other ways to efficiently combine expression, and we also need to
prove at least in most cases the logic can end up with small enough number
of flows that recomputing-every-time is nothing to worry about, which I am
not even sure it is going to be the case. On the other hand, maybe it is
better to leave the task of crushing large address sets to a central
component or to CMS, and keep ovn-controller simple.

[0] https://www.mail-archive.com/ovs-dev@openvswitch.org/msg61321.html
[1]
https://github.com/ovn-org/ovn/commit/7c0b538762f68b21bad41ed46f779d9084b7a679

Thanks,
Han
Ilya Maximets March 21, 2023, 12:24 p.m. UTC | #2
On 3/20/23 23:31, Han Zhou wrote:
> 
> 
> On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote:
>>
>> This patch set covers removal of expressions which are subsets of other
>> wider expressions and aggregation of a few granular expressions into
>> wider expressions that cover all of them at once.  This allows to avoid
>> flow explosion in case of negative matches and reduce the total number
>> of flows required for address sets.  More details are in commit messages.
>>
>> Version 2:
>>   * Became a patch set.
>>   * Added tests and missing bitmap.h include.
>>   * Code switched to work with bitwise maskable fields only (ORDINAL).
>>   * Added a new patch to combine smaller expressions into wider ones.
>>   * Added a patch to fix a crash uncovered with expression aggregation.
>>
>> Ilya Maximets (3):
>>   expr: Remove supersets from OR expressions.
>>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
>>   expr: Combine OR sub-expressions into supersets.
>>
>>  controller/lflow.c      |   5 +-
>>  lib/expr.c              | 188 +++++++++++++------
>>  tests/ovn-controller.at <http://ovn-controller.at> | 399 ++++++++++++++++++++--------------------
>>  tests/ovn.at <http://ovn.at>            | 210 +++++++++++----------
>>  4 files changed, 443 insertions(+), 359 deletions(-)
>>
>> --
>> 2.39.2
>>
> 
> Thanks Ilya for working on this. The same problem was also reported and discussed briefly in the past: [0], which may be mentioned in reported-by as well.

That one was more about memory issue on the OVS side, but sure.

> 
> I reviewed and tested this series. It definitely works great for the flow explosion problem caused by negations in expressions. With 5 subnets in != {} form, which would have generated hundreds of thousands of flows without the patch, now ended up with almost nothing.

Nice!

> 
> However, I also see scale problems introduced by this change for more normal use cases. The loop that tries to combine expressions to supersets is O(n^2), n = size of an address set. In large scale environments, it is easy to have more than 10k IPs in an address set - such as when there is a network policy allowing access from all pods of a big tenant/application. I reused my scripts for testing my earlier address set I-P patch [1] to test the performance with this series. As mentioned in the commit message, the old result was:
> 
> Before: ~400ms
> After: 11-12ms
> 
> Now with this series, it takes 70 seconds for the same test!
> As we can see, even before address-set I-P, it took just 400ms. In a large scale environment, since pods come and go very frequently, even 400ms for each change would make ovn-controller too busy, and that's why we came up with address-set I-P, which made it O(1) and much faster. Now with this change, for each IP change it would take 70s, almost 200 times slower when recomputing the lflow, not to mention comparing with address-set I-P.
> 
> So, I would suggest that the patch 1 should add some logic to restrict the handling for combining expressions generated by negation (!=) only, and keep the current logic unchanged for regular non-negative matches. I think it is possible to add a field in the expr structure to indicate that information while parsing the != operator.
> 
> For the same reason, I'd rather drop patch 3, because the penalty of disabling I-P is far beyond the gains of reduced number of flows, unless there are other ways to efficiently combine expression, and we also need to prove at least in most cases the logic can end up with small enough number of flows that recomputing-every-time is nothing to worry about, which I am not even sure it is going to be the case. On the other hand, maybe it is better to leave the task of crushing large address sets to a central component or to CMS, and keep ovn-controller simple.

Yeah, all that makes sense.  I didn't do the proper math, but I'd expect
the complexity on average to be less than n^2, because it should really
be possible to combine many addresses in very large address sets.
However, I would agree that even a linear complexity is a problem without
address set I-P.

It would be great if you can share your test script, maybe I can play
around with it and see if we can try to have both: combining (partial?) and
the I-P somehow.  Dropping I-P entirely on expression merges or duplicate
removals seems a bit overly cautious at a first glance.  Might be tricky
to handle removals, I guess.  But worth investigating, IMO.

Best regards, Ilya Maximets.

> 
> [0] https://www.mail-archive.com/ovs-dev@openvswitch.org/msg61321.html <https://www.mail-archive.com/ovs-dev@openvswitch.org/msg61321.html>
> [1] https://github.com/ovn-org/ovn/commit/7c0b538762f68b21bad41ed46f779d9084b7a679 <https://github.com/ovn-org/ovn/commit/7c0b538762f68b21bad41ed46f779d9084b7a679>
> 
> Thanks,
> Han
Ilya Maximets March 24, 2023, 1:33 p.m. UTC | #3
On 3/21/23 13:24, Ilya Maximets wrote:
> On 3/20/23 23:31, Han Zhou wrote:
>>
>>
>> On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote:
>>>
>>> This patch set covers removal of expressions which are subsets of other
>>> wider expressions and aggregation of a few granular expressions into
>>> wider expressions that cover all of them at once.  This allows to avoid
>>> flow explosion in case of negative matches and reduce the total number
>>> of flows required for address sets.  More details are in commit messages.
>>>
>>> Version 2:
>>>   * Became a patch set.
>>>   * Added tests and missing bitmap.h include.
>>>   * Code switched to work with bitwise maskable fields only (ORDINAL).
>>>   * Added a new patch to combine smaller expressions into wider ones.
>>>   * Added a patch to fix a crash uncovered with expression aggregation.
>>>
>>> Ilya Maximets (3):
>>>   expr: Remove supersets from OR expressions.
>>>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
>>>   expr: Combine OR sub-expressions into supersets.
>>>
>>>  controller/lflow.c      |   5 +-
>>>  lib/expr.c              | 188 +++++++++++++------
>>>  tests/ovn-controller.at <http://ovn-controller.at> | 399 ++++++++++++++++++++--------------------
>>>  tests/ovn.at <http://ovn.at>            | 210 +++++++++++----------
>>>  4 files changed, 443 insertions(+), 359 deletions(-)
>>>
>>> --
>>> 2.39.2
>>>
>>
>> Thanks Ilya for working on this. The same problem was also reported and discussed briefly in the past: [0], which may be mentioned in reported-by as well.
> 
> That one was more about memory issue on the OVS side, but sure.
> 
>>
>> I reviewed and tested this series. It definitely works great for the flow explosion problem caused by negations in expressions. With 5 subnets in != {} form, which would have generated hundreds of thousands of flows without the patch, now ended up with almost nothing.
> 
> Nice!
> 
>>
>> However, I also see scale problems introduced by this change for more normal use cases. The loop that tries to combine expressions to supersets is O(n^2), n = size of an address set. In large scale environments, it is easy to have more than 10k IPs in an address set - such as when there is a network policy allowing access from all pods of a big tenant/application. I reused my scripts for testing my earlier address set I-P patch [1] to test the performance with this series. As mentioned in the commit message, the old result was:
>>
>> Before: ~400ms
>> After: 11-12ms
>>
>> Now with this series, it takes 70 seconds for the same test!
>> As we can see, even before address-set I-P, it took just 400ms. In a large scale environment, since pods come and go very frequently, even 400ms for each change would make ovn-controller too busy, and that's why we came up with address-set I-P, which made it O(1) and much faster. Now with this change, for each IP change it would take 70s, almost 200 times slower when recomputing the lflow, not to mention comparing with address-set I-P.
>>
>> So, I would suggest that the patch 1 should add some logic to restrict the handling for combining expressions generated by negation (!=) only, and keep the current logic unchanged for regular non-negative matches. I think it is possible to add a field in the expr structure to indicate that information while parsing the != operator.

Han, do you see the performance degradation with just the first
patch applied?

I mean, it shouldn't block the I-P, unless users are manually
adding supersets of the same IP match into the address set.
It's not that different from removing duplicates that we do today.

If that's the case, maybe the first two patches can be accepted
as is?  Patch #3 definitely needs more work though, I agree.

What do you think?

Best regards, Ilya Maximets.
Ilya Maximets March 24, 2023, 2:22 p.m. UTC | #4
On 3/24/23 14:33, Ilya Maximets wrote:
> On 3/21/23 13:24, Ilya Maximets wrote:
>> On 3/20/23 23:31, Han Zhou wrote:
>>>
>>>
>>> On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote:
>>>>
>>>> This patch set covers removal of expressions which are subsets of other
>>>> wider expressions and aggregation of a few granular expressions into
>>>> wider expressions that cover all of them at once.  This allows to avoid
>>>> flow explosion in case of negative matches and reduce the total number
>>>> of flows required for address sets.  More details are in commit messages.
>>>>
>>>> Version 2:
>>>>   * Became a patch set.
>>>>   * Added tests and missing bitmap.h include.
>>>>   * Code switched to work with bitwise maskable fields only (ORDINAL).
>>>>   * Added a new patch to combine smaller expressions into wider ones.
>>>>   * Added a patch to fix a crash uncovered with expression aggregation.
>>>>
>>>> Ilya Maximets (3):
>>>>   expr: Remove supersets from OR expressions.
>>>>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
>>>>   expr: Combine OR sub-expressions into supersets.
>>>>
>>>>  controller/lflow.c      |   5 +-
>>>>  lib/expr.c              | 188 +++++++++++++------
>>>>  tests/ovn-controller.at <http://ovn-controller.at> | 399 ++++++++++++++++++++--------------------
>>>>  tests/ovn.at <http://ovn.at>            | 210 +++++++++++----------
>>>>  4 files changed, 443 insertions(+), 359 deletions(-)
>>>>
>>>> --
>>>> 2.39.2
>>>>
>>>
>>> Thanks Ilya for working on this. The same problem was also reported and discussed briefly in the past: [0], which may be mentioned in reported-by as well.
>>
>> That one was more about memory issue on the OVS side, but sure.
>>
>>>
>>> I reviewed and tested this series. It definitely works great for the flow explosion problem caused by negations in expressions. With 5 subnets in != {} form, which would have generated hundreds of thousands of flows without the patch, now ended up with almost nothing.
>>
>> Nice!
>>
>>>
>>> However, I also see scale problems introduced by this change for more normal use cases. The loop that tries to combine expressions to supersets is O(n^2), n = size of an address set. In large scale environments, it is easy to have more than 10k IPs in an address set - such as when there is a network policy allowing access from all pods of a big tenant/application. I reused my scripts for testing my earlier address set I-P patch [1] to test the performance with this series. As mentioned in the commit message, the old result was:
>>>
>>> Before: ~400ms
>>> After: 11-12ms
>>>
>>> Now with this series, it takes 70 seconds for the same test!
>>> As we can see, even before address-set I-P, it took just 400ms. In a large scale environment, since pods come and go very frequently, even 400ms for each change would make ovn-controller too busy, and that's why we came up with address-set I-P, which made it O(1) and much faster. Now with this change, for each IP change it would take 70s, almost 200 times slower when recomputing the lflow, not to mention comparing with address-set I-P.
>>>
>>> So, I would suggest that the patch 1 should add some logic to restrict the handling for combining expressions generated by negation (!=) only, and keep the current logic unchanged for regular non-negative matches. I think it is possible to add a field in the expr structure to indicate that information while parsing the != operator.
> 
> Han, do you see the performance degradation with just the first
> patch applied?
> 
> I mean, it shouldn't block the I-P, unless users are manually
> adding supersets of the same IP match into the address set.
> It's not that different from removing duplicates that we do today.
> 
> If that's the case, maybe the first two patches can be accepted
> as is?  Patch #3 definitely needs more work though, I agree.
> 
> What do you think?

Nevermind. :)
Even if it doesn't affect I-P, it may affect full recompute time,
which is also not great.  I'll do some testing and restrict the
use to cross-product sets.

> 
> Best regards, Ilya Maximets.
Han Zhou March 24, 2023, 9:57 p.m. UTC | #5
On Fri, Mar 24, 2023 at 7:22 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 3/24/23 14:33, Ilya Maximets wrote:
> > On 3/21/23 13:24, Ilya Maximets wrote:
> >> On 3/20/23 23:31, Han Zhou wrote:
> >>>
> >>>
> >>> On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org
<mailto:i.maximets@ovn.org>> wrote:
> >>>>
> >>>> This patch set covers removal of expressions which are subsets of
other
> >>>> wider expressions and aggregation of a few granular expressions into
> >>>> wider expressions that cover all of them at once.  This allows to
avoid
> >>>> flow explosion in case of negative matches and reduce the total
number
> >>>> of flows required for address sets.  More details are in commit
messages.
> >>>>
> >>>> Version 2:
> >>>>   * Became a patch set.
> >>>>   * Added tests and missing bitmap.h include.
> >>>>   * Code switched to work with bitwise maskable fields only
(ORDINAL).
> >>>>   * Added a new patch to combine smaller expressions into wider ones.
> >>>>   * Added a patch to fix a crash uncovered with expression
aggregation.
> >>>>
> >>>> Ilya Maximets (3):
> >>>>   expr: Remove supersets from OR expressions.
> >>>>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
> >>>>   expr: Combine OR sub-expressions into supersets.
> >>>>
> >>>>  controller/lflow.c      |   5 +-
> >>>>  lib/expr.c              | 188 +++++++++++++------
> >>>>  tests/ovn-controller.at <http://ovn-controller.at> | 399
++++++++++++++++++++--------------------
> >>>>  tests/ovn.at <http://ovn.at>            | 210 +++++++++++----------
> >>>>  4 files changed, 443 insertions(+), 359 deletions(-)
> >>>>
> >>>> --
> >>>> 2.39.2
> >>>>
> >>>
> >>> Thanks Ilya for working on this. The same problem was also reported
and discussed briefly in the past: [0], which may be mentioned in
reported-by as well.
> >>
> >> That one was more about memory issue on the OVS side, but sure.
> >>
> >>>
> >>> I reviewed and tested this series. It definitely works great for the
flow explosion problem caused by negations in expressions. With 5 subnets
in != {} form, which would have generated hundreds of thousands of flows
without the patch, now ended up with almost nothing.
> >>
> >> Nice!
> >>
> >>>
> >>> However, I also see scale problems introduced by this change for more
normal use cases. The loop that tries to combine expressions to supersets
is O(n^2), n = size of an address set. In large scale environments, it is
easy to have more than 10k IPs in an address set - such as when there is a
network policy allowing access from all pods of a big tenant/application. I
reused my scripts for testing my earlier address set I-P patch [1] to test
the performance with this series. As mentioned in the commit message, the
old result was:
> >>>
> >>> Before: ~400ms
> >>> After: 11-12ms
> >>>
> >>> Now with this series, it takes 70 seconds for the same test!
> >>> As we can see, even before address-set I-P, it took just 400ms. In a
large scale environment, since pods come and go very frequently, even 400ms
for each change would make ovn-controller too busy, and that's why we came
up with address-set I-P, which made it O(1) and much faster. Now with this
change, for each IP change it would take 70s, almost 200 times slower when
recomputing the lflow, not to mention comparing with address-set I-P.
> >>>
> >>> So, I would suggest that the patch 1 should add some logic to
restrict the handling for combining expressions generated by negation (!=)
only, and keep the current logic unchanged for regular non-negative
matches. I think it is possible to add a field in the expr structure to
indicate that information while parsing the != operator.
> >
> > Han, do you see the performance degradation with just the first
> > patch applied?
> >
> > I mean, it shouldn't block the I-P, unless users are manually
> > adding supersets of the same IP match into the address set.
> > It's not that different from removing duplicates that we do today.
> >
> > If that's the case, maybe the first two patches can be accepted
> > as is?  Patch #3 definitely needs more work though, I agree.
> >
> > What do you think?
>
> Nevermind. :)
> Even if it doesn't affect I-P, it may affect full recompute time,
> which is also not great.  I'll do some testing and restrict the
> use to cross-product sets.
>
Right, and it doesn't only affect full recompute, but also affect lflow
level recompute, which may be triggered by e.g. creating a new ACL, or a
local port-group update for an existing ACL.

Here is my test script for your reference:
https://github.com/hzhou8/ovn-test-script

The example in the readme is what I used (skip step 5 because
ovn-controller is not the focus here) for this benchmark test. You may use
ovn-heater, or something that simply creates a big address set for the same
purpose.

Thanks,
Han

> >
> > Best regards, Ilya Maximets.
>
Han Zhou March 25, 2023, 10:41 p.m. UTC | #6
On Fri, Mar 24, 2023 at 2:57 PM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Fri, Mar 24, 2023 at 7:22 AM Ilya Maximets <i.maximets@ovn.org> wrote:
> >
> > On 3/24/23 14:33, Ilya Maximets wrote:
> > > On 3/21/23 13:24, Ilya Maximets wrote:
> > >> On 3/20/23 23:31, Han Zhou wrote:
> > >>>
> > >>>
> > >>> On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org
<mailto:i.maximets@ovn.org>> wrote:
> > >>>>
> > >>>> This patch set covers removal of expressions which are subsets of
other
> > >>>> wider expressions and aggregation of a few granular expressions
into
> > >>>> wider expressions that cover all of them at once.  This allows to
avoid
> > >>>> flow explosion in case of negative matches and reduce the total
number
> > >>>> of flows required for address sets.  More details are in commit
messages.
> > >>>>
> > >>>> Version 2:
> > >>>>   * Became a patch set.
> > >>>>   * Added tests and missing bitmap.h include.
> > >>>>   * Code switched to work with bitwise maskable fields only
(ORDINAL).
> > >>>>   * Added a new patch to combine smaller expressions into wider
ones.
> > >>>>   * Added a patch to fix a crash uncovered with expression
aggregation.
> > >>>>
> > >>>> Ilya Maximets (3):
> > >>>>   expr: Remove supersets from OR expressions.
> > >>>>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
> > >>>>   expr: Combine OR sub-expressions into supersets.
> > >>>>
> > >>>>  controller/lflow.c      |   5 +-
> > >>>>  lib/expr.c              | 188 +++++++++++++------
> > >>>>  tests/ovn-controller.at <http://ovn-controller.at> | 399
++++++++++++++++++++--------------------
> > >>>>  tests/ovn.at <http://ovn.at>            | 210
+++++++++++----------
> > >>>>  4 files changed, 443 insertions(+), 359 deletions(-)
> > >>>>
> > >>>> --
> > >>>> 2.39.2
> > >>>>
> > >>>
> > >>> Thanks Ilya for working on this. The same problem was also reported
and discussed briefly in the past: [0], which may be mentioned in
reported-by as well.
> > >>
> > >> That one was more about memory issue on the OVS side, but sure.
> > >>
> > >>>
> > >>> I reviewed and tested this series. It definitely works great for
the flow explosion problem caused by negations in expressions. With 5
subnets in != {} form, which would have generated hundreds of thousands of
flows without the patch, now ended up with almost nothing.
> > >>
> > >> Nice!
> > >>
> > >>>
> > >>> However, I also see scale problems introduced by this change for
more normal use cases. The loop that tries to combine expressions to
supersets is O(n^2), n = size of an address set. In large scale
environments, it is easy to have more than 10k IPs in an address set - such
as when there is a network policy allowing access from all pods of a big
tenant/application. I reused my scripts for testing my earlier address set
I-P patch [1] to test the performance with this series. As mentioned in the
commit message, the old result was:
> > >>>
> > >>> Before: ~400ms
> > >>> After: 11-12ms
> > >>>
> > >>> Now with this series, it takes 70 seconds for the same test!
> > >>> As we can see, even before address-set I-P, it took just 400ms. In
a large scale environment, since pods come and go very frequently, even
400ms for each change would make ovn-controller too busy, and that's why we
came up with address-set I-P, which made it O(1) and much faster. Now with
this change, for each IP change it would take 70s, almost 200 times slower
when recomputing the lflow, not to mention comparing with address-set I-P.
> > >>>
> > >>> So, I would suggest that the patch 1 should add some logic to
restrict the handling for combining expressions generated by negation (!=)
only, and keep the current logic unchanged for regular non-negative
matches. I think it is possible to add a field in the expr structure to
indicate that information while parsing the != operator.
> > >
> > > Han, do you see the performance degradation with just the first
> > > patch applied?
> > >
> > > I mean, it shouldn't block the I-P, unless users are manually
> > > adding supersets of the same IP match into the address set.
> > > It's not that different from removing duplicates that we do today.
> > >
> > > If that's the case, maybe the first two patches can be accepted
> > > as is?  Patch #3 definitely needs more work though, I agree.
> > >
> > > What do you think?
> >
> > Nevermind. :)
> > Even if it doesn't affect I-P, it may affect full recompute time,
> > which is also not great.  I'll do some testing and restrict the
> > use to cross-product sets.
> >
> Right, and it doesn't only affect full recompute, but also affect lflow
level recompute, which may be triggered by e.g. creating a new ACL, or a
local port-group update for an existing ACL.
>
> Here is my test script for your reference:
> https://github.com/hzhou8/ovn-test-script
>
> The example in the readme is what I used (skip step 5 because
ovn-controller is not the focus here) for this benchmark test. You may use
ovn-heater, or something that simply creates a big address set for the same
purpose.

Sorry, please just ignore my statements about "skip step 5 because
ovn-controller is not the focus here" above. Apparently ovn-controller is
the focus here. Not sure what was going on in my brain :D

>
> Thanks,
> Han
>
> > >
> > > Best regards, Ilya Maximets.
> >
Ilya Maximets March 29, 2023, 4:12 p.m. UTC | #7
On 3/25/23 23:41, Han Zhou wrote:
> 
> 
> On Fri, Mar 24, 2023 at 2:57 PM Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>> wrote:
>>
>>
>>
>> On Fri, Mar 24, 2023 at 7:22 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote:
>> >
>> > On 3/24/23 14:33, Ilya Maximets wrote:
>> > > On 3/21/23 13:24, Ilya Maximets wrote:
>> > >> On 3/20/23 23:31, Han Zhou wrote:
>> > >>>
>> > >>>
>> > >>> On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org> <mailto:i.maximets@ovn.org <mailto:i.maximets@ovn.org>>> wrote:
>> > >>>>
>> > >>>> This patch set covers removal of expressions which are subsets of other
>> > >>>> wider expressions and aggregation of a few granular expressions into
>> > >>>> wider expressions that cover all of them at once.  This allows to avoid
>> > >>>> flow explosion in case of negative matches and reduce the total number
>> > >>>> of flows required for address sets.  More details are in commit messages.
>> > >>>>
>> > >>>> Version 2:
>> > >>>>   * Became a patch set.
>> > >>>>   * Added tests and missing bitmap.h include.
>> > >>>>   * Code switched to work with bitwise maskable fields only (ORDINAL).
>> > >>>>   * Added a new patch to combine smaller expressions into wider ones.
>> > >>>>   * Added a patch to fix a crash uncovered with expression aggregation.
>> > >>>>
>> > >>>> Ilya Maximets (3):
>> > >>>>   expr: Remove supersets from OR expressions.
>> > >>>>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
>> > >>>>   expr: Combine OR sub-expressions into supersets.
>> > >>>>
>> > >>>>  controller/lflow.c      |   5 +-
>> > >>>>  lib/expr.c              | 188 +++++++++++++------
>> > >>>>  tests/ovn-controller.at <http://ovn-controller.at> <http://ovn-controller.at <http://ovn-controller.at>> | 399 ++++++++++++++++++++--------------------
>> > >>>>  tests/ovn.at <http://ovn.at> <http://ovn.at <http://ovn.at>>            | 210 +++++++++++----------
>> > >>>>  4 files changed, 443 insertions(+), 359 deletions(-)
>> > >>>>
>> > >>>> --
>> > >>>> 2.39.2
>> > >>>>
>> > >>>
>> > >>> Thanks Ilya for working on this. The same problem was also reported and discussed briefly in the past: [0], which may be mentioned in reported-by as well.
>> > >>
>> > >> That one was more about memory issue on the OVS side, but sure.
>> > >>
>> > >>>
>> > >>> I reviewed and tested this series. It definitely works great for the flow explosion problem caused by negations in expressions. With 5 subnets in != {} form, which would have generated hundreds of thousands of flows without the patch, now ended up with almost nothing.
>> > >>
>> > >> Nice!
>> > >>
>> > >>>
>> > >>> However, I also see scale problems introduced by this change for more normal use cases. The loop that tries to combine expressions to supersets is O(n^2), n = size of an address set. In large scale environments, it is easy to have more than 10k IPs in an address set - such as when there is a network policy allowing access from all pods of a big tenant/application. I reused my scripts for testing my earlier address set I-P patch [1] to test the performance with this series. As mentioned in the commit message, the old result was:
>> > >>>
>> > >>> Before: ~400ms
>> > >>> After: 11-12ms
>> > >>>
>> > >>> Now with this series, it takes 70 seconds for the same test!
>> > >>> As we can see, even before address-set I-P, it took just 400ms. In a large scale environment, since pods come and go very frequently, even 400ms for each change would make ovn-controller too busy, and that's why we came up with address-set I-P, which made it O(1) and much faster. Now with this change, for each IP change it would take 70s, almost 200 times slower when recomputing the lflow, not to mention comparing with address-set I-P.
>> > >>>
>> > >>> So, I would suggest that the patch 1 should add some logic to restrict the handling for combining expressions generated by negation (!=) only, and keep the current logic unchanged for regular non-negative matches. I think it is possible to add a field in the expr structure to indicate that information while parsing the != operator.
>> > >
>> > > Han, do you see the performance degradation with just the first
>> > > patch applied?
>> > >
>> > > I mean, it shouldn't block the I-P, unless users are manually
>> > > adding supersets of the same IP match into the address set.
>> > > It's not that different from removing duplicates that we do today.
>> > >
>> > > If that's the case, maybe the first two patches can be accepted
>> > > as is?  Patch #3 definitely needs more work though, I agree.
>> > >
>> > > What do you think?
>> >
>> > Nevermind. :)
>> > Even if it doesn't affect I-P, it may affect full recompute time,
>> > which is also not great.  I'll do some testing and restrict the
>> > use to cross-product sets.
>> >
>> Right, and it doesn't only affect full recompute, but also affect lflow level recompute, which may be triggered by e.g. creating a new ACL, or a local port-group update for an existing ACL.
>>
>> Here is my test script for your reference:
>> https://github.com/hzhou8/ovn-test-script <https://github.com/hzhou8/ovn-test-script>
>>
>> The example in the readme is what I used (skip step 5 because ovn-controller is not the focus here) for this benchmark test. You may use ovn-heater, or something that simply creates a big address set for the same purpose.
> 
> Sorry, please just ignore my statements about "skip step 5 because ovn-controller is not the focus here" above. Apparently ovn-controller is the focus here. Not sure what was going on in my brain :D

Sure. :)

Nice script!  Thanks.  I posted a re-worked v3 now.  I took a slightly
different approach: instead of adding a flag, I re-worked the algorithm
in a way that we will not perform any extra work for normal sets that
do not actually contain any possible supersets, e.g. all are exact matches
or have the same number of bits in the mask.

Testing in my setup with your scripts doesn't show any noticeable difference
in performance for normal PG sets.

Algorithm in v3 should also be way faster for cases where we actually
need to look for supersets, compared to v2.

Best regards, Ilya Maximets.