Message ID | 20240130212028.1482153-1-numans@ovn.org |
---|---|
Headers | show |
Series | northd lflow incremental processing | expand |
On 1/30/24 22:20, numans@ovn.org wrote: > From: Numan Siddique <numans@ovn.org> > Hi Numan, > This patch series adds incremental processing in the lflow engine > node to handle changes to northd and other engine nodes. > Changed related to load balancers and NAT are mainly handled in > this patch series. > > This patch series can also be found here - https://github.com/numansiddique/ovn/tree/northd_lbnatacl_lflow/v5 > > Prior to this patch series, most of the changes to northd engine > resulted in full recomputation of logical flows. This series > aims to improve the performance of ovn-northd by adding the I-P > support. In order to add this support, some of the northd engine > node data (from struct ovn_datapath) is split and moved over to > new engine nodes - mainly related to load balancers, NAT and ACLs. > > Below are the scale testing results done with these patches applied > using ovn-heater. The test ran the scenario - > ocp-500-density-heavy.yml [1]. > > With all the lflow I-P patches applied, the resuts are: > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Iteration Total 0.136883 1.129016 1.192001 1.204167 1.212728 0.665017 83.127099 125 0 > Namespace.add_ports 0.005216 0.005736 0.007034 0.015486 0.018978 0.006211 0.776373 125 0 > WorkerNode.bind_port 0.035030 0.046082 0.052469 0.058293 0.060311 0.045973 11.493259 250 0 > WorkerNode.ping_port 0.005057 0.006727 1.047692 1.069253 1.071336 0.266896 66.724094 250 0 > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > The results with the present main are: > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Iteration Total 0.135491 2.223805 3.311270 3.339078 3.345346 1.729172 216.146495 125 0 > Namespace.add_ports 0.005380 0.005744 0.006819 0.018773 0.020800 0.006292 0.786532 125 0 > WorkerNode.bind_port 0.034179 0.046055 0.053488 0.058801 0.071043 0.046117 11.529311 250 0 > WorkerNode.ping_port 0.004956 0.006952 3.086952 3.191743 3.192807 0.791544 197.886026 250 0 > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > Please see the link [2] which has a high level description of the > changes done in this patch series. > > > [1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml > [2] - https://mail.openvswitch.org/pipermail/ovs-dev/2023-December/410053.html > > v5 -> v6 > ------ > * Applied the first 3 patches of v5 after addressing all the review > comments (and with the Acks) > > * Rebased to latest main and resolved the conflicts. > > * Addressed almost all of the review comments received for v5 from > Han and Dumitru. > - Added detailed documentation on 'struct lflow_ref' and life > cycle of 'struct lflow_ref_node'. > - Added documentation on the thread safety limitations when > using 'struct lflow_ref'. > > v4 -> v5 > ------- > * Rebased to latest main and resolved the conflicts. > > * Addressed the review comments from Han in patch 15 (and in p8). Removed the > assert if SB dp group is missing and handled it by returning false > so that lflow engine recomputes. Added test cases to cover this > scenario for both lflows (p8) and SB load balancers (p15) . > > v3 -> v4 > ------- > * Addressed most of the review comments from Dumitru and Han. > > * Found a couple of bugs in v3 patch 9 - > "northd: Refactor lflow management into a separate module." > and addressed them in v4. > To brief the issue, if a logical flow L(M, A) is referenced > by 2 lflow_ref's which belong to the same datapath, then the lflow > was deleted even if one lflow_ref was cleared due to any changes. > It is addressed now by maintaining a reference count in the 'struct > ovn_lflow' for each datapath it is used by. > > * Moved the v3 patch 14 ("northd: Add I-P for NB_Global and SB_Global.") > to patch 16 in v4. There were comments in this patch to not add a > full I-P for NB_Global and SB_Global. Made this patch as the last > in the series so that we can discuss further and not block other patches > in case we want to drop this one. > > > v2 -> v3 > ------- > * Addressed some of the review comments from Han and Dumitru. There > are still a few pending review comments which needs to be addressed > or discussed. > > * Renamed the engine node from "lr_lbnat_data" to "lr_stateful" > (v3 patch 5). > > * Renamed the engine node from "ls_lbacls" to "ls_stateful" (v3 patch 8). > > * Removed v2 patch 2 from the series (northd: Track ovn_datapaths in > northd engine track data."). This patch is now part of v3 patch 7 > (northd: Add a new node 'ls_stateful'). > > * Squashed v2 patch 8 (northd: Don't commit dhcp response flows in > the conntrack.) into v3 patch 7 (northd: Add a new node > 'ls_stateful'.) > > > v1 -> v2 > -------- > * Now also maintaing array indexes for ls_lbacls, lr_nat and > lr_lb_nat_data tables (similar to ovn_datapaths->array) to > make the lookup effecient. The same ovn_datapath->index > is reused. > > * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c. > In v2 we don't use objdep_mgr to maintain the resource to lflow > references. Instead we maintain the 'struct lflow' pointer. > With this we don't need to maintain additional hmap of lflows. > [...] > 35 files changed, 9681 insertions(+), 4645 deletions(-) I had another look at this series and acked the remaining patches. I just had some minor comments that can be easily fixed when applying the patches to the main branch. Thanks for all the work on this! It was a very large change but it improves northd performance significantly. I just hope we don't introduce too many bugs. Hopefully the time we have until release will allow us to further test this change on the 24.03 branch. Regards, Dumitru
From: Numan Siddique <numans@ovn.org> This patch series adds incremental processing in the lflow engine node to handle changes to northd and other engine nodes. Changed related to load balancers and NAT are mainly handled in this patch series. This patch series can also be found here - https://github.com/numansiddique/ovn/tree/northd_lbnatacl_lflow/v5 Prior to this patch series, most of the changes to northd engine resulted in full recomputation of logical flows. This series aims to improve the performance of ovn-northd by adding the I-P support. In order to add this support, some of the northd engine node data (from struct ovn_datapath) is split and moved over to new engine nodes - mainly related to load balancers, NAT and ACLs. Below are the scale testing results done with these patches applied using ovn-heater. The test ran the scenario - ocp-500-density-heavy.yml [1]. With all the lflow I-P patches applied, the resuts are: ------------------------------------------------------------------------------------------------------------------------------------------------------- Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed ------------------------------------------------------------------------------------------------------------------------------------------------------- Iteration Total 0.136883 1.129016 1.192001 1.204167 1.212728 0.665017 83.127099 125 0 Namespace.add_ports 0.005216 0.005736 0.007034 0.015486 0.018978 0.006211 0.776373 125 0 WorkerNode.bind_port 0.035030 0.046082 0.052469 0.058293 0.060311 0.045973 11.493259 250 0 WorkerNode.ping_port 0.005057 0.006727 1.047692 1.069253 1.071336 0.266896 66.724094 250 0 ------------------------------------------------------------------------------------------------------------------------------------------------------- The results with the present main are: ------------------------------------------------------------------------------------------------------------------------------------------------------- Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed ------------------------------------------------------------------------------------------------------------------------------------------------------- Iteration Total 0.135491 2.223805 3.311270 3.339078 3.345346 1.729172 216.146495 125 0 Namespace.add_ports 0.005380 0.005744 0.006819 0.018773 0.020800 0.006292 0.786532 125 0 WorkerNode.bind_port 0.034179 0.046055 0.053488 0.058801 0.071043 0.046117 11.529311 250 0 WorkerNode.ping_port 0.004956 0.006952 3.086952 3.191743 3.192807 0.791544 197.886026 250 0 ------------------------------------------------------------------------------------------------------------------------------------------------------- Please see the link [2] which has a high level description of the changes done in this patch series. [1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml [2] - https://mail.openvswitch.org/pipermail/ovs-dev/2023-December/410053.html v5 -> v6 ------ * Applied the first 3 patches of v5 after addressing all the review comments (and with the Acks) * Rebased to latest main and resolved the conflicts. * Addressed almost all of the review comments received for v5 from Han and Dumitru. - Added detailed documentation on 'struct lflow_ref' and life cycle of 'struct lflow_ref_node'. - Added documentation on the thread safety limitations when using 'struct lflow_ref'. v4 -> v5 ------- * Rebased to latest main and resolved the conflicts. * Addressed the review comments from Han in patch 15 (and in p8). Removed the assert if SB dp group is missing and handled it by returning false so that lflow engine recomputes. Added test cases to cover this scenario for both lflows (p8) and SB load balancers (p15) . v3 -> v4 ------- * Addressed most of the review comments from Dumitru and Han. * Found a couple of bugs in v3 patch 9 - "northd: Refactor lflow management into a separate module." and addressed them in v4. To brief the issue, if a logical flow L(M, A) is referenced by 2 lflow_ref's which belong to the same datapath, then the lflow was deleted even if one lflow_ref was cleared due to any changes. It is addressed now by maintaining a reference count in the 'struct ovn_lflow' for each datapath it is used by. * Moved the v3 patch 14 ("northd: Add I-P for NB_Global and SB_Global.") to patch 16 in v4. There were comments in this patch to not add a full I-P for NB_Global and SB_Global. Made this patch as the last in the series so that we can discuss further and not block other patches in case we want to drop this one. v2 -> v3 ------- * Addressed some of the review comments from Han and Dumitru. There are still a few pending review comments which needs to be addressed or discussed. * Renamed the engine node from "lr_lbnat_data" to "lr_stateful" (v3 patch 5). * Renamed the engine node from "ls_lbacls" to "ls_stateful" (v3 patch 8). * Removed v2 patch 2 from the series (northd: Track ovn_datapaths in northd engine track data."). This patch is now part of v3 patch 7 (northd: Add a new node 'ls_stateful'). * Squashed v2 patch 8 (northd: Don't commit dhcp response flows in the conntrack.) into v3 patch 7 (northd: Add a new node 'ls_stateful'.) v1 -> v2 -------- * Now also maintaing array indexes for ls_lbacls, lr_nat and lr_lb_nat_data tables (similar to ovn_datapaths->array) to make the lookup effecient. The same ovn_datapath->index is reused. * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c. In v2 we don't use objdep_mgr to maintain the resource to lflow references. Instead we maintain the 'struct lflow' pointer. With this we don't need to maintain additional hmap of lflows. Numan Siddique (13): northd: Add a new engine 'lr_nat' to manage lr NAT data. northd: Add a new engine 'lr_stateful' to manage lr's stateful data. northd: Generate router's stateful flows using lr_stateful data. northd: Add a new node 'ls_stateful'. northd: Refactor lflow management into a separate module. northd: Use lflow_ref when adding all logical flows. northd: Move ovn_lb_datapaths from lib to northd module. northd: Handle lb changes in lflow engine. northd: Add lr_stateful handler for lflow engine node. northd: Add ls_stateful handler for lflow engine node. northd: Add a noop handler for northd SB mac binding. northd: Add northd change handler for sync_to_sb_lb node. northd: Add I-P for NB_Global and SB_Global. controller/automake.mk | 2 + controller/lb.c | 146 + controller/lb.h | 55 + controller/lflow.c | 1 + lib/lb.c | 771 +---- lib/lb.h | 199 +- lib/ovn-util.c | 26 +- lib/ovn-util.h | 5 +- lib/stopwatch-names.h | 5 + northd/aging.c | 21 +- northd/automake.mk | 14 +- northd/en-global-config.c | 576 ++++ northd/en-global-config.h | 65 + northd/en-lb-data.c | 1 + northd/en-lflow.c | 109 +- northd/en-lflow.h | 8 + northd/en-lr-nat.c | 397 +++ northd/en-lr-nat.h | 135 + northd/en-lr-stateful.c | 702 +++++ northd/en-lr-stateful.h | 153 + northd/en-ls-stateful.c | 440 +++ northd/en-ls-stateful.h | 113 + northd/en-northd.c | 58 +- northd/en-northd.h | 2 +- northd/en-port-group.h | 3 + northd/en-sync-sb.c | 565 +++- northd/inc-proc-northd.c | 74 +- northd/lb.c | 654 +++++ northd/lb.h | 217 ++ northd/lflow-mgr.c | 1409 +++++++++ northd/lflow-mgr.h | 189 ++ northd/northd.c | 5840 ++++++++++++++++--------------------- northd/northd.h | 475 ++- northd/ovn-northd.c | 9 + tests/ovn-northd.at | 887 +++++- 35 files changed, 9681 insertions(+), 4645 deletions(-) create mode 100644 controller/lb.c create mode 100644 controller/lb.h create mode 100644 northd/en-global-config.c create mode 100644 northd/en-global-config.h create mode 100644 northd/en-lr-nat.c create mode 100644 northd/en-lr-nat.h create mode 100644 northd/en-lr-stateful.c create mode 100644 northd/en-lr-stateful.h create mode 100644 northd/en-ls-stateful.c create mode 100644 northd/en-ls-stateful.h create mode 100644 northd/lb.c create mode 100644 northd/lb.h create mode 100644 northd/lflow-mgr.c create mode 100644 northd/lflow-mgr.h