Message ID | 20210825073500.20621-2-anton.ivanov@cambridgegreys.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,1/2] northd: Resize the hash to correct parameters after build | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | fail | github build: failed |
On 25/08/2021 08:35, anton.ivanov@cambridgegreys.com wrote: > From: Anton Ivanov <anton.ivanov@cambridgegreys.com> > > When running with dp_groups and parallelization enabled we have > a possible worse case scenario where northd connects for the > first time to a pre-populated database. > If we do not size the hash to a sufficiently big size to > accommodate for this, we end up with severe lock contention > and very slow lookups. > If both dp_groups and parallelization are enabled we have > no choice, but to allocate for a worst case scenario on the > first run and adjust later. > > Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> > --- > northd/ovn-northd.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 14659c407..3a5ab1d9f 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -13034,7 +13034,7 @@ ovn_sb_set_lflow_logical_dp_group( > sbrec_logical_flow_set_logical_dp_group(sbflow, dpg->dp_group); > } > > -static ssize_t max_seen_lflow_size = 128; > +static ssize_t max_seen_lflow_size = 0; > > /* Updates the Logical_Flow and Multicast_Group tables in the OVN_SB database, > * constructing their contents based on the OVN_NB database. */ > @@ -13048,6 +13048,25 @@ build_lflows(struct northd_context *ctx, struct hmap *datapaths, > { > struct hmap lflows; > > + if (!max_seen_lflow_size) { > + if (use_logical_dp_groups && use_parallel_build) { > + /* We are running for the first time. The user has > + * requested both dp_groups and parallelisation. We > + * may encounter a very large amount of flows on first > + * run and we have no way to guess the flow hash size. > + * We allocate for worst a case scenario on the first > + * run. This will be resized to a sane size later. */ > + max_seen_lflow_size = INT_MAX; This ends up being too big for most systems. First run has to be in single threaded mode. > + } else { > + /* If the build is not parallel, this will be resized > + * to a correct size. If it is parallel, but without > + * dp_groups, the sizing is irrelevant as the hash is > + * not used for lookups during build. We resize it to > + * a correct size after that. */ > + max_seen_lflow_size = 128; > + } > + } > + > fast_hmap_size_for(&lflows, max_seen_lflow_size); > if (use_parallel_build) { > update_hashrow_locks(&lflows, &lflow_locks); >
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 14659c407..3a5ab1d9f 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -13034,7 +13034,7 @@ ovn_sb_set_lflow_logical_dp_group( sbrec_logical_flow_set_logical_dp_group(sbflow, dpg->dp_group); } -static ssize_t max_seen_lflow_size = 128; +static ssize_t max_seen_lflow_size = 0; /* Updates the Logical_Flow and Multicast_Group tables in the OVN_SB database, * constructing their contents based on the OVN_NB database. */ @@ -13048,6 +13048,25 @@ build_lflows(struct northd_context *ctx, struct hmap *datapaths, { struct hmap lflows; + if (!max_seen_lflow_size) { + if (use_logical_dp_groups && use_parallel_build) { + /* We are running for the first time. The user has + * requested both dp_groups and parallelisation. We + * may encounter a very large amount of flows on first + * run and we have no way to guess the flow hash size. + * We allocate for worst a case scenario on the first + * run. This will be resized to a sane size later. */ + max_seen_lflow_size = INT_MAX; + } else { + /* If the build is not parallel, this will be resized + * to a correct size. If it is parallel, but without + * dp_groups, the sizing is irrelevant as the hash is + * not used for lookups during build. We resize it to + * a correct size after that. */ + max_seen_lflow_size = 128; + } + } + fast_hmap_size_for(&lflows, max_seen_lflow_size); if (use_parallel_build) { update_hashrow_locks(&lflows, &lflow_locks);