Message ID | SY4PR01MB84389B259744BC16CB7F11BDCD489@SY4PR01MB8438.ausprd01.prod.outlook.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v1] dpif-netdev: Update meter timestamp when dp_netdev_run_meter been called. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
Hi/ilya,/ Could you please review my code? On 5/31/2023 11:41 PM, miterv@outlook.com wrote: > From: Lin Huang <linhuang@ruijie.com.cn> > > Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated > by pmd_thread_ctx_time_update(). > > Before processing of the new packet batch: > - dpif_netdev_execute() > - dp_netdev_process_rxq_port() > > There is a problem when user want to police the rate to a low pps by meter. > For example, When the traffic is more than 3Mpps, policing the traffic to > 1M pps, the real rate will be 3Mpps not 1Mpps. > > The key point is that a meter's timestamp isn't update in real time. > For example, PMD thread A and B are polled at the same time (t1). > Thread A starts to run meter to police traffic. At the same time, thread B > pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter > lock. This elapsed a lot of time, especially we have more than 10 PMD threads. > > After thread A release the meter lock, thread B start to count the time diff. > - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0. > - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0. > - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0. > > Fix this problem by update the meter timestamp every time. > > Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn> > Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> > --- > lib/dpif-netdev.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 70b953ae6..dbb275cf8 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED, > * that exceed a band are dropped in-place. */ > static void > dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, > - uint32_t meter_id, long long int now) > + uint32_t meter_id) > { > struct dp_meter *meter; > struct dp_meter_band *band; > struct dp_packet *packet; > long long int long_delta_t; /* msec */ > + long long int now; > uint32_t delta_t; /* msec */ > const size_t cnt = dp_packet_batch_size(packets_); > uint32_t bytes, volume; > @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, > memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); > > ovs_mutex_lock(&meter->lock); > + > + /* Update now */ > + now = time_msec(); > + > /* All packets will hit the meter at the same time. */ > - long_delta_t = now / 1000 - meter->used / 1000; /* msec */ > + long_delta_t = now - meter->used; /* msec */ > > if (long_delta_t < 0) { > /* This condition means that we have several threads fighting for a > @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, > } > > case OVS_ACTION_ATTR_METER: > - dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a), > - pmd->ctx.now); > + dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a)); > break; > > case OVS_ACTION_ATTR_PUSH_VLAN:
On 5/31/23 17:41, miterv@outlook.com wrote: > From: Lin Huang <linhuang@ruijie.com.cn> > > Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated > by pmd_thread_ctx_time_update(). > > Before processing of the new packet batch: > - dpif_netdev_execute() > - dp_netdev_process_rxq_port() > > There is a problem when user want to police the rate to a low pps by meter. > For example, When the traffic is more than 3Mpps, policing the traffic to > 1M pps, the real rate will be 3Mpps not 1Mpps. > > The key point is that a meter's timestamp isn't update in real time. > For example, PMD thread A and B are polled at the same time (t1). > Thread A starts to run meter to police traffic. At the same time, thread B > pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter > lock. This elapsed a lot of time, especially we have more than 10 PMD threads. Shouldn't the infrequent time update cause more drops instead? Best regards, Ilya Maximets. > > After thread A release the meter lock, thread B start to count the time diff. > - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0. > - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0. > - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0. > > Fix this problem by update the meter timestamp every time. > > Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn> > Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> > --- > lib/dpif-netdev.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 70b953ae6..dbb275cf8 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED, > * that exceed a band are dropped in-place. */ > static void > dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, > - uint32_t meter_id, long long int now) > + uint32_t meter_id) > { > struct dp_meter *meter; > struct dp_meter_band *band; > struct dp_packet *packet; > long long int long_delta_t; /* msec */ > + long long int now; > uint32_t delta_t; /* msec */ > const size_t cnt = dp_packet_batch_size(packets_); > uint32_t bytes, volume; > @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, > memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); > > ovs_mutex_lock(&meter->lock); > + > + /* Update now */ > + now = time_msec(); > + > /* All packets will hit the meter at the same time. */ > - long_delta_t = now / 1000 - meter->used / 1000; /* msec */ > + long_delta_t = now - meter->used; /* msec */ > > if (long_delta_t < 0) { > /* This condition means that we have several threads fighting for a > @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, > } > > case OVS_ACTION_ATTR_METER: > - dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a), > - pmd->ctx.now); > + dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a)); > break; > > case OVS_ACTION_ATTR_PUSH_VLAN:
On 6/21/23 18:07, Ilya Maximets wrote: > On 5/31/23 17:41, miterv@outlook.com wrote: >> From: Lin Huang <linhuang@ruijie.com.cn> >> >> Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated >> by pmd_thread_ctx_time_update(). >> >> Before processing of the new packet batch: >> - dpif_netdev_execute() >> - dp_netdev_process_rxq_port() >> >> There is a problem when user want to police the rate to a low pps by meter. >> For example, When the traffic is more than 3Mpps, policing the traffic to >> 1M pps, the real rate will be 3Mpps not 1Mpps. >> >> The key point is that a meter's timestamp isn't update in real time. >> For example, PMD thread A and B are polled at the same time (t1). >> Thread A starts to run meter to police traffic. At the same time, thread B >> pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter >> lock. This elapsed a lot of time, especially we have more than 10 PMD threads. > > Shouldn't the infrequent time update cause more drops instead? Also, having 10+ threads waiting on the same lock doesn't sound like a particularly efficient setup. If the locking here indeed a problem, maybe you can try the following patch in your scenario: https://patchwork.ozlabs.org/project/openvswitch/patch/20230621223220.2073152-1-i.maximets@ovn.org/ ? > > Best regards, Ilya Maximets. > >> >> After thread A release the meter lock, thread B start to count the time diff. >> - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0. >> - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0. >> - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0. >> >> Fix this problem by update the meter timestamp every time. >> >> Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn> >> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >> --- >> lib/dpif-netdev.c | 12 ++++++++---- >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index 70b953ae6..dbb275cf8 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED, >> * that exceed a band are dropped in-place. */ >> static void >> dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, >> - uint32_t meter_id, long long int now) >> + uint32_t meter_id) >> { >> struct dp_meter *meter; >> struct dp_meter_band *band; >> struct dp_packet *packet; >> long long int long_delta_t; /* msec */ >> + long long int now; >> uint32_t delta_t; /* msec */ >> const size_t cnt = dp_packet_batch_size(packets_); >> uint32_t bytes, volume; >> @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, >> memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); >> >> ovs_mutex_lock(&meter->lock); >> + >> + /* Update now */ >> + now = time_msec(); >> + >> /* All packets will hit the meter at the same time. */ >> - long_delta_t = now / 1000 - meter->used / 1000; /* msec */ >> + long_delta_t = now - meter->used; /* msec */ >> >> if (long_delta_t < 0) { >> /* This condition means that we have several threads fighting for a >> @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, >> } >> >> case OVS_ACTION_ATTR_METER: >> - dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a), >> - pmd->ctx.now); >> + dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a)); >> break; >> >> case OVS_ACTION_ATTR_PUSH_VLAN: >
Hi Ilya, Thanks for reviewing our code. We have 20+ threads indeed, so taking the same meter lock makes 'now' not updated timely. Using lockless meter to police traffic will be a perfect solution. :) Thanks a lot. On 6/22/2023 6:35 AM, Ilya Maximets wrote: > On 6/21/23 18:07, Ilya Maximets wrote: >> On 5/31/23 17:41,miterv@outlook.com wrote: >>> From: Lin Huang<linhuang@ruijie.com.cn> >>> >>> Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated >>> by pmd_thread_ctx_time_update(). >>> >>> Before processing of the new packet batch: >>> - dpif_netdev_execute() >>> - dp_netdev_process_rxq_port() >>> >>> There is a problem when user want to police the rate to a low pps by meter. >>> For example, When the traffic is more than 3Mpps, policing the traffic to >>> 1M pps, the real rate will be 3Mpps not 1Mpps. >>> >>> The key point is that a meter's timestamp isn't update in real time. >>> For example, PMD thread A and B are polled at the same time (t1). >>> Thread A starts to run meter to police traffic. At the same time, thread B >>> pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter >>> lock. This elapsed a lot of time, especially we have more than 10 PMD threads. >> Shouldn't the infrequent time update cause more drops instead? > Also, having 10+ threads waiting on the same lock doesn't sound like > a particularly efficient setup. If the locking here indeed a problem, > maybe you can try the following patch in your scenario: > > https://patchwork.ozlabs.org/project/openvswitch/patch/20230621223220.2073152-1-i.maximets@ovn.org/ > > ? > >> Best regards, Ilya Maximets. >> >>> After thread A release the meter lock, thread B start to count the time diff. >>> - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0. >>> - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0. >>> - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0. >>> >>> Fix this problem by update the meter timestamp every time. >>> >>> Test-by: Zhang Yuhuang<zhangyuhuang@ruijie.com.cn> >>> Signed-off-by: Lin Huang<linhuang@ruijie.com.cn> >>> --- >>> lib/dpif-netdev.c | 12 ++++++++---- >>> 1 file changed, 8 insertions(+), 4 deletions(-) >>> >>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>> index 70b953ae6..dbb275cf8 100644 >>> --- a/lib/dpif-netdev.c >>> +++ b/lib/dpif-netdev.c >>> @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED, >>> * that exceed a band are dropped in-place. */ >>> static void >>> dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, >>> - uint32_t meter_id, long long int now) >>> + uint32_t meter_id) >>> { >>> struct dp_meter *meter; >>> struct dp_meter_band *band; >>> struct dp_packet *packet; >>> long long int long_delta_t; /* msec */ >>> + long long int now; >>> uint32_t delta_t; /* msec */ >>> const size_t cnt = dp_packet_batch_size(packets_); >>> uint32_t bytes, volume; >>> @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, >>> memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); >>> >>> ovs_mutex_lock(&meter->lock); >>> + >>> + /* Update now */ >>> + now = time_msec(); >>> + >>> /* All packets will hit the meter at the same time. */ >>> - long_delta_t = now / 1000 - meter->used / 1000; /* msec */ >>> + long_delta_t = now - meter->used; /* msec */ >>> >>> if (long_delta_t < 0) { >>> /* This condition means that we have several threads fighting for a >>> @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, >>> } >>> >>> case OVS_ACTION_ATTR_METER: >>> - dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a), >>> - pmd->ctx.now); >>> + dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a)); >>> break; >>> >>> case OVS_ACTION_ATTR_PUSH_VLAN:
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 70b953ae6..dbb275cf8 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED, * that exceed a band are dropped in-place. */ static void dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, - uint32_t meter_id, long long int now) + uint32_t meter_id) { struct dp_meter *meter; struct dp_meter_band *band; struct dp_packet *packet; long long int long_delta_t; /* msec */ + long long int now; uint32_t delta_t; /* msec */ const size_t cnt = dp_packet_batch_size(packets_); uint32_t bytes, volume; @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); ovs_mutex_lock(&meter->lock); + + /* Update now */ + now = time_msec(); + /* All packets will hit the meter at the same time. */ - long_delta_t = now / 1000 - meter->used / 1000; /* msec */ + long_delta_t = now - meter->used; /* msec */ if (long_delta_t < 0) { /* This condition means that we have several threads fighting for a @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, } case OVS_ACTION_ATTR_METER: - dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a), - pmd->ctx.now); + dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a)); break; case OVS_ACTION_ATTR_PUSH_VLAN: