diff mbox series

[ovs-dev,v1] dpif-netdev: Update meter timestamp when dp_netdev_run_meter been called.

Message ID SY4PR01MB84389B259744BC16CB7F11BDCD489@SY4PR01MB8438.ausprd01.prod.outlook.com
State Superseded
Headers show
Series [ovs-dev,v1] dpif-netdev: Update meter timestamp when dp_netdev_run_meter been called. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

miter May 31, 2023, 3:41 p.m. UTC
From: Lin Huang <linhuang@ruijie.com.cn>

Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated
by pmd_thread_ctx_time_update().

Before processing of the new packet batch:
- dpif_netdev_execute()
- dp_netdev_process_rxq_port()

There is a problem when user want to police the rate to a low pps by meter.
For example, When the traffic is more than 3Mpps, policing the traffic to
1M pps, the real rate will be 3Mpps not 1Mpps.

The key point is that a meter's timestamp isn't update in real time.
For example, PMD thread A and B are polled at the same time (t1).
Thread A starts to run meter to police traffic. At the same time, thread B
pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter
lock. This elapsed a lot of time, especially we have more than 10 PMD threads.

After thread A release the meter lock, thread B start to count the time diff.
- long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0.
- band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0.
- band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0.

Fix this problem by update the meter timestamp every time.

Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn>
Signed-off-by: Lin Huang <linhuang@ruijie.com.cn>
---
 lib/dpif-netdev.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Comments

miter June 15, 2023, 2:08 p.m. UTC | #1
Hi/ilya,/

Could you please review my code?


On 5/31/2023 11:41 PM, miterv@outlook.com wrote:
> From: Lin Huang <linhuang@ruijie.com.cn>
>
> Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated
> by pmd_thread_ctx_time_update().
>
> Before processing of the new packet batch:
> - dpif_netdev_execute()
> - dp_netdev_process_rxq_port()
>
> There is a problem when user want to police the rate to a low pps by meter.
> For example, When the traffic is more than 3Mpps, policing the traffic to
> 1M pps, the real rate will be 3Mpps not 1Mpps.
>
> The key point is that a meter's timestamp isn't update in real time.
> For example, PMD thread A and B are polled at the same time (t1).
> Thread A starts to run meter to police traffic. At the same time, thread B
> pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter
> lock. This elapsed a lot of time, especially we have more than 10 PMD threads.
>
> After thread A release the meter lock, thread B start to count the time diff.
> - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0.
> - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0.
> - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0.
>
> Fix this problem by update the meter timestamp every time.
>
> Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn>
> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn>
> ---
>   lib/dpif-netdev.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 70b953ae6..dbb275cf8 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED,
>    * that exceed a band are dropped in-place. */
>   static void
>   dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
> -                    uint32_t meter_id, long long int now)
> +                    uint32_t meter_id)
>   {
>       struct dp_meter *meter;
>       struct dp_meter_band *band;
>       struct dp_packet *packet;
>       long long int long_delta_t; /* msec */
> +    long long int now;
>       uint32_t delta_t; /* msec */
>       const size_t cnt = dp_packet_batch_size(packets_);
>       uint32_t bytes, volume;
> @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
>       memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate);
>   
>       ovs_mutex_lock(&meter->lock);
> +
> +    /* Update now */
> +    now = time_msec();
> +
>       /* All packets will hit the meter at the same time. */
> -    long_delta_t = now / 1000 - meter->used / 1000; /* msec */
> +    long_delta_t = now  - meter->used; /* msec */
>   
>       if (long_delta_t < 0) {
>           /* This condition means that we have several threads fighting for a
> @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
>       }
>   
>       case OVS_ACTION_ATTR_METER:
> -        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a),
> -                            pmd->ctx.now);
> +        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a));
>           break;
>   
>       case OVS_ACTION_ATTR_PUSH_VLAN:
Ilya Maximets June 21, 2023, 4:07 p.m. UTC | #2
On 5/31/23 17:41, miterv@outlook.com wrote:
> From: Lin Huang <linhuang@ruijie.com.cn>
> 
> Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated
> by pmd_thread_ctx_time_update().
> 
> Before processing of the new packet batch:
> - dpif_netdev_execute()
> - dp_netdev_process_rxq_port()
> 
> There is a problem when user want to police the rate to a low pps by meter.
> For example, When the traffic is more than 3Mpps, policing the traffic to
> 1M pps, the real rate will be 3Mpps not 1Mpps.
> 
> The key point is that a meter's timestamp isn't update in real time.
> For example, PMD thread A and B are polled at the same time (t1).
> Thread A starts to run meter to police traffic. At the same time, thread B
> pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter
> lock. This elapsed a lot of time, especially we have more than 10 PMD threads.

Shouldn't the infrequent time update cause more drops instead?

Best regards, Ilya Maximets.

> 
> After thread A release the meter lock, thread B start to count the time diff.
> - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0.
> - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0.
> - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0.
> 
> Fix this problem by update the meter timestamp every time.
> 
> Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn>
> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn>
> ---
>  lib/dpif-netdev.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 70b953ae6..dbb275cf8 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED,
>   * that exceed a band are dropped in-place. */
>  static void
>  dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
> -                    uint32_t meter_id, long long int now)
> +                    uint32_t meter_id)
>  {
>      struct dp_meter *meter;
>      struct dp_meter_band *band;
>      struct dp_packet *packet;
>      long long int long_delta_t; /* msec */
> +    long long int now;
>      uint32_t delta_t; /* msec */
>      const size_t cnt = dp_packet_batch_size(packets_);
>      uint32_t bytes, volume;
> @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
>      memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate);
>  
>      ovs_mutex_lock(&meter->lock);
> +
> +    /* Update now */
> +    now = time_msec();
> +
>      /* All packets will hit the meter at the same time. */
> -    long_delta_t = now / 1000 - meter->used / 1000; /* msec */
> +    long_delta_t = now  - meter->used; /* msec */
>  
>      if (long_delta_t < 0) {
>          /* This condition means that we have several threads fighting for a
> @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
>      }
>  
>      case OVS_ACTION_ATTR_METER:
> -        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a),
> -                            pmd->ctx.now);
> +        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a));
>          break;
>  
>      case OVS_ACTION_ATTR_PUSH_VLAN:
Ilya Maximets June 21, 2023, 10:35 p.m. UTC | #3
On 6/21/23 18:07, Ilya Maximets wrote:
> On 5/31/23 17:41, miterv@outlook.com wrote:
>> From: Lin Huang <linhuang@ruijie.com.cn>
>>
>> Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated
>> by pmd_thread_ctx_time_update().
>>
>> Before processing of the new packet batch:
>> - dpif_netdev_execute()
>> - dp_netdev_process_rxq_port()
>>
>> There is a problem when user want to police the rate to a low pps by meter.
>> For example, When the traffic is more than 3Mpps, policing the traffic to
>> 1M pps, the real rate will be 3Mpps not 1Mpps.
>>
>> The key point is that a meter's timestamp isn't update in real time.
>> For example, PMD thread A and B are polled at the same time (t1).
>> Thread A starts to run meter to police traffic. At the same time, thread B
>> pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter
>> lock. This elapsed a lot of time, especially we have more than 10 PMD threads.
> 
> Shouldn't the infrequent time update cause more drops instead?

Also, having 10+ threads waiting on the same lock doesn't sound like
a particularly efficient setup.  If the locking here indeed a problem,
maybe you can try the following patch in your scenario:

 https://patchwork.ozlabs.org/project/openvswitch/patch/20230621223220.2073152-1-i.maximets@ovn.org/

?

> 
> Best regards, Ilya Maximets.
> 
>>
>> After thread A release the meter lock, thread B start to count the time diff.
>> - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0.
>> - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0.
>> - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0.
>>
>> Fix this problem by update the meter timestamp every time.
>>
>> Test-by: Zhang Yuhuang <zhangyuhuang@ruijie.com.cn>
>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn>
>> ---
>>  lib/dpif-netdev.c | 12 ++++++++----
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 70b953ae6..dbb275cf8 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED,
>>   * that exceed a band are dropped in-place. */
>>  static void
>>  dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
>> -                    uint32_t meter_id, long long int now)
>> +                    uint32_t meter_id)
>>  {
>>      struct dp_meter *meter;
>>      struct dp_meter_band *band;
>>      struct dp_packet *packet;
>>      long long int long_delta_t; /* msec */
>> +    long long int now;
>>      uint32_t delta_t; /* msec */
>>      const size_t cnt = dp_packet_batch_size(packets_);
>>      uint32_t bytes, volume;
>> @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
>>      memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate);
>>  
>>      ovs_mutex_lock(&meter->lock);
>> +
>> +    /* Update now */
>> +    now = time_msec();
>> +
>>      /* All packets will hit the meter at the same time. */
>> -    long_delta_t = now / 1000 - meter->used / 1000; /* msec */
>> +    long_delta_t = now  - meter->used; /* msec */
>>  
>>      if (long_delta_t < 0) {
>>          /* This condition means that we have several threads fighting for a
>> @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
>>      }
>>  
>>      case OVS_ACTION_ATTR_METER:
>> -        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a),
>> -                            pmd->ctx.now);
>> +        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a));
>>          break;
>>  
>>      case OVS_ACTION_ATTR_PUSH_VLAN:
>
miter June 22, 2023, 10:40 a.m. UTC | #4
Hi Ilya,

Thanks for reviewing our code.

We have 20+ threads indeed, so taking the same meter lock makes 'now' 
not updated timely.

Using lockless meter to police traffic will be a perfect solution. :)

Thanks a lot.


On 6/22/2023 6:35 AM, Ilya Maximets wrote:
> On 6/21/23 18:07, Ilya Maximets wrote:
>> On 5/31/23 17:41,miterv@outlook.com  wrote:
>>> From: Lin Huang<linhuang@ruijie.com.cn>
>>>
>>> Currently, a meter's timestamp 'now' is set by 'pmd->ctx.now' which updated
>>> by pmd_thread_ctx_time_update().
>>>
>>> Before processing of the new packet batch:
>>> - dpif_netdev_execute()
>>> - dp_netdev_process_rxq_port()
>>>
>>> There is a problem when user want to police the rate to a low pps by meter.
>>> For example, When the traffic is more than 3Mpps, policing the traffic to
>>> 1M pps, the real rate will be 3Mpps not 1Mpps.
>>>
>>> The key point is that a meter's timestamp isn't update in real time.
>>> For example, PMD thread A and B are polled at the same time (t1).
>>> Thread A starts to run meter to police traffic. At the same time, thread B
>>> pause at 'ovs_mutex_lock(&meter->lock)' to wait thread A release the meter
>>> lock. This elapsed a lot of time, especially we have more than 10 PMD threads.
>> Shouldn't the infrequent time update cause more drops instead?
> Also, having 10+ threads waiting on the same lock doesn't sound like
> a particularly efficient setup.  If the locking here indeed a problem,
> maybe you can try the following patch in your scenario:
>
>   https://patchwork.ozlabs.org/project/openvswitch/patch/20230621223220.2073152-1-i.maximets@ovn.org/
>
> ?
>
>> Best regards, Ilya Maximets.
>>
>>> After thread A release the meter lock, thread B start to count the time diff.
>>> - long_delta_t = now - meter->used' --> long_delta_t = t1 - t1 = 0.
>>> - band->bucket += (uint64_t) delta_t * band->rate --> band->bucket = 0.
>>> - band_exceeded_pkt = band->bucket / 1000; --> band_exceeded_pkt = 0.
>>>
>>> Fix this problem by update the meter timestamp every time.
>>>
>>> Test-by: Zhang Yuhuang<zhangyuhuang@ruijie.com.cn>
>>> Signed-off-by: Lin Huang<linhuang@ruijie.com.cn>
>>> ---
>>>   lib/dpif-netdev.c | 12 ++++++++----
>>>   1 file changed, 8 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 70b953ae6..dbb275cf8 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -7169,12 +7169,13 @@ dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED,
>>>    * that exceed a band are dropped in-place. */
>>>   static void
>>>   dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
>>> -                    uint32_t meter_id, long long int now)
>>> +                    uint32_t meter_id)
>>>   {
>>>       struct dp_meter *meter;
>>>       struct dp_meter_band *band;
>>>       struct dp_packet *packet;
>>>       long long int long_delta_t; /* msec */
>>> +    long long int now;
>>>       uint32_t delta_t; /* msec */
>>>       const size_t cnt = dp_packet_batch_size(packets_);
>>>       uint32_t bytes, volume;
>>> @@ -7197,8 +7198,12 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
>>>       memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate);
>>>   
>>>       ovs_mutex_lock(&meter->lock);
>>> +
>>> +    /* Update now */
>>> +    now = time_msec();
>>> +
>>>       /* All packets will hit the meter at the same time. */
>>> -    long_delta_t = now / 1000 - meter->used / 1000; /* msec */
>>> +    long_delta_t = now  - meter->used; /* msec */
>>>   
>>>       if (long_delta_t < 0) {
>>>           /* This condition means that we have several threads fighting for a
>>> @@ -9170,8 +9175,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
>>>       }
>>>   
>>>       case OVS_ACTION_ATTR_METER:
>>> -        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a),
>>> -                            pmd->ctx.now);
>>> +        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a));
>>>           break;
>>>   
>>>       case OVS_ACTION_ATTR_PUSH_VLAN:
diff mbox series

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 70b953ae6..dbb275cf8 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7169,12 +7169,13 @@  dpif_netdev_meter_get_features(const struct dpif * dpif OVS_UNUSED,
  * that exceed a band are dropped in-place. */
 static void
 dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
-                    uint32_t meter_id, long long int now)
+                    uint32_t meter_id)
 {
     struct dp_meter *meter;
     struct dp_meter_band *band;
     struct dp_packet *packet;
     long long int long_delta_t; /* msec */
+    long long int now;
     uint32_t delta_t; /* msec */
     const size_t cnt = dp_packet_batch_size(packets_);
     uint32_t bytes, volume;
@@ -7197,8 +7198,12 @@  dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_,
     memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate);
 
     ovs_mutex_lock(&meter->lock);
+
+    /* Update now */
+    now = time_msec();
+
     /* All packets will hit the meter at the same time. */
-    long_delta_t = now / 1000 - meter->used / 1000; /* msec */
+    long_delta_t = now  - meter->used; /* msec */
 
     if (long_delta_t < 0) {
         /* This condition means that we have several threads fighting for a
@@ -9170,8 +9175,7 @@  dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
     }
 
     case OVS_ACTION_ATTR_METER:
-        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a),
-                            pmd->ctx.now);
+        dp_netdev_run_meter(pmd->dp, packets_, nl_attr_get_u32(a));
         break;
 
     case OVS_ACTION_ATTR_PUSH_VLAN: