diff mbox series

[net] rds: Fix incorrect statistics counting

Message ID 20170906152950.17766-1-Haakon.Bugge@oracle.com
State Superseded, archived
Delegated to: David Miller
Headers show
Series [net] rds: Fix incorrect statistics counting | expand

Commit Message

Haakon Bugge Sept. 6, 2017, 3:29 p.m. UTC
In rds_send_xmit() there is logic to batch the sends. However, if
another thread has acquired the lock, it is considered a race and we
yield. The code incrementing the s_send_lock_queue_raced statistics
counter did not count this event correctly.

This commit removes a small race in determining the race and
increments the statistics counter correctly.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
---
 net/rds/send.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

Comments

Santosh Shilimkar Sept. 6, 2017, 3:58 p.m. UTC | #1
On 9/6/2017 8:29 AM, Håkon Bugge wrote:
> In rds_send_xmit() there is logic to batch the sends. However, if
> another thread has acquired the lock, it is considered a race and we
> yield. The code incrementing the s_send_lock_queue_raced statistics
> counter did not count this event correctly.
> 
> This commit removes a small race in determining the race and
> increments the statistics counter correctly.
> 
> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> Reviewed-by: Knut Omang <knut.omang@oracle.com>
> ---
>   net/rds/send.c | 16 +++++++++++++---
>   1 file changed, 13 insertions(+), 3 deletions(-)
> 
Those counters are not really to give that accurate so
am not very keen to add additional cycles in send paths
and add additional code. Have you seen any real issue
or this is just a observation. s_send_lock_queue_raced
counter is never used to check for smaller increments
and hence the question.

Regards,
Santosh
Haakon Bugge Sept. 6, 2017, 4:12 p.m. UTC | #2
> On 6 Sep 2017, at 17:58, Santosh Shilimkar <santosh.shilimkar@oracle.com> wrote:
> 
> On 9/6/2017 8:29 AM, Håkon Bugge wrote:
>> In rds_send_xmit() there is logic to batch the sends. However, if
>> another thread has acquired the lock, it is considered a race and we
>> yield. The code incrementing the s_send_lock_queue_raced statistics
>> counter did not count this event correctly.
>> This commit removes a small race in determining the race and
>> increments the statistics counter correctly.
>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>> Reviewed-by: Knut Omang <knut.omang@oracle.com>
>> ---
>>  net/rds/send.c | 16 +++++++++++++---
>>  1 file changed, 13 insertions(+), 3 deletions(-)
> Those counters are not really to give that accurate so
> am not very keen to add additional cycles in send paths
> and add additional code. Have you seen any real issue
> or this is just a observation. s_send_lock_queue_raced
> counter is never used to check for smaller increments
> and hence the question.

Hi Santosh,


Yes, I agree with accuracy of s_send_lock_queue_raced. But the main point is that the existing code counts some partial share of when it is _not_ raced.

So, in the critical path, my patch adds one test_bit(), which hits the local CPU cache, if not raced. If raced, some other thread is in control, so I would not think the added cycles would make any big difference.

I can send a v2 where the race tightening is removed if you like.


Thxs, Håkon
Santosh Shilimkar Sept. 6, 2017, 4:21 p.m. UTC | #3
On 9/6/2017 9:12 AM, Håkon Bugge wrote:
> 
[...]

> 
> Hi Santosh,
> 
> 
> Yes, I agree with accuracy of s_send_lock_queue_raced. But the main point is that the existing code counts some partial share of when it is _not_ raced.
> 
> So, in the critical path, my patch adds one test_bit(), which hits the local CPU cache, if not raced. If raced, some other thread is in control, so I would not think the added cycles would make any big difference.
>
Cycles added for no good reason is the point.

> I can send a v2 where the race tightening is removed if you like.
> 
Yes please.

Regards,
Santosh
diff mbox series

Patch

diff --git a/net/rds/send.c b/net/rds/send.c
index 058a407..ecfe0b5 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -101,6 +101,11 @@  void rds_send_path_reset(struct rds_conn_path *cp)
 }
 EXPORT_SYMBOL_GPL(rds_send_path_reset);
 
+static bool someone_in_xmit(struct rds_conn_path *cp)
+{
+	return test_bit(RDS_IN_XMIT, &cp->cp_flags);
+}
+
 static int acquire_in_xmit(struct rds_conn_path *cp)
 {
 	return test_and_set_bit(RDS_IN_XMIT, &cp->cp_flags) == 0;
@@ -428,14 +433,19 @@  int rds_send_xmit(struct rds_conn_path *cp)
 	 * some work and we will skip our goto
 	 */
 	if (ret == 0) {
+		bool raced;
+
 		smp_mb();
+		raced = someone_in_xmit(cp) ||
+			send_gen != READ_ONCE(cp->cp_send_gen);
+
 		if ((test_bit(0, &conn->c_map_queued) ||
-		     !list_empty(&cp->cp_send_queue)) &&
-			send_gen == READ_ONCE(cp->cp_send_gen)) {
-			rds_stats_inc(s_send_lock_queue_raced);
+				!list_empty(&cp->cp_send_queue)) && !raced) {
 			if (batch_count < send_batch_count)
 				goto restart;
 			queue_delayed_work(rds_wq, &cp->cp_send_w, 1);
+		} else if (raced) {
+			rds_stats_inc(s_send_lock_queue_raced);
 		}
 	}
 out: