diff mbox

[BUGFIX,3/3] memcg/tcp: ignore tcp usage before accounting started

Message ID 4F740AEF.7090900@jp.fujitsu.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

KAMEZAWA Hiroyuki March 29, 2012, 7:10 a.m. UTC
tcp memcontrol starts accouting after res->limit is set. So, if a sockets
starts before setting res->limit, there are already used resource.
After setting res->limit, the resource (already used) will be uncharged and
make res_counter below 0 because they are not charged. This causes warning.

This patch fixes that by adding res_counter_uncharge_nowarn().
(*) We cannot avoid this while we have 'account start' switch.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 include/linux/res_counter.h |    2 ++
 include/net/sock.h          |    3 ++-
 kernel/res_counter.c        |   18 ++++++++++++++++++
 3 files changed, 22 insertions(+), 1 deletions(-)

Comments

Glauber Costa March 29, 2012, 9:21 a.m. UTC | #1
On 03/29/2012 09:10 AM, KAMEZAWA Hiroyuki wrote:
> tcp memcontrol starts accouting after res->limit is set. So, if a sockets
> starts before setting res->limit, there are already used resource.
> After setting res->limit, the resource (already used) will be uncharged and
> make res_counter below 0 because they are not charged. This causes warning.
> 
> This patch fixes that by adding res_counter_uncharge_nowarn().
> (*) We cannot avoid this while we have 'account start' switch.
> 
> Signed-off-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>

Fine by me.

Acked-by: Glauber Costa <glommer@parallels.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 2, 2012, 3:41 a.m. UTC | #2
From: Glauber Costa <glommer@parallels.com>
Date: Thu, 29 Mar 2012 11:21:07 +0200

> On 03/29/2012 09:10 AM, KAMEZAWA Hiroyuki wrote:
>> tcp memcontrol starts accouting after res->limit is set. So, if a sockets
>> starts before setting res->limit, there are already used resource.
>> After setting res->limit, the resource (already used) will be uncharged and
>> make res_counter below 0 because they are not charged. This causes warning.
>> 
>> This patch fixes that by adding res_counter_uncharge_nowarn().
>> (*) We cannot avoid this while we have 'account start' switch.
>> 
>> Signed-off-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
> 
> Fine by me.
> 
> Acked-by: Glauber Costa <glommer@parallels.com>

I'm not applying patches that simply ignore accounting counter
underflows.

You must either:

1) Integrate the socket's existing usage when the limit is set.

2) Avoid accounting completely for a socket that started before
   the limit was set.

No half-way solutions, please.  Otherwise it is impossible to design
validations of the resource usage for a particular socket or group of
sockets, because they can always be potentially "wrong" and over the
limit.  That's a design for a buggy system.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Glauber Costa April 3, 2012, 10:31 p.m. UTC | #3
On 04/02/2012 07:41 AM, David Miller wrote:
> From: Glauber Costa<glommer@parallels.com>
> Date: Thu, 29 Mar 2012 11:21:07 +0200
>
>> On 03/29/2012 09:10 AM, KAMEZAWA Hiroyuki wrote:
>>> tcp memcontrol starts accouting after res->limit is set. So, if a sockets
>>> starts before setting res->limit, there are already used resource.
>>> After setting res->limit, the resource (already used) will be uncharged and
>>> make res_counter below 0 because they are not charged. This causes warning.
>>>
>>> This patch fixes that by adding res_counter_uncharge_nowarn().
>>> (*) We cannot avoid this while we have 'account start' switch.
>>>
>>> Signed-off-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>>
>> Fine by me.
>>
>> Acked-by: Glauber Costa<glommer@parallels.com>
>
> I'm not applying patches that simply ignore accounting counter
> underflows.
>
> You must either:
>
> 1) Integrate the socket's existing usage when the limit is set.
>
> 2) Avoid accounting completely for a socket that started before
>     the limit was set.
>
> No half-way solutions, please.  Otherwise it is impossible to design
> validations of the resource usage for a particular socket or group of
> sockets, because they can always be potentially "wrong" and over the
> limit.  That's a design for a buggy system.
>
>
Kame,

I agree with Dave FWIW.

We should be able to do this by dropping the reference count when the 
cgroup is finally destroyed, instead of from the remove callback. At 
that point, no more pending sockets should be attached to it.

Prior to increasing the static key, they are all assigned to the global 
cgroup, so we shouldn't care about them.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
KAMEZAWA Hiroyuki April 9, 2012, 12:58 a.m. UTC | #4
(2012/04/04 7:31), Glauber Costa wrote:

> On 04/02/2012 07:41 AM, David Miller wrote:
>> From: Glauber Costa<glommer@parallels.com>
>> Date: Thu, 29 Mar 2012 11:21:07 +0200
>>
>>> On 03/29/2012 09:10 AM, KAMEZAWA Hiroyuki wrote:
>>>> tcp memcontrol starts accouting after res->limit is set. So, if a sockets
>>>> starts before setting res->limit, there are already used resource.
>>>> After setting res->limit, the resource (already used) will be uncharged and
>>>> make res_counter below 0 because they are not charged. This causes warning.
>>>>
>>>> This patch fixes that by adding res_counter_uncharge_nowarn().
>>>> (*) We cannot avoid this while we have 'account start' switch.
>>>>
>>>> Signed-off-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>>>
>>> Fine by me.
>>>
>>> Acked-by: Glauber Costa<glommer@parallels.com>
>>
>> I'm not applying patches that simply ignore accounting counter
>> underflows.
>>
>> You must either:
>>
>> 1) Integrate the socket's existing usage when the limit is set.
>>
>> 2) Avoid accounting completely for a socket that started before
>>     the limit was set.
>>
>> No half-way solutions, please.  Otherwise it is impossible to design
>> validations of the resource usage for a particular socket or group of
>> sockets, because they can always be potentially "wrong" and over the
>> limit.  That's a design for a buggy system.
>>
>>
> Kame,
> 
> I agree with Dave FWIW.
> 
> We should be able to do this by dropping the reference count when the 
> cgroup is finally destroyed, instead of from the remove callback. At 
> that point, no more pending sockets should be attached to it.
> 
> Prior to increasing the static key, they are all assigned to the global 
> cgroup, so we shouldn't care about them.
> 

Could you do the fix ?

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Glauber Costa April 9, 2012, 1:44 a.m. UTC | #5
On 04/08/2012 09:58 PM, KAMEZAWA Hiroyuki wrote:
> (2012/04/04 7:31), Glauber Costa wrote:
>
>> On 04/02/2012 07:41 AM, David Miller wrote:
>>> From: Glauber Costa<glommer@parallels.com>
>>> Date: Thu, 29 Mar 2012 11:21:07 +0200
>>>
>>>> On 03/29/2012 09:10 AM, KAMEZAWA Hiroyuki wrote:
>>>>> tcp memcontrol starts accouting after res->limit is set. So, if a sockets
>>>>> starts before setting res->limit, there are already used resource.
>>>>> After setting res->limit, the resource (already used) will be uncharged and
>>>>> make res_counter below 0 because they are not charged. This causes warning.
>>>>>
>>>>> This patch fixes that by adding res_counter_uncharge_nowarn().
>>>>> (*) We cannot avoid this while we have 'account start' switch.
>>>>>
>>>>> Signed-off-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>>>>
>>>> Fine by me.
>>>>
>>>> Acked-by: Glauber Costa<glommer@parallels.com>
>>>
>>> I'm not applying patches that simply ignore accounting counter
>>> underflows.
>>>
>>> You must either:
>>>
>>> 1) Integrate the socket's existing usage when the limit is set.
>>>
>>> 2) Avoid accounting completely for a socket that started before
>>>      the limit was set.
>>>
>>> No half-way solutions, please.  Otherwise it is impossible to design
>>> validations of the resource usage for a particular socket or group of
>>> sockets, because they can always be potentially "wrong" and over the
>>> limit.  That's a design for a buggy system.
>>>
>>>
>> Kame,
>>
>> I agree with Dave FWIW.
>>
>> We should be able to do this by dropping the reference count when the
>> cgroup is finally destroyed, instead of from the remove callback. At
>> that point, no more pending sockets should be attached to it.
>>
>> Prior to increasing the static key, they are all assigned to the global
>> cgroup, so we shouldn't care about them.
>>
>
> Could you do the fix ?
>
> Thanks,
> -Kame
>
>
I sent you a version already. Please just make sure it works for you

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
index da81af0..e081948 100644
--- a/include/linux/res_counter.h
+++ b/include/linux/res_counter.h
@@ -134,6 +134,8 @@  int __must_check res_counter_charge_nofail(struct res_counter *counter,
 
 void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val);
 void res_counter_uncharge(struct res_counter *counter, unsigned long val);
+void res_counter_uncharge_nowarn(struct res_counter *counter,
+				unsigned long val);
 
 /**
  * res_counter_margin - calculate chargeable space of a counter
diff --git a/include/net/sock.h b/include/net/sock.h
index a6ba1f8..a1b3f4802 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1048,7 +1048,8 @@  static inline void memcg_memory_allocated_add(struct cg_proto *prot,
 static inline void memcg_memory_allocated_sub(struct cg_proto *prot,
 					      unsigned long amt)
 {
-	res_counter_uncharge(prot->memory_allocated, amt << PAGE_SHIFT);
+	res_counter_uncharge_nowarn(prot->memory_allocated,
+				amt << PAGE_SHIFT);
 }
 
 static inline u64 memcg_memory_allocated_read(struct cg_proto *prot)
diff --git a/kernel/res_counter.c b/kernel/res_counter.c
index d508363..2bb01ac 100644
--- a/kernel/res_counter.c
+++ b/kernel/res_counter.c
@@ -113,6 +113,24 @@  void res_counter_uncharge(struct res_counter *counter, unsigned long val)
 	local_irq_restore(flags);
 }
 
+void res_counter_uncharge_nowarn(struct res_counter *counter,
+					unsigned long val)
+{
+	struct res_counter *c;
+	unsigned long flags;
+
+	local_irq_save(flags);
+
+	for (c = counter; c != NULL; c = c->parent) {
+		spin_lock(&c->lock);
+		if (c->usage < val)
+			val = c->usage;
+		res_counter_uncharge_locked(c, val);
+		spin_unlock(&c->lock);
+	}
+	local_irq_restore(flags);
+}
+
 
 static inline unsigned long long *
 res_counter_member(struct res_counter *counter, int member)