diff mbox

[net] net: reduce RECURSION_LIMIT to 8

Message ID 56945B7A.6010601@stressinduktion.org
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Hannes Frederic Sowa Jan. 12, 2016, 1:48 a.m. UTC
On 12.01.2016 01:36, pravin shelar wrote:
> On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> On 11.01.2016 07:38, pravin shelar wrote:
>>>
>>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
>>> <hannes@stressinduktion.org> wrote:
>>>>
>>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
>>>> This limit was later raised to 10 by DaveM. Nowadays it is observed that
>>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
>>>> overwritten shortly after 9 recursion.
>>>>
>>> Major user of stack space in OVS is sw_flow_key in
>>> ovs_vport_receive(). With recent features like IPv6 tunnel support we
>>> have increased the size of the flow-key which could have caused the
>>> stack overflow sooner.
>>> One way to avoid using stack in subsequent recursive call is to use
>>> per-cpu storage for the sw_flow_key object. This is already done for
>>> OVS recursive actions, so we can expand on that facility.
>>
>>
>> Hmmm. This already came up. I think the difficulty is that ovs_vport_receive
>> can be called from actions again with skb_cloned skb before the original's
>> skb callstack is actually finished. Data in the percpu area would be
>> overwritten while still being used. It would need some more logic IMHO.
>>
> You can have stack of flow-keys and allocate a flow-key for each recursive call.

Hmm, I came up with something like that but the other day I find it
unpleasant and think that the kmalloc might anyway end up in the fast
path:

processing
   *
@@ -439,9 +442,19 @@ u32 ovs_vport_find_upcall_portid(const struct vport 
*vport, struct sk_buff *skb)
  int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
  		      const struct ip_tunnel_info *tun_info)
  {
-	struct sw_flow_key key;
+	struct sw_flow_key *key;
  	int error;

+	if (__this_cpu_inc_return(ovs_recursion) == 1) {
+		key = raw_cpu_ptr(&pcpu_key);
+	} else {
+		key = kmalloc(sizeof(*key), GFP_ATOMIC);
+		if (!key) {
+			__this_cpu_dec(ovs_recursion);
+			return -ENOMEM;
+		}
+	}
+
  	OVS_CB(skb)->input_vport = vport;
  	OVS_CB(skb)->mru = 0;
  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
@@ -454,12 +467,15 @@ int ovs_vport_receive(struct vport *vport, struct 
sk_buff *skb,
  	}

  	/* Extract flow from 'skb' into 'key'. */
-	error = ovs_flow_key_extract(tun_info, skb, &key);
+	error = ovs_flow_key_extract(tun_info, skb, key);
  	if (unlikely(error)) {
  		kfree_skb(skb);
-		return error;
+		goto out;
  	}
-	ovs_dp_process_packet(skb, &key);
+	ovs_dp_process_packet(skb, key);
+out:
+	if (__this_cpu_dec_return(ovs_recursion) > 0)
+		kfree(key);
  	return error;
  }
  EXPORT_SYMBOL_GPL(ovs_vport_receive);


>> What are recursive actions in ovs? I couldn't find any use of pcpu data in
>> there? Thanks! :)
>>
> There are couple of recursive actions in OVS, e.g.
> OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu
> flow-key stack to avoid recursive function call.

Ahh, the deferred_action stuff. I understand. So the idea would be to
lift even the first entry point to deferred_action a like?

This sounds like it could work but I fear we should find a solution
for stable, as this seems like a bit more work.

Thanks,
Hannes

Comments

Pravin Shelar Jan. 12, 2016, 8:41 p.m. UTC | #1
On Mon, Jan 11, 2016 at 5:48 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On 12.01.2016 01:36, pravin shelar wrote:
>>
>> On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>>>
>>> On 11.01.2016 07:38, pravin shelar wrote:
>>>>
>>>>
>>>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
>>>> <hannes@stressinduktion.org> wrote:
>>>>>
>>>>>
>>>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
>>>>> This limit was later raised to 10 by DaveM. Nowadays it is observed
>>>>> that
>>>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
>>>>> overwritten shortly after 9 recursion.
>>>>>
>>>> Major user of stack space in OVS is sw_flow_key in
>>>> ovs_vport_receive(). With recent features like IPv6 tunnel support we
>>>> have increased the size of the flow-key which could have caused the
>>>> stack overflow sooner.
>>>> One way to avoid using stack in subsequent recursive call is to use
>>>> per-cpu storage for the sw_flow_key object. This is already done for
>>>> OVS recursive actions, so we can expand on that facility.
>>>
>>>
>>>
>>> Hmmm. This already came up. I think the difficulty is that
>>> ovs_vport_receive
>>> can be called from actions again with skb_cloned skb before the
>>> original's
>>> skb callstack is actually finished. Data in the percpu area would be
>>> overwritten while still being used. It would need some more logic IMHO.
>>>
>> You can have stack of flow-keys and allocate a flow-key for each recursive
>> call.
>
>
> Hmm, I came up with something like that but the other day I find it
> unpleasant and think that the kmalloc might anyway end up in the fast
> path:
>
We can avoid the kmalloc using per-cpu array of flow-key. Infact for
first four recursion can be done using stack to avoid any performance
penalty of accessing percpu variables.

>
>
>>> What are recursive actions in ovs? I couldn't find any use of pcpu data
>>> in
>>> there? Thanks! :)
>>>
>> There are couple of recursive actions in OVS, e.g.
>> OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu
>> flow-key stack to avoid recursive function call.
>
>
> Ahh, the deferred_action stuff. I understand. So the idea would be to
> lift even the first entry point to deferred_action a like?
>
> This sounds like it could work but I fear we should find a solution
> for stable, as this seems like a bit more work.
>

I am fine with OVS specific recursion limit as stable tree fix for this issue.

Thanks.
diff mbox

Patch

diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 31cbc8c5c7db82..af9c94c732ce56 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -426,6 +426,9 @@  u32 ovs_vport_find_upcall_portid(const struct vport 
*vport, struct sk_buff *skb)
  	return ids->ids[ids_index];
  }

+static DEFINE_PER_CPU(struct sw_flow_key, pcpu_key);
+static DEFINE_PER_CPU(int, ovs_recursion);
+
  /**
   *	ovs_vport_receive - pass up received packet to the datapath for