From patchwork Tue Jan 12 01:48:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Frederic Sowa X-Patchwork-Id: 566242 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id A04621402DE for ; Tue, 12 Jan 2016 12:48:54 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=stressinduktion.org header.i=@stressinduktion.org header.b=MPrF3Yn2; dkim=pass (1024-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b=TUtbVS4U; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762073AbcALBsu (ORCPT ); Mon, 11 Jan 2016 20:48:50 -0500 Received: from out4-smtp.messagingengine.com ([66.111.4.28]:55498 "EHLO out4-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762053AbcALBss (ORCPT ); Mon, 11 Jan 2016 20:48:48 -0500 Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 9A6A820FD0 for ; Mon, 11 Jan 2016 20:48:47 -0500 (EST) Received: from frontend2 ([10.202.2.161]) by compute2.internal (MEProxy); Mon, 11 Jan 2016 20:48:47 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= stressinduktion.org; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=SRnvVMwzO+sSocVm juDhkoK/IbA=; b=MPrF3Yn25q6VW5ZLvLV79VfL/lelBWSkwUJRjy//jyRTV5Ak syehnDqbHYPGqegHvafC9uw2QSne9ZYB+3DyrH4p3sotUN5HXq8tF3pmb1xFtnIc gJQGvSQZLgxMT3B4yGYtFlU+4LlgG70od0viGfgmUW4eX8CVtw1RnMahES8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=SRnvVMwzO+sSocV mjuDhkoK/IbA=; b=TUtbVS4UR2z9GXOzjQT6aqTZckmLs7LggK7s2DiBHr2viJ9 t1J8sYSz28tGZEwdVTEVSTicXvLMiWlPylKPy0lfH4uzl+tQdUkzMr4MLRc5z9HG Riib77S5ZH/KOXKEEkVEOt5UGPws9U92oetu8st5NkTjMjSMpraLJWAoWFBA= X-Sasl-enc: XBk5zDRYGhnkj9oYnv2IBoT4hanAEJtZfHIA4M/aCJGz 1452563327 Received: from z.localhost (unknown [213.55.184.153]) by mail.messagingengine.com (Postfix) with ESMTPA id 41FDD68012C; Mon, 11 Jan 2016 20:48:46 -0500 (EST) Subject: Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 To: pravin shelar References: <1452188453-5523-1-git-send-email-hannes@stressinduktion.org> <56939EEF.70403@stressinduktion.org> From: Hannes Frederic Sowa Cc: Linux Kernel Network Developers , "David S. Miller" , Eric Dumazet Message-ID: <56945B7A.6010601@stressinduktion.org> Date: Tue, 12 Jan 2016 02:48:42 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 12.01.2016 01:36, pravin shelar wrote: > On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa > wrote: >> On 11.01.2016 07:38, pravin shelar wrote: >>> >>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa >>> wrote: >>>> >>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. >>>> This limit was later raised to 10 by DaveM. Nowadays it is observed that >>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be >>>> overwritten shortly after 9 recursion. >>>> >>> Major user of stack space in OVS is sw_flow_key in >>> ovs_vport_receive(). With recent features like IPv6 tunnel support we >>> have increased the size of the flow-key which could have caused the >>> stack overflow sooner. >>> One way to avoid using stack in subsequent recursive call is to use >>> per-cpu storage for the sw_flow_key object. This is already done for >>> OVS recursive actions, so we can expand on that facility. >> >> >> Hmmm. This already came up. I think the difficulty is that ovs_vport_receive >> can be called from actions again with skb_cloned skb before the original's >> skb callstack is actually finished. Data in the percpu area would be >> overwritten while still being used. It would need some more logic IMHO. >> > You can have stack of flow-keys and allocate a flow-key for each recursive call. Hmm, I came up with something like that but the other day I find it unpleasant and think that the kmalloc might anyway end up in the fast path: processing * @@ -439,9 +442,19 @@ u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb) int ovs_vport_receive(struct vport *vport, struct sk_buff *skb, const struct ip_tunnel_info *tun_info) { - struct sw_flow_key key; + struct sw_flow_key *key; int error; + if (__this_cpu_inc_return(ovs_recursion) == 1) { + key = raw_cpu_ptr(&pcpu_key); + } else { + key = kmalloc(sizeof(*key), GFP_ATOMIC); + if (!key) { + __this_cpu_dec(ovs_recursion); + return -ENOMEM; + } + } + OVS_CB(skb)->input_vport = vport; OVS_CB(skb)->mru = 0; if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) { @@ -454,12 +467,15 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb, } /* Extract flow from 'skb' into 'key'. */ - error = ovs_flow_key_extract(tun_info, skb, &key); + error = ovs_flow_key_extract(tun_info, skb, key); if (unlikely(error)) { kfree_skb(skb); - return error; + goto out; } - ovs_dp_process_packet(skb, &key); + ovs_dp_process_packet(skb, key); +out: + if (__this_cpu_dec_return(ovs_recursion) > 0) + kfree(key); return error; } EXPORT_SYMBOL_GPL(ovs_vport_receive); >> What are recursive actions in ovs? I couldn't find any use of pcpu data in >> there? Thanks! :) >> > There are couple of recursive actions in OVS, e.g. > OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu > flow-key stack to avoid recursive function call. Ahh, the deferred_action stuff. I understand. So the idea would be to lift even the first entry point to deferred_action a like? This sounds like it could work but I fear we should find a solution for stable, as this seems like a bit more work. Thanks, Hannes diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c index 31cbc8c5c7db82..af9c94c732ce56 100644 --- a/net/openvswitch/vport.c +++ b/net/openvswitch/vport.c @@ -426,6 +426,9 @@ u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb) return ids->ids[ids_index]; } +static DEFINE_PER_CPU(struct sw_flow_key, pcpu_key); +static DEFINE_PER_CPU(int, ovs_recursion); + /** * ovs_vport_receive - pass up received packet to the datapath for