Multicast packet loss

From: Eric Dumazet <dada1@cosmosbay.com>

Kenny Chang a écrit :
> It's been a while since I updated this thread.  We've been running
> through the different suggestions and tabulating their effects, as well
> as trying out an Intel card.  The short story is that setting affinity
> and MSI works to some extent, and the Intel card doesn't seem to change
> things significantly.  The results don't seem consistent enough for us
> to be able to point to a smoking gun.
> 
> It does look like the 2.6.29-rc4 kernel performs okay with the Intel
> card, but this is not a real-time build and it's not likely to be in a
> supported Ubuntu distribution real soon.  We've reached the point where
> we'd like to look for an expert dedicated to work on this problem for a
> period of time.  The final result being some sort of solution to produce
> a realtime configuration with a reasonably "aged" kernel (.24~.28) that
> has multicast performance greater than or equal to that of 2.6.15.
> 
> If anybody is interested in devoting some compensated time to this
> issue, we're offering up a bounty:
> http://www.athenacr.com/bounties/multicast-performance/
> 
> For completeness, here's the table of our experiment results:
> 
> ====================== ================== ========= ==========
> =============== ============== ============== =================
> Kernel                 flavor             IRQ       affinity   *4x
> mcasttest*  *5x mcasttest* *6x mcasttest*  *Mtools2* [4]_
> ====================== ================== ========= ==========
> =============== ============== ============== =================
> Intel
> e1000e                                                                                                                 
> 
> -----------------------------------------+---------+----------+---------------+--------------+--------------+-----------------
> 
> 2.6.24.19              rt                |          any       |
> OK              Maybe          X                            
> 2.6.24.19              rt                |          CPU0      |
> OK              OK             X                            
> 2.6.24.19              generic           |          any       |
> X                                                           
> 2.6.24.19              generic           |          CPU0      |
> OK                                                          
> 2.6.29-rc3             vanilla-server    |          any       |
> X                                                           
> 2.6.29-rc3             vanilla-server    |          CPU0      |
> OK                                                          
> 2.6.29-rc4             vanilla-generic   |          any       |
> X                                             OK            
> 2.6.29-rc4             vanilla-generic   |          CPU0      | OK  
>           OK             OK [5]_        OK            
> -----------------------------------------+---------+----------+---------------+--------------+--------------+-----------------
> 
> Broadcom
> BNX2                                                                                                                
> 
> -----------------------------------------+---------+----------+---------------+--------------+--------------+-----------------
> 
> 2.6.24-19              rt                | MSI      any       |
> OK              OK             X                            
> 2.6.24-19              rt                | MSI      CPU0      |
> OK              Maybe          X                            
> 2.6.24-19              rt                | APIC     any       |
> OK              OK             X                            
> 2.6.24-19              rt                | APIC     CPU0      |
> OK              Maybe          X                            
> 2.6.24-19-bnx-latest   rt                | APIC     CPU0      |
> OK              X                                           
> 2.6.24-19              server            | MSI      any       |
> X                                                           
> 2.6.24-19              server            | MSI      CPU0      |
> OK                                                          
> 2.6.24-19              generic           | APIC     any       |
> X                                                           
> 2.6.24-19              generic           | APIC     CPU0      |
> OK                                                          
> 2.6.27-11              generic           | APIC     any       |
> X                                                           
> 2.6.27-11              generic           | APIC     CPU0      |
> OK              10% drop                                     
> 2.6.28-8               generic           | APIC     any       |
> OK              X                                            
> 2.6.28-8               generic           | APIC     CPU0      |
> OK              OK             0.5% drop                     
> 2.6.29-rc3             vanilla-server    | MSI      any       |
> X                                                           
> 2.6.29-rc3             vanilla-server    | MSI      CPU0      |
> X                                                           
> 2.6.29-rc3             vanilla-server    | APIC     any       |
> OK              X                                           
> 2.6.29-rc3             vanilla-server    | APIC     CPU0      |
> OK              OK                                          
> 2.6.29-rc4             vanilla-generic   | APIC     any       |
> X                                                           
> 2.6.29-rc4             vanilla-generic   | APIC     CPU0      |
> OK              3% drop        10% drop       X             
> ======================
> ==================+=========+==========+===============+==============+==============+=================
> 
> * [4] MTools2 is a test from 29West: http://www.29west.com/docs/TestNet/
> * [5] In 5 trials, 1 of the trials dropped 2%, 4 of the trials dropped
> nothing.
> 
> Kenny
> 

Hi Kenny

I am investigating how to reduce contention (and schedule() calls) on this workload.

Following patch already gave me less packet drops (but not yet *perfect*)
(10% packet loss instead of 30%, if 8 receivers on my 8 cpus machine)

David, this is a preliminary work, not meant for inclusion as is,
comments are welcome.

Thank you

[PATCH] net: sk_forward_alloc becomes an atomic_t

Commit 95766fff6b9a78d11fc2d3812dd035381690b55d
(UDP: Add memory accounting) introduced a regression for high rate UDP flows,
because of extra lock_sock() in udp_recvmsg()

In order to reduce need for lock_sock() in UDP receive path, we might need
to declare sk_forward_alloc as an atomic_t.

udp_recvmsg() can avoid a lock_sock()/release_sock() pair.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 include/net/sock.h   |   14 +++++++-------
 net/core/sock.c      |   31 +++++++++++++++++++------------
 net/core/stream.c    |    2 +-
 net/ipv4/af_inet.c   |    2 +-
 net/ipv4/inet_diag.c |    2 +-
 net/ipv4/tcp_input.c |    2 +-
 net/ipv4/udp.c       |    2 --
 net/ipv6/udp.c       |    2 --
 net/sched/em_meta.c  |    2 +-
 9 files changed, 31 insertions(+), 28 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Message ID	49A8FAFF.7060104@cosmosbay.com
State	RFC, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 8CC5EDDE0F for <patchwork-incoming@ozlabs.org>; Sat, 28 Feb 2009 19:51:43 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752065AbZB1Ivj (ORCPT <rfc822;patchwork-incoming@ozlabs.org>); Sat, 28 Feb 2009 03:51:39 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752031AbZB1Ivi (ORCPT <rfc822; netdev-outgoing>); Sat, 28 Feb 2009 03:51:38 -0500 Received: from gw1.cosmosbay.com ([212.99.114.194]:36361 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751652AbZB1Ivh convert rfc822-to-8bit (ORCPT <rfc822;netdev@vger.kernel.org>); Sat, 28 Feb 2009 03:51:37 -0500 Received: from [127.0.0.1] (localhost [127.0.0.1]) by gw1.cosmosbay.com (8.13.7/8.13.7) with ESMTP id n1S8pB6Y007642; Sat, 28 Feb 2009 09:51:11 +0100 Message-ID: <49A8FAFF.7060104@cosmosbay.com> Date: Sat, 28 Feb 2009 09:51:11 +0100 From: Eric Dumazet <dada1@cosmosbay.com> User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: Kenny Chang <kchang@athenacr.com> CC: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>, Christoph Lameter <cl@linux-foundation.org> Subject: Re: Multicast packet loss References: <49838213.90700@cosmosbay.com> <49859847.9010206@cosmosbay.com> <20090202134523.GA13369@hmsreliant.think-freely.org> <498725F4.2010205@cosmosbay.com> <20090202182212.GA17950@hmsreliant.think-freely.org> <loom.20090202T194942-55@post.gmane.org> <498757AA.8010101@cosmosbay.com> <4987610D.6040902@athenacr.com> <4987663D.6080802@cosmosbay.com> <4988803E.2020009@athenacr.com> <20090204012144.GC3650@localhost.localdomain> <49A6CE39.5050200@athenacr.com> In-Reply-To: <49A6CE39.5050200@athenacr.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [0.0.0.0]); Sat, 28 Feb 2009 09:51:12 +0100 (CET) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: <netdev.vger.kernel.org> X-Mailing-List: netdev@vger.kernel.org

Multicast packet loss

Commit Message

Comments

Patch