Message ID | 1286547901-10782-1-git-send-email-sgruszka@redhat.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, Oct 08, 2010 at 04:25:00PM +0200, Stanislaw Gruszka wrote: > We have fedora bug report where driver fail to initialize after > suspend/resume because of memory allocation errors: > https://bugzilla.redhat.com/show_bug.cgi?id=629158 There is also one more thing to do regarding above. Calltraces from bug reports, shows that order 3 allocation fail. On arch with 4kB pages, order 3 mean 32kB allocation. We want to alloc 16kB, but there is also internal sk_buff data what make that we exceed the boundary and take 32kB from allocator, getting almost 50% wastage. To fix we can use similar method as in niu or iwlwifi drivers, alloc pages directly form buddy allocator and attach them to skb (by skb_add_rx_frag for example). I'm going to prepare such patch, but I have one doubt, what happens if page size in system is bigger than 16kB, should I care about such case? Stanislaw -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le vendredi 08 octobre 2010 à 16:52 +0200, Stanislaw Gruszka a écrit : > On Fri, Oct 08, 2010 at 04:25:00PM +0200, Stanislaw Gruszka wrote: > > We have fedora bug report where driver fail to initialize after > > suspend/resume because of memory allocation errors: > > https://bugzilla.redhat.com/show_bug.cgi?id=629158 > > There is also one more thing to do regarding above. Calltraces from bug > reports, shows that order 3 allocation fail. On arch with 4kB pages, > order 3 mean 32kB allocation. We want to alloc 16kB, but there is also > internal sk_buff data what make that we exceed the boundary and take > 32kB from allocator, getting almost 50% wastage. > Or its only an 1460+overhead allocation, and SLUB uses order-3 pages to satisfy 2048 bytes allocations. # grep 2048 /proc/slabinfo kmalloc-2048 8664 8752 2048 16 8 : tunables 0 0 0 : slabdata 547 547 0 8 in the <pagesperslab> column just says that : order-3 pages, even for small allocations. Switch to SLAB -> no more problem ;) > To fix we can use similar method as in niu or iwlwifi drivers, alloc > pages directly form buddy allocator and attach them to skb (by > skb_add_rx_frag for example). I'm going to prepare such patch, but > I have one doubt, what happens if page size in system is bigger > than 16kB, should I care about such case? Seems tricky. Should we patch all drivers to do something like that ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 08, 2010 at 05:04:07PM +0200, Eric Dumazet wrote: > Le vendredi 08 octobre 2010 à 16:52 +0200, Stanislaw Gruszka a écrit : > > On Fri, Oct 08, 2010 at 04:25:00PM +0200, Stanislaw Gruszka wrote: > > > We have fedora bug report where driver fail to initialize after > > > suspend/resume because of memory allocation errors: > > > https://bugzilla.redhat.com/show_bug.cgi?id=629158 > > > > There is also one more thing to do regarding above. Calltraces from bug > > reports, shows that order 3 allocation fail. On arch with 4kB pages, > > order 3 mean 32kB allocation. We want to alloc 16kB, but there is also > > internal sk_buff data what make that we exceed the boundary and take > > 32kB from allocator, getting almost 50% wastage. > > > > Or its only an 1460+overhead allocation, and SLUB uses order-3 pages to > satisfy 2048 bytes allocations. Rather not, trace show failure in rtl8169_rx_fill, where we allocate rx buffers and these are 16kB big by default. > Switch to SLAB -> no more problem ;) yeh, I wish to, but fedora use SLUB because of some debugging capabilities. > > To fix we can use similar method as in niu or iwlwifi drivers, alloc > > pages directly form buddy allocator and attach them to skb (by > > skb_add_rx_frag for example). I'm going to prepare such patch, but > > I have one doubt, what happens if page size in system is bigger > > than 16kB, should I care about such case? > > Seems tricky. Should we patch all drivers to do something like that ? I think, only on these drivers which do alloc_skb(n*PAGE_SIZE). As alternative we can be smarter in alloc_skb. Stanislaw > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le vendredi 08 octobre 2010 à 18:03 +0200, Stanislaw Gruszka a écrit : > On Fri, Oct 08, 2010 at 05:04:07PM +0200, Eric Dumazet wrote: > > Le vendredi 08 octobre 2010 à 16:52 +0200, Stanislaw Gruszka a écrit : > > > On Fri, Oct 08, 2010 at 04:25:00PM +0200, Stanislaw Gruszka wrote: > > > > We have fedora bug report where driver fail to initialize after > > > > suspend/resume because of memory allocation errors: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=629158 > > > > > > There is also one more thing to do regarding above. Calltraces from bug > > > reports, shows that order 3 allocation fail. On arch with 4kB pages, > > > order 3 mean 32kB allocation. We want to alloc 16kB, but there is also > > > internal sk_buff data what make that we exceed the boundary and take > > > 32kB from allocator, getting almost 50% wastage. > > > > > > > Or its only an 1460+overhead allocation, and SLUB uses order-3 pages to > > satisfy 2048 bytes allocations. > > Rather not, trace show failure in rtl8169_rx_fill, where we allocate rx > buffers and these are 16kB big by default. > Only when gfp_t is GFP_KERNEL to fill rx buffers. (after your patch applied of course). This should succeed. If not, driver cannot load and function, since this NIC really needs 16KB buffers in order to avoid a hardware bug. Once allocated for RX rings, we never free them (never give this skb to upper stack) : When we receive a frame, we copybreak it, (using GFP_ATOMIC) so it depends on MTU. With MTU=1500, I am pretty sure we allocate 2048 bytes chunks, not more. > I think, only on these drivers which do alloc_skb(n*PAGE_SIZE). > As alternative we can be smarter in alloc_skb. Only if MTU is non standard, then. I repeat : With standard MTU=1500, we dont allocate huge skbs in rx path, only small (<2048 bytes) ones. For bigger frames, then you might allocate fragments, using pages, and dont care if PAGE_SIZE is 64Kbytes. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le vendredi 08 octobre 2010 à 16:25 +0200, Stanislaw Gruszka a écrit : > We have fedora bug report where driver fail to initialize after > suspend/resume because of memory allocation errors: > https://bugzilla.redhat.com/show_bug.cgi?id=629158 > > To fix use GFP_KERNEL allocation where possible. > > Tested-by: Neal Becker <ndbecker2@gmail.com> > Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat, 09 Oct 2010 09:54:04 +0200 > Le vendredi 08 octobre 2010 à 16:25 +0200, Stanislaw Gruszka a écrit : >> We have fedora bug report where driver fail to initialize after >> suspend/resume because of memory allocation errors: >> https://bugzilla.redhat.com/show_bug.cgi?id=629158 >> >> To fix use GFP_KERNEL allocation where possible. >> >> Tested-by: Neal Becker <ndbecker2@gmail.com> >> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> > > Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 8 Oct 2010, Eric Dumazet wrote: > 8 in the <pagesperslab> column just says that : order-3 pages, even for > small allocations. Those allocations will fallback to smaller allocs if the page allocator has trouble satisfying those requests. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 11 octobre 2010 à 11:03 -0500, Christoph Lameter a écrit : > On Fri, 8 Oct 2010, Eric Dumazet wrote: > > > 8 in the <pagesperslab> column just says that : order-3 pages, even for > > small allocations. > > Those allocations will fallback to smaller allocs if the page allocator > has trouble satisfying those requests. > Interesting, do you have an idea when this feature was added ? Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 11 Oct 2010, Eric Dumazet wrote: > Le lundi 11 octobre 2010 à 11:03 -0500, Christoph Lameter a écrit : > > On Fri, 8 Oct 2010, Eric Dumazet wrote: > > > > > 8 in the <pagesperslab> column just says that : order-3 pages, even for > > > small allocations. > > > > Those allocations will fallback to smaller allocs if the page allocator > > has trouble satisfying those requests. > > > > Interesting, do you have an idea when this feature was added ? A couple of years ago.
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index fe3b762..a7fb044 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -4006,7 +4006,7 @@ static inline void rtl8169_map_to_asic(struct RxDesc *desc, dma_addr_t mapping, static struct sk_buff *rtl8169_alloc_rx_skb(struct pci_dev *pdev, struct net_device *dev, struct RxDesc *desc, int rx_buf_sz, - unsigned int align) + unsigned int align, gfp_t gfp) { struct sk_buff *skb; dma_addr_t mapping; @@ -4014,7 +4014,7 @@ static struct sk_buff *rtl8169_alloc_rx_skb(struct pci_dev *pdev, pad = align ? align : NET_IP_ALIGN; - skb = netdev_alloc_skb(dev, rx_buf_sz + pad); + skb = __netdev_alloc_skb(dev, rx_buf_sz + pad, gfp); if (!skb) goto err_out; @@ -4045,7 +4045,7 @@ static void rtl8169_rx_clear(struct rtl8169_private *tp) } static u32 rtl8169_rx_fill(struct rtl8169_private *tp, struct net_device *dev, - u32 start, u32 end) + u32 start, u32 end, gfp_t gfp) { u32 cur; @@ -4060,7 +4060,7 @@ static u32 rtl8169_rx_fill(struct rtl8169_private *tp, struct net_device *dev, skb = rtl8169_alloc_rx_skb(tp->pci_dev, dev, tp->RxDescArray + i, - tp->rx_buf_sz, tp->align); + tp->rx_buf_sz, tp->align, gfp); if (!skb) break; @@ -4088,7 +4088,7 @@ static int rtl8169_init_ring(struct net_device *dev) memset(tp->tx_skb, 0x0, NUM_TX_DESC * sizeof(struct ring_info)); memset(tp->Rx_skbuff, 0x0, NUM_RX_DESC * sizeof(struct sk_buff *)); - if (rtl8169_rx_fill(tp, dev, 0, NUM_RX_DESC) != NUM_RX_DESC) + if (rtl8169_rx_fill(tp, dev, 0, NUM_RX_DESC, GFP_KERNEL) != NUM_RX_DESC) goto err_out; rtl8169_mark_as_last_descriptor(tp->RxDescArray + NUM_RX_DESC - 1); @@ -4587,7 +4587,7 @@ static int rtl8169_rx_interrupt(struct net_device *dev, count = cur_rx - tp->cur_rx; tp->cur_rx = cur_rx; - delta = rtl8169_rx_fill(tp, dev, tp->dirty_rx, tp->cur_rx); + delta = rtl8169_rx_fill(tp, dev, tp->dirty_rx, tp->cur_rx, GFP_ATOMIC); if (!delta && count) netif_info(tp, intr, dev, "no Rx buffer allocated\n"); tp->dirty_rx += delta;