Message ID | 20111018080523.16861.55402.sendpatchset@krkumar2.in.ibm.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, 2011-10-18 at 09:05 +0100, Krishna Kumar wrote: > The extra reference to the fragment pages causes those pages to > not get freed in skb_release_data(). The following patch fixes > the bug. I have not checked if any other converted driver has > the same issue. Damn. You are completely correct and I appear to have made this same mistake several times. A quick look suggests that at least cxbg, myriage, vmxnet, cassini and bnx2 may potentially have a similar issue :-( (I stopped looking at that point, I'll obviously do a full audit). I considered quite carefully whether (__)skb_frag_set_page should take a reference or not and decided yes but I'm starting to reconsider whether I made the right choice. It seems that is just confusing and violates the principal of least surprise to have a function called "set" take a new reference. In reality all existing drivers expect that adding a page to an SKB frag will just take over the existing reference. I think the best thing might be to remove the additional ref taking from the setter function and audit the previous changes to ensure they conform. I'll do that right away and post a fixup patch ASAP. > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> This change is correct as things stand today, so: Acked-by: Ian Campbell <ian.campbell@citrix.com> but perhaps it would be better to hold off and let me fix all of these all at once. Thanks for bringing this to my attention Krishna. Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Ian Campbell <Ian.Campbell@citrix.com> wrote on 10/18/2011 02:06:56 PM: Hi Ian, > On Tue, 2011-10-18 at 09:05 +0100, Krishna Kumar wrote: > > The extra reference to the fragment pages causes those pages to > > not get freed in skb_release_data(). The following patch fixes > > the bug. I have not checked if any other converted driver has > > the same issue. > > Damn. You are completely correct and I appear to have made this same > mistake several times. A quick look suggests that at least cxbg, > myriage, vmxnet, cassini and bnx2 may potentially have a similar > issue :-( (I stopped looking at that point, I'll obviously do a full > audit). > > I considered quite carefully whether (__)skb_frag_set_page should take a > reference or not and decided yes but I'm starting to reconsider whether > I made the right choice. It seems that is just confusing and violates > the principal of least surprise to have a function called "set" take a > new reference. In reality all existing drivers expect that adding a page > to an SKB frag will just take over the existing reference. > > I think the best thing might be to remove the additional ref taking from > the setter function and audit the previous changes to ensure they > conform. I'll do that right away and post a fixup patch ASAP. > > > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> > > This change is correct as things stand today, so: > > Acked-by: Ian Campbell <ian.campbell@citrix.com> > > but perhaps it would be better to hold off and let me fix all of these > all at once. Either way is fine with me. However, besides having a fix, I would like to remove the manual initializations in set_skb_frag(), and substitute with __skb_fill_page_desc, like other drivers do. thanks, - KK -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Krishna Kumar <krkumar2@in.ibm.com> Date: Tue, 18 Oct 2011 13:35:23 +0530 > Commit 86ee8130 ("virtionet: convert to SKB paged frag API") > introduced a bug in guest. During RX testing, guest runs out > of memory within seconds, causing oom-killer; which then > panics the system: "Kernel panic - not syncing: Out of memory > and no killable processes...". /proc/meminfo just before the > panic shows MemFree is a few MB's: > > MemFree: 1928544 kB (starts here) > ... > ... > MemFree: 27488 kB > MemFree: 26248 kB > MemFree: 24636 kB > MemFree: 22632 kB > MemFree: 19580 kB > MemFree: 17928 kB > MemFree: 15548 kB > (Panic) > > The extra reference to the fragment pages causes those pages to > not get freed in skb_release_data(). The following patch fixes > the bug. I have not checked if any other converted driver has > the same issue. > > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> I'll wait for Ian's full audit, but Krishna please use more appropriate subject lines in future patch submissions. This patch is fixing a problem in the virtio_net driver, so please mention that: "Subject: [PATCH] virtio_net: Fix guest memory leak and panic" -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2011-10-18 at 15:12 -0400, David Miller wrote:
> I'll wait for Ian's full audit,
The result of that is in
<1318931232.16132.65.camel@zakaz.uk.xensource.com> in this thread.
Thanks,
Ian.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c --- org/drivers/net/virtio_net.c 2011-10-18 08:49:46.000000000 +0530 +++ new/drivers/net/virtio_net.c 2011-10-18 12:55:32.000000000 +0530 @@ -143,18 +143,15 @@ static void skb_xmit_done(struct virtque static void set_skb_frag(struct sk_buff *skb, struct page *page, unsigned int offset, unsigned int *len) { + int size = min((unsigned)PAGE_SIZE - offset, *len); int i = skb_shinfo(skb)->nr_frags; - skb_frag_t *f; - f = &skb_shinfo(skb)->frags[i]; - f->size = min((unsigned)PAGE_SIZE - offset, *len); - f->page_offset = offset; - __skb_frag_set_page(f, page); + __skb_fill_page_desc(skb, i, page, offset, size); - skb->data_len += f->size; - skb->len += f->size; + skb->data_len += size; + skb->len += size; skb_shinfo(skb)->nr_frags++; - *len -= f->size; + *len -= size; } static struct sk_buff *page_to_skb(struct virtnet_info *vi,
Commit 86ee8130 ("virtionet: convert to SKB paged frag API") introduced a bug in guest. During RX testing, guest runs out of memory within seconds, causing oom-killer; which then panics the system: "Kernel panic - not syncing: Out of memory and no killable processes...". /proc/meminfo just before the panic shows MemFree is a few MB's: MemFree: 1928544 kB (starts here) ... ... MemFree: 27488 kB MemFree: 26248 kB MemFree: 24636 kB MemFree: 22632 kB MemFree: 19580 kB MemFree: 17928 kB MemFree: 15548 kB (Panic) The extra reference to the fragment pages causes those pages to not get freed in skb_release_data(). The following patch fixes the bug. I have not checked if any other converted driver has the same issue. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> --- drivers/net/virtio_net.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html