diff mbox

NULL Pointer Deference: NFS & Telnet

Message ID 27F9C60D11D683428E133F85D2BB4A53043E3EDFE6@dlee03.ent.ti.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Arce, Abraham May 26, 2010, 1:48 a.m. UTC
Hi,

I am able to avoid the NULL pointer dereference but not sure if the handling
is the correct one... find the patch below...

> I have 2 scenarios in which I am getting a NULL pointer dereference:
> 
> 1) root filesystem over nfs
> 2) telnet connection
> 
> The issue appeared on this commit
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-
> 2.6.git;a=commit;h=f8965467f366fd18f01feafb5db10512d7b4422c
> 
> The driver I am working with is drivers/net/ks8851.c
> Any help will be highly appreciated...
>
> ---
> 
> Scenario 1 | root filesystem over nfs
> 
> Looking up port of RPC 100005/1 on 10.87.231.229
> VFS: Mounted root (nfs filesystem) on device 0:10.
> Freeing init memory: 128K
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [..]
> PC is at put_page+0xc/0x120
> LR is at skb_release_data+0x74/0xb8
> [..]
> Backtrace:
> [<c0086dd0>] (put_page+0x0/0x120)
> [<c01d0a48>] (skb_release_data+0x0/0xb8)
> [<c01d1044>] (skb_release_all+0x0/0x20)
> [<c01d075c>] (__kfree_skb+0x0/0xbc)
> [<c01d0818>] (consume_skb+0x0/0x58)
> [<c01d39cc>] (skb_free_datagram+0x0/0x40)
> [<c023ff74>] (xs_udp_data_ready+0x0/0x1e8)
> [<c01ce034>] (sock_queue_rcv_skb+0x0/0x1c0)
> [<c01fbba8>] (ip_queue_rcv_skb+0x0/0x58)
> [<c02176c0>] (__udp_queue_rcv_skb+0x0/0x18c)
> [<c0218e28>] (udp_queue_rcv_skb+0x0/0x348)
> [<c02195a4>] (__udp4_lib_rcv+0x0/0x564)
> [<c0219b08>] (udp_rcv+0x0/0x20)
> [<c01f5f34>] (ip_local_deliver+0x0/0x264)
> [<c01f586c>] (ip_rcv+0x0/0x6c8)
> [<c01d7ec0>] (__netif_receive_skb+0x0/0x2d0)
> [<c01d8190>] (process_backlog+0x0/0x16c)
> [<c01d8e14>] (net_rx_action+0x0/0x18c)
> [<c00521a0>] (__do_softirq+0x0/0x12c)
> [<c00522cc>] (irq_exit+0x0/0x70)
> [<c0028000>] (asm_do_IRQ+0x0/0xc8)
> 
> Complete log at http://pastebin.mozilla.org/728027
> 
> ---
> 
> Scenario 2
> 
>  1. Root filesystem booted in ram
>  2. eth0 brought up
>  3. telnetd daemon started
>  4. tried to connect through telnet
> 
> # Unable to handle kernel NULL pointer dereference at virtual address 00000000
> pgd = d98e8000
> [..]
> PC is at put_page+0xc/0x120
> LR is at skb_release_data+0x74/0xb8
> [..]
> Backtrace:
> [<c0086dd0>] (put_page+0x0/0x120)
> [<c01d0a48>] (skb_release_data+0x0/0xb8)
> [<c01d1044>] (skb_release_all+0x0/0x20)
> [<c01d075c>] (__kfree_skb+0x0/0xbc)
> [<c0202444>] (tcp_recvmsg+0x0/0x93c)
> [<c02201e8>] (inet_recvmsg+0x0/0xec)
> [<c01c7fd0>] (sock_aio_read+0x0/0xf8)
> [<c00ab3ac>] (do_sync_read+0x0/0xec)
> [<c00abfbc>] (vfs_read+0x0/0x164)
> [<c00ac1a0>] (sys_read+0x0/0x70)
> [<c0029100>] (ret_fast_syscall+0x0/0x30)
> 
> Complete log at http://pastebin.mozilla.org/728028
> 

Check for NULL data in sk_buff before sending to put_page

Signed-off-by: Abraham Arce <x0066660@ti.com>
---
 net/core/skbuff.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

David Miller May 26, 2010, 1:52 a.m. UTC | #1
From: "Arce, Abraham" <x0066660@ti.com>
Date: Tue, 25 May 2010 20:48:02 -0500

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index f8abf68..eb81f76 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -334,7 +334,7 @@ static void skb_release_data(struct sk_buff *skb)
>  	if (!skb->cloned ||
>  	    !atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>  			       &skb_shinfo(skb)->dataref)) {
> -		if (skb_shinfo(skb)->nr_frags) {
> +		if (skb_shinfo(skb)->nr_frags && skb_has_frags(skb)) {
>  			int i;
>  			for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
>  				put_page(skb_shinfo(skb)->frags[i].page);

skb_shinfo(skb)->nr_frags counts the number of entries contained
in the skb_shinfo(skb)->frags[] array.

This has nothing to do with the frag list pointer,
skb_shinfo(skb)->frag_list, which is what skb_has_frags()
tests.

You've got some kind of memory corruption going on and it
appears to have nothing to do with the code paths you're
playing with here.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arce, Abraham May 26, 2010, 2:02 a.m. UTC | #2
Thanks David,

> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index f8abf68..eb81f76 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -334,7 +334,7 @@ static void skb_release_data(struct sk_buff *skb)
> >  	if (!skb->cloned ||
> >  	    !atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >  			       &skb_shinfo(skb)->dataref)) {
> > -		if (skb_shinfo(skb)->nr_frags) {
> > +		if (skb_shinfo(skb)->nr_frags && skb_has_frags(skb)) {
> >  			int i;
> >  			for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> >  				put_page(skb_shinfo(skb)->frags[i].page);
> 
> skb_shinfo(skb)->nr_frags counts the number of entries contained
> in the skb_shinfo(skb)->frags[] array.
> 
> This has nothing to do with the frag list pointer,
> skb_shinfo(skb)->frag_list, which is what skb_has_frags()
> tests.
> 
> You've got some kind of memory corruption going on and it
> appears to have nothing to do with the code paths you're
> playing with here.

Do you have any recommendation on debugging technique/tool for this memory corruption issue?

Best Regards
Abraham
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 26, 2010, 5:29 a.m. UTC | #3
Le mardi 25 mai 2010 à 21:02 -0500, Arce, Abraham a écrit :
> Thanks David,
> 
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index f8abf68..eb81f76 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -334,7 +334,7 @@ static void skb_release_data(struct sk_buff *skb)
> > >  	if (!skb->cloned ||
> > >  	    !atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> > >  			       &skb_shinfo(skb)->dataref)) {
> > > -		if (skb_shinfo(skb)->nr_frags) {
> > > +		if (skb_shinfo(skb)->nr_frags && skb_has_frags(skb)) {
> > >  			int i;
> > >  			for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> > >  				put_page(skb_shinfo(skb)->frags[i].page);
> > 
> > skb_shinfo(skb)->nr_frags counts the number of entries contained
> > in the skb_shinfo(skb)->frags[] array.
> > 
> > This has nothing to do with the frag list pointer,
> > skb_shinfo(skb)->frag_list, which is what skb_has_frags()
> > tests.
> > 
> > You've got some kind of memory corruption going on and it
> > appears to have nothing to do with the code paths you're
> > playing with here.
> 
> Do you have any recommendation on debugging technique/tool for this memory corruption issue?
> 
> Best Regards
> Abraham
> --

It seems quite strange. You have a skb->nr_frags > 0 value, but a
frags[i].page = 0 value

You might add following function :

shinfo_check(struct sk_buff *skb)
{
	struct skb_shared_info *shinfo = skb_shinfo(skb);
	int i;

	WARN_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
	for (i = 0; i < shinfo->nr_frags; i++)
		WARN_ON(!shinfo->frags[i].page);
}

And call it from various points, to check who corrupts your skb.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f8abf68..eb81f76 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -334,7 +334,7 @@  static void skb_release_data(struct sk_buff *skb)
 	if (!skb->cloned ||
 	    !atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
 			       &skb_shinfo(skb)->dataref)) {
-		if (skb_shinfo(skb)->nr_frags) {
+		if (skb_shinfo(skb)->nr_frags && skb_has_frags(skb)) {
 			int i;
 			for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
 				put_page(skb_shinfo(skb)->frags[i].page);