diff mbox

NFS: Fix infinite loop in gss_create_upcall()

Message ID 4DA48EB0.40600@netapp.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Bryan Schumaker April 12, 2011, 5:41 p.m. UTC
On 04/11/2011 05:08 PM, Jiri Slaby wrote:
> 
> Sorry for an extra message. I've just found out that there appears
> messages in dmesg:
> [   58.656048] RPC: AUTH_GSS upcall timed out.
> [   58.656050] Please check user daemon is running.
> [   88.656065] RPC: AUTH_GSS upcall timed out.
> [   88.656068] Please check user daemon is running.
> [  118.656077] RPC: AUTH_GSS upcall timed out.
> [  118.656080] Please check user daemon is running.
> [  148.656049] RPC: AUTH_GSS upcall timed out.
> [  148.656052] Please check user daemon is running.
> [  178.656046] RPC: AUTH_GSS upcall timed out.
> [  178.656049] Please check user daemon is running.
> 
> 
> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
> 
> I don't use GSS at all.
> 
> regards,

Does this patch help?

- Bryan



There can be an infinite loop if gss_create_upcall() is called without
the userspace program running.  To prevent this, we return -EACCES if
we notice that pipe_version hasn't changed (indicating that the pipe
has not been opened).

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
--


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jiri Slaby April 12, 2011, 6:05 p.m. UTC | #1
On 04/12/2011 07:41 PM, Bryan Schumaker wrote:
> On 04/11/2011 05:08 PM, Jiri Slaby wrote:
>>
>> Sorry for an extra message. I've just found out that there appears
>> messages in dmesg:
>> [   58.656048] RPC: AUTH_GSS upcall timed out.
>> [   58.656050] Please check user daemon is running.
>> [   88.656065] RPC: AUTH_GSS upcall timed out.
>> [   88.656068] Please check user daemon is running.
>> [  118.656077] RPC: AUTH_GSS upcall timed out.
>> [  118.656080] Please check user daemon is running.
>> [  148.656049] RPC: AUTH_GSS upcall timed out.
>> [  148.656052] Please check user daemon is running.
>> [  178.656046] RPC: AUTH_GSS upcall timed out.
>> [  178.656049] Please check user daemon is running.
>>
>>
>> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
>>
>> I don't use GSS at all.
>>
>> regards,
> 
> Does this patch help?
> 
> - Bryan
> 
> 
> 
> There can be an infinite loop if gss_create_upcall() is called without
> the userspace program running.  To prevent this, we return -EACCES if
> we notice that pipe_version hasn't changed (indicating that the pipe
> has not been opened).

Yes, it fixes the problem. But it waits 15s before it times out. This is
inacceptable for automounted NFS dirs.

thanks,
Trond Myklebust April 12, 2011, 6:31 p.m. UTC | #2
On Tue, 2011-04-12 at 20:05 +0200, Jiri Slaby wrote:
> On 04/12/2011 07:41 PM, Bryan Schumaker wrote:
> > On 04/11/2011 05:08 PM, Jiri Slaby wrote:
> >>
> >> Sorry for an extra message. I've just found out that there appears
> >> messages in dmesg:
> >> [   58.656048] RPC: AUTH_GSS upcall timed out.
> >> [   58.656050] Please check user daemon is running.
> >> [   88.656065] RPC: AUTH_GSS upcall timed out.
> >> [   88.656068] Please check user daemon is running.
> >> [  118.656077] RPC: AUTH_GSS upcall timed out.
> >> [  118.656080] Please check user daemon is running.
> >> [  148.656049] RPC: AUTH_GSS upcall timed out.
> >> [  148.656052] Please check user daemon is running.
> >> [  178.656046] RPC: AUTH_GSS upcall timed out.
> >> [  178.656049] Please check user daemon is running.
> >>
> >>
> >> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
> >>
> >> I don't use GSS at all.
> >>
> >> regards,
> > 
> > Does this patch help?
> > 
> > - Bryan
> > 
> > 
> > 
> > There can be an infinite loop if gss_create_upcall() is called without
> > the userspace program running.  To prevent this, we return -EACCES if
> > we notice that pipe_version hasn't changed (indicating that the pipe
> > has not been opened).
> 
> Yes, it fixes the problem. But it waits 15s before it times out. This is
> inacceptable for automounted NFS dirs.

I'm still confused as to why you are hitting it at all. In the normal
autonegotiation case, the client should be trying to use AUTH_SYS first
and then trying rpcsec_gss if and only if that fails.

Are you really exporting a filesystem using AUTH_NULL as the only
supported flavour?
Jiri Slaby April 12, 2011, 6:34 p.m. UTC | #3
On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>> inacceptable for automounted NFS dirs.
> 
> I'm still confused as to why you are hitting it at all. In the normal
> autonegotiation case, the client should be trying to use AUTH_SYS first
> and then trying rpcsec_gss if and only if that fails.
> 
> Are you really exporting a filesystem using AUTH_NULL as the only
> supported flavour?

I don't know, I connect to a nfs server which is not maintained by me.
It looks like that. How can I find out?

thanks,
Trond Myklebust April 12, 2011, 6:38 p.m. UTC | #4
On Tue, 2011-04-12 at 20:34 +0200, Jiri Slaby wrote:
> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
> >> Yes, it fixes the problem. But it waits 15s before it times out. This is
> >> inacceptable for automounted NFS dirs.
> > 
> > I'm still confused as to why you are hitting it at all. In the normal
> > autonegotiation case, the client should be trying to use AUTH_SYS first
> > and then trying rpcsec_gss if and only if that fails.
> > 
> > Are you really exporting a filesystem using AUTH_NULL as the only
> > supported flavour?
> 
> I don't know, I connect to a nfs server which is not maintained by me.
> It looks like that. How can I find out?

A wireshark trace of a successful mount would help.
Bryan Schumaker April 12, 2011, 6:43 p.m. UTC | #5
On 04/12/2011 02:34 PM, Jiri Slaby wrote:
> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>> inacceptable for automounted NFS dirs.
>>
>> I'm still confused as to why you are hitting it at all. In the normal
>> autonegotiation case, the client should be trying to use AUTH_SYS first
>> and then trying rpcsec_gss if and only if that fails.
>>
>> Are you really exporting a filesystem using AUTH_NULL as the only
>> supported flavour?
> 
> I don't know, I connect to a nfs server which is not maintained by me.
> It looks like that. How can I find out?

If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).

- Bryan

> 
> thanks,

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Slaby April 12, 2011, 6:52 p.m. UTC | #6
On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>> inacceptable for automounted NFS dirs.
>>>
>>> I'm still confused as to why you are hitting it at all. In the normal
>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>> and then trying rpcsec_gss if and only if that fails.
>>>
>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>> supported flavour?
>>
>> I don't know, I connect to a nfs server which is not maintained by me.
>> It looks like that. How can I find out?
> 
> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).

I don't have NFS in modules. It's all built-in. And this one is
unconditionally selected because of CONFIG_NFS_V4.

regards,
diff mbox

Patch

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9bf41ea..8a03ee0 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2224,8 +2224,9 @@  static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
 
 	for (i = 0; i < len; i++) {
 		status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
-		if (status != -EPERM)
-			break;
+		if (status == -EPERM || status == -EACCES)
+			continue;
+		break;
 	}
 	if (status == 0)
 		status = nfs4_server_capabilities(server, fhandle);
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index f3914d0..339ba64 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -520,7 +520,7 @@  gss_refresh_upcall(struct rpc_task *task)
 		warn_gssd();
 		task->tk_timeout = 15*HZ;
 		rpc_sleep_on(&pipe_version_rpc_waitqueue, task, NULL);
-		return 0;
+		return -EAGAIN;
 	}
 	if (IS_ERR(gss_msg)) {
 		err = PTR_ERR(gss_msg);
@@ -563,10 +563,12 @@  retry:
 	if (PTR_ERR(gss_msg) == -EAGAIN) {
 		err = wait_event_interruptible_timeout(pipe_version_waitqueue,
 				pipe_version >= 0, 15*HZ);
+		if (pipe_version < 0) {
+			warn_gssd();
+			err = -EACCES;
+		}
 		if (err)
 			goto out;
-		if (pipe_version < 0)
-			warn_gssd();
 		goto retry;
 	}
 	if (IS_ERR(gss_msg)) {