diff mbox

Asterisk deadlocks since Kernel 4.1

Message ID 5664A102.2030602@profihost.ag
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Stefan Priebe - Profihost AG Dec. 6, 2015, 8:56 p.m. UTC
Hi Herbert,

i think i found the issue in 4.1 with netlink. Somebody made a mistake 
while backporting or cherry-picking your patch "netlink: Fix autobind 
race condition that leads to zero port ID" to 4.1.

It misses a goto in 4.1.

This goto is missing in 4.1:


Can you please confirm, that this is not correct and might cause those 
issues.

Stefan

Am 05.12.2015 um 02:08 schrieb Herbert Xu:
> On Fri, Dec 04, 2015 at 07:26:12PM +0100, Stefan Priebe wrote:
>>
>> * 9f87e0c - (2 months ago) netlink: Replace rhash_portid with bound
>> - Herbert Xu
>> * 35e9890 - (3 months ago) netlink: Fix autobind race condition that
>> leads to zero port ID - Herbert Xu
>> * 30c6472 - (7 months ago) netlink: Use random autobind rover - Herbert Xu
>
> These three patches are absolutely required in any kernel where the
> netlink insertion is lockless.  So yes they should be applied to
> 4.1.
>
> Thanks,
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Herbert Xu Dec. 7, 2015, 1:20 a.m. UTC | #1
On Sun, Dec 06, 2015 at 09:56:34PM +0100, Stefan Priebe wrote:
> Hi Herbert,
> 
> i think i found the issue in 4.1 with netlink. Somebody made a
> mistake while backporting or cherry-picking your patch "netlink: Fix
> autobind race condition that leads to zero port ID" to 4.1.
> 
> It misses a goto in 4.1.
> 
> This goto is missing in 4.1:
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 4017e12..f15c001 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -1132,7 +1132,6 @@ static int netlink_insert(struct sock *sk, u32 portid)
>                 if (err == -EEXIST)
>                         err = -EADDRINUSE;
>                 sock_put(sk);
> -               goto err;
>         }
> 
>         /* We need to ensure that the socket is hashed and visible. */
> 
> Can you please confirm, that this is not correct and might cause
> those issues.

Well spotted! Yes this would be a fatal error and can cause the
problems you guys are seeing.

Thanks,
Stefan Priebe - Profihost AG Dec. 7, 2015, 6:58 a.m. UTC | #2
Hi Herbert,

Am 07.12.2015 um 02:20 schrieb Herbert Xu:
> On Sun, Dec 06, 2015 at 09:56:34PM +0100, Stefan Priebe wrote:
>> Hi Herbert,
>>
>> i think i found the issue in 4.1 with netlink. Somebody made a
>> mistake while backporting or cherry-picking your patch "netlink: Fix
>> autobind race condition that leads to zero port ID" to 4.1.
>>
>> It misses a goto in 4.1.
>>
>> This goto is missing in 4.1:
>>
>> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
>> index 4017e12..f15c001 100644
>> --- a/net/netlink/af_netlink.c
>> +++ b/net/netlink/af_netlink.c
>> @@ -1132,7 +1132,6 @@ static int netlink_insert(struct sock *sk, u32 portid)
>>                 if (err == -EEXIST)
>>                         err = -EADDRINUSE;
>>                 sock_put(sk);
>> -               goto err;
>>         }
>>
>>         /* We need to ensure that the socket is hashed and visible. */
>>
>> Can you please confirm, that this is not correct and might cause
>> those issues.
> 
> Well spotted! Yes this would be a fatal error and can cause the
> problems you guys are seeing.

Thanks, good. Can you help me to get this fix upstream into the stable
lines?

Stefan

> 
> Thanks,
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Philipp Hahn Dec. 7, 2015, 7:41 a.m. UTC | #3
Hello Stefan,

Am 06.12.2015 um 21:56 schrieb Stefan Priebe:
> i think i found the issue in 4.1 with netlink. Somebody made a mistake
> while backporting or cherry-picking your patch "netlink: Fix autobind
> race condition that leads to zero port ID" to 4.1.
> 
> It misses a goto in 4.1.
> 
> This goto is missing in 4.1:
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 4017e12..f15c001 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -1132,7 +1132,6 @@ static int netlink_insert(struct sock *sk, u32
> portid)
>                 if (err == -EEXIST)
>                         err = -EADDRINUSE;
>                 sock_put(sk);
> -               goto err;
>         }
> 
>         /* We need to ensure that the socket is hashed and visible. */
> 
> Can you please confirm, that this is not correct and might cause those
> issues.

I just tested that patch and it seems to fix our hang. Thank you for
your good work.

Philipp

PS: I guess I can skip testing your other test request as this simple
patch is part of your other hiuq4bsW patch. If I should still test it,
just send a note.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 4017e12..f15c001 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1132,7 +1132,6 @@  static int netlink_insert(struct sock *sk, u32 portid)
                 if (err == -EEXIST)
                         err = -EADDRINUSE;
                 sock_put(sk);
-               goto err;
         }

         /* We need to ensure that the socket is hashed and visible. */