diff mbox

SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()

Message ID BANLkTikeWzuE-384uT6RhZR6Wn=DBK+CNQ@mail.gmail.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Jacek Luczak May 18, 2011, 7:01 a.m. UTC
During the sctp_close() call, we do not use rcu primitives to
destroy the address list attached to the endpoint.  At the same
time, we do the removal of addresses from this list before
attempting to remove the socket from the port hash

As a result, it is possible for another process to find the socket
in the port hash that is in the process of being closed.  It then
proceeds to traverse the address list to find the conflict, only
to have that address list suddenly disappear without rcu() critical
section.

This can result in a kernel crash with general protection fault or
kernel NULL pointer dereference.

Fix issue by closing address list removal inside RCU critical
section.

Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>

---
 bind_addr.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Eric Dumazet May 18, 2011, 7:48 a.m. UTC | #1
Le mercredi 18 mai 2011 à 09:01 +0200, Jacek Luczak a écrit :
> During the sctp_close() call, we do not use rcu primitives to
> destroy the address list attached to the endpoint.  At the same
> time, we do the removal of addresses from this list before
> attempting to remove the socket from the port hash
> 
> As a result, it is possible for another process to find the socket
> in the port hash that is in the process of being closed.  It then
> proceeds to traverse the address list to find the conflict, only
> to have that address list suddenly disappear without rcu() critical
> section.
> 
> This can result in a kernel crash with general protection fault or
> kernel NULL pointer dereference.
> 
> Fix issue by closing address list removal inside RCU critical
> section.
> 
> Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
> 
> ---
>  bind_addr.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index faf71d1..19d1329 100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -155,8 +155,16 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>  /* Dispose of an SCTP_bind_addr structure  */
>  void sctp_bind_addr_free(struct sctp_bind_addr *bp)
>  {
> -       /* Empty the bind address list. */
> -       sctp_bind_addr_clean(bp);
> +       struct sctp_sockaddr_entry *addr;
> +
> +       /* Empty the bind address list inside RCU section. */
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(addr, &bp->address_list, list) {
> +               list_del_rcu(&addr->list);
> +               call_rcu(&addr->rcu, sctp_local_addr_free);
> +               SCTP_DBG_OBJCNT_DEC(addr);
> +       }
> +       rcu_read_unlock();
> 

Sorry this looks odd.

If you're removing items from this list, you must be a writer here, with
exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.

Therefore, I guess following code is better :

list_for_each_entry(addr, &bp->address_list, list) {
	list_del_rcu(&addr->list);
	call_rcu(&addr->rcu, sctp_local_addr_free);
	SCTP_DBG_OBJCNT_DEC(addr);
}

Then, why dont you fix sctp_bind_addr_clean() instead ?

if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
be protected as well.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jacek Luczak May 18, 2011, 8:06 a.m. UTC | #2
2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 18 mai 2011 à 09:01 +0200, Jacek Luczak a écrit :
>> During the sctp_close() call, we do not use rcu primitives to
>> destroy the address list attached to the endpoint.  At the same
>> time, we do the removal of addresses from this list before
>> attempting to remove the socket from the port hash
>>
>> As a result, it is possible for another process to find the socket
>> in the port hash that is in the process of being closed.  It then
>> proceeds to traverse the address list to find the conflict, only
>> to have that address list suddenly disappear without rcu() critical
>> section.
>>
>> This can result in a kernel crash with general protection fault or
>> kernel NULL pointer dereference.
>>
>> Fix issue by closing address list removal inside RCU critical
>> section.
>>
>> Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
>> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>>
>> ---
>>  bind_addr.c |   12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
>> index faf71d1..19d1329 100644
>> --- a/net/sctp/bind_addr.c
>> +++ b/net/sctp/bind_addr.c
>> @@ -155,8 +155,16 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>>  /* Dispose of an SCTP_bind_addr structure  */
>>  void sctp_bind_addr_free(struct sctp_bind_addr *bp)
>>  {
>> -       /* Empty the bind address list. */
>> -       sctp_bind_addr_clean(bp);
>> +       struct sctp_sockaddr_entry *addr;
>> +
>> +       /* Empty the bind address list inside RCU section. */
>> +       rcu_read_lock();
>> +       list_for_each_entry_rcu(addr, &bp->address_list, list) {
>> +               list_del_rcu(&addr->list);
>> +               call_rcu(&addr->rcu, sctp_local_addr_free);
>> +               SCTP_DBG_OBJCNT_DEC(addr);
>> +       }
>> +       rcu_read_unlock();
>>
>
> Sorry this looks odd.
>
> If you're removing items from this list, you must be a writer here, with
> exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.

I could agree to some extend ... but strict RCU section IMO is needed here.
I can check this if the issue exists.

> Therefore, I guess following code is better :
>
> list_for_each_entry(addr, &bp->address_list, list) {
>        list_del_rcu(&addr->list);
>        call_rcu(&addr->rcu, sctp_local_addr_free);
>        SCTP_DBG_OBJCNT_DEC(addr);
> }
>
> Then, why dont you fix sctp_bind_addr_clean() instead ?
>
> if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
> be protected as well.

The _clean() as claimed by Vlad is called many times from various places
in code and this could give a overhead. I guess Vlad would need to comment.

-Jacek
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 18, 2011, 8:29 a.m. UTC | #3
Le mercredi 18 mai 2011 à 10:06 +0200, Jacek Luczak a écrit :
> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:

> > If you're removing items from this list, you must be a writer here, with
> > exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.
> 
> I could agree to some extend ... but strict RCU section IMO is needed here.
> I can check this if the issue exists.
> 

I can tell you for sure rcu_read_lock() is not needed here. Its only
showing confusion from code's author.

Please read Documentation/RCU/listRCU.txt for concise explanations,
line 117.


> > Therefore, I guess following code is better :
> >
> > list_for_each_entry(addr, &bp->address_list, list) {
> >        list_del_rcu(&addr->list);
> >        call_rcu(&addr->rcu, sctp_local_addr_free);
> >        SCTP_DBG_OBJCNT_DEC(addr);
> > }
> >
> > Then, why dont you fix sctp_bind_addr_clean() instead ?
> >
> > if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
> > be protected as well.
> 
> The _clean() as claimed by Vlad is called many times from various places
> in code and this could give a overhead. I guess Vlad would need to comment.

I guess a full review of this code is needed. You'll have to prove
sctp_bind_addr_clean() is always called after one RCU grace period if
you want to leave it as is.

You cant get RCU for free, some rules must be followed or you risk
crashes.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jacek Luczak May 18, 2011, 12:06 p.m. UTC | #4
2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 18 mai 2011 à 10:06 +0200, Jacek Luczak a écrit :
>> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
>
>> > If you're removing items from this list, you must be a writer here, with
>> > exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.
>>
>> I could agree to some extend ... but strict RCU section IMO is needed here.
>> I can check this if the issue exists.
>>
>
> I can tell you for sure rcu_read_lock() is not needed here. Its only
> showing confusion from code's author.
>
> Please read Documentation/RCU/listRCU.txt for concise explanations,
> line 117.
>
>
>> > Therefore, I guess following code is better :
>> >
>> > list_for_each_entry(addr, &bp->address_list, list) {
>> >        list_del_rcu(&addr->list);
>> >        call_rcu(&addr->rcu, sctp_local_addr_free);
>> >        SCTP_DBG_OBJCNT_DEC(addr);
>> > }
>> >
>> > Then, why dont you fix sctp_bind_addr_clean() instead ?
>> >
>> > if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
>> > be protected as well.
>>
>> The _clean() as claimed by Vlad is called many times from various places
>> in code and this could give a overhead. I guess Vlad would need to comment.
>
> I guess a full review of this code is needed. You'll have to prove
> sctp_bind_addr_clean() is always called after one RCU grace period if
> you want to leave it as is.

Eric this is actually good point which I think did not found at first
glance. As the original
issue occurs between _clean() and _conflict() then if the former is
used here and
there we can hit the grace period not only in that case. By that then
_clean() should
be ,,fixed''. Right?

-Jacek
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index faf71d1..19d1329 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -155,8 +155,16 @@  static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
 /* Dispose of an SCTP_bind_addr structure  */
 void sctp_bind_addr_free(struct sctp_bind_addr *bp)
 {
-       /* Empty the bind address list. */
-       sctp_bind_addr_clean(bp);
+       struct sctp_sockaddr_entry *addr;
+
+       /* Empty the bind address list inside RCU section. */
+       rcu_read_lock();
+       list_for_each_entry_rcu(addr, &bp->address_list, list) {
+               list_del_rcu(&addr->list);
+               call_rcu(&addr->rcu, sctp_local_addr_free);
+               SCTP_DBG_OBJCNT_DEC(addr);
+       }
+       rcu_read_unlock();

        if (bp->malloced) {
                kfree(bp);