Message ID | 20200224140644.27330-1-ioanna-maria.alifieraki@canonical.com |
---|---|
State | New |
Headers | show |
Series | [SRU,X/B/E/F] Revert "ipc, sem: remove uneeded sem_undo_list lock usage in exit_sem()" | expand |
On 24/02/2020 14:06, Ioanna Alifieraki wrote: > BugLink: https://bugs.launchpad.net/bugs/1858834 > > This reverts commit a97955844807e327df11aa33869009d14d6b7de0. > > Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage > in exit_sem()") removes a lock that is needed. This leads to a process > looping infinitely in exit_sem() and can also lead to a crash. There is > a reproducer available in [1] and with the commit reverted the issue > does not reproduce anymore. > > Using the reproducer found in [1] is fairly easy to reach a point where > one of the child processes is looping infinitely in exit_sem between > for(;;) and if (semid == -1) block, while it's trying to free its last > sem_undo structure which has already been freed by freeary(). > > Each sem_undo struct is on two lists: one per semaphore set (list_id) > and one per process (list_proc). The list_id list tracks undos by > semaphore set, and the list_proc by process. > > Undo structures are removed either by freeary() or by exit_sem(). The > freeary function is invoked when the user invokes a syscall to remove a > semaphore set. During this operation freeary() traverses the list_id > associated with the semaphore set and removes the undo structures from > both the list_id and list_proc lists. > > For this case, exit_sem() is called at process exit. Each process > contains a struct sem_undo_list (referred to as "ulp") which contains > the head for the list_proc list. When the process exits, exit_sem() > traverses this list to remove each sem_undo struct. As in freeary(), > whenever a sem_undo struct is removed from list_proc, it is also removed > from the list_id list. > > Removing elements from list_id is safe for both exit_sem() and freeary() > due to sem_lock(). Removing elements from list_proc is not safe; > freeary() locks &un->ulp->lock when it performs > list_del_rcu(&un->list_proc) but exit_sem() does not (locking was > removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list > lock usage in exit_sem()"). > > This can result in the following situation while executing the > reproducer [1] : Consider a child process in exit_sem() and the parent > in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)). > > - The list_proc for the child contains the last two undo structs A and > B (the rest have been removed either by exit_sem() or freeary()). > > - The semid for A is 1 and semid for B is 2. > > - exit_sem() removes A and at the same time freeary() removes B. > > - Since A and B have different semid sem_lock() will acquire different > locks for each process and both can proceed. > > The bug is that they remove A and B from the same list_proc at the same > time because only freeary() acquires the ulp lock. When exit_sem() > removes A it makes ulp->list_proc.next to point at B and at the same > time freeary() removes B setting B->semid=-1. > > At the next iteration of for(;;) loop exit_sem() will try to remove B. > > The only way to break from for(;;) is for (&un->list_proc == > &ulp->list_proc) to be true which is not. Then exit_sem() will check if > B->semid=-1 which is and will continue looping in for(;;) until the > memory for B is reallocated and the value at B->semid is changed. > > At that point, exit_sem() will crash attempting to unlink B from the > lists (this can be easily triggered by running the reproducer [1] a > second time). > > To prove this scenario instrumentation was added to keep information > about each sem_undo (un) struct that is removed per process and per > semaphore set (sma). > > CPU0 CPU1 > [caller holds sem_lock(sma for A)] ... > freeary() exit_sem() > ... ... > ... sem_lock(sma for B) > spin_lock(A->ulp->lock) ... > list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc) > > Undo structures A and B have different semid and sem_lock() operations > proceed. However they belong to the same list_proc list and they are > removed at the same time. This results into ulp->list_proc.next > pointing to the address of B which is already removed. > > After reverting commit a97955844807 ("ipc,sem: remove uneeded > sem_undo_list lock usage in exit_sem()") the issue was no longer > reproducible. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779 > > Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com > Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") > Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com> > Acked-by: Manfred Spraul <manfred@colorfullife.com> > Acked-by: Herton R. Krzesinski <herton@redhat.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: <malat@debian.org> > Cc: Joel Fernandes (Google) <joel@joelfernandes.org> > Cc: Davidlohr Bueso <dave@stgolabs.net> > Cc: Jay Vosburgh <jay.vosburgh@canonical.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit edf28f4061afe4c2d9eb1c3323d90e882c1d6800) > --- > ipc/sem.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/ipc/sem.c b/ipc/sem.c > index ec97a7072413..fe12ea8dd2b3 100644 > --- a/ipc/sem.c > +++ b/ipc/sem.c > @@ -2368,11 +2368,9 @@ void exit_sem(struct task_struct *tsk) > ipc_assert_locked_object(&sma->sem_perm); > list_del(&un->list_id); > > - /* we are the last process using this ulp, acquiring ulp->lock > - * isn't required. Besides that, we are also protected against > - * IPC_RMID as we hold sma->sem_perm lock now > - */ > + spin_lock(&ulp->lock); > list_del_rcu(&un->list_proc); > + spin_unlock(&ulp->lock); > > /* perform adjustments registered in un */ > for (i = 0; i < sma->sem_nsems; i++) { > Seems very reasonable to me. Thanks Ioanna. Acked-by: Colin Ian King <colin.king@canonical.com>
On 2020-02-24 14:06:44 , Ioanna Alifieraki wrote: > BugLink: https://bugs.launchpad.net/bugs/1858834 > > This reverts commit a97955844807e327df11aa33869009d14d6b7de0. > > Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage > in exit_sem()") removes a lock that is needed. This leads to a process > looping infinitely in exit_sem() and can also lead to a crash. There is > a reproducer available in [1] and with the commit reverted the issue > does not reproduce anymore. > > Using the reproducer found in [1] is fairly easy to reach a point where > one of the child processes is looping infinitely in exit_sem between > for(;;) and if (semid == -1) block, while it's trying to free its last > sem_undo structure which has already been freed by freeary(). > > Each sem_undo struct is on two lists: one per semaphore set (list_id) > and one per process (list_proc). The list_id list tracks undos by > semaphore set, and the list_proc by process. > > Undo structures are removed either by freeary() or by exit_sem(). The > freeary function is invoked when the user invokes a syscall to remove a > semaphore set. During this operation freeary() traverses the list_id > associated with the semaphore set and removes the undo structures from > both the list_id and list_proc lists. > > For this case, exit_sem() is called at process exit. Each process > contains a struct sem_undo_list (referred to as "ulp") which contains > the head for the list_proc list. When the process exits, exit_sem() > traverses this list to remove each sem_undo struct. As in freeary(), > whenever a sem_undo struct is removed from list_proc, it is also removed > from the list_id list. > > Removing elements from list_id is safe for both exit_sem() and freeary() > due to sem_lock(). Removing elements from list_proc is not safe; > freeary() locks &un->ulp->lock when it performs > list_del_rcu(&un->list_proc) but exit_sem() does not (locking was > removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list > lock usage in exit_sem()"). > > This can result in the following situation while executing the > reproducer [1] : Consider a child process in exit_sem() and the parent > in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)). > > - The list_proc for the child contains the last two undo structs A and > B (the rest have been removed either by exit_sem() or freeary()). > > - The semid for A is 1 and semid for B is 2. > > - exit_sem() removes A and at the same time freeary() removes B. > > - Since A and B have different semid sem_lock() will acquire different > locks for each process and both can proceed. > > The bug is that they remove A and B from the same list_proc at the same > time because only freeary() acquires the ulp lock. When exit_sem() > removes A it makes ulp->list_proc.next to point at B and at the same > time freeary() removes B setting B->semid=-1. > > At the next iteration of for(;;) loop exit_sem() will try to remove B. > > The only way to break from for(;;) is for (&un->list_proc == > &ulp->list_proc) to be true which is not. Then exit_sem() will check if > B->semid=-1 which is and will continue looping in for(;;) until the > memory for B is reallocated and the value at B->semid is changed. > > At that point, exit_sem() will crash attempting to unlink B from the > lists (this can be easily triggered by running the reproducer [1] a > second time). > > To prove this scenario instrumentation was added to keep information > about each sem_undo (un) struct that is removed per process and per > semaphore set (sma). > > CPU0 CPU1 > [caller holds sem_lock(sma for A)] ... > freeary() exit_sem() > ... ... > ... sem_lock(sma for B) > spin_lock(A->ulp->lock) ... > list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc) > > Undo structures A and B have different semid and sem_lock() operations > proceed. However they belong to the same list_proc list and they are > removed at the same time. This results into ulp->list_proc.next > pointing to the address of B which is already removed. > > After reverting commit a97955844807 ("ipc,sem: remove uneeded > sem_undo_list lock usage in exit_sem()") the issue was no longer > reproducible. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779 > > Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com > Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") > Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com> > Acked-by: Manfred Spraul <manfred@colorfullife.com> > Acked-by: Herton R. Krzesinski <herton@redhat.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: <malat@debian.org> > Cc: Joel Fernandes (Google) <joel@joelfernandes.org> > Cc: Davidlohr Bueso <dave@stgolabs.net> > Cc: Jay Vosburgh <jay.vosburgh@canonical.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit edf28f4061afe4c2d9eb1c3323d90e882c1d6800) > --- > ipc/sem.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/ipc/sem.c b/ipc/sem.c > index ec97a7072413..fe12ea8dd2b3 100644 > --- a/ipc/sem.c > +++ b/ipc/sem.c > @@ -2368,11 +2368,9 @@ void exit_sem(struct task_struct *tsk) > ipc_assert_locked_object(&sma->sem_perm); > list_del(&un->list_id); > > - /* we are the last process using this ulp, acquiring ulp->lock > - * isn't required. Besides that, we are also protected against > - * IPC_RMID as we hold sma->sem_perm lock now > - */ > + spin_lock(&ulp->lock); > list_del_rcu(&un->list_proc); > + spin_unlock(&ulp->lock); > > /* perform adjustments registered in un */ > for (i = 0; i < sma->sem_nsems; i++) { I think this is needed in the disco kernel as well. Looks good: Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
On Mon, Feb 24, 2020 at 02:06:44PM +0000, Ioanna Alifieraki wrote: > BugLink: https://bugs.launchpad.net/bugs/1858834 > > This reverts commit a97955844807e327df11aa33869009d14d6b7de0. > > Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage > in exit_sem()") removes a lock that is needed. This leads to a process > looping infinitely in exit_sem() and can also lead to a crash. There is > a reproducer available in [1] and with the commit reverted the issue > does not reproduce anymore. > > Using the reproducer found in [1] is fairly easy to reach a point where > one of the child processes is looping infinitely in exit_sem between > for(;;) and if (semid == -1) block, while it's trying to free its last > sem_undo structure which has already been freed by freeary(). > > Each sem_undo struct is on two lists: one per semaphore set (list_id) > and one per process (list_proc). The list_id list tracks undos by > semaphore set, and the list_proc by process. > > Undo structures are removed either by freeary() or by exit_sem(). The > freeary function is invoked when the user invokes a syscall to remove a > semaphore set. During this operation freeary() traverses the list_id > associated with the semaphore set and removes the undo structures from > both the list_id and list_proc lists. > > For this case, exit_sem() is called at process exit. Each process > contains a struct sem_undo_list (referred to as "ulp") which contains > the head for the list_proc list. When the process exits, exit_sem() > traverses this list to remove each sem_undo struct. As in freeary(), > whenever a sem_undo struct is removed from list_proc, it is also removed > from the list_id list. > > Removing elements from list_id is safe for both exit_sem() and freeary() > due to sem_lock(). Removing elements from list_proc is not safe; > freeary() locks &un->ulp->lock when it performs > list_del_rcu(&un->list_proc) but exit_sem() does not (locking was > removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list > lock usage in exit_sem()"). > > This can result in the following situation while executing the > reproducer [1] : Consider a child process in exit_sem() and the parent > in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)). > > - The list_proc for the child contains the last two undo structs A and > B (the rest have been removed either by exit_sem() or freeary()). > > - The semid for A is 1 and semid for B is 2. > > - exit_sem() removes A and at the same time freeary() removes B. > > - Since A and B have different semid sem_lock() will acquire different > locks for each process and both can proceed. > > The bug is that they remove A and B from the same list_proc at the same > time because only freeary() acquires the ulp lock. When exit_sem() > removes A it makes ulp->list_proc.next to point at B and at the same > time freeary() removes B setting B->semid=-1. > > At the next iteration of for(;;) loop exit_sem() will try to remove B. > > The only way to break from for(;;) is for (&un->list_proc == > &ulp->list_proc) to be true which is not. Then exit_sem() will check if > B->semid=-1 which is and will continue looping in for(;;) until the > memory for B is reallocated and the value at B->semid is changed. > > At that point, exit_sem() will crash attempting to unlink B from the > lists (this can be easily triggered by running the reproducer [1] a > second time). > > To prove this scenario instrumentation was added to keep information > about each sem_undo (un) struct that is removed per process and per > semaphore set (sma). > > CPU0 CPU1 > [caller holds sem_lock(sma for A)] ... > freeary() exit_sem() > ... ... > ... sem_lock(sma for B) > spin_lock(A->ulp->lock) ... > list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc) > > Undo structures A and B have different semid and sem_lock() operations > proceed. However they belong to the same list_proc list and they are > removed at the same time. This results into ulp->list_proc.next > pointing to the address of B which is already removed. > > After reverting commit a97955844807 ("ipc,sem: remove uneeded > sem_undo_list lock usage in exit_sem()") the issue was no longer > reproducible. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779 > > Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com > Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") > Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com> > Acked-by: Manfred Spraul <manfred@colorfullife.com> > Acked-by: Herton R. Krzesinski <herton@redhat.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: <malat@debian.org> > Cc: Joel Fernandes (Google) <joel@joelfernandes.org> > Cc: Davidlohr Bueso <dave@stgolabs.net> > Cc: Jay Vosburgh <jay.vosburgh@canonical.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit edf28f4061afe4c2d9eb1c3323d90e882c1d6800) Applied to focal/master-next, thanks!
On 2020-02-24 14:06:44 , Ioanna Alifieraki wrote: > BugLink: https://bugs.launchpad.net/bugs/1858834 > > This reverts commit a97955844807e327df11aa33869009d14d6b7de0. > > Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage > in exit_sem()") removes a lock that is needed. This leads to a process > looping infinitely in exit_sem() and can also lead to a crash. There is > a reproducer available in [1] and with the commit reverted the issue > does not reproduce anymore. > > Using the reproducer found in [1] is fairly easy to reach a point where > one of the child processes is looping infinitely in exit_sem between > for(;;) and if (semid == -1) block, while it's trying to free its last > sem_undo structure which has already been freed by freeary(). > > Each sem_undo struct is on two lists: one per semaphore set (list_id) > and one per process (list_proc). The list_id list tracks undos by > semaphore set, and the list_proc by process. > > Undo structures are removed either by freeary() or by exit_sem(). The > freeary function is invoked when the user invokes a syscall to remove a > semaphore set. During this operation freeary() traverses the list_id > associated with the semaphore set and removes the undo structures from > both the list_id and list_proc lists. > > For this case, exit_sem() is called at process exit. Each process > contains a struct sem_undo_list (referred to as "ulp") which contains > the head for the list_proc list. When the process exits, exit_sem() > traverses this list to remove each sem_undo struct. As in freeary(), > whenever a sem_undo struct is removed from list_proc, it is also removed > from the list_id list. > > Removing elements from list_id is safe for both exit_sem() and freeary() > due to sem_lock(). Removing elements from list_proc is not safe; > freeary() locks &un->ulp->lock when it performs > list_del_rcu(&un->list_proc) but exit_sem() does not (locking was > removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list > lock usage in exit_sem()"). > > This can result in the following situation while executing the > reproducer [1] : Consider a child process in exit_sem() and the parent > in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)). > > - The list_proc for the child contains the last two undo structs A and > B (the rest have been removed either by exit_sem() or freeary()). > > - The semid for A is 1 and semid for B is 2. > > - exit_sem() removes A and at the same time freeary() removes B. > > - Since A and B have different semid sem_lock() will acquire different > locks for each process and both can proceed. > > The bug is that they remove A and B from the same list_proc at the same > time because only freeary() acquires the ulp lock. When exit_sem() > removes A it makes ulp->list_proc.next to point at B and at the same > time freeary() removes B setting B->semid=-1. > > At the next iteration of for(;;) loop exit_sem() will try to remove B. > > The only way to break from for(;;) is for (&un->list_proc == > &ulp->list_proc) to be true which is not. Then exit_sem() will check if > B->semid=-1 which is and will continue looping in for(;;) until the > memory for B is reallocated and the value at B->semid is changed. > > At that point, exit_sem() will crash attempting to unlink B from the > lists (this can be easily triggered by running the reproducer [1] a > second time). > > To prove this scenario instrumentation was added to keep information > about each sem_undo (un) struct that is removed per process and per > semaphore set (sma). > > CPU0 CPU1 > [caller holds sem_lock(sma for A)] ... > freeary() exit_sem() > ... ... > ... sem_lock(sma for B) > spin_lock(A->ulp->lock) ... > list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc) > > Undo structures A and B have different semid and sem_lock() operations > proceed. However they belong to the same list_proc list and they are > removed at the same time. This results into ulp->list_proc.next > pointing to the address of B which is already removed. > > After reverting commit a97955844807 ("ipc,sem: remove uneeded > sem_undo_list lock usage in exit_sem()") the issue was no longer > reproducible. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779 > > Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com > Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") > Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com> > Acked-by: Manfred Spraul <manfred@colorfullife.com> > Acked-by: Herton R. Krzesinski <herton@redhat.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: <malat@debian.org> > Cc: Joel Fernandes (Google) <joel@joelfernandes.org> > Cc: Davidlohr Bueso <dave@stgolabs.net> > Cc: Jay Vosburgh <jay.vosburgh@canonical.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit edf28f4061afe4c2d9eb1c3323d90e882c1d6800) > --- > ipc/sem.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/ipc/sem.c b/ipc/sem.c > index ec97a7072413..fe12ea8dd2b3 100644 > --- a/ipc/sem.c > +++ b/ipc/sem.c > @@ -2368,11 +2368,9 @@ void exit_sem(struct task_struct *tsk) > ipc_assert_locked_object(&sma->sem_perm); > list_del(&un->list_id); > > - /* we are the last process using this ulp, acquiring ulp->lock > - * isn't required. Besides that, we are also protected against > - * IPC_RMID as we hold sma->sem_perm lock now > - */ > + spin_lock(&ulp->lock); > list_del_rcu(&un->list_proc); > + spin_unlock(&ulp->lock); > > /* perform adjustments registered in un */ > for (i = 0; i < sma->sem_nsems; i++) { > -- > 2.17.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
diff --git a/ipc/sem.c b/ipc/sem.c index ec97a7072413..fe12ea8dd2b3 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -2368,11 +2368,9 @@ void exit_sem(struct task_struct *tsk) ipc_assert_locked_object(&sma->sem_perm); list_del(&un->list_id); - /* we are the last process using this ulp, acquiring ulp->lock - * isn't required. Besides that, we are also protected against - * IPC_RMID as we hold sma->sem_perm lock now - */ + spin_lock(&ulp->lock); list_del_rcu(&un->list_proc); + spin_unlock(&ulp->lock); /* perform adjustments registered in un */ for (i = 0; i < sma->sem_nsems; i++) {