Message ID | 1496439810-11240-1-git-send-email-leitao@debian.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 1195892c091a15cc862f4e202482a36adc924e12 |
Headers | show |
Hi Breno, > Currently tsk->thread->load_vec and load_fp are not initialized > during a task creation, which set garbage to these variables > (non-zero value). Nice catch! It seems like we should zero load_tm too though? Acked-by: Anton Blanchard <anton@samba.org> Anton > These variables will be checked later at restore_math() to validate > if the FP and vectors are being utilized. Since these values might be > non-zero, the restore_math() will continue to save the FP and vectors > even if they were never utilized before the userspace application. > load_fp and load_vec counters will then overflow and the FP and > Altivec will be finally disabled, but before that condition is > reached (counter overflow) several context switches restored FP and > vector registers without need, causing a performance degradation. > > Signed-off-by: Breno Leitao <leitao@debian.org> > Signed-off-by: Gustavo Romero <gusbromero@gmail.com> > --- > arch/powerpc/kernel/process.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/powerpc/kernel/process.c > b/arch/powerpc/kernel/process.c index baae104b16c7..a9435397eab8 > 100644 --- a/arch/powerpc/kernel/process.c > +++ b/arch/powerpc/kernel/process.c > @@ -1666,6 +1666,7 @@ void start_thread(struct pt_regs *regs, > unsigned long start, unsigned long sp) #ifdef CONFIG_VSX > current->thread.used_vsr = 0; > #endif > + current->thread.load_fp = 0; > memset(¤t->thread.fp_state, 0, > sizeof(current->thread.fp_state)); current->thread.fp_save_area = > NULL; #ifdef CONFIG_ALTIVEC > @@ -1674,6 +1675,7 @@ void start_thread(struct pt_regs *regs, > unsigned long start, unsigned long sp) current->thread.vr_save_area = > NULL; current->thread.vrsave = 0; > current->thread.used_vr = 0; > + current->thread.load_vec = 0; > #endif /* CONFIG_ALTIVEC */ > #ifdef CONFIG_SPE > memset(current->thread.evr, 0, sizeof(current->thread.evr));
Hi Anton, On Sat, Jun 03, 2017 at 08:04:11AM +1000, Anton Blanchard wrote: > Hi Breno, > > > Currently tsk->thread->load_vec and load_fp are not initialized > > during a task creation, which set garbage to these variables > > (non-zero value). > > Nice catch! It seems like we should zero load_tm too though? Yes, it seems we need to zero load_tm also, since it does not seem to be zeroed anywhere else. But I did some tests, and load_tm is always zero after start_thread() is being called. In fact, start_thread() is being called and pt_regs->load_tm is already zero since the function start. I also wrote a SystemTap script[1] to investigate it better, and I've never seen a single load_tm != 0 in a my machine. I tested on both POWER8 bare metal and KVM guests. (load_vec and load_fp happened to have garbage all the time) Any idea if this is just occasional event, or, if there is someone zeroing it in an obscure code? [1] https://github.com/leitao/htm_torture/blob/master/systemtap/load_tm_at_start_thread.stap
On Sat, 3 Jun 2017 19:42:14 -0300 Breno Leitao <leitao@debian.org> wrote: > Hi Anton, > > On Sat, Jun 03, 2017 at 08:04:11AM +1000, Anton Blanchard wrote: > > Hi Breno, > > > > > Currently tsk->thread->load_vec and load_fp are not initialized > > > during a task creation, which set garbage to these variables > > > (non-zero value). > > > > Nice catch! It seems like we should zero load_tm too though? > > Yes, it seems we need to zero load_tm also, since it does not seem to > be zeroed anywhere else. > > But I did some tests, and load_tm is always zero after start_thread() > is being called. > > In fact, start_thread() is being called and pt_regs->load_tm is > already zero since the function start. > > I also wrote a SystemTap script[1] to investigate it better, and I've > never seen a single load_tm != 0 in a my machine. I tested on both > POWER8 bare metal and KVM guests. (load_vec and load_fp happened to > have garbage all the time) > > Any idea if this is just occasional event, or, if there is someone > zeroing it in an obscure code? Quite likely no one uses TM :) Try: #include <unistd.h> int main(void) { __builtin_tbegin(0); execlp("/bin/true", "/bin/true", NULL); } Anton
On Sun, Jun 04, 2017 at 11:38:14AM +1000, Anton Blanchard wrote: > On Sat, 3 Jun 2017 19:42:14 -0300 > Breno Leitao <leitao@debian.org> wrote: > > > Hi Anton, > > > > On Sat, Jun 03, 2017 at 08:04:11AM +1000, Anton Blanchard wrote: > > > Hi Breno, > > > > > > > Currently tsk->thread->load_vec and load_fp are not initialized > > > > during a task creation, which set garbage to these variables > > > > (non-zero value). > > > > > > Nice catch! It seems like we should zero load_tm too though? > > > > Yes, it seems we need to zero load_tm also, since it does not seem to > > be zeroed anywhere else. > > > > But I did some tests, and load_tm is always zero after start_thread() > > is being called. > > > > In fact, start_thread() is being called and pt_regs->load_tm is > > already zero since the function start. > > > > I also wrote a SystemTap script[1] to investigate it better, and I've > > never seen a single load_tm != 0 in a my machine. I tested on both > > POWER8 bare metal and KVM guests. (load_vec and load_fp happened to > > have garbage all the time) > > > > Any idea if this is just occasional event, or, if there is someone > > zeroing it in an obscure code? > > Quite likely no one uses TM :) Try: In fact, I had tested with TM[1] and haven't seen any issue, but I was not calling a nested application (through execve() syscall). Somehow if I call "$ ./tm_application ; /bin/true", I do not see a non-zero load_tm in the new task->thread. On the other side, I see the corruption with your test case, mainly if I sleep after 'tbegin.' and before execlp(), giving a chance to have load_tm incremented, and this value is being inherited in the new task->thread. This is obviously wrong, I will send a patch to have it fixed. Thanks for the guidance! [1] https://github.com/leitao/htm_torture
Breno Leitao <leitao@debian.org> writes: > Currently tsk->thread->load_vec and load_fp are not initialized during a > task creation, which set garbage to these variables (non-zero value). > > These variables will be checked later at restore_math() to validate if the > FP and vectors are being utilized. Since these values might be non-zero, > the restore_math() will continue to save the FP and vectors even if they > were never utilized before the userspace application. load_fp and load_vec > counters will then overflow and the FP and Altivec will be finally > disabled, but before that condition is reached (counter overflow) several > context switches restored FP and vector registers without need, causing a > performance degradation. > > Signed-off-by: Breno Leitao <leitao@debian.org> > Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Thanks, I tweaked the wording a little and added: Fixes: 70fe3d980f5f ("powerpc: Restore FPU/VEC/VSX if previously used") Cc: stable@vger.kernel.org # v4.6+ cheers
On Fri, 2017-06-02 at 21:43:30 UTC, Breno Leitao wrote: > Currently tsk->thread->load_vec and load_fp are not initialized during a > task creation, which set garbage to these variables (non-zero value). > > These variables will be checked later at restore_math() to validate if the > FP and vectors are being utilized. Since these values might be non-zero, > the restore_math() will continue to save the FP and vectors even if they > were never utilized before the userspace application. load_fp and load_vec > counters will then overflow and the FP and Altivec will be finally > disabled, but before that condition is reached (counter overflow) several > context switches restored FP and vector registers without need, causing a > performance degradation. > > Signed-off-by: Breno Leitao <leitao@debian.org> > Signed-off-by: Gustavo Romero <gusbromero@gmail.com> > Acked-by: Anton Blanchard <anton@samba.org> Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/1195892c091a15cc862f4e202482a3 cheers
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index baae104b16c7..a9435397eab8 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1666,6 +1666,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp) #ifdef CONFIG_VSX current->thread.used_vsr = 0; #endif + current->thread.load_fp = 0; memset(¤t->thread.fp_state, 0, sizeof(current->thread.fp_state)); current->thread.fp_save_area = NULL; #ifdef CONFIG_ALTIVEC @@ -1674,6 +1675,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp) current->thread.vr_save_area = NULL; current->thread.vrsave = 0; current->thread.used_vr = 0; + current->thread.load_vec = 0; #endif /* CONFIG_ALTIVEC */ #ifdef CONFIG_SPE memset(current->thread.evr, 0, sizeof(current->thread.evr));