Message ID | 20170403064102.23245-1-anton@ozlabs.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Anton Blanchard <anton@ozlabs.org> a écrit : > From: Anton Blanchard <anton@samba.org> > > Early on in do_page_fault() we call store_updates_sp(), regardless of > the type of exception. For an instruction miss this doesn't make > sense, because we only use this information to detect if a data miss > is the result of a stack expansion instruction or not. > > Worse still, it results in a data miss within every userspace > instruction miss handler, because we try and load the very instruction > we are about to install a pte for! > > A simple exec microbenchmark runs 6% faster on POWER8 with this fix: > > #include <stdlib.h> > #include <stdio.h> > #include <unistd.h> > > int main(int argc, char *argv[]) > { > unsigned long left = atol(argv[1]); > char leftstr[16]; > > if (left-- == 0) > return 0; > > sprintf(leftstr, "%ld", left); > execlp(argv[0], argv[0], leftstr, NULL); > perror("exec failed\n"); > > return 0; > } > > Pass the number of iterations on the command line (eg 10000) and time > how long it takes to execute. > > Signed-off-by: Anton Blanchard <anton@samba.org> > --- > arch/powerpc/mm/fault.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c > index fd6484fc2fa9..3a7d580fdc59 100644 > --- a/arch/powerpc/mm/fault.c > +++ b/arch/powerpc/mm/fault.c > @@ -287,7 +287,7 @@ int do_page_fault(struct pt_regs *regs, unsigned > long address, > * can result in fault, which will cause a deadlock when called with > * mmap_sem held > */ > - if (user_mode(regs)) > + if (!is_exec && user_mode(regs)) Shouldn't it also check 'is_write' ? If it is a store, is_write should be set, shouldn't it ? Christophe > store_update_sp = store_updates_sp(regs); > > if (user_mode(regs)) > -- > 2.11.0
Hi Christophe, > > - if (user_mode(regs)) > > + if (!is_exec && user_mode(regs)) > > Shouldn't it also check 'is_write' ? > If it is a store, is_write should be set, shouldn't it ? Thanks, Ben had the same suggestion. I'll add that further optimisation in a subsequent patch. Anton
On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote: > From: Anton Blanchard <anton@samba.org> > > Early on in do_page_fault() we call store_updates_sp(), regardless of > the type of exception. For an instruction miss this doesn't make > sense, because we only use this information to detect if a data miss > is the result of a stack expansion instruction or not. > > Worse still, it results in a data miss within every userspace > instruction miss handler, because we try and load the very instruction > we are about to install a pte for! > > A simple exec microbenchmark runs 6% faster on POWER8 with this fix: > > #include <stdlib.h> > #include <stdio.h> > #include <unistd.h> > > int main(int argc, char *argv[]) > { > unsigned long left = atol(argv[1]); > char leftstr[16]; > > if (left-- == 0) > return 0; > > sprintf(leftstr, "%ld", left); > execlp(argv[0], argv[0], leftstr, NULL); > perror("exec failed\n"); > > return 0; > } > > Pass the number of iterations on the command line (eg 10000) and time > how long it takes to execute. > > Signed-off-by: Anton Blanchard <anton@samba.org> Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f cheers
Hi Anton, Le 04/04/2017 à 00:00, Anton Blanchard a écrit : > Hi Christophe, > >>> - if (user_mode(regs)) >>> + if (!is_exec && user_mode(regs)) >> >> Shouldn't it also check 'is_write' ? >> If it is a store, is_write should be set, shouldn't it ? > > Thanks, Ben had the same suggestion. I'll add that further optimisation > in a subsequent patch. > > Anton > For your information, I made some benchmark test using 'perf stat' with your app on MPC8321 and MPC885, and I got the following results: MPC8321 before the change: Performance counter stats for './fault 1000' (10 runs): 4491.971466 cpu-clock (msec) ( +- 0.03% ) 47386 faults ( +- 0.02% ) 4.727864465 seconds time elapsed ( +- 0.17% ) MPC8321 after your change: Performance counter stats for './fault 1000' (10 runs): 4278.738845 cpu-clock (msec) ( +- 0.02% ) 35181 faults ( +- 0.02% ) 4.504443891 seconds time elapsed ( +- 0.19% ) MPC8321 after changing !is_exec by is_write Performance counter stats for './fault 1000' (10 runs): 4268.187261 cpu-clock (msec) ( +- 0.03% ) 35181 faults ( +- 0.01% ) 4.489207922 seconds time elapsed ( +- 0.20% ) MPC885 before the change: Performance counter stats for './fault 500' (10 runs): 726605854 cpu-cycles ( +- 0.03% ) 176067 dTLB-load-misses ( +- 0.08% ) 52722 iTLB-load-misses ( +- 0.01% ) 25718 faults ( +- 0.03% ) 5.795924654 seconds time elapsed ( +- 0.14% ) MPC885 after your change: Performance counter stats for './fault 500' (10 runs): 711233251 cpu-cycles ( +- 0.04% ) 152462 dTLB-load-misses ( +- 0.09% ) 52715 iTLB-load-misses ( +- 0.01% ) 19611 faults ( +- 0.02% ) 5.673784606 seconds time elapsed ( +- 0.14% ) MPC885 after changing !is_exec by is_write Performance counter stats for './fault 500' (10 runs): 710904083 cpu-cycles ( +- 0.05% ) 147162 dTLB-load-misses ( +- 0.06% ) 52716 iTLB-load-misses ( +- 0.01% ) 19610 faults ( +- 0.02% ) 5.672091139 seconds time elapsed ( +- 0.15% ) Christophe
On Thu, 2017-04-06 at 23:06 +1000, Michael Ellerman wrote: > On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote: > > From: Anton Blanchard <anton@samba.org> > > > Applied to powerpc next, thanks. > > https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f > FYI: The version you applied does not have checks for is_write Balbir Singh.
Hi Balbir,
> FYI: The version you applied does not have checks for is_write
Yeah, we decided to do that in a follow up patch. I'm ok if someone
gets to it before me :)
Anton
Christophe LEROY <christophe.leroy@c-s.fr> writes: > Hi Anton, > > Le 04/04/2017 à 00:00, Anton Blanchard a écrit : >> Hi Christophe, >> >>>> - if (user_mode(regs)) >>>> + if (!is_exec && user_mode(regs)) >>> >>> Shouldn't it also check 'is_write' ? >>> If it is a store, is_write should be set, shouldn't it ? >> >> Thanks, Ben had the same suggestion. I'll add that further optimisation >> in a subsequent patch. >> > > For your information, I made some benchmark test using 'perf stat' with > your app on MPC8321 and MPC885, and I got the following results: MPC8321: before 47386 faults after 35181 faults -12205 is_write 35181 faults -12205 So that's good. MPC885: before: 176067 dTLB-load-misses 52722 iTLB-load-misses 25718 faults after: 152462 dTLB-load-misses -23605 52715 iTLB-load-misses -7 19611 faults -6107 is_write: 147162 dTLB-load-misses -28905 52716 iTLB-load-misses -6 19610 faults -6108 Also good, and shows that is_write idea would be even better. cheers
Balbir Singh <bsingharora@gmail.com> writes: > On Thu, 2017-04-06 at 23:06 +1000, Michael Ellerman wrote: >> On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote: >> > From: Anton Blanchard <anton@samba.org> >> > >> Applied to powerpc next, thanks. >> >> https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f >> > > FYI: The version you applied does not have checks for is_write The version I was sent didn't have it :) cheers
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index fd6484fc2fa9..3a7d580fdc59 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -287,7 +287,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, * can result in fault, which will cause a deadlock when called with * mmap_sem held */ - if (user_mode(regs)) + if (!is_exec && user_mode(regs)) store_update_sp = store_updates_sp(regs); if (user_mode(regs))