Message ID | 88dbbef6-96cb-a32b-fba8-2bb512039c64@redhat.com |
---|---|
State | Not Applicable |
Headers | show |
On 12/07/17 14:53, Thomas Huth wrote: > On 12.07.2017 05:46, Alexey Kardashevskiy wrote: >> On 09/06/17 03:13, Segher Boessenkool wrote: >>> On Thu, Jun 08, 2017 at 08:54:06AM +0200, Thomas Huth wrote: >>>> On 08.06.2017 08:12, Alexey Kardashevskiy wrote: >>>>> /home/aik/p/slof/slof/paflof.c: In function ‘engine’: >>>>> /home/aik/p/slof/slof/paflof.c:84:23: warning: array subscript is below >>>>> array bounds [-Warray-bounds] >>>>> dp = the_data_stack - 1; >>>>> ~~~~~~~~~~~~~~~^~~ >>>>> /home/aik/p/slof/slof/paflof.c:85:22: warning: array subscript is below >>>>> array bounds [-Warray-bounds] >>>>> rp = handler_stack - 1; >>>>> ~~~~~~~~~~~~~~^~~ >>>>> >>>>> with gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) from Fedora24/BE. >>>>> >>>>> Can you please take a look on this? Thanks. >>>> >>>> See Segher's suggestions here: >>>> >>>> https://lists.ozlabs.org/pipermail/slof/2016-August/001221.html >>>> >>>> IMHO we could also just simply sacrifice the first stack entry and only >>>> use "dp = the_data_stack", without adding the inline asm here (to keep >>>> paflof.c portable). >>> >>> Or you simply make the_*_stack declared as a pointer in paflof.c, not as >>> an array. Where is it actually defined? If it is defined in assembler >>> code all is fine; if it is defined in C code, well, don't lie to the >>> compiler or it will take its revenge (if using full-program optimisation >>> it can still see you're accessing the array out-of-bounds for example, >>> or worse, assume you don't do undefined things and optimise accordingly). >> >> >> slof/paflof.c includes slof/ppc64.c (via #include ISTR(TARG,c)) which >> defines the_data_stack. >> >> handler_stack is defined right in slof/paflof.c's engine(). >> >>> An easy way around is to have the_*_stack just an external symbol, and >>> have the_real_*_stack for the arrays of cells, and then equate the two >>> in a linker script. >> >> Harsh. >> >>> Or, ignore the warning. If ever things break (and they won't), it will >>> do so with lots of fireworks; it won't silently break. >> >> This shuts the gcc up (and we loose 1 cell in each stack): >> >> diff --git a/slof/paflof.c b/slof/paflof.c >> index 50b4adf..1e7874a 100644 >> --- a/slof/paflof.c >> +++ b/slof/paflof.c >> @@ -68,7 +68,8 @@ long engine(int mode, long param_1, long param_2) >> >> cell *restrict ip; >> cell *restrict cfa; >> - static cell handler_stack[160]; >> + static cell handler_stack_[160]; >> + static cell *handler_stack = handler_stack_ + 1; >> static cell c_return[2]; >> static cell dummy; >> >> diff --git a/slof/ppc64.c b/slof/ppc64.c >> index 83a8e82..76c1bdf 100644 >> --- a/slof/ppc64.c >> +++ b/slof/ppc64.c >> @@ -26,7 +26,8 @@ cell the_exception_frame[0x400 / CELLSIZE] __attribute__ >> ((aligned(PAGE_SIZE))); >> cell the_client_frame[0x1000 / CELLSIZE] __attribute__ ((aligned(0x100))); >> cell the_client_stack[0x8000 / CELLSIZE] __attribute__ ((aligned(0x100))); >> /* THE forth stack */ >> -cell the_data_stack[0x2000 / CELLSIZE] __attribute__ ((aligned(0x100))); >> +cell the_data_stack_[0x2000 / CELLSIZE] __attribute__ ((aligned(0x100))); >> +cell *the_data_stack = the_data_stack_ + 1; >> /* the forth return stack */ >> cell the_return_stack[0x2000 / CELLSIZE] __attribute__ ((aligned(0x100))); > > That's kind of ugly. Why not simply: > > diff --git a/slof/paflof.c b/slof/paflof.c > index 50b4adf..ea3c145 100644 > --- a/slof/paflof.c > +++ b/slof/paflof.c > @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) > LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; > > // stack-pointers > - dp = the_data_stack - 1; > - rp = handler_stack - 1; > + dp = the_data_stack; > + rp = handler_stack; > > // return-address for "evaluate" personality > dummy.a = &&over; > > ? You also need to fix: slof/prim.code|207| PRIM(DEPTH) PUSH; TOS.u = dp - the_data_stack; MIRP as otherwise SLOF stops at its prompt with "1 >" rather than "0 >". What is uglier is a question. > > Thomas >
On Wed, Jul 12, 2017 at 06:53:38AM +0200, Thomas Huth wrote: > That's kind of ugly. Why not simply: > > diff --git a/slof/paflof.c b/slof/paflof.c > index 50b4adf..ea3c145 100644 > --- a/slof/paflof.c > +++ b/slof/paflof.c > @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) > LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; > > // stack-pointers > - dp = the_data_stack - 1; > - rp = handler_stack - 1; > + dp = the_data_stack; > + rp = handler_stack; > > // return-address for "evaluate" personality > dummy.a = &&over; Paflof originally had options to cache 0, 1, or 2 top-of-stack elements. Now it is always 1; it is usually fastest (2 is theoretically a tiny bit better, but a bit worse in practice). There also was a user-space version. It set up the stacks with unmapped guard pages, so it could trap stack underflows (and overflows). If you run in translation-off mode you of course cannot do such things. (Your patch is incomplete btw, there are five or six or so places that need changes). I would just add a cast to shut up the warning, and not lose the optimisations (or ignore the warning). Segher
On 12.07.2017 07:47, Alexey Kardashevskiy wrote: > On 12/07/17 14:53, Thomas Huth wrote: >> On 12.07.2017 05:46, Alexey Kardashevskiy wrote: >>> On 09/06/17 03:13, Segher Boessenkool wrote: >>>> On Thu, Jun 08, 2017 at 08:54:06AM +0200, Thomas Huth wrote: >>>>> On 08.06.2017 08:12, Alexey Kardashevskiy wrote: >>>>>> /home/aik/p/slof/slof/paflof.c: In function ‘engine’: >>>>>> /home/aik/p/slof/slof/paflof.c:84:23: warning: array subscript is below >>>>>> array bounds [-Warray-bounds] >>>>>> dp = the_data_stack - 1; >>>>>> ~~~~~~~~~~~~~~~^~~ >>>>>> /home/aik/p/slof/slof/paflof.c:85:22: warning: array subscript is below >>>>>> array bounds [-Warray-bounds] >>>>>> rp = handler_stack - 1; >>>>>> ~~~~~~~~~~~~~~^~~ >>>>>> >>>>>> with gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) from Fedora24/BE. >>>>>> >>>>>> Can you please take a look on this? Thanks. >>>>> >>>>> See Segher's suggestions here: >>>>> >>>>> https://lists.ozlabs.org/pipermail/slof/2016-August/001221.html >>>>> >>>>> IMHO we could also just simply sacrifice the first stack entry and only >>>>> use "dp = the_data_stack", without adding the inline asm here (to keep >>>>> paflof.c portable). >>>> >>>> Or you simply make the_*_stack declared as a pointer in paflof.c, not as >>>> an array. Where is it actually defined? If it is defined in assembler >>>> code all is fine; if it is defined in C code, well, don't lie to the >>>> compiler or it will take its revenge (if using full-program optimisation >>>> it can still see you're accessing the array out-of-bounds for example, >>>> or worse, assume you don't do undefined things and optimise accordingly). >>> >>> >>> slof/paflof.c includes slof/ppc64.c (via #include ISTR(TARG,c)) which >>> defines the_data_stack. >>> >>> handler_stack is defined right in slof/paflof.c's engine(). >>> >>>> An easy way around is to have the_*_stack just an external symbol, and >>>> have the_real_*_stack for the arrays of cells, and then equate the two >>>> in a linker script. >>> >>> Harsh. >>> >>>> Or, ignore the warning. If ever things break (and they won't), it will >>>> do so with lots of fireworks; it won't silently break. >>> >>> This shuts the gcc up (and we loose 1 cell in each stack): >>> >>> diff --git a/slof/paflof.c b/slof/paflof.c >>> index 50b4adf..1e7874a 100644 >>> --- a/slof/paflof.c >>> +++ b/slof/paflof.c >>> @@ -68,7 +68,8 @@ long engine(int mode, long param_1, long param_2) >>> >>> cell *restrict ip; >>> cell *restrict cfa; >>> - static cell handler_stack[160]; >>> + static cell handler_stack_[160]; >>> + static cell *handler_stack = handler_stack_ + 1; >>> static cell c_return[2]; >>> static cell dummy; >>> >>> diff --git a/slof/ppc64.c b/slof/ppc64.c >>> index 83a8e82..76c1bdf 100644 >>> --- a/slof/ppc64.c >>> +++ b/slof/ppc64.c >>> @@ -26,7 +26,8 @@ cell the_exception_frame[0x400 / CELLSIZE] __attribute__ >>> ((aligned(PAGE_SIZE))); >>> cell the_client_frame[0x1000 / CELLSIZE] __attribute__ ((aligned(0x100))); >>> cell the_client_stack[0x8000 / CELLSIZE] __attribute__ ((aligned(0x100))); >>> /* THE forth stack */ >>> -cell the_data_stack[0x2000 / CELLSIZE] __attribute__ ((aligned(0x100))); >>> +cell the_data_stack_[0x2000 / CELLSIZE] __attribute__ ((aligned(0x100))); >>> +cell *the_data_stack = the_data_stack_ + 1; >>> /* the forth return stack */ >>> cell the_return_stack[0x2000 / CELLSIZE] __attribute__ ((aligned(0x100))); >> >> That's kind of ugly. Why not simply: >> >> diff --git a/slof/paflof.c b/slof/paflof.c >> index 50b4adf..ea3c145 100644 >> --- a/slof/paflof.c >> +++ b/slof/paflof.c >> @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) >> LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; >> >> // stack-pointers >> - dp = the_data_stack - 1; >> - rp = handler_stack - 1; >> + dp = the_data_stack; >> + rp = handler_stack; >> >> // return-address for "evaluate" personality >> dummy.a = &&over; >> >> ? > > > You also need to fix: > > slof/prim.code|207| PRIM(DEPTH) PUSH; TOS.u = dp - the_data_stack; MIRP > > as otherwise SLOF stops at its prompt with "1 >" rather than "0 >". > > What is uglier is a question. Ugh, ok. What about: dp = the_data_stack; rp = handler_stack; barrier(); dp--; rp--; ? That should hopefully shut up the compiler, too, and result in the same initial setting of dp and rp. Thomas
On Wed, Jul 12, 2017 at 02:56:41PM +0200, Thomas Huth wrote: > >>>>>> /home/aik/p/slof/slof/paflof.c: In function ‘engine’: > >>>>>> /home/aik/p/slof/slof/paflof.c:84:23: warning: array subscript is below > >>>>>> array bounds [-Warray-bounds] > >>>>>> dp = the_data_stack - 1; > >>>>>> ~~~~~~~~~~~~~~~^~~ > Ugh, ok. > > What about: > > dp = the_data_stack; > rp = handler_stack; > barrier(); > dp--; > rp--; > > ? > > That should hopefully shut up the compiler, too, and result in the same > initial setting of dp and rp. I don't know what "barrier" is, but it cannot work :-) Decrementing dp is not valid C here, either (exactly the same problem as the original). You could do just dp = (cell *)((uintptr_t)the_data_stack - CELLSIZE); etc. -- that is valid C, standard C even, and will work just fine. It's ugly, but at least it is correct ;-) Segher
On 12/07/17 22:41, Segher Boessenkool wrote: > On Wed, Jul 12, 2017 at 06:53:38AM +0200, Thomas Huth wrote: >> That's kind of ugly. Why not simply: >> >> diff --git a/slof/paflof.c b/slof/paflof.c >> index 50b4adf..ea3c145 100644 >> --- a/slof/paflof.c >> +++ b/slof/paflof.c >> @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) >> LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; >> >> // stack-pointers >> - dp = the_data_stack - 1; >> - rp = handler_stack - 1; >> + dp = the_data_stack; >> + rp = handler_stack; >> >> // return-address for "evaluate" personality >> dummy.a = &&over; > > Paflof originally had options to cache 0, 1, or 2 top-of-stack elements. > Now it is always 1; it is usually fastest (2 is theoretically a tiny > bit better, but a bit worse in practice). Sorry I am missing the connection between this 1-cell-cache and the size of the stack :) > > There also was a user-space version. It set up the stacks with unmapped > guard pages, so it could trap stack underflows (and overflows). If you > run in translation-off mode you of course cannot do such things. > > (Your patch is incomplete btw, there are five or six or so places that > need changes). > > I would just add a cast to shut up the warning, and not lose the > optimisations (or ignore the warning). > > > Segher >
On 12.07.2017 21:24, Segher Boessenkool wrote: > On Wed, Jul 12, 2017 at 02:56:41PM +0200, Thomas Huth wrote: >>>>>>>> /home/aik/p/slof/slof/paflof.c: In function ‘engine’: >>>>>>>> /home/aik/p/slof/slof/paflof.c:84:23: warning: array subscript is below >>>>>>>> array bounds [-Warray-bounds] >>>>>>>> dp = the_data_stack - 1; >>>>>>>> ~~~~~~~~~~~~~~~^~~ > >> Ugh, ok. >> >> What about: >> >> dp = the_data_stack; >> rp = handler_stack; >> barrier(); >> dp--; >> rp--; >> >> ? >> >> That should hopefully shut up the compiler, too, and result in the same >> initial setting of dp and rp. > > I don't know what "barrier" is, but it cannot work :-) Decrementing dp > is not valid C here, either (exactly the same problem as the original). > > You could do just > > dp = (cell *)((uintptr_t)the_data_stack - CELLSIZE); > > etc. -- that is valid C, standard C even, and will work just fine. > It's ugly, but at least it is correct ;-) OK, that solution looks fine to me, too! Alexey, what do you think? Thomas
Hi! On Thu, Jul 13, 2017 at 12:42:32PM +1000, Alexey Kardashevskiy wrote: > On 12/07/17 22:41, Segher Boessenkool wrote: > > On Wed, Jul 12, 2017 at 06:53:38AM +0200, Thomas Huth wrote: > >> That's kind of ugly. Why not simply: > >> > >> diff --git a/slof/paflof.c b/slof/paflof.c > >> index 50b4adf..ea3c145 100644 > >> --- a/slof/paflof.c > >> +++ b/slof/paflof.c > >> @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) > >> LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; > >> > >> // stack-pointers > >> - dp = the_data_stack - 1; > >> - rp = handler_stack - 1; > >> + dp = the_data_stack; > >> + rp = handler_stack; > >> > >> // return-address for "evaluate" personality > >> dummy.a = &&over; > > > > Paflof originally had options to cache 0, 1, or 2 top-of-stack elements. > > Now it is always 1; it is usually fastest (2 is theoretically a tiny > > bit better, but a bit worse in practice). > > Sorry I am missing the connection between this 1-cell-cache and the size of > the stack :) I left out a bit, and I didn't check either (it actually caches 0 elements currently). The # elts cached together with where the stack pointer points exactly (see below) matters for the code size and speed of the primitives. The stack grows upward, and it's a full stack (i.e. the stack pointer points to the top used element), which means for an empty stack it points to one below the stack. This means that for pushing to the stack we can use a store-with-update instruction (stdu). This generates good code for most primitives, better than the other stack pointer offsets. It doesn't matter all that much; most of the time you are just waiting for some i/o to happen. If you want to change it, you should look at DEPTH, PICK, ROLL, and the things that change stack and THROW/CATCH and such. And anything else that "manually" accesses the stacks... Segher
On 13/07/17 18:10, Segher Boessenkool wrote: > Hi! > > On Thu, Jul 13, 2017 at 12:42:32PM +1000, Alexey Kardashevskiy wrote: >> On 12/07/17 22:41, Segher Boessenkool wrote: >>> On Wed, Jul 12, 2017 at 06:53:38AM +0200, Thomas Huth wrote: >>>> That's kind of ugly. Why not simply: >>>> >>>> diff --git a/slof/paflof.c b/slof/paflof.c >>>> index 50b4adf..ea3c145 100644 >>>> --- a/slof/paflof.c >>>> +++ b/slof/paflof.c >>>> @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) >>>> LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; >>>> >>>> // stack-pointers >>>> - dp = the_data_stack - 1; >>>> - rp = handler_stack - 1; >>>> + dp = the_data_stack; >>>> + rp = handler_stack; >>>> >>>> // return-address for "evaluate" personality >>>> dummy.a = &&over; >>> >>> Paflof originally had options to cache 0, 1, or 2 top-of-stack elements. >>> Now it is always 1; it is usually fastest (2 is theoretically a tiny >>> bit better, but a bit worse in practice). >> >> Sorry I am missing the connection between this 1-cell-cache and the size of >> the stack :) > > I left out a bit, and I didn't check either (it actually caches 0 elements > currently). The # elts cached together with where the stack pointer > points exactly (see below) matters for the code size and speed of the > primitives. > > The stack grows upward, and it's a full stack (i.e. the stack pointer > points to the top used element), which means for an empty stack it > points to one below the stack. > > This means that for pushing to the stack we can use a store-with-update > instruction (stdu). This generates good code for most primitives, > better than the other stack pointer offsets. Aha! So this "stdu" actually explains the current behaviour. Thanks for the explanation, I'll add it to the commit log for that type cast you suggested. > > It doesn't matter all that much; most of the time you are just waiting > for some i/o to happen. > > > If you want to change it, you should look at DEPTH, PICK, ROLL, and > the things that change stack and THROW/CATCH and such. And anything > else that "manually" accesses the stacks... > > > Segher >
diff --git a/slof/paflof.c b/slof/paflof.c index 50b4adf..ea3c145 100644 --- a/slof/paflof.c +++ b/slof/paflof.c @@ -81,8 +81,8 @@ long engine(int mode, long param_1, long param_2) LAST_ELEMENT(xt_FORTH_X2d_WORDLIST).a = xt_LASTWORD; // stack-pointers - dp = the_data_stack - 1; - rp = handler_stack - 1; + dp = the_data_stack; + rp = handler_stack; // return-address for "evaluate" personality dummy.a = &&over;