Message ID | DB6PR0801MB2053A36AF986D2F04C1FDE6B83840@DB6PR0801MB2053.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On 08/22/2017 08:15 AM, Wilco Dijkstra wrote: > Jeff Law wrote: > On 07/26/2017 05:29 PM, Wilco Dijkstra wrote: > >>> But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0 >>> seems wrong too given that round_push uses a different alignment to align to. >> I had started to dig into the history of this code, but just didn't have >> the time to do so fully before needing to leave for the day. To some >> degree I was hoping you knew the rationale behind the test against >> MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-) > > I looked further into this - it appears this works correctly since it is only bypassed if > size_align is already maximally aligned. round_push aligns to the preferred alignment, > which may be lower or equal to MAX_SUPPORTED_STACK_ALIGNMENT (which is > at least STACK_BOUNDARY). > > Here is the updated version: > > ChangeLog: > 2017-08-22 Wilco Dijkstra <wdijkstr@arm.com> > > * explow.c (get_dynamic_stack_size): Improve dynamic alignment. OK. I wonder if this will make it easier to write stack-clash tests of the dynamic space for boundary conditions :-) I was always annoyed that I had to fiddle around with magic adjustments to the sizes of objects to tickle boundary cases. jeff
Jeff Law <law@redhat.com> writes: > On 08/22/2017 08:15 AM, Wilco Dijkstra wrote: >> Jeff Law wrote: >> On 07/26/2017 05:29 PM, Wilco Dijkstra wrote: >> >>>> But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0 >>>> seems wrong too given that round_push uses a different alignment to >>>> align to. >>> I had started to dig into the history of this code, but just didn't have >>> the time to do so fully before needing to leave for the day. To some >>> degree I was hoping you knew the rationale behind the test against >>> MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-) >> >> I looked further into this - it appears this works correctly since it is >> only bypassed if >> size_align is already maximally aligned. round_push aligns to the >> preferred alignment, >> which may be lower or equal to MAX_SUPPORTED_STACK_ALIGNMENT (which is >> at least STACK_BOUNDARY). >> >> Here is the updated version: >> >> ChangeLog: >> 2017-08-22 Wilco Dijkstra <wdijkstr@arm.com> >> >> * explow.c (get_dynamic_stack_size): Improve dynamic alignment. > OK. I wonder if this will make it easier to write stack-clash tests of > the dynamic space for boundary conditions :-) I was always annoyed that > I had to fiddle around with magic adjustments to the sizes of objects to > tickle boundary cases. This patch brought back PR libgomp/78468, which had caused its predecessor to be backed out of gcc-7. Rainer
> This patch brought back PR libgomp/78468, which had caused its > predecessor to be backed out of gcc-7. Yes, it's exactly the same mistake: + /* Since the stack is presumed to be aligned before this allocation, + we only need to increase the size of the allocation if the required + alignment is more than the stack alignment. */ The stack is aligned before the allocation but it gets misaligned during the allocation because the dynamic offset is not a multiple of STACK_BOUNDARY. The code had been realigning the stack pointer for more than 2 decades to enforce STACK_BOUNDARY but suddenly stopped again with Wilco's patch. The failure mode is very nasty (random corruption of the stack contents) and there are very likely other affected targets among the ~50 supported ones.
Eric Botcazou wrote: > The stack is aligned before the allocation but it gets misaligned during the > allocation because the dynamic offset is not a multiple of STACK_BOUNDARY. No, the stack never gets misaligned - my patch doesn't change that at all. The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY and neither aligns the outgoing args. Run my test which proves alloca was broken before my patch. Wilco
Hi Wilco, > Eric Botcazou wrote: > >> The stack is aligned before the allocation but it gets misaligned during the >> allocation because the dynamic offset is not a multiple of STACK_BOUNDARY. > > No, the stack never gets misaligned - my patch doesn't change that at > all. The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY > and neither aligns the outgoing args. Run my test which proves alloca was > broken before my patch. I'm currently running my SPARC bootstraps with your patch backed out so regressions don't go overboard. The test PASSes this way. Rainer
Hi Rainer, Can you post the disassembly for say the 8-byte aligned tests? It may not be built correctly or hit an offset that is accidentally aligned, however pass/fail status can't change due to my patch as it doesn't change alignment at all. Wilco
> No, the stack never gets misaligned - my patch doesn't change that at all. Yes, it does. Dynamic allocation works like this: the amount to be allocated is added to VIRTUAL_STACK_DYNAMIC_REGNUM and the result is then dynamically aligned. Your patch assumes that VIRTUAL_STACK_DYNAMIC_REGNUM is as aligned as the stack pointer, i.e. STACK_BOUNDARY, but that's wrong for 32-bit SPARC at least (that's why I put the ??? note at line 5746 in emit-rtl.c). The previous attempt by Dominic Vogt was more correct in this respect: 2016-08-23 Dominik Vogt <vogt@linux.vnet.ibm.com> * explow.c (get_dynamic_stack_size): Take known alignment of stack pointer + STACK_DYNAMIC_OFFSET into account when calculating the size needed. since it was using the declared alignment of VIRTUAL_STACK_DYNAMIC_REGNUM and not STACK_BOUNDARY directly. But the outcome was the same, since the declared alignment is still wrong for 32-bit SPARC. > The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY and > neither aligns the outgoing args. Run my test which proves alloca was > broken before my patch. How could this have been broken for so long, realistically? The SPARC back- end is parameterized according to the ABI and the documented interface between middle-end and back-end.
Eric Botcazou wrote: >> No, the stack never gets misaligned - my patch doesn't change that at all. > > Yes, it does. No. Look at the diffs, there is not a single change in alignment anywhere for all of the alloca variants. If the alignment is incorrect after my patch, it is also incorrect before my patch. This is the diff for pr78668.c on AArch64: >diff pr78668.before pr78668.after < add x0, x0, 18 --- > add x0, x0, 15 90c90 < add x0, x0, 18 --- > add x0, x0, 15 120c120 < add x0, x0, 22 --- > add x0, x0, 15 149c149 < add x0, x0, 22 --- > add x0, x0, 15 179c179 < add x0, x0, 30 --- > add x0, x0, 15 208c208 < add x0, x0, 30 --- > add x0, x0, 15 238c238 < add x0, x0, 46 --- > add x0, x0, 31 268c268 < add x0, x0, 46 --- > add x0, x0, 31 The mid-end always ensures that the stack is decremented by a value that is a multiple of STACK_BOUNDARY. My change does not make any difference here, but if SP is not aligned to STACK_BOUNDARY then it's obviously broken before my patch. For example the relevant instructions for t2_a8 are: add x0, x0, 15 and x0, x0, -16 sub sp, sp, x0 add x0, sp, 32 As you can see, it always rounds up and aligns to STACK_BOUNDARY before adjusting SP. When computing the final alloca address (the last add above) you must also keep that STACK_BOUNDARY alignment. To conclude my change just avoids inserting redundant padding based on the alignment promise that is made by the backend. The alignment itself is unchanged and is already incorrect on Sparc before my change. >> The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY and >> neither aligns the outgoing args. Run my test which proves alloca was >> broken before my patch. > > How could this have been broken for so long, realistically? The SPARC back- > end is parameterized according to the ABI and the documented interface between > middle-end and back-end. It clearly gets the ABI wrong as it sets STACK_BOUNDARY and then doesn't keep the alignment promise it makes. Wilco
On 09/09/2017 02:51 AM, Eric Botcazou wrote: >> No, the stack never gets misaligned - my patch doesn't change that at all. > > Yes, it does. Dynamic allocation works like this: the amount to be allocated > is added to VIRTUAL_STACK_DYNAMIC_REGNUM and the result is then dynamically > aligned. Your patch assumes that VIRTUAL_STACK_DYNAMIC_REGNUM is as aligned > as the stack pointer, i.e. STACK_BOUNDARY, but that's wrong for 32-bit SPARC > at least (that's why I put the ??? note at line 5746 in emit-rtl.c). This seems like a SPARC target problem to me -- essentially it's claiming a higher STACK_BOUNDARY than it really has. Presumably there's a good reason for this and some kind of hack may be needed to deal with it in dynamically allocated space. But it does not seem like we should be forcing all targets to allocate unnecessary space to deal with this. Jeff
Jeff Law wrote: > On 09/09/2017 02:51 AM, Eric Botcazou wrote: > >> No, the stack never gets misaligned - my patch doesn't change that at all. > > > > Yes, it does. Dynamic allocation works like this: the amount to be allocated > > is added to VIRTUAL_STACK_DYNAMIC_REGNUM and the result is then dynamically > > aligned. Your patch assumes that VIRTUAL_STACK_DYNAMIC_REGNUM is as aligned > > as the stack pointer, i.e. STACK_BOUNDARY, but that's wrong for 32-bit SPARC > > at least (that's why I put the ??? note at line 5746 in emit-rtl.c). > This seems like a SPARC target problem to me -- essentially it's > claiming a higher STACK_BOUNDARY than it really has. > > Presumably there's a good reason for this and some kind of hack may be > needed to deal with it in dynamically allocated space. But it does not > seem like we should be forcing all targets to allocate unnecessary space > to deal with this. It's not just STACK_BOUNDARY, the outgoing argument offset is incorrect too. These snippets of code from PR78468 (comment 20) look very wrong: sub %sp, %g2, %sp add %sp, 108, %g3 ; g3 = fp - 28 (x) sub %sp, %g2, %sp add %sp, 108, %g2 ; g2 = fp - 44 (d) sub %sp, %g1, %sp add %sp, 112, %g1 ; g1 = fp - 56 (e) There are several different outgoing argument offsets used here (108 and 112). This not only results in alloca blocks being unaligned (when they should be at least aligned to STACK_BOUNDARY, ideally PREFERRED_STACK_BOUNDARY), but this also means you end up with different alloca blocks overlapping and corrupting each other in non-trivial ways... Wilco
> This seems like a SPARC target problem to me -- essentially it's > claiming a higher STACK_BOUNDARY than it really has. No, it is not, I can guarantee you that the stack pointer is always aligned to 64-bit boundaries on SPARC, otherwise all hell would break loose... > Presumably there's a good reason for this and some kind of hack may be > needed to deal with it in dynamically allocated space. But it does not > seem like we should be forcing all targets to allocate unnecessary space > to deal with this. I agree but SPARC is presumably not the only affected platform, so I think that it's wrong to sureptitiously change the interface with the ~50 back-ends and hope that the maintainers will repair the damage; they won't and we'll have introduced very nasty bugs for a few wasted bytes on the stack.
On 10/04/2017 08:53 AM, Eric Botcazou wrote: >> This seems like a SPARC target problem to me -- essentially it's >> claiming a higher STACK_BOUNDARY than it really has. > > No, it is not, I can guarantee you that the stack pointer is always aligned to > 64-bit boundaries on SPARC, otherwise all hell would break loose... Then something is inconsistent somehwere. Either the stack is aligned prior to that code or it is not. If it is aligned, then Wilco's patch ought to keep it aligned. If is not properly aligned, then well, that's the problem ISTM. Am I missing something here? jeff
On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law <law@redhat.com> wrote: > On 10/04/2017 08:53 AM, Eric Botcazou wrote: >>> This seems like a SPARC target problem to me -- essentially it's >>> claiming a higher STACK_BOUNDARY than it really has. >> >> No, it is not, I can guarantee you that the stack pointer is always aligned to >> 64-bit boundaries on SPARC, otherwise all hell would break loose... > Then something is inconsistent somehwere. Either the stack is aligned > prior to that code or it is not. If it is aligned, then Wilco's patch > ought to keep it aligned. If is not properly aligned, then well, that's > the problem ISTM. > > Am I missing something here? What I got from the discussion and the PR is that the stack hardregister is properly aligned but what GCC maps to it (virtual or frame or whatever) might not be at all points. allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not sure STACK_BOUNDARY applies to it? Not that I know anything about this here ;) Richard. > jeff
Richard Biener wrote: > On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law <law@redhat.com> wrote: > > On 10/04/2017 08:53 AM, Eric Botcazou wrote: >>>> This seems like a SPARC target problem to me -- essentially it's >>>> claiming a higher STACK_BOUNDARY than it really has. >>> >>> No, it is not, I can guarantee you that the stack pointer is always aligned to >>> 64-bit boundaries on SPARC, otherwise all hell would break loose... >> Then something is inconsistent somehwere. Either the stack is aligned >> prior to that code or it is not. If it is aligned, then Wilco's patch >> ought to keep it aligned. If is not properly aligned, then well, that's >> the problem ISTM. >> >> Am I missing something here? > > What I got from the discussion and the PR is that the stack hardregister > is properly aligned but what GCC maps to it (virtual or frame or whatever) > might not be at all points. > > allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not > sure STACK_BOUNDARY applies to it? Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other virtual frame registers. It appears it's main purpose is to enable alignment optimizations since PREFERRED_STACK_BOUNDARY is used to align local and outgoing argument area etc. So if you don't want the alignment optimizations it is feasible to set STACK_BOUNDARY to a lower value without changing the stack layout. There is also STACK_DYNAMIC_OFFSET which computes the total offset from the stack. It's not obvious whether the default version should align (since outgoing arguments are already aligned there is no easy way to record the extra padding), but we could assert if the offset isn't aligned. Wilco
Wilco Dijkstra wrote: > > Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other > virtual frame registers. It appears it's main purpose is to enable alignment > optimizations since PREFERRED_STACK_BOUNDARY is used to align > local and outgoing argument area etc. So if you don't want the alignment > optimizations it is feasible to set STACK_BOUNDARY to a lower value > without changing the stack layout. > > There is also STACK_DYNAMIC_OFFSET which computes the total offset > from the stack. It's not obvious whether the default version should align (since > outgoing arguments are already aligned there is no easy way to record the > extra padding), but we could assert if the offset isn't aligned. Also there is something odd in the sparc backend: /* Given the stack bias, the stack pointer isn't actually aligned. */ #define INIT_EXPANDERS \ do { \ if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS) \ { \ REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT; \ REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \ } \ } while (0) That lowers the alignment for the stack and frame pointer. So assuming that works and blocks alignment optimizations, why isn't this done for the dynamic offset as well? Wilco
On 10/05/2017 03:16 AM, Richard Biener wrote: > On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law <law@redhat.com> wrote: >> On 10/04/2017 08:53 AM, Eric Botcazou wrote: >>>> This seems like a SPARC target problem to me -- essentially it's >>>> claiming a higher STACK_BOUNDARY than it really has. >>> >>> No, it is not, I can guarantee you that the stack pointer is always aligned to >>> 64-bit boundaries on SPARC, otherwise all hell would break loose... >> Then something is inconsistent somehwere. Either the stack is aligned >> prior to that code or it is not. If it is aligned, then Wilco's patch >> ought to keep it aligned. If is not properly aligned, then well, that's >> the problem ISTM. >> >> Am I missing something here? > > What I got from the discussion and the PR is that the stack hardregister > is properly aligned but what GCC maps to it (virtual or frame or whatever) > might not be at all points. Ah! But I'd probably claim that having the virtual unaligned is erroneous. > > allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not > sure STACK_BOUNDARY applies to it? > > Not that I know anything about this here ;) My first thought is that sure it should apply. It just seems wrong that STACK_BOUNDARY wouldn't apply to the virtual. But I doubt we've ever documented that as a requirement/assumption. Jeff
On 10/17/2017 06:04 AM, Wilco Dijkstra wrote: > Wilco Dijkstra wrote: >> >> Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other >> virtual frame registers. It appears it's main purpose is to enable alignment >> optimizations since PREFERRED_STACK_BOUNDARY is used to align >> local and outgoing argument area etc. So if you don't want the alignment >> optimizations it is feasible to set STACK_BOUNDARY to a lower value >> without changing the stack layout. >> >> There is also STACK_DYNAMIC_OFFSET which computes the total offset >> from the stack. It's not obvious whether the default version should align (since >> outgoing arguments are already aligned there is no easy way to record the >> extra padding), but we could assert if the offset isn't aligned. > > Also there is something odd in the sparc backend: > > /* Given the stack bias, the stack pointer isn't actually aligned. */ > #define INIT_EXPANDERS \ > do { \ > if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS) \ > { \ > REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT; \ > REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \ > } \ > } while (0) > > That lowers the alignment for the stack and frame pointer. So assuming that works > and blocks alignment optimizations, why isn't this done for the dynamic offset as well? No clue, but ISTM that it should. Eric, can you try that and see if it addresses these problems? I'd really like to get this wrapped up, but I don't have access to any sparc systems to test it myself. Jeff
On Thu, Oct 26, 2017 at 4:55 PM, Jeff Law <law@redhat.com> wrote: > On 10/17/2017 06:04 AM, Wilco Dijkstra wrote: >> Wilco Dijkstra wrote: >>> >>> Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other >>> virtual frame registers. It appears it's main purpose is to enable alignment >>> optimizations since PREFERRED_STACK_BOUNDARY is used to align >>> local and outgoing argument area etc. So if you don't want the alignment >>> optimizations it is feasible to set STACK_BOUNDARY to a lower value >>> without changing the stack layout. >>> >>> There is also STACK_DYNAMIC_OFFSET which computes the total offset >>> from the stack. It's not obvious whether the default version should align (since >>> outgoing arguments are already aligned there is no easy way to record the >>> extra padding), but we could assert if the offset isn't aligned. >> >> Also there is something odd in the sparc backend: >> >> /* Given the stack bias, the stack pointer isn't actually aligned. */ >> #define INIT_EXPANDERS \ >> do { \ >> if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS) \ >> { \ >> REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT; \ >> REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \ >> } \ >> } while (0) >> >> That lowers the alignment for the stack and frame pointer. So assuming that works >> and blocks alignment optimizations, why isn't this done for the dynamic offset as well? > No clue, but ISTM that it should. Eric, can you try that and see if it > addresses these problems? I'd really like to get this wrapped up, but I > don't have access to any sparc systems to test it myself. Or maybe adjust all non-hardreg stack pointers by the bias so they _are_ aligned. And of course make sure we always use the aligned pointers when allocating. Weird ABI ... Richard. > Jeff
> No clue, but ISTM that it should. Eric, can you try that and see if it > addresses these problems? I'd really like to get this wrapped up, but I > don't have access to any sparc systems to test it myself. Yes, the INIT_EXPANDERS trick works for SPARC (but this has nothing to do with SPARC_STACK_BIAS) and avoid hardcoding the bogus alignment assumption in the get_dynamic_stack_size function. As a matter of fact, this was the approach originally used by Dominik Vogt last year. Of course this doesn't address the same potential issue on other targets but you don't seem to care much about that, so who am I to do it after all? ;-) Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline. 2017-12-13 Eric Botcazou <ebotcazou@adacore.com> Dominik Vogt <vogt@linux.vnet.ibm.com> PR middle-end/78468 * emit-rtl.c (init_emit): Remove ??? comment. * explow.c (get_dynamic_stack_size): Take known alignment of stack pointer + STACK_DYNAMIC_OFFSET into account in lieu of STACK_BOUNDARY * config/sparc/sparc.h (INIT_EXPANDERS): In 32-bit mode, lower the alignment of 3 virtual registers to BITS_PER_WORD. * config/sparc/sparc.c (sparc_compute_frame_size): Simplify.
On 12/13/2017 04:17 PM, Eric Botcazou wrote: >> No clue, but ISTM that it should. Eric, can you try that and see if it >> addresses these problems? I'd really like to get this wrapped up, but I >> don't have access to any sparc systems to test it myself. > > Yes, the INIT_EXPANDERS trick works for SPARC (but this has nothing to do with > SPARC_STACK_BIAS) and avoid hardcoding the bogus alignment assumption in the > get_dynamic_stack_size function. As a matter of fact, this was the approach > originally used by Dominik Vogt last year. > > Of course this doesn't address the same potential issue on other targets but > you don't seem to care much about that, so who am I to do it after all? ;-) > > Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline. > > > 2017-12-13 Eric Botcazou <ebotcazou@adacore.com> > Dominik Vogt <vogt@linux.vnet.ibm.com> > > PR middle-end/78468 > * emit-rtl.c (init_emit): Remove ??? comment. > * explow.c (get_dynamic_stack_size): Take known alignment of stack > pointer + STACK_DYNAMIC_OFFSET into account in lieu of STACK_BOUNDARY > * config/sparc/sparc.h (INIT_EXPANDERS): In 32-bit mode, lower the > alignment of 3 virtual registers to BITS_PER_WORD. > > * config/sparc/sparc.c (sparc_compute_frame_size): Simplify. > Thanks for taking care of this. jeff
diff --git a/gcc/explow.c b/gcc/explow.c index 50074e281edd5270c76d29feac6b7a92f598d11d..d3148273030a010ece1f8ea1c14eef64bbf4e78a 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -1234,15 +1234,20 @@ get_dynamic_stack_size (rtx *psize, unsigned size_align, example), so we must preventively align the value. We leave space in SIZE for the hole that might result from the alignment operation. */ - extra = (required_align - BITS_PER_UNIT) / BITS_PER_UNIT; - size = plus_constant (Pmode, size, extra); - size = force_operand (size, NULL_RTX); - - if (flag_stack_usage_info && pstack_usage_size) - *pstack_usage_size += extra; + /* Since the stack is presumed to be aligned before this allocation, + we only need to increase the size of the allocation if the required + alignment is more than the stack alignment. */ + if (required_align > STACK_BOUNDARY) + { + extra = (required_align - STACK_BOUNDARY) / BITS_PER_UNIT; + size = plus_constant (Pmode, size, extra); + size = force_operand (size, NULL_RTX); + if (size_align > STACK_BOUNDARY) + size_align = STACK_BOUNDARY; - if (extra && size_align > BITS_PER_UNIT) - size_align = BITS_PER_UNIT; + if (flag_stack_usage_info && pstack_usage_size) + *pstack_usage_size += extra; + } /* Round the size to a multiple of the required stack alignment. Since the stack is presumed to be rounded before this allocation,