LRA: Fix setup_sp_offset

Message ID	6fd0b11d-54e5-99eb-f8c1-1f3e282f4489@suse.de
State	New
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 59B1D3860C3B Date: Thu, 22 Aug 2024 17:45:00 +0200 (CEST) From: Michael Matz <matz@suse.de> To: gcc-patches@gcc.gnu.org Subject: LRA: Fix setup_sp_offset Message-ID: <6fd0b11d-54e5-99eb-f8c1-1f3e282f4489@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	LRA: Fix setup_sp_offset \| expand LRA: Fix setup_sp_offset

Michael Matz Aug. 22, 2024, 3:45 p.m. UTC

This is part of making m68k work with LRA.  See PR116429.
In short: setup_sp_offset is internally inconsistent.  It wants to
setup the sp_offset for newly generated instructions.  sp_offset for
an instruction is always the state of the sp-offset right before that
instruction.  For that it starts at the (assumed correct) sp_offset
of the instruction right after the given (new) sequence, and then
iterates that sequence forward simulating its effects on sp_offset.

That can't ever be right: either it needs to start at the front
and simulate forward, or start at the end and simulate backward.
The former seems to be the more natural way.  Funnily the local
variable holding that instruction is also called 'before'.

This changes it to the first variant: start before the sequence,
do one simulation step to get the sp-offset state in front of the
sequence and then continue simulating.

More details: in the problematic testcase we start with this
situation (sp_off before 550 is 0):

  550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
  551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
  552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
  554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
  556: call                   sp_off = -16

insn 554 doesn't match its constraints and needs some reloads:

      Creating newreg=262, assigning class DATA_REGS to r262
  554: r262:SI=r262:SI-r37:SI
      REG_ARGS_SIZE 0x10
    Inserting insn reload before:
  996: r262:SI=r116:SI
    Inserting insn reload after:
  997: [--%sp:SI]=r262:SI

         Considering alt=0 of insn 997:   (0) =g  (1) damSKT
            1 Non pseudo reload: reject++
          overall=1,losers=0,rld_nregs=0
      Choosing alt 0 in insn 997:  (0) =g  (1) damSKT {*movsi_m68k2} (sp_off=-16)

Note how insn 997 (the after-reload) now has sp_off=-16 already.  It all
goes downhill from there.  We end up with these insns:

  552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
  996: r262 = r116            sp_off = -12
  554: r262 = r262 - r37      sp_off = -12
  997: [--sp] = r262          sp_off = -16  (!!! should be -12)
  556: call                   sp_off = -16

The call insn sp_off remains at the correct -16, but internally it's already
inconsistent here.  If the sp_off before an insn is -16, and that insn
pre_decs sp, then the after-insn sp_off should be -20.

	PR target/116429
	* lra.cc (setup_sp_offset): Start with sp_offset from
	before the new sequence, not from after.
---
 gcc/lra.cc | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)
---
Regstrapped on x86-64-linux.  Okay?

Jeff Law Aug. 25, 2024, 2:29 p.m. UTC | #1

On 8/22/24 9:45 AM, Michael Matz wrote:
> This is part of making m68k work with LRA.  See PR116429.
> In short: setup_sp_offset is internally inconsistent.  It wants to
> setup the sp_offset for newly generated instructions.  sp_offset for
> an instruction is always the state of the sp-offset right before that
> instruction.  For that it starts at the (assumed correct) sp_offset
> of the instruction right after the given (new) sequence, and then
> iterates that sequence forward simulating its effects on sp_offset.
> 
> That can't ever be right: either it needs to start at the front
> and simulate forward, or start at the end and simulate backward.
> The former seems to be the more natural way.  Funnily the local
> variable holding that instruction is also called 'before'.
> 
> This changes it to the first variant: start before the sequence,
> do one simulation step to get the sp-offset state in front of the
> sequence and then continue simulating.
> 
> More details: in the problematic testcase we start with this
> situation (sp_off before 550 is 0):
> 
>    550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
>    551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
>    552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
>    554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
>    556: call                   sp_off = -16
> 
> insn 554 doesn't match its constraints and needs some reloads:
> 
>        Creating newreg=262, assigning class DATA_REGS to r262
>    554: r262:SI=r262:SI-r37:SI
>        REG_ARGS_SIZE 0x10
>      Inserting insn reload before:
>    996: r262:SI=r116:SI
>      Inserting insn reload after:
>    997: [--%sp:SI]=r262:SI
> 
>           Considering alt=0 of insn 997:   (0) =g  (1) damSKT
>              1 Non pseudo reload: reject++
>            overall=1,losers=0,rld_nregs=0
>        Choosing alt 0 in insn 997:  (0) =g  (1) damSKT {*movsi_m68k2} (sp_off=-16)
> 
> Note how insn 997 (the after-reload) now has sp_off=-16 already.  It all
> goes downhill from there.  We end up with these insns:
> 
>    552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
>    996: r262 = r116            sp_off = -12
>    554: r262 = r262 - r37      sp_off = -12
>    997: [--sp] = r262          sp_off = -16  (!!! should be -12)
>    556: call                   sp_off = -16
> 
> The call insn sp_off remains at the correct -16, but internally it's already
> inconsistent here.  If the sp_off before an insn is -16, and that insn
> pre_decs sp, then the after-insn sp_off should be -20.
> 
> 	PR target/116429
> 	* lra.cc (setup_sp_offset): Start with sp_offset from
> 	before the new sequence, not from after.
I think you're right in that the current code isn't correct, but the 
natural question is how in the world has this worked to-date.   Though I 
guess targets which push arguments are a dying breed (though I would 
have expected i386 to have tripped over this at some point).

OK. Though I fear there may be fallout on this one...

jeff

H.J. Lu Aug. 25, 2024, 3:16 p.m. UTC | #2

On Sun, Aug 25, 2024 at 7:30 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 8/22/24 9:45 AM, Michael Matz wrote:
> > This is part of making m68k work with LRA.  See PR116429.
> > In short: setup_sp_offset is internally inconsistent.  It wants to
> > setup the sp_offset for newly generated instructions.  sp_offset for
> > an instruction is always the state of the sp-offset right before that
> > instruction.  For that it starts at the (assumed correct) sp_offset
> > of the instruction right after the given (new) sequence, and then
> > iterates that sequence forward simulating its effects on sp_offset.
> >
> > That can't ever be right: either it needs to start at the front
> > and simulate forward, or start at the end and simulate backward.
> > The former seems to be the more natural way.  Funnily the local
> > variable holding that instruction is also called 'before'.
> >
> > This changes it to the first variant: start before the sequence,
> > do one simulation step to get the sp-offset state in front of the
> > sequence and then continue simulating.
> >
> > More details: in the problematic testcase we start with this
> > situation (sp_off before 550 is 0):
> >
> >    550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
> >    551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
> >    552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
> >    554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
> >    556: call                   sp_off = -16
> >
> > insn 554 doesn't match its constraints and needs some reloads:
> >
> >        Creating newreg=262, assigning class DATA_REGS to r262
> >    554: r262:SI=r262:SI-r37:SI
> >        REG_ARGS_SIZE 0x10
> >      Inserting insn reload before:
> >    996: r262:SI=r116:SI
> >      Inserting insn reload after:
> >    997: [--%sp:SI]=r262:SI
> >
> >           Considering alt=0 of insn 997:   (0) =g  (1) damSKT
> >              1 Non pseudo reload: reject++
> >            overall=1,losers=0,rld_nregs=0
> >        Choosing alt 0 in insn 997:  (0) =g  (1) damSKT {*movsi_m68k2} (sp_off=-16)
> >
> > Note how insn 997 (the after-reload) now has sp_off=-16 already.  It all
> > goes downhill from there.  We end up with these insns:
> >
> >    552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
> >    996: r262 = r116            sp_off = -12
> >    554: r262 = r262 - r37      sp_off = -12
> >    997: [--sp] = r262          sp_off = -16  (!!! should be -12)
> >    556: call                   sp_off = -16
> >
> > The call insn sp_off remains at the correct -16, but internally it's already
> > inconsistent here.  If the sp_off before an insn is -16, and that insn
> > pre_decs sp, then the after-insn sp_off should be -20.
> >
> >       PR target/116429
> >       * lra.cc (setup_sp_offset): Start with sp_offset from
> >       before the new sequence, not from after.
> I think you're right in that the current code isn't correct, but the
> natural question is how in the world has this worked to-date.   Though I
> guess targets which push arguments are a dying breed (though I would
> have expected i386 to have tripped over this at some point).

Is it because i386 pushes the return address on stack?

> OK. Though I fear there may be fallout on this one...
>
> jeff
>

Andreas Schwab Aug. 25, 2024, 3:32 p.m. UTC | #3

On Aug 25 2024, H.J. Lu wrote:

> Is it because i386 pushes the return address on stack?

Like m68k.

Michael Matz Aug. 26, 2024, 2:14 p.m. UTC | #4

Hello,

On Sun, 25 Aug 2024, Jeff Law wrote:

> >    550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
> >    551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
> >    552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
> >    554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
> >    556: call                   sp_off = -16
> > 
> > insn 554 doesn't match its constraints and needs some reloads:
> 
> I think you're right in that the current code isn't correct, but the 
> natural question is how in the world has this worked to-date.  Though I 
> guess targets which push arguments are a dying breed (though I would 
> have expected i386 to have tripped over this at some point).

Yeah, I wondered as well.  For things to go wrong some instructions that 
contain pre/post-inc/dec of the stack pointer need to have reloads in such 
a way that the actual SP-change sideeffect moves to a different 
instruction.  In this case it was:

    554: [--sp] = r116 - r37

-->

    996: r262 = r116
    554: r262 = r262 - r37
    997: [--sp] = r262

And for this to happen the targets needs to have instructions that have 
SP-change sideeffect _and_ accept complicated expressions _and_ constrain 
the operand containing the side-effect in some way to the operands of 
these expressions.  (In this case: the subsi3 accepts a generic 
mem-operand destination, which includes pre-increment, and two generic 
register input operands; but constrains it such that the dest must be same 
as op0 of the minus).

I guess that LRA targets until now, when they have SP-change (e.g. x86 
push/pop) are simple enough that the SP-change doesn't need reload.  E.g. 
the push on i386 only accepts a simple register or memory as input, and 
doesn't otherwise tie the SP-memory operands to the input:

   [--sp] = op0     # a simple push of simple general_operand op0

If any reloads are necessary for some reason then it will be on op0 which 
most likely will simply be a force_reg:

   regT = op0
   [--sp] = regT

The identity of the instruction that does the SP-change doesn't change.
setup_sp_offset will only be called on the new regT setter which doesn't 
contain any interesting effects on SP whatsoever, and the sp_offset value 
of the push will remain correct for it.

But if there are output reloads that contain the [--sp] things will go 
wrong, as here.  Typical RISC ISAs, even if they have SP-changes will have 
them in their load/store insns, in which any reloads are similar to the 
x86 case: the side-effect will remain on the original instruction and 
everything will work.

> OK. Though I fear there may be fallout on this one...

Me as well, but I can have hope, can I? :-)

Ciao,
Michael.

Paul Koning Aug. 26, 2024, 2:21 p.m. UTC | #5

> On Aug 26, 2024, at 10:14 AM, Michael Matz <matz@suse.de> wrote:
> 
> Hello,
> 
> On Sun, 25 Aug 2024, Jeff Law wrote:
> 
>>>   550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
>>>   551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
>>>   552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
>>>   554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
>>>   556: call                   sp_off = -16
>>> 
>>> insn 554 doesn't match its constraints and needs some reloads:
>> 
>> I think you're right in that the current code isn't correct, but the 
>> natural question is how in the world has this worked to-date.  Though I 
>> guess targets which push arguments are a dying breed (though I would 
>> have expected i386 to have tripped over this at some point).
> 
> Yeah, I wondered as well.  For things to go wrong some instructions that 
> contain pre/post-inc/dec of the stack pointer need to have reloads in such 
> a way that the actual SP-change sideeffect moves to a different 
> instruction.  

I think I've seen that in the past on PDP11, and reported it, but I thought that particular issue was fixed not too long after.

	paul

Michael Matz Aug. 26, 2024, 2:40 p.m. UTC | #6

Hello,

On Mon, 26 Aug 2024, Paul Koning wrote:

> >>>   550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
> >>>   551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
> >>>   552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
> >>>   554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
> >>>   556: call                   sp_off = -16
> >>> 
> >>> insn 554 doesn't match its constraints and needs some reloads:
> >> 
> >> I think you're right in that the current code isn't correct, but the 
> >> natural question is how in the world has this worked to-date.  Though I 
> >> guess targets which push arguments are a dying breed (though I would 
> >> have expected i386 to have tripped over this at some point).
> > 
> > Yeah, I wondered as well.  For things to go wrong some instructions that 
> > contain pre/post-inc/dec of the stack pointer need to have reloads in such 
> > a way that the actual SP-change sideeffect moves to a different 
> > instruction.  
> 
> I think I've seen that in the past on PDP11, and reported it, but I 
> thought that particular issue was fixed not too long after.

Do you have a reference handy?  I'd like to take a look, if for nothing 
else than curiosity ;-)


Ciao,
Michael.

Paul Koning Aug. 26, 2024, 8:18 p.m. UTC | #7

> On Aug 26, 2024, at 10:40 AM, Michael Matz <matz@suse.de> wrote:
> 
> Hello,
> 
> On Mon, 26 Aug 2024, Paul Koning wrote:
> 
>>>>>  550: [--sp] = 0             sp_off = 0  {pushexthisi_const}
>>>>>  551: [--sp] = 37            sp_off = -4 {pushexthisi_const}
>>>>>  552: [--sp] = r37           sp_off = -8 {movsi_m68k2}
>>>>>  554: [--sp] = r116 - r37    sp_off = -12 {subsi3}
>>>>>  556: call                   sp_off = -16
>>>>> 
>>>>> insn 554 doesn't match its constraints and needs some reloads:
>>>> 
>>>> I think you're right in that the current code isn't correct, but the 
>>>> natural question is how in the world has this worked to-date.  Though I 
>>>> guess targets which push arguments are a dying breed (though I would 
>>>> have expected i386 to have tripped over this at some point).
>>> 
>>> Yeah, I wondered as well.  For things to go wrong some instructions that 
>>> contain pre/post-inc/dec of the stack pointer need to have reloads in such 
>>> a way that the actual SP-change sideeffect moves to a different 
>>> instruction.  
>> 
>> I think I've seen that in the past on PDP11, and reported it, but I 
>> thought that particular issue was fixed not too long after.
> 
> Do you have a reference handy?  I'd like to take a look, if for nothing 
> else than curiosity ;-)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87944 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87944> which says it was fixed in GCC 14 on 5/30/2023.

	paul

Michael Matz Aug. 27, 2024, 1:50 p.m. UTC | #8

Hello,

On Mon, 26 Aug 2024, Paul Koning wrote:

> >>> Yeah, I wondered as well.  For things to go wrong some instructions that 
> >>> contain pre/post-inc/dec of the stack pointer need to have reloads in such 
> >>> a way that the actual SP-change sideeffect moves to a different 
> >>> instruction.  
> >> 
> >> I think I've seen that in the past on PDP11, and reported it, but I 
> >> thought that particular issue was fixed not too long after.
> > 
> > Do you have a reference handy?  I'd like to take a look, if for nothing 
> > else than curiosity ;-)
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87944 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87944> which says it was fixed in GCC 14 on 5/30/2023.

Ah, yes, thanks.  The referenced patch there changed stuff at the caller 
of setup_sp_offset for the before sequence only and left the after 
sequence alone.  I think it worked for your case only because it was a 
single reload and it was in front of the insn.

Ciao,
Michael.

LRA: Fix setup_sp_offset

Commit Message

Comments

Patch