Message ID | 20210906104018.2697413-2-rearnsha@arm.com |
---|---|
State | New |
Headers | show |
Series | lower more cases of memcpy [PR102125] | expand |
On Mon, Sep 6, 2021 at 12:40 PM Richard Earnshaw <rearnsha@arm.com> wrote: > > > GCC was recently changed to prevent simplify_subreg from simplifying > a subreg of a mem when the mode of the new mem would have stricter alignment > constraints than the inner mem already has when the target requires > STRICT_ALIGNMENT. > > However, such targets may have specialist patterns that can handle > unaligned accesses and this restriction turns out to be unduly restrictive. > So limit this restriction to only apply when the inner mem is naturally > aligned to the inner mode. Hmm, I think this can end up either generating wrong code or recog fails. The specific combination of alignment and mode of 'op' has been validated to be supported, replacing the mode with sth else would need re-validation of the combination. I'm not sure we can for example just query movmisalign support here and hope for LRA to reload the mem with that. So - where do you run into this? Is it possible to catch the situation on a higher level where more context as in the whole insn is visible? Thanks, Richard. > > gcc/ChangeLog: > > PR target/102125 > * simplify-rtx.c (simplify_context::simplify_subreg): Allow > simplifying (subreg (mem())) when the inner mem is already > misaligned for its type. > --- > gcc/simplify-rtx.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) >
On 06/09/2021 11:58, Richard Biener via Gcc-patches wrote: > On Mon, Sep 6, 2021 at 12:40 PM Richard Earnshaw <rearnsha@arm.com> wrote: >> >> >> GCC was recently changed to prevent simplify_subreg from simplifying >> a subreg of a mem when the mode of the new mem would have stricter alignment >> constraints than the inner mem already has when the target requires >> STRICT_ALIGNMENT. >> >> However, such targets may have specialist patterns that can handle >> unaligned accesses and this restriction turns out to be unduly restrictive. >> So limit this restriction to only apply when the inner mem is naturally >> aligned to the inner mode. > > Hmm, I think this can end up either generating wrong code or > recog fails. The specific combination of alignment and mode of 'op' > has been validated to be supported, replacing the mode with sth > else would need re-validation of the combination. I'm not sure > we can for example just query movmisalign support here and > hope for LRA to reload the mem with that. > > So - where do you run into this? Is it possible to catch the > situation on a higher level where more context as in the whole insn > is visible? I ran into it with patch 2 of this series when calling gen_highpart on a misaligned mem. IIRC gen_highpart would end up returning (subreg:SI (mem:DI (addr [A8])) 4), while gen_lowpart would simplify the operation to (mem:SI (addr [A8])) as expected. (subreg:SI (mem:DI (addr [A8])) 4) is really problematic, because it's not a memory_operand (from the manual: it will get reloaded into a register later on). But that's no good here, I don't want this reloading into a wide register later, I need it to be narrowed to the component part now. R. > > Thanks, > Richard. > >> >> gcc/ChangeLog: >> >> PR target/102125 >> * simplify-rtx.c (simplify_context::simplify_subreg): Allow >> simplifying (subreg (mem())) when the inner mem is already >> misaligned for its type. >> --- >> gcc/simplify-rtx.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >>
On Mon, Sep 6, 2021 at 1:08 PM Richard Earnshaw <Richard.Earnshaw@foss.arm.com> wrote: > > > > On 06/09/2021 11:58, Richard Biener via Gcc-patches wrote: > > On Mon, Sep 6, 2021 at 12:40 PM Richard Earnshaw <rearnsha@arm.com> wrote: > >> > >> > >> GCC was recently changed to prevent simplify_subreg from simplifying > >> a subreg of a mem when the mode of the new mem would have stricter alignment > >> constraints than the inner mem already has when the target requires > >> STRICT_ALIGNMENT. > >> > >> However, such targets may have specialist patterns that can handle > >> unaligned accesses and this restriction turns out to be unduly restrictive. > >> So limit this restriction to only apply when the inner mem is naturally > >> aligned to the inner mode. > > > > Hmm, I think this can end up either generating wrong code or > > recog fails. The specific combination of alignment and mode of 'op' > > has been validated to be supported, replacing the mode with sth > > else would need re-validation of the combination. I'm not sure > > we can for example just query movmisalign support here and > > hope for LRA to reload the mem with that. > > > > So - where do you run into this? Is it possible to catch the > > situation on a higher level where more context as in the whole insn > > is visible? > > I ran into it with patch 2 of this series when calling gen_highpart on a > misaligned mem. IIRC gen_highpart would end up returning (subreg:SI > (mem:DI (addr [A8])) 4), while gen_lowpart would simplify the operation > to (mem:SI (addr [A8])) as expected. > > (subreg:SI (mem:DI (addr [A8])) 4) is really problematic, because it's > not a memory_operand (from the manual: it will get reloaded into a > register later on). But that's no good here, I don't want this > reloading into a wide register later, I need it to be narrowed to the > component part now. So maybe calling gen_highpart is not what you want then? adjust_address is IIRC what one uses to offset a MEM and change its mode. Richard. > > R. > > > > > Thanks, > > Richard. > > > >> > >> gcc/ChangeLog: > >> > >> PR target/102125 > >> * simplify-rtx.c (simplify_context::simplify_subreg): Allow > >> simplifying (subreg (mem())) when the inner mem is already > >> misaligned for its type. > >> --- > >> gcc/simplify-rtx.c | 6 +++++- > >> 1 file changed, 5 insertions(+), 1 deletion(-) > >>
On 06/09/2021 12:13, Richard Biener wrote: > On Mon, Sep 6, 2021 at 1:08 PM Richard Earnshaw > <Richard.Earnshaw@foss.arm.com> wrote: >> >> >> >> On 06/09/2021 11:58, Richard Biener via Gcc-patches wrote: >>> On Mon, Sep 6, 2021 at 12:40 PM Richard Earnshaw <rearnsha@arm.com> wrote: >>>> >>>> >>>> GCC was recently changed to prevent simplify_subreg from simplifying >>>> a subreg of a mem when the mode of the new mem would have stricter alignment >>>> constraints than the inner mem already has when the target requires >>>> STRICT_ALIGNMENT. >>>> >>>> However, such targets may have specialist patterns that can handle >>>> unaligned accesses and this restriction turns out to be unduly restrictive. >>>> So limit this restriction to only apply when the inner mem is naturally >>>> aligned to the inner mode. >>> >>> Hmm, I think this can end up either generating wrong code or >>> recog fails. The specific combination of alignment and mode of 'op' >>> has been validated to be supported, replacing the mode with sth >>> else would need re-validation of the combination. I'm not sure >>> we can for example just query movmisalign support here and >>> hope for LRA to reload the mem with that. >>> >>> So - where do you run into this? Is it possible to catch the >>> situation on a higher level where more context as in the whole insn >>> is visible? >> >> I ran into it with patch 2 of this series when calling gen_highpart on a >> misaligned mem. IIRC gen_highpart would end up returning (subreg:SI >> (mem:DI (addr [A8])) 4), while gen_lowpart would simplify the operation >> to (mem:SI (addr [A8])) as expected. >> >> (subreg:SI (mem:DI (addr [A8])) 4) is really problematic, because it's >> not a memory_operand (from the manual: it will get reloaded into a >> register later on). But that's no good here, I don't want this >> reloading into a wide register later, I need it to be narrowed to the >> component part now. > > So maybe calling gen_highpart is not what you want then? > adjust_address is IIRC what one uses to offset a MEM and change > its mode. It was based on looking at the patch for PR 100106 (r12-163-c33db31d9ad9). That patch added the MEM_ALIGN constraint when previously there was none here and my call would have been simplified. Are you saying that GCC was always wrong in this respect? All I've done was to tightened the check that Bernd added. R. > > Richard. > >> >> R. >> >>> >>> Thanks, >>> Richard. >>> >>>> >>>> gcc/ChangeLog: >>>> >>>> PR target/102125 >>>> * simplify-rtx.c (simplify_context::simplify_subreg): Allow >>>> simplifying (subreg (mem())) when the inner mem is already >>>> misaligned for its type. >>>> --- >>>> gcc/simplify-rtx.c | 6 +++++- >>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>
On Mon, Sep 6, 2021 at 1:23 PM Richard Earnshaw <Richard.Earnshaw@foss.arm.com> wrote: > > > > On 06/09/2021 12:13, Richard Biener wrote: > > On Mon, Sep 6, 2021 at 1:08 PM Richard Earnshaw > > <Richard.Earnshaw@foss.arm.com> wrote: > >> > >> > >> > >> On 06/09/2021 11:58, Richard Biener via Gcc-patches wrote: > >>> On Mon, Sep 6, 2021 at 12:40 PM Richard Earnshaw <rearnsha@arm.com> wrote: > >>>> > >>>> > >>>> GCC was recently changed to prevent simplify_subreg from simplifying > >>>> a subreg of a mem when the mode of the new mem would have stricter alignment > >>>> constraints than the inner mem already has when the target requires > >>>> STRICT_ALIGNMENT. > >>>> > >>>> However, such targets may have specialist patterns that can handle > >>>> unaligned accesses and this restriction turns out to be unduly restrictive. > >>>> So limit this restriction to only apply when the inner mem is naturally > >>>> aligned to the inner mode. > >>> > >>> Hmm, I think this can end up either generating wrong code or > >>> recog fails. The specific combination of alignment and mode of 'op' > >>> has been validated to be supported, replacing the mode with sth > >>> else would need re-validation of the combination. I'm not sure > >>> we can for example just query movmisalign support here and > >>> hope for LRA to reload the mem with that. > >>> > >>> So - where do you run into this? Is it possible to catch the > >>> situation on a higher level where more context as in the whole insn > >>> is visible? > >> > >> I ran into it with patch 2 of this series when calling gen_highpart on a > >> misaligned mem. IIRC gen_highpart would end up returning (subreg:SI > >> (mem:DI (addr [A8])) 4), while gen_lowpart would simplify the operation > >> to (mem:SI (addr [A8])) as expected. > >> > >> (subreg:SI (mem:DI (addr [A8])) 4) is really problematic, because it's > >> not a memory_operand (from the manual: it will get reloaded into a > >> register later on). But that's no good here, I don't want this > >> reloading into a wide register later, I need it to be narrowed to the > >> component part now. > > > > So maybe calling gen_highpart is not what you want then? > > adjust_address is IIRC what one uses to offset a MEM and change > > its mode. > > It was based on looking at the patch for PR 100106 > (r12-163-c33db31d9ad9). That patch added the MEM_ALIGN constraint when > previously there was none here and my call would have been simplified. > Are you saying that GCC was always wrong in this respect? Yes. > All I've done was to tightened the check that Bernd added. The check was supposed to verify that the generated MEM is aligned according to its mode. If you'd have a misaligned valid DImode according to movmisalign the target might not have a move insn to move a misaligned SImode so tightening the check to more closely match the originally motivating testcase isn't correct IMHO. Richard. > > R. > > > > > Richard. > > > >> > >> R. > >> > >>> > >>> Thanks, > >>> Richard. > >>> > >>>> > >>>> gcc/ChangeLog: > >>>> > >>>> PR target/102125 > >>>> * simplify-rtx.c (simplify_context::simplify_subreg): Allow > >>>> simplifying (subreg (mem())) when the inner mem is already > >>>> misaligned for its type. > >>>> --- > >>>> gcc/simplify-rtx.c | 6 +++++- > >>>> 1 file changed, 5 insertions(+), 1 deletion(-) > >>>>
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index ebad5cb5a79..1baa50cb1b9 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -7317,7 +7317,11 @@ simplify_context::simplify_subreg (machine_mode outermode, rtx op, have instruction to move the whole thing. */ && (! MEM_VOLATILE_P (op) || ! have_insn_for (SET, innermode)) - && !(STRICT_ALIGNMENT && MEM_ALIGN (op) < GET_MODE_ALIGNMENT (outermode)) + /* On STRICT_ALIGNMENT targets, don't allow the alignment to be increased + if the inner object is already naturally aligned. */ + && !(STRICT_ALIGNMENT + && MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (innermode) + && MEM_ALIGN (op) < GET_MODE_ALIGNMENT (outermode)) && known_le (outersize, innersize)) return adjust_address_nv (op, outermode, byte);